Full Job Description
Cloud Infrastructure and Automation Team Security Incident and Event Manager
In 2019 HP Print launched a broad initiative to create a unified User-Centric Digital Ecosystem (UCDE), with the following mission: Lead the creation of HP’s pan print platform upon which world-class solutions and experiences are developed and deployed; providing continuous delivery of value to our customers. The first suite of services released to the market in Fall of 2020 was HP+.
Our suite of cloud enhanced solutions spans print devices, mobile apps, web front ends, and multiple cloud hosted services that work in concert to provide a diverse set of solutions to HP’s print customers. The Cloud Infrastructure and Automation organization (CIA) provides operational support services to solution teams across this ecosystem
For this role, we are hiring a technical manager who can ensure all performance incidents are rapidly managed to full restoration via an incident “war room”. In addition, all events will go through a follow-up retrospective and remediation review to identify and assign work to eliminate or minimize future occurrences. This role will work with on call DevOps, plus engineering teams supporting solutions and services. This role will help to ensure a world class ecosystem to sustain and scale all solutions that are dependent on those core services.
This position is for a SW program manger who will focus on operational reliability and response to performance issues in production across HP’s UCDE cloud platform. The engineer will collaborate with SW developers, Quality engineers, Ops engineers, and Technical Program Managers to ensure our services meet high standards for scale, performance, robustness, uptime, responsiveness and agility. To meet these requirements, the engineer will collaborate with managers and technical leads to refine and evolve HP’s event response strategy to ensure HP’s uptime performance goals. The engineer in this role will develop a broad knowledge of the architecture of the system to ensure monitoring and response process match defined Key Performance Indicators and Service Levels. The engineer will also be involved in reporting overall system performance to program owners and Sr. Leadership.
The candidate will need to have the ability to think in at a cross-team/cross-service level and have (or build) that breadth of knowledge that is needed to support a very broad group of teams and geographies spanning India, South Korea, US, Canadian, and Brazil Teams that all contribute code to UCDE services space.
A solid understanding of how all the interconnects between services within UCDE and can impact one another that would result in a customer facing poor experience. Using these engagements across multiple ecosystem partners, this role will drive with urgency matching the impact of the event to ensure rapid system restoration, as well as ensuring communication of status during and after the event from engineering level to executive and even in some cases, customer.
- Bachelor’s degree in a technical discipline (CS or closely related degree)
Experience and knowledge:
- 5+ years experience dealing with production level events
- Demonstrated SW development experience
- Experience with web development/deployment technologies (e.g., HTML, HTTP, REST APIs, .NET/Java, XML/JSON, Apigee, GitOps, AWS, Docker, Kubernetes, etc.)
- Cloud operations monitoring tools experience (Splunk, CloudWatch, Grafana, etc.)
- Strong teamwork/team building, leadership, communication, collaboration & partnering skills
- SW development lifecycle, quality processes, agile development
- Proven problem solving & decision-making capabilities
- Significant Web/Cloud development/deployment experience
- Amazon Web Services experience.
- DevOps experience.
- Significant Cloud Ops monitoring experience
- IoT development experience
- Print industry experience
- Presentation and documentation skills
- History of using data analysis and analytics to isolate and scope issues
- Address Vancouver, WA, USA
- Salary Offer $50.000 ~ $100.000
- Experience Level Manager
- Total Years Experience 5-10
- Academic Degree Bachelors