Job Description
You Are:
A self-starter individual who is highly collaborative, open, transparent, and bottom line focused. To be successful, this individual must bring both the energy and experience required to drive business aligned change in a complex environment, while building a strong service relationship with their peers in IT. He / She will work closely with their peers, as well as broader technology organization to support the absolute highest levels of quality, availability, and reliability.
Day to Day:
- Review AWS Billing and make sure the services being charged accurately reflect company compute usage.
- Define, implement, and support a well-governed infrastructure capacity and performance process, supported by forecasting and demand management activities, which ensures consistent service performance, avoids urgent and unplanned investments, and provides consistent information for proactive decision-making.
- Transform capacity and performance activities from reactive to proactive, enabling improved visibility for service providers and consumers.
- Promote the use of current tools, and drive the evaluation of new tools, to capitalize on the predictive capabilities and anomaly detection.
- Collaborate across technical domains to harmonize the collection, analysis, and reporting of performance and capacity data.
- Drive the continuous review of capacity and performance metrics, mining the data for optimization opportunities that lower unit costs while limiting incremental risk.
- Avoid unplanned and urgent upgrades that could undermine budgets, the planning process, and service owners' credibility.
- Minimize unacceptable performance that may impact the business so severely that business processing stops.
- Support transformation programs, such as data center consolidation, which may be planned with greater precision and future-proofing when mature capacity planning is practiced
- Provide service providers with timely capacity, performance and fault analysis.
- Alert on anomalies in performance and capacity before they escalate into service-impacting outages.
- Ensure that key metrics are measured, and data collection is consistent across all platforms
- Work closely with observability team to ensure end-to-end visibility and rationalize enterprise monitoring tools
- Responsible for Configuration Items (CI), CI relationships, and CMDB integration to ensure normalization of data across technical domains and service management.
- Responsible for ensuring monitoring is in line with SLA/SLTs
- Provide engineering analysis of pending environment changes to ensure SLAs can be maintained.
- Support the definition of SLAs, KPIs, and forecasting (demand management) measures for delivering Cloud services.
- Partner with adjacent infrastructure and application domain engineers to ensure cohesive, end-to-end solutions meet business objectives in a cost-effective way.
- Lead by example, take advantage of peer coaching opportunities to share knowledge and experience with others
- Provide third level support to operational teams to ensure Incident and Problem Management is a mechanism that feeds continuous improvement.
- Support a culture of strong accountability by reinforcing the need for leveraging consistent inputs and outputs between plan, build, and run functions.
- Establish and maintain excellent partnerships with key internal providers of server, storage, network, platform services (middleware & database), systems management, incident, problem, and change, and other key IT services.
- Work with application support, enterprise testing and quality assurance to leverage and incorporate their shared services into the infrastructure capacity and performance process.
- Supports the attraction, development, and retention of talent through mentoring and leading by example.
You Have:
- Have solid understanding of AWS or any other Cloud provider Infrastructure service offerings.
- Bachelor’s degree in computer science or related field or equivalent technical experience
- 15+ years of professional experience. Preferably having depth of IT experience that includes along with 10+ years of experience providing infrastructure engineering services.
- operations, engineering, and architecture services in large scale enterprises.
- Demonstrated knowledge of key trends and disruptors
- Strong product and solution knowledge with a proven ability to drive technology and culture change.
- Excellent knowledge of ITIL and service-based delivery models with 7x24x365 operations
- Prior experience identifying the need, building the business case, gaining executive support, designing, implementing, and supporting a modern infrastructure capacity and performance management process is highly desirable.
- Demonstrated ability to understand and decompose enterprise systems, methodically analyze complex problems, and provide insightful and actionable recommendations.
- Strong experience with VMware, VCE (vBlock), EMC, NetApp, Cisco, Microsoft, Red Hat
- Excellent track record of structured, logical, and methodical approach to problem solving, data gathering, and analysis.
Job Tags
Immediate start,