Lead Site Reliability Engineer
-
Location
United States (Remote)
-
Sector:
-
Job type:
-
Salary:
260,000 - 290,000
-
Contact:
Stephen Tepsick
-
Contact email:
s.tepsick@hamlynwilliams.com
-
Salary high:
0
-
Salary low:
0
-
Job ref:
SRE-210
-
Published:
8 months ago
-
Expiry date:
2021-12-31
-
Startdate:
ASAP
An exciting late-stage consumer-facing platform company - Top 10 in DAU's. This is an opportuntiy to keep the platform operational at one of the most revolutionary tech companies of our time.
Responsibilities & Deliverables
Your deliverables as a Lead Site Reliability Engineer will include, but not limited to the following:
- Lead Site Reliability Process and Technical Management of the Cloud Native Platforms
- Design and Architecture of SRE Processes
- Oversee Change Management
- Heavy Automation
- SLA Reviews
- Creating PWORs
- Post Mortem Reviews
- Resiliency Improvements
- Azure Platform
Required Skills & Experience
- A bachelor or master degree in IT (preferable computer science)
- 10+ years of experience in software development
- Expert Java or other object-oriented language
- HA and Highly Scalable Design
- Experience implementing SRE standards for Resiliency and Scalability of Java/Node.js based microservices in Cloud
- Experience implementing development/delivery practices
Mandatory Skills
- Failure Mode Analysis of Architectures
- Root Cause analysis of Incidents using Incident Post Mortems
- Working knowledge of Cloud IaaS & PaaS Platforms in Azure
- Kubernetes
- Monitoring Cloud Platform services
- Azure DevOps CD Tools
- Agile