SRE- Senior Site Reliability Engineer
Since 1993, EPAM Systems, Inc. (NYSE: EPAM) has leveraged its advanced software engineering heritage to become the foremost global digital transformation services provider – leading the industry in digital and physical product development and digital platform engineering services. Through its innovative strategy; integrated advisory, consulting and design capabilities; and unique ‘Engineering DNA,EPAM’s globally deployed hybrid teams help make the future real for clients and communities around the world by powering better enterprise, education and health platforms that connect people, optimize experiences, and improve people’s lives. Selected by Newsweek as a 2021 Most Loved Workplace.
EPAM’s global multi-disciplinary teams serve 61,300 employees and customers in more than 50 countries across five continents.
As a recognized leader, EPAM is listed among the top 15 companies in Information Technology Services on the Fortune 1000 and ranked as the top IT services company on Fortune’s 100 Fastest-Growing Companies list for the last three consecutive years.
EPAM is also listed among Ad Age’s top 25 World’s Largest Agency Companies and in 2020, Consulting Magazine named EPAM Continuum a top 20 Fastest-Growing organization.
- Ability to rapidly and effectively understand and translate requirements into technical solutions.
- Ability to reason about performance, security, and process interactions in complex distributed system. Passionate about managing operational risk.
- Ability to work effectively as part of a diverse multi-disciplined team.
- Motivated, self-organized and have good time & work management skills.
- Minimum experience required is 5 to 9 years.
- Required is an Systems Engineer with Development background and understanding of Kubernetes and Containers:
- Good knowledge of Infrastructure (networking, operating systems)
- Good knowledge of Linux
- Good knowledge of Kubernetes and Docker
- Good debugging skills
- Skills to handle operational issues
- Really good at Python, Bash, PowerShell (at least anyone)
- Strong in problem solving, analytical skills, algorithms
- Familiarity with monitoring in Cloud and understanding of SLI concept
- Ability to communicate technical concepts effectively, both written and orally, as well as the interpersonal skills required to collaborate effectively with colleagues across diverse technology teams and locations.
- Familiarity with any cloud provider (especially GCP or Azure)
- Identify, craft, and maintain SLIs and SLOs for teams, as well as metrics such as MTTR, Lead time for change, Deployment Frequency and Change Failure Rate
- Should be able to work with Application teams to set up Observability, Telemetry.
- Experience with Any SRE tool, good if it is Grafana, Dynatrace, Splunk
nice to have
- Package management solutions like Nix, Apt, Yum
- Nice to have experience working with Windows
- Nice to have knowledge of CI/CD (especially Azure DevOps)
- Nice to have knowledge of Kubernetes
- Nice to have knowledge of Istio
- Nice to have knowledge of GitOps tools (like ArgoCD)
- Insurance Coverage
- Paid Leaves – including maternity, bereavement, paternity, and special COVID-19 leaves.
- Financial assistance for medical crisis
- Retiral Benefits – VPF and NPS
- Customized Mindfulness and Wellness programs
- EPAM Hobby Clubs
- Hybrid Work Model
- Soft loans to set up workspace at home
- Stable workload
- Relocation opportunities with ‘EPAM without Borders’ program
- Certification trainings for technical and soft skills
- Access to unlimited LinkedIn Learning platform
- Access to internal learning programs set up by world class trainers
- Community networking and idea creation platforms
- Mentorship programs
- Self-driven career progression tool
Send us your CV to get a personalized offer.