#301392•Dodano Invalid Date•0•źródło: EPAM Systems
Site Reliability Engineer
Doświadczenie
Senior
Lokalizacja
—
Tryb pracy
Zdalnie
Wymiar
Full-time
CI/CDKubernetesDocker
O ofercie
EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.
Wymagania
- Bachelor’s degree in Computer Science, Engineering, or a related field
- 3+ years of hands-on experience in Site Reliability Engineering or related roles
- Proven experience in any cloud (AWS/GCP/Azure)
- Experience with implementing SRE practices such as SLO/SLI, Error budgets, Postmortems, Reducing Toil, capacity planning, and Incident Management
- Python or other scripting/programming language
- Strong background in monitoring tools
- Proficiency in CI/CD tools, infrastructure as code, and configuration management
- Solid knowledge of container orchestration technologies (Kubernetes, Docker)
- English language proficiency at an Upper-Intermediate level (B2) or higher
Nice to have:
- Expertise in deployment and management of LLMs, including technologies like RAG
- Certification in Kubernetes, AWS/GCP/Azure, or similar technologies
- Proven experience in DevOps
- Knowledge of managing and optimizing AI/ML models in production environments, including basic deployment, monitoring, and maintenance
Obowiązki
- Collaborate with development, security, quality, and operation teams to implement SRE practices and ensure system reliability
- Define and support required level of reliability, availability, and performance for services and applications
- Design and deliver Cloud-based solutions tailored to client needs
- Troubleshoot, mitigate, and support fixing of the infrastructure and application issues in a timely manner
- Implement a monitoring system for the infrastructure and application reliability
- Communicate technical concepts clearly to both engineering teams and management stakeholders
Benefity
- Engineering community of industry professionals
- Friendly team and enjoyable working environment
- Flexible schedule and opportunity to work remotely within Poland
- Chance to work abroad for up to 60 days annually
- Business-driven relocation opportunities
- Outstanding career roadmap
- Leadership development, career advising, soft skills, and well-being programs
- Certification (GCP, Azure, AWS)
- Unlimited access to LinkedIn Learning, Get Abstract, Cloud Guru
- English classes
- Stable income (Employment Contract or B2B)
- Participation in the Employee Stock Purchase Plan
- Benefits package (health insurance, multisport, shopping vouchers)
- Strategically located offices featuring entertainment and relaxation zones, table tennis and football, free snacks, fantastic coffee, and more
- Referral bonuses
- Corporate, social and well-being events