#301392Dodano Invalid Date0źródło: EPAM Systems
EPAM Systems
EPAM Systems

Site Reliability Engineer

Doświadczenie

Senior

Lokalizacja

Tryb pracy

Zdalnie

Wymiar

Full-time

CI/CDKubernetesDocker

O ofercie

EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.

Wymagania

  • Bachelor’s degree in Computer Science, Engineering, or a related field
  • 3+ years of hands-on experience in Site Reliability Engineering or related roles
  • Proven experience in any cloud (AWS/GCP/Azure)
  • Experience with implementing SRE practices such as SLO/SLI, Error budgets, Postmortems, Reducing Toil, capacity planning, and Incident Management
  • Python or other scripting/programming language
  • Strong background in monitoring tools
  • Proficiency in CI/CD tools, infrastructure as code, and configuration management
  • Solid knowledge of container orchestration technologies (Kubernetes, Docker)
  • English language proficiency at an Upper-Intermediate level (B2) or higher

Nice to have:

  • Expertise in deployment and management of LLMs, including technologies like RAG
  • Certification in Kubernetes, AWS/GCP/Azure, or similar technologies
  • Proven experience in DevOps
  • Knowledge of managing and optimizing AI/ML models in production environments, including basic deployment, monitoring, and maintenance

Obowiązki

  • Collaborate with development, security, quality, and operation teams to implement SRE practices and ensure system reliability
  • Define and support required level of reliability, availability, and performance for services and applications
  • Design and deliver Cloud-based solutions tailored to client needs
  • Troubleshoot, mitigate, and support fixing of the infrastructure and application issues in a timely manner
  • Implement a monitoring system for the infrastructure and application reliability
  • Communicate technical concepts clearly to both engineering teams and management stakeholders

Benefity

  • Engineering community of industry professionals
  • Friendly team and enjoyable working environment
  • Flexible schedule and opportunity to work remotely within Poland
  • Chance to work abroad for up to 60 days annually
  • Business-driven relocation opportunities
  • Outstanding career roadmap
  • Leadership development, career advising, soft skills, and well-being programs
  • Certification (GCP, Azure, AWS)
  • Unlimited access to LinkedIn Learning, Get Abstract, Cloud Guru
  • English classes
  • Stable income (Employment Contract or B2B)
  • Participation in the Employee Stock Purchase Plan
  • Benefits package (health insurance, multisport, shopping vouchers)
  • Strategically located offices featuring entertainment and relaxation zones, table tennis and football, free snacks, fantastic coffee, and more
  • Referral bonuses
  • Corporate, social and well-being events