Antal Sp. z o.o.

Site Reliability Engineer

~21 840 - 35 280 PLN/ mies.

MidFull-time

#299438·Dodano 2 miesiące temu·56

Źródło: Antal

Aplikuj teraz

Tech Stack / Keywords

CloudMicroservicesArchitectureGoogle Cloud PlatformJavaSpring BootSpringApache

Firma i stanowisko

Antal is a leading recruitment and HR advisory company, present in Poland since 1996 and later expanded to the Czech Republic and Hungary. Across the CEE region, we employ around 150 professionals who deliver a full range of services – from specialist and executive recruitment, employee outsourcing and HR consulting, to employer branding and market research.

Our division-based structure combines deep industry expertise with functional specialisation, enabling us to provide tailored solutions for companies in every sector. We act as a trusted partner for both employers and candidates, sharing our knowledge and guiding them through every stage of the talent journey. We connect exceptional people with the right opportunities and help organisations build successful teams.

Wymagania

4+ years of experience supporting and/or developing distributed systems (Java-based environments)
Strong troubleshooting and analytical skills
Experience with disaster recovery processes
Hands-on experience with application lifecycle and CI/CD tooling (JIRA, Confluence, Jenkins, Ansible)
Experience supporting complex, cross-platform systems (Java / Python environments)
Knowledge of Agile/Kanban delivery models
Experience implementing monitoring and logging frameworks (e.g. Grafana, InfluxDB, Prometheus, Splunk, Loki or similar)
Basic knowledge of relational databases (Oracle, PostgreSQL)
Understanding of cloud platforms (preferably GCP)
Familiarity with Unix/Linux environments
Ability to lead technical discussions with global support teams
Strong communication skills and ability to work across regions

Technical Requirements:

Core Java knowledge
Application support experience
Monitoring tools (Grafana, InfluxDB, Prometheus or similar)
Basic cloud knowledge (GCP preferred)
Automation tools (Jenkins, Ansible)
Knowledge of relational databases (Oracle, PostgreSQL)

Obowiązki

Manage application support operations with focus on resiliency, availability and performance
Coordinate production incident resolution and conduct post-mortems / root cause analysis
Investigate and resolve complex production issues across distributed systems
Contribute to continuous service improvement and knowledge base documentation
Actively engage in Incident, Problem and Service Management processes
Apply SRE principles to enhance reliability, scalability and observability
Develop and improve monitoring, alerting and incident detection mechanisms
Support hybrid cloud environments and automation initiatives
Work in a 2-shift rotation (8:00 AM start / 4:00 PM start)
Participate in weekend and on-call rotations

Oferta

Opportunity to work on business-critical global risk platforms
Participation in a large-scale cloud and architecture transformation
Modern technology stack and DevOps culture
Hybrid working model (2 days per week in Kraków office)
Long-term project within a stable, global financial environment

Antal Sp. z o.o.

946 aktywnych ofert

Zobacz wszystkie oferty

Aplikuj teraz