Head of DevOps / AI Ops
Brak informacji o wynagrodzeniu
C-Level / ManagerFull-time
#318395·Dodano około miesiąc temu·27
Źródło: Blue MediaTech Stack / Keywords
DevOpsAIAWSGCPCloudCI/CDMuleSoftSecurity
Firma i stanowisko
Autopay Global is the newest member of the Autopay family, aiming to expand the reach of the group’s state-of-the-art payment integration and payment data technologies to the international market, providing seamless integration with local PSPs, support for multiple currencies and compliance with local frameworks.
Wymagania
- 10+ years in DevOps/SRE/platform engineering with startups; 3-5+ years leading SRE/DevOps teams for multi-service production systems in payments or banking
- Proven experience in IT audits and compliance, including industry standards (e.g., PCI DSS) and regulatory requirements from banking/payment authorities
- Hands-on expertise running Kubernetes in production (EKS and/or GKE) with strong networking fundamentals (VPC design, private connectivity, TLS, DNS)
- Deep experience with cloud-native observability (metrics, logs, tracing) and building actionable alerting and on-call hygiene
- Proven implementation of SLOs, error budgets, capacity planning, and DR for high-availability services
- Strong security engineering instincts: IAM design, secrets management, encryption, secure CI/CD, and audit logging
- Experience operating real-time data systems (Kafka/MSK/Confluent, Pub/Sub, stream processing) and API-heavy platforms
- Operating ML/LLM inference or agent runtimes in production: rollout/rollback, safe configuration, monitoring and alerting
- Implementing evaluation gates for models/agents (offline regression, golden sets, canaries) and closing the loop with production feedback signals
- Monitoring for drift and regression (data drift, embedding drift, tool-call failure rates, latency regressions) and establishing kill switches for rapid containment
- Predictive capacity & resource optimization leveraging AI/ML
Nice to have:
- Experience with MuleSoft Anypoint and multi-cloud data movement (AWS and GCP)
Obowiązki
- Define and operationalize SLOs for critical services
- Own incident management: on-call structure, alerting standards, runbooks, postmortems, and corrective action tracking
- Build and maintain the multi-cloud runtime platform (EKS/GKE or equivalent)
- Establish a paved road for engineers: standardized service templates, CI/CD pipelines, IaC modules, environment management, and production readiness checks
- Implement cloud-native observability across AWS and GCP, including metrics, logs, alarms and dashboards using respective cloud platform tools
- Establish unified cross-cloud telemetry conventions
- Integrate and use Dynatrace Platform as suitable
- Drive governance, monitoring, and optimization of integrations using Mulesoft Anypoint, enabling secure and reliable connectivity with external services
- Harden security posture: least-privilege IAM, key management, perimeter controls and secure CI/CD
- Own multi-cloud cost controls and capacity planning for peak campaign traffic and paid media bursts
- Lead AI Ops: implement model/agent versioning, evaluation suites, rollout and rollback; monitor model/agent quality and enforce guardrails
- Partner with Engineering, Data, Security, and Compliance to meet SOC2 and PCI-aligned operational controls
- Hire and lead a high-performing AI OPS organization
Oferta
- Leadership role in a fast-growing, global fintech company
- Possibility to work with cutting-edge tools and technologies
- Independence in decision-making
- Friendly working environment, team support, no dress code
Autopay
27 aktywnych ofert