Head of DevOps / AI Ops

Brak informacji o wynagrodzeniu
C-Level / ManagerFull-time
#318395·Dodano około miesiąc temu·27
Źródło: Blue Media
Aplikuj teraz

Tech Stack / Keywords

DevOpsAIAWSGCPCloudCI/CDMuleSoftSecurity

Firma i stanowisko

Autopay Global is the newest member of the Autopay family, aiming to expand the reach of the group’s state-of-the-art payment integration and payment data technologies to the international market, providing seamless integration with local PSPs, support for multiple currencies and compliance with local frameworks.


Wymagania

  • 10+ years in DevOps/SRE/platform engineering with startups; 3-5+ years leading SRE/DevOps teams for multi-service production systems in payments or banking
  • Proven experience in IT audits and compliance, including industry standards (e.g., PCI DSS) and regulatory requirements from banking/payment authorities
  • Hands-on expertise running Kubernetes in production (EKS and/or GKE) with strong networking fundamentals (VPC design, private connectivity, TLS, DNS)
  • Deep experience with cloud-native observability (metrics, logs, tracing) and building actionable alerting and on-call hygiene
  • Proven implementation of SLOs, error budgets, capacity planning, and DR for high-availability services
  • Strong security engineering instincts: IAM design, secrets management, encryption, secure CI/CD, and audit logging
  • Experience operating real-time data systems (Kafka/MSK/Confluent, Pub/Sub, stream processing) and API-heavy platforms
  • Operating ML/LLM inference or agent runtimes in production: rollout/rollback, safe configuration, monitoring and alerting
  • Implementing evaluation gates for models/agents (offline regression, golden sets, canaries) and closing the loop with production feedback signals
  • Monitoring for drift and regression (data drift, embedding drift, tool-call failure rates, latency regressions) and establishing kill switches for rapid containment
  • Predictive capacity & resource optimization leveraging AI/ML

Nice to have:

  • Experience with MuleSoft Anypoint and multi-cloud data movement (AWS and GCP)

Obowiązki

  • Define and operationalize SLOs for critical services
  • Own incident management: on-call structure, alerting standards, runbooks, postmortems, and corrective action tracking
  • Build and maintain the multi-cloud runtime platform (EKS/GKE or equivalent)
  • Establish a paved road for engineers: standardized service templates, CI/CD pipelines, IaC modules, environment management, and production readiness checks
  • Implement cloud-native observability across AWS and GCP, including metrics, logs, alarms and dashboards using respective cloud platform tools
  • Establish unified cross-cloud telemetry conventions
  • Integrate and use Dynatrace Platform as suitable
  • Drive governance, monitoring, and optimization of integrations using Mulesoft Anypoint, enabling secure and reliable connectivity with external services
  • Harden security posture: least-privilege IAM, key management, perimeter controls and secure CI/CD
  • Own multi-cloud cost controls and capacity planning for peak campaign traffic and paid media bursts
  • Lead AI Ops: implement model/agent versioning, evaluation suites, rollout and rollback; monitor model/agent quality and enforce guardrails
  • Partner with Engineering, Data, Security, and Compliance to meet SOC2 and PCI-aligned operational controls
  • Hire and lead a high-performing AI OPS organization

Oferta

  • Leadership role in a fast-growing, global fintech company
  • Possibility to work with cutting-edge tools and technologies
  • Independence in decision-making
  • Friendly working environment, team support, no dress code
Autopay

Autopay

27 aktywnych ofert

Zobacz wszystkie oferty
Aplikuj teraz