Sr. Site Reliability Engineer

Brak informacji o wynagrodzeniu
SeniorFull-time
#325794·Dodano 20 dni temu·23
Źródło: nofluffjobs.com
Aplikuj teraz

Tech Stack / Keywords

Cloud platformAWSAzureKubernetesService meshIstioInfrastructure as CodeTerraform

Firma i stanowisko

Visa is a world leader in payments technology, facilitating transactions between consumers, merchants, financial institutions and government entities across more than 200 countries and territories.


Wymagania

  • Strong hands-on experience with public cloud platforms (AWS preferred, Azure).
  • Experience administrating productive Kubernetes environments at scale.
  • Experience with service mesh technologies (Istio preferred, App Mesh, Linkerd).
  • Strong understanding of observability tooling and Golden Signals concepts.
  • Knowledge of incident management concepts and on-call operations.
  • Experience with Infrastructure as Code (e.g., Terraform).
  • Understanding of cloud-native containerized micro-services architecture.
  • Strong collaboration and communication skills.

Obowiązki

Platform Ownership & Reliability:

  • Own the end-to-end lifecycle (design, provisioning, upgrades, and decommissioning) of core platform components including cloud infrastructure primitives, Kubernetes clusters and cluster services, networking, ingress, service discovery, service mesh and supporting data-plane components.
  • Ensure platform components are resilient by design applying SRE principles such as fault isolation, graceful degradation, capacity planning, saturation control, reduced operational toil, and clear failure modes.
  • Continuously assess and mitigate reliability risks, proactively improving platform stability and operational readiness.

Infrastructure Bootstrap & Automation Leadership:

  • Lead design and implementation of infrastructure bootstrap orchestration including automated cluster and environment provisioning, deterministic and repeatable platform bring-up and teardown, and dependency-aware orchestration across cloud, network, and Kubernetes layers.
  • Drive Infrastructure-as-Code and GitOps-first approach ensuring platform components are reproducible, auditable, automated, testable, reversible, and minimize manual intervention.
  • Identify automation gaps and lead initiatives to reduce human effort, onboarding time, and operational risk.

SRE Practices & Operational Excellence:

  • Apply and promote SRE practices including clear ownership and runbooks for platform components, participation in on-call rotation as platform reliability escalation point, incident response, post-incident reviews, and problem management.
  • Improve platform operability by simplifying day-2 operations, standardizing upgrade and rollback strategies, reducing Mean Time to Detect (MTTD) and Mean Time to Recover (MTTR).
  • Ensure platform operations align with security, compliance, and internal control requirements.

Oferta

  • Sport subscription
  • Private healthcare
  • International projects
  • Free coffee
  • Playroom
  • Free snacks
  • In-house trainings
  • Modern office
Karta sportowa
Opieka zdrowotna
Szkolenia wewnętrzne
Darmowe napoje
Darmowe przekąski

Inne informacje

This is a remote position. A remote position does not require job duties be performed within proximity of a Visa office location. Remote positions may be required to be present at a Visa office with scheduled notice.

Visa is an EEO Employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability or protected veteran status. Visa will also consider qualified applicants with criminal histories in a manner consistent with EEOC guidelines and applicable local law.

VISA

VISA

60 aktywnych ofert

Zobacz wszystkie oferty
Aplikuj teraz