Head of Data Engineering

Brak informacji o wynagrodzeniu

C-Level / ManagerFull-time

#318396·Dodano około miesiąc temu·34

Źródło: Blue Media

Aplikuj teraz

Tech Stack / Keywords

ArchitecturePySparkDatabricksAIGoogle CloudUnityDatabasesCloud

Firma i stanowisko

Autopay Global is the newest member of the Autopay family, aiming to expand the reach of the group’s state-of-the-art payment integration and payment data technologies to the international market, providing seamless integration with local PSPs, support for multiple currencies and compliance with local frameworks.

Wymagania

10+ years in data engineering and still hands-on to build ground up forming a team; 3-5+ years leading data platform teams with ownership of production data SLAs
Deep hands-on expertise with PySpark and Spark performance tuning (shuffle optimization, partitioning, checkpointing, incremental loads)
Strong experience with Databricks (jobs/workflows, Delta Lake, governance) and building lakehouse architectures on GCS
Proven delivery of streaming + batch data platforms that power real-time product experiences (not just analytics)
Experience building feature stores and ML-ready datasets with point-in-time correctness and strong governance
Strong grasp of privacy and compliance in data systems: PII handling, consent, and auditability
Google Vertex AI experience: building data pipelines that feed training, evaluation, and inference workflows; understanding of dataset/version management
Hands-on experience supporting RAG systems: document ingestion, chunking, embedding generation, retrieval evaluation, and index refresh strategies
Experience with retrieval-aware training approaches (e.g., retrieval augmented fine-tuning / RAFT) and producing high-quality supervised datasets with provenance
Ability to collaborate with AI Engineers on MCP-based tools and agent workflows (tool schemas, rate limits, caching, and audit logs)

Nice to have:

Experience with identity resolution inputs
Experience building near-real-time segmentation, CLV, and propensity scoring pipelines
Familiarity with vector databases and multi-cloud data movement patterns

Obowiązki

Define the lakehouse reference architecture on GCS with Databricks/Delta Lake
Build and operate PySpark pipelines in Databricks for both streaming and batch workloads
Implement streaming ingestion
Own the Customer 360 / CDP layer: unify events, transactions, and user identifiers
Deliver a real-time feature layer (feature store) that publishes segments, scores, and vectors
Create and maintain embeddings and retrieval indexes to power RAG in Autopay AI Core (chunking strategies, metadata, refresh policies, and retrieval evaluation)
Establish data governance with Dataplex/Data Catalog and/or Unity Catalog
Own data observability for pipelines: freshness, completeness, schema drift, anomaly detection, and automated remediation workflows

Oferta

A leadership role in fast-growing, global fintech company
Possibility to work with cutting-edge tools and technologies
Independence in decision-making
Friendly working environment, team support, no dress code

Autopay

27 aktywnych ofert

Zobacz wszystkie oferty

Aplikuj teraz