Head of Data Engineering
Brak informacji o wynagrodzeniu
C-Level / ManagerFull-time
#318396·Dodano około miesiąc temu·34
Źródło: Blue MediaTech Stack / Keywords
ArchitecturePySparkDatabricksAIGoogle CloudUnityDatabasesCloud
Firma i stanowisko
Autopay Global is the newest member of the Autopay family, aiming to expand the reach of the group’s state-of-the-art payment integration and payment data technologies to the international market, providing seamless integration with local PSPs, support for multiple currencies and compliance with local frameworks.
Wymagania
- 10+ years in data engineering and still hands-on to build ground up forming a team; 3-5+ years leading data platform teams with ownership of production data SLAs
- Deep hands-on expertise with PySpark and Spark performance tuning (shuffle optimization, partitioning, checkpointing, incremental loads)
- Strong experience with Databricks (jobs/workflows, Delta Lake, governance) and building lakehouse architectures on GCS
- Proven delivery of streaming + batch data platforms that power real-time product experiences (not just analytics)
- Experience building feature stores and ML-ready datasets with point-in-time correctness and strong governance
- Strong grasp of privacy and compliance in data systems: PII handling, consent, and auditability
- Google Vertex AI experience: building data pipelines that feed training, evaluation, and inference workflows; understanding of dataset/version management
- Hands-on experience supporting RAG systems: document ingestion, chunking, embedding generation, retrieval evaluation, and index refresh strategies
- Experience with retrieval-aware training approaches (e.g., retrieval augmented fine-tuning / RAFT) and producing high-quality supervised datasets with provenance
- Ability to collaborate with AI Engineers on MCP-based tools and agent workflows (tool schemas, rate limits, caching, and audit logs)
Nice to have:
- Experience with identity resolution inputs
- Experience building near-real-time segmentation, CLV, and propensity scoring pipelines
- Familiarity with vector databases and multi-cloud data movement patterns
Obowiązki
- Define the lakehouse reference architecture on GCS with Databricks/Delta Lake
- Build and operate PySpark pipelines in Databricks for both streaming and batch workloads
- Implement streaming ingestion
- Own the Customer 360 / CDP layer: unify events, transactions, and user identifiers
- Deliver a real-time feature layer (feature store) that publishes segments, scores, and vectors
- Create and maintain embeddings and retrieval indexes to power RAG in Autopay AI Core (chunking strategies, metadata, refresh policies, and retrieval evaluation)
- Establish data governance with Dataplex/Data Catalog and/or Unity Catalog
- Own data observability for pipelines: freshness, completeness, schema drift, anomaly detection, and automated remediation workflows
Oferta
- A leadership role in fast-growing, global fintech company
- Possibility to work with cutting-edge tools and technologies
- Independence in decision-making
- Friendly working environment, team support, no dress code
Autopay
27 aktywnych ofert