ITMAGINATION

Data Engineer

130 - 143 PLN/ godz.B2B (netto)

MidFull-time·B2B

#330373·Dodano 12 dni temu·18

Źródło: nofluffjobs.com

Aplikuj teraz

Tech Stack / Keywords

BackboneAIMachine learningETLAutomated testingGoogle Cloud StorageApache SparkFlinkKafkaTerraformDockerKubernetesSQLPythonJavaScalaData pipelinesApache BeamSparkPUBREST APIMicroservices architecture

Firma i stanowisko

We are seeking a seasoned Data Engineer with 3 years of experience to be the backbone of our AI initiatives, ensuring high-quality, high-velocity data for models. The role involves working proactively in a fast-paced, startup-like environment with ownership and ambiguity as opportunities to build structure.

Wymagania

Proficiency in Python and SQL.
Experience with Java or Scala is a plus.
Hands-on experience with Apache Spark, Flink, or Kafka.
Experience supporting AI projects, including handling embeddings, managing datasets for LLM fine-tuning, or working with tools like LangChain or LlamaIndex.
Comfortable with Terraform or Docker/Kubernetes to manage data environments.
Communicative English.

Nice to have:

BigQuery optimization and partitioning strategies.
Experience building and deploying data pipelines within the Vertex AI ecosystem.
Experience with managed Apache Beam or Spark services (Dataflow/Dataproc).
Experience building real-time event-driven architectures with Google Pub/Sub.
Understanding of REST APIs and microservices architecture.

Obowiązki

Design and scale data architectures specifically tailored for Machine Learning lifecycles, including feature stores, vector databases, and model training pipelines.
Architect, build, and maintain robust ETL/ELT pipelines handling structured and unstructured data with a focus on low latency and high reliability.
Identify bottlenecks in the data lifecycle and automate manual processes without waiting for tickets.
Work closely with Data Scientists and AI Researchers to understand model requirements and translate them into technical data specifications.
Implement automated testing and monitoring for data integrity.
Build and maintain data lakes and feature stores on Google Cloud Storage.
Implement real-time and batch processing architectures for AI-driven applications.

Oferta

Udemy for Business access.
Sport subscription.
Training budget.
Private healthcare.
International projects.

Karta sportowa

Dofinansowanie szkoleń

Opieka zdrowotna

ITMAGINATION

35 aktywnych ofert

Zobacz wszystkie oferty

Aplikuj teraz