Data Engineer

130 - 143 PLN/ godz.B2B (netto)
MidFull-time·B2B
#330373·Dodano 12 dni temu·18
Źródło: nofluffjobs.com
Aplikuj teraz

Tech Stack / Keywords

BackboneAIMachine learningETLAutomated testingGoogle Cloud StorageApache SparkFlinkKafkaTerraformDockerKubernetesSQLPythonJavaScalaData pipelinesApache BeamSparkPUBREST APIMicroservices architecture

Firma i stanowisko

We are seeking a seasoned Data Engineer with 3 years of experience to be the backbone of our AI initiatives, ensuring high-quality, high-velocity data for models. The role involves working proactively in a fast-paced, startup-like environment with ownership and ambiguity as opportunities to build structure.


Wymagania

  • Proficiency in Python and SQL.
  • Experience with Java or Scala is a plus.
  • Hands-on experience with Apache Spark, Flink, or Kafka.
  • Experience supporting AI projects, including handling embeddings, managing datasets for LLM fine-tuning, or working with tools like LangChain or LlamaIndex.
  • Comfortable with Terraform or Docker/Kubernetes to manage data environments.
  • Communicative English.

Nice to have:

  • BigQuery optimization and partitioning strategies.
  • Experience building and deploying data pipelines within the Vertex AI ecosystem.
  • Experience with managed Apache Beam or Spark services (Dataflow/Dataproc).
  • Experience building real-time event-driven architectures with Google Pub/Sub.
  • Understanding of REST APIs and microservices architecture.

Obowiązki

  • Design and scale data architectures specifically tailored for Machine Learning lifecycles, including feature stores, vector databases, and model training pipelines.
  • Architect, build, and maintain robust ETL/ELT pipelines handling structured and unstructured data with a focus on low latency and high reliability.
  • Identify bottlenecks in the data lifecycle and automate manual processes without waiting for tickets.
  • Work closely with Data Scientists and AI Researchers to understand model requirements and translate them into technical data specifications.
  • Implement automated testing and monitoring for data integrity.
  • Build and maintain data lakes and feature stores on Google Cloud Storage.
  • Implement real-time and batch processing architectures for AI-driven applications.

Oferta

  • Udemy for Business access.
  • Sport subscription.
  • Training budget.
  • Private healthcare.
  • International projects.
Karta sportowa
Dofinansowanie szkoleń
Opieka zdrowotna
ITMAGINATION

ITMAGINATION

35 aktywnych ofert

Zobacz wszystkie oferty
Aplikuj teraz