AI Engineer

Brak informacji o wynagrodzeniu
SeniorFull-time
#333559·Dodano 7 dni temu·7
Źródło: jobs.ashbyhq.com
Aplikuj teraz

Tech Stack / Keywords

AIRubyNode.jsNext.jsTypeScriptLLMJSONTesting

Firma i stanowisko

Ruby Labs is a leading tech company that creates and operates innovative consumer products across the health, education, and entertainment industries.


Wymagania

  • Deep knowledge of Node.js & Next.js to build reliable services and handle complex LLM-generated data.
  • Proven experience in building dynamic prompts dependent on input variables and context injection.
  • Experience with OpenRouter APIs, managing rate limits, and selecting cost-effective models.
  • Understanding of LLM observability principles, including tracing, test datasets, and scoring systems (Langfuse or similar).
  • Experience with evaluation methodologies like RAGAS or custom “LLM-as-a-judge” systems.
  • Analytical mindset to transform raw generation logs into actionable business metrics and technical insights.
  • Iterative mindset focused on continuous product improvement through feedback loops.

Nice to have:

  • Practical experience in fine-tuning models for specific domain tasks or JSON compliance.
  • Understanding of Retrieval-Augmented Generation (RAG) architecture including indexing, retrieval, and re-ranking.
  • Basic knowledge of Python for data science scripts or AI evaluation libraries.

Obowiązki

  • Advanced Prompt Engineering: Designing complex, dynamic prompt templates with conditional logic and efficiently reusing information and context within prompts to maximize generation quality and reasoning.
  • Structured Outputs & Schemas: Implementing various response schemes (JSON mode, function calling, Zod/JSON schemas) to ensure AI outputs are predictable and ready for seamless integration into application logic.
  • Prompt Engineering & Evaluations: Building robust evaluation pipelines and using Langfuse to collect feedback and score the quality of responses in real time.
  • Tracing & Debugging: Performing deep debugging of complex LLM chains using Langfuse traces to identify bottlenecks and optimize for cost, latency, and context window usage.
  • AI A/B Testing: Running systematic experiments across different models via OpenRouter and analyzing results based on quantitative metrics.
  • Data-Driven Decisions: Making deployment decisions for new prompts or models strictly based on quantitative benchmarks and trace data.
  • Output Scoring & Analysis: Developing scoring systems to analyze the “Problem → Solution” chain and identify root causes of hallucinations or logic errors using Langfuse analytics.
  • Model Performance & Fine-Tuning: Regularly re-evaluating model performance as new architectures emerge and performing fine-tuning when necessary to meet specific domain requirements.

Oferta

  • Remote work environment allowing work from anywhere.
  • Unlimited paid time off (PTO).
  • Paid national holidays.
  • Company-provided Apple MacBook for employees who need them.
  • Flexible independent contractor agreement offering flexibility, autonomy, tax advantages, and entrepreneurial opportunities.
Płatny urlop
Płatne święta

Inne informacje

Applicants must be located within approximately ± 4 hours of Central European Time (CET) to ensure optimal collaboration and communication during working hours.

Ruby Labs

Ruby Labs

31 aktywnych ofert

Zobacz wszystkie oferty
Aplikuj teraz