Ruby Labs

AI Engineer

Brak informacji o wynagrodzeniu

MidFull-time

#329983·Dodano 13 dni temu·25

Źródło: jobs.ashbyhq.com

Aplikuj teraz

Tech Stack / Keywords

AIRubyNode.jsNext.jsTypeScriptLLMJSONTesting

Firma i stanowisko

Ruby Labs is a leading tech company that creates and operates innovative consumer products across the health, education, and entertainment industries.

Wymagania

Deep knowledge of Node.js & Next.js to build reliable services and handle complex LLM-generated data.
Proven experience in building dynamic prompts dependent on input variables and context injection.
Experience with OpenRouter APIs, managing rate limits, and selecting cost-effective models.
Understanding of LLM observability principles, including tracing, test datasets, and scoring systems (Langfuse or similar).
Experience with evaluation methodologies like RAGAS or custom “LLM-as-a-judge” systems.
Analytical mindset to transform raw generation logs into actionable business metrics and technical insights.
Iterative mindset focused on continuous product improvement through feedback loops.

Nice to have:

Practical experience in fine-tuning models for specific domain tasks or JSON compliance.
Understanding of Retrieval-Augmented Generation (RAG) architecture including indexing, retrieval, and re-ranking.
Basic knowledge of Python for data science scripts or AI evaluation libraries.

Obowiązki

Advanced Prompt Engineering: Designing complex, dynamic prompt templates with conditional logic and efficiently reusing information and context within prompts to maximize generation quality and reasoning.
Structured Outputs & Schemas: Implementing various response schemes (JSON mode, function calling, Zod/JSON schemas) to ensure AI outputs are predictable and ready for seamless integration into application logic.
Prompt Engineering & Evaluations: Building robust evaluation pipelines and using Langfuse to collect feedback and score the quality of responses in real time.
Tracing & Debugging: Performing deep debugging of complex LLM chains using Langfuse traces to identify bottlenecks and optimize for cost, latency, and context window usage.
AI A/B Testing: Running systematic experiments across different models via OpenRouter and analyzing results based on quantitative metrics.
Data-Driven Decisions: Making deployment decisions for new prompts or models strictly based on quantitative benchmarks and trace data.
Output Scoring & Analysis: Developing scoring systems to analyze the “Problem → Solution” chain and identify root causes of hallucinations or logic errors using Langfuse analytics.
Model Performance & Fine-Tuning: Regularly re-evaluating model performance as new architectures emerge and performing fine-tuning when necessary to meet specific domain requirements.

Oferta

Remote work environment allowing work from anywhere.
Unlimited paid time off (PTO).
Paid national holidays.
Company-provided Apple MacBook for employees who need them.
Flexible independent contractor agreement offering flexibility, autonomy, tax advantages, networking opportunities, and freedom to work from anywhere.

Elastyczne godziny

Płatny urlop

Płatne święta

Telefon służbowy

Inne informacje

Applicants must be located within approximately ±4 hours of the Central European Time (CET) zone to ensure optimal collaboration and communication during working hours.

Ruby Labs

31 aktywnych ofert

Zobacz wszystkie oferty

Aplikuj teraz