AI Engineer
Brak informacji o wynagrodzeniu
MidFull-time
#329983·Dodano 13 dni temu·25
Źródło: jobs.ashbyhq.comTech Stack / Keywords
AIRubyNode.jsNext.jsTypeScriptLLMJSONTesting
Firma i stanowisko
Ruby Labs is a leading tech company that creates and operates innovative consumer products across the health, education, and entertainment industries.
Wymagania
- Deep knowledge of Node.js & Next.js to build reliable services and handle complex LLM-generated data.
- Proven experience in building dynamic prompts dependent on input variables and context injection.
- Experience with OpenRouter APIs, managing rate limits, and selecting cost-effective models.
- Understanding of LLM observability principles, including tracing, test datasets, and scoring systems (Langfuse or similar).
- Experience with evaluation methodologies like RAGAS or custom “LLM-as-a-judge” systems.
- Analytical mindset to transform raw generation logs into actionable business metrics and technical insights.
- Iterative mindset focused on continuous product improvement through feedback loops.
Nice to have:
- Practical experience in fine-tuning models for specific domain tasks or JSON compliance.
- Understanding of Retrieval-Augmented Generation (RAG) architecture including indexing, retrieval, and re-ranking.
- Basic knowledge of Python for data science scripts or AI evaluation libraries.
Obowiązki
- Advanced Prompt Engineering: Designing complex, dynamic prompt templates with conditional logic and efficiently reusing information and context within prompts to maximize generation quality and reasoning.
- Structured Outputs & Schemas: Implementing various response schemes (JSON mode, function calling, Zod/JSON schemas) to ensure AI outputs are predictable and ready for seamless integration into application logic.
- Prompt Engineering & Evaluations: Building robust evaluation pipelines and using Langfuse to collect feedback and score the quality of responses in real time.
- Tracing & Debugging: Performing deep debugging of complex LLM chains using Langfuse traces to identify bottlenecks and optimize for cost, latency, and context window usage.
- AI A/B Testing: Running systematic experiments across different models via OpenRouter and analyzing results based on quantitative metrics.
- Data-Driven Decisions: Making deployment decisions for new prompts or models strictly based on quantitative benchmarks and trace data.
- Output Scoring & Analysis: Developing scoring systems to analyze the “Problem → Solution” chain and identify root causes of hallucinations or logic errors using Langfuse analytics.
- Model Performance & Fine-Tuning: Regularly re-evaluating model performance as new architectures emerge and performing fine-tuning when necessary to meet specific domain requirements.
Oferta
- Remote work environment allowing work from anywhere.
- Unlimited paid time off (PTO).
- Paid national holidays.
- Company-provided Apple MacBook for employees who need them.
- Flexible independent contractor agreement offering flexibility, autonomy, tax advantages, networking opportunities, and freedom to work from anywhere.
Elastyczne godziny
Płatny urlop
Płatne święta
Telefon służbowy
Inne informacje
Applicants must be located within approximately ±4 hours of the Central European Time (CET) zone to ensure optimal collaboration and communication during working hours.
Ruby Labs
31 aktywnych ofert