#300577•Dodano Invalid Date•1•źródło: Mindrift
Evaluation Scenario Writer - AI Agent Testing Specialist
Doświadczenie
Mid
Lokalizacja
—
Tryb pracy
Zdalnie
Wymiar
Part-time
AITestingTest AutomationSoftware DevelopmentPythonpytestReactDocker
O ofercie
Mindrift connects specialists with project-based AI opportunities for leading tech companies, focused on testing, evaluating, and improving AI systems. Participation is project-based, not permanent employment.
Wymagania
- Degree in Computer Science, Software Engineering or related fields
- 5+ years in software development, primarily Python (pytest, async/await, subprocess, file operations)
- Background in Full-Stack development, with an equal focus on building React-based interfaces and robust Back-end systems
- Experience writing tests (functional, integration – not just running them)
- Docker containers (running evaluations locally in containers)
- CI/CD understanding (GitHub Actions as a user: triggers, labels, reading results)
- English proficiency - B2
Obowiązki
- Review and refine realistic coding tasks based on provided production codebases with realistic scope, requirements and information sources
- Write comprehensive functional tests that validate actual end-to-end behavior and edge-cases, not just superficial checks
- Craft “fair but hard” challenges where the AI has all the context it needs, but has to work for it (information scattered across files and external sources, complex reasoning required)
- Analyze AI failures to understand what the model struggles with vs. what it masters
- Iterate based on feedback from expert QA reviewers who score your work on 7 quality criteria
Benefity
- Paid contributions, with rates up to $30/hour
- Fixed project rate or individual rates, depending on the project
- Some projects include incentive payments
Inne informacje
Please submit your CV in English and indicate your level of English proficiency.