For an AI lab we are looking for AI Agent Evaluation Analyst to train an AI model (Large Language Model - LLM).
You help AI to make sense of the world. As consultant, you may be invited to take part in online projects to train the model in your domain of expertise.
This flexible role accommodates both experts seeking part-time engagement (minimum few hours/week) and those interested in full-time opportunities
- Reviewing evaluation tasks and scenarios for logic, completeness, and realism.
- Identifying inconsistencies, missing assumptions, or unclear decision points.
- Helping define clear expected behaviors (gold standards) for AI agents.
- Annotating cause-effect relationships, reasoning paths, and plausible alternatives.
- Thinking through complex systems and policies as a human would to ensure agents are tested properly.
- Working closely with QA, writers, or developers to suggest refinements or edge case coverage.