LLM Trainer & Reasoning Specialist with 3+ years shaping high-fidelity prompts, evaluation rubrics, and gold-standard benchmarks across science/technology, legal principles, health & lifestyle.
I translate complex, policy-heavy instructions into clear, auditable workflows—including rationale notes, decision trees, inter-rater calibration sets, and defect taxonomies—that increase agreement, reduce rework, and raise throughput.
Proven record of quality at scale: 98.85% audited QA pass across 815 reviewed tasks, consistent SLA delivery in fully remote, fast-iteration environments. Strengths include reasoning-first prompt design (tiered variants, constraints, uncertainty language), evaluation operations (analytic/holistic rubrics, partial-credit logic, severity tagging, AQL spot checks), and safety/factuality governance (bias/fairness screens, non-advice framing, evidence-bounded prompts).
I partner closely with research and evaluation leads to convert model error analyses into derived prompts, adversarial test sets, clearer acceptance criteria, and versioned SOPs with complete audit trails. Tool-fluent with enterprise annotation and QA platforms; meticulous about metadata hygiene, template reuse, and documentation that scales across raters and projects.
Discover other experts with similar qualifications and experience
2025 © FRATCH.IO GmbH. All rights reserved.