Robin L.

AI Tutor — STEM Prompting & Review

Stockholm, Sweden

Experience

Jan 2025 - Present
9 months

AI Tutor — STEM Prompting & Review

Mindrift

  • Authored prompts and evaluation rubrics for math/physics/CS; required clear step-by-step solutions, unit handling, and error-explanation notes.
  • Reviewed model outputs for correctness and reasoning quality; created calibration sets and difficulty bands.
Jan 2025 - Present
9 months

AI Data Annotator — Multimodal Labeling & QA

RWS (TrainAI)

  • Labeled text/image data to evolving schemas; stabilized label taxonomies; maintained rationale notes and edge-case logs.
Jan 2024 - Present
1 year 9 months

AI Linguistic Trainer (Swedish/English) — Criteria & Evaluation Design

Outlier AI

  • Built scoring criteria for helpfulness, safety, factuality, and style; turned policy into example-led guidelines.
  • Wrote adversarial prompts to probe ambiguity, safety, and factual precision; tuned thresholds to raise rater agreement.
Jan 2023 - Present
2 years 9 months

Content Creator — Swedish (Chaya Project)

TransPerfect / DataForce

  • Produced 1,000+ Swedish SMS/email samples; enforced strict style/safety guidelines.
  • Result: 98.85% QA pass across 815 reviewed tasks.
Jan 2022 - Present
3 years 9 months

Swedish AI Linguist & Trainer

DataAnnotation.tech

  • Reviewed/refined model outputs across registers; designed task-specific rubrics; led small calibration passes.

Summary

LLM Trainer & Reasoning Specialist with 3+ years shaping high-fidelity prompts, evaluation rubrics, and gold-standard benchmarks across science/technology, legal principles, health & lifestyle.

I translate complex, policy-heavy instructions into clear, auditable workflows—including rationale notes, decision trees, inter-rater calibration sets, and defect taxonomies—that increase agreement, reduce rework, and raise throughput.

Proven record of quality at scale: 98.85% audited QA pass across 815 reviewed tasks, consistent SLA delivery in fully remote, fast-iteration environments. Strengths include reasoning-first prompt design (tiered variants, constraints, uncertainty language), evaluation operations (analytic/holistic rubrics, partial-credit logic, severity tagging, AQL spot checks), and safety/factuality governance (bias/fairness screens, non-advice framing, evidence-bounded prompts).

I partner closely with research and evaluation leads to convert model error analyses into derived prompts, adversarial test sets, clearer acceptance criteria, and versioned SOPs with complete audit trails. Tool-fluent with enterprise annotation and QA platforms; meticulous about metadata hygiene, template reuse, and documentation that scales across raters and projects.

Languages

Swedish
Native
English
Advanced

Education

Stensund Folkhögskola

Behavioral Science · Sweden · Graduated with Distinction, Top 5%

Need a freelancer? Find your match in seconds.
Try FRATCH GPT
More actions