Recommended expert

Christine Haehner-Murdock

AI Evaluation Case Study – Constraint Override Failure

Christine Haehner-Murdock
Siegen, Germany

Experience

AI Evaluation Case Study – Constraint Override Failure

  • Designed and executed a structured evaluation harness testing whether explicit response-mode constraints persist across multi-turn LLM interactions
  • 42 experimental runs across multiple frontier models
  • Comparison of frontloaded vs late constraint insertion
  • Analysis of drift onset, override success, and stylistic divergence
  • Reproducible dataset and documented evaluation protocol
  • Focus: instruction-hierarchy robustness under conversational state accumulation

Independent Researcher

Self-employed

  • Ongoing development of evaluation experiments and datasets documenting model behavior under structured interaction constraints
Germany

Senior Educator – Social Sciences & Economics

Self-employed

  • Taught and developed curriculum in sociology, economics, political science, and philosophy
  • Focus areas:
  • institutional systems
  • governance structures
  • economic decision frameworks
  • complex systems analysis

Summary

Independent AI evaluation researcher focusing on behavioral testing of large language models in structured conversational settings. Designs evaluation harnesses to analyze constraint persistence, instruction hierarchy robustness, and response-mode stability in multi-turn interactions.

Background in social sciences and economics with focus on socio-technical systems, institutional dynamics, and decision frameworks.

Interested in model evaluation, safety testing, prompt robustness analysis, and human-AI interaction frameworks.

Independent research on AI evaluation methods and human-AI interaction systems.

Ongoing development of evaluation experiments and datasets documenting model behavior under structured interaction constraints.

Skills

Ai Evaluation

  • Prompt Harness Design For Model Behavior Testing
  • Multi-turn Interaction Evaluation
  • Instruction Hierarchy Analysis
  • Prompt Robustness Testing
  • Evaluation Protocol Design
  • Structured Experiment Documentation

Technical Literacy

  • Git / Github Workflow
  • Json / Csv Experiment Logging
  • Structured Dataset Design
  • Prompt Engineering

Analytical Domains

  • Socio-technical Systems
  • Institutional Analysis
  • Governance Frameworks
  • Economic Incentive Structures

Languages

German
Native
English
Advanced

Education

Master’s Degree · Social Sciences

Profile

Created
Need a freelancer? Find your match in seconds.
Try FRATCH GPT
More actions

Frequently asked questions

Do you have questions? Here you can find further information.

Where is Christine based?

Christine is based in Siegen, Germany.

What languages does Christine speak?

Christine speaks the following languages: German (Native), English (Advanced).

What roles would Christine be best suited for?

Based on recent experience, Christine would be well-suited for roles such as: AI Evaluation Case Study – Constraint Override Failure, Independent Researcher, Senior Educator – Social Sciences & Economics.

What is Christine's latest experience?

Christine's most recent position is AI Evaluation Case Study – Constraint Override Failure.

What is Christine's education?

Christine holds a Master in Social Sciences.

What is the availability of Christine?

Christine will be available part-time from March 2026.

What is the rate of Christine?

Christine's rate depends on the specific project requirements. Please use the Meet button on the profile to schedule a meeting and discuss the details.

How to hire Christine?

To hire Christine, click the Meet button on the profile to request a meeting and discuss your project needs.

Average rates for similar positions

Rates are based on recent contracts and do not include FRATCH margin.

1000
750
500
250
Market avg: 790-950 €
The rates shown represent the typical market range for freelancers in this position based on recent contracts on our platform.
Actual rates may vary depending on seniority level, experience, skill specialization, project complexity, and engagement length.