Project details
Recommended projects
AI Evaluation Consultant (m/w/d)
We are seeking an analytical and technically-minded professional to:
- Evaluate AI outputs and processes
- Ensure quality, accuracy, and reliability
- Identify logical errors, risks, and structural inconsistencies
- Provide actionable insights and recommendations to the team
Ideal candidates:
- Consultants, auditors, analysts, data researchers, or business/technical analysts with strong reasoning skills
- Professionals curious about AI, process improvement, and quality evaluation
- Problem-solvers who enjoy analyzing complex systems, logic, and scenarios
Key Responsibilities:
- Lead evaluation of AI outputs and related processes
- Review tasks against expected/ideal scenarios; identify gaps and risks
- Provide structured, actionable recommendations to engineers, domain experts, and managers
- Maintain and improve evaluation guidelines, checklists, SOPs
- Suggest new approaches, tools, and processes to enhance AI evaluation
New
AI Agent Evaluation Analyst (m/w/d)
We’re on the hunt for QAs for autonomous AI agents for a new project focused on validating and improving complex task structures, policy logic, and agent evaluation frameworks. Throughout the project, you’ll have to balance quality assurance, research, and logical problem-solving. This project opportunity is ideal for people who enjoy looking at systems holistically and thinking through scenarios, implications, and edge cases.
You do not need a coding background, but you must be curious, intellectually rigorous, and capable of evaluating the soundness and consistency of complex setups. If you’ve ever excelled in things like consulting, CHGK, Olympiads, case solving, or systems thinking — you might be a great fit.
What you’ll be doing:
- Reviewing evaluation tasks and scenarios for logic, completeness, and realism.
- Identifying inconsistencies, missing assumptions, or unclear decision points.
- Helping define clear expected behaviors (gold standards) for AI agents.
- Annotating cause-effect relationships, reasoning paths, and plausible alternatives.
- Thinking through complex systems and policies as a human would to ensure agents are tested properly.
- Working closely with QA, writers, or developers to suggest refinements or edge case coverage.
New
Evaluation Scenario Writer (m/w/d)
We’re looking for someone who can design realistic and structured evaluation scenarios for LLM-based agents. You’ll create test cases that simulate human-performed tasks and define gold-standard behavior to compare agent actions against. You’ll work to ensure each scenario is clearly defined, well-scored, and easy to execute and reuse. You’ll need a sharp analytical mindset, attention to detail, and an interest in how AI agents make decisions.
Although every project is unique, you might typically:
- Designing structured test scenarios based on real-world tasks.
- Defining the golden path and acceptable agent behavior.
- Annotating task steps, expected outputs, and edge cases.
- Working with devs to test your scenarios and improve clarity.
- Reviewing agent outputs and adapting tests accordingly
Freelance Chemistry Expert for AI Model Training (m/f/d)
An AI lab is looking for a freelance chemistry experts to evaluate AI models. The goal of the project is to assess the performance, accuracy, and reliability of AI models applied in chemistry contexts. The role involves working closely with the development team to ensure the models meet industry standards and provide actionable insights.
This is a remote part-time role that can be flexibly tailored to your availability – from just a few hours per week to full-time.
Key responsibilities:
- Evaluate AI models for chemistry applications.
- Analyze model outputs and provide feedback for improvement.
- Collaborate with the development team to ensure alignment with industry standards.
- Document findings and recommendations for model optimization.
- Conduct tests to validate model performance and reliability.
Freelance Biology Expert for AI Model Training (m/f/d)
An AI lab is looking for a freelance biology experts to evaluate AI models. The goal of the project is to assess the performance, accuracy, and reliability of AI models applied in biology (all areas) contexts. The role involves working closely with the development team to ensure the models meet industry standards and provide actionable insights.
This is a remote part-time role that can be flexibly tailored to your availability – from just a few hours per week to full-time.
Key responsibilities:
- Evaluate AI models for biology applications.
- Analyze model outputs and provide feedback for improvement.
- Collaborate with the development team to ensure alignment with industry standards.
- Document findings and recommendations for model optimization.
- Conduct tests to validate model performance and reliability.
Freelance Automotive Engineer (with Python) - Quality Assurance / AI Trainer
Generative AI models are improving very quickly, and one of our goals is to make them capable of addressing specialized questions and achieving complex reasoning skills.
Although every project is unique, you might typically:
- Content Creation & Refinement: Create and refine content to ensure accuracy and relevance across a variety of topics in Physics, while also developing references and examples of tasks.
- Experts Acquisition: Assess the qualification tests of experts, ensuring their competency.
- Chat Moderation: Provide support by addressing project-related questions from other experts in Discord chats, especially those related to project guidelines.
- Auditing Work: Review and evaluate tasks completed by other experts, ensuring they align with project guidelines. Provide constructive feedback, verify expertise-related information, and edit content as necessary to improve quality.
Freelance Civil Engineer with Python Experience (m/f/d)
A company is looking for a freelance Civil engineering experts to evaluate AI models. The goal of the project is to assess the performance, accuracy, and reliability of AI models applied in civil engineering contexts. The role involves working closely with the development team to ensure the models meet industry standards and provide actionable insights.
Key responsibilities:
- Evaluate AI models for civil engineering applications.
- Analyze model outputs and provide feedback for improvement.
- Collaborate with the development team to ensure alignment with industry standards.
- Document findings and recommendations for model optimization.
- Conduct tests to validate model performance and reliability.
Freelance Ruby Developer (m/f/d)
For an AI lab we are looking for a Ruby Developer to train an AI model (Large Language Model - LLM).
You help AI make sense of the world. As a consultant, you may be invited to take part in online projects to train the model in your domain of expertise.
This flexible role accommodates both experts seeking part-time engagement (at least a few hours per week) and those interested in full-time opportunities.
- Code generation and code review
- Prompt evaluation and complex data annotation
- Training and evaluation of large language models
- Benchmarking and agent-based code execution in sandboxed environments
- Working across multiple programming languages (Python, JavaScript/TypeScript, Rust, SQL, etc.)
- Adapting guidelines for new domains and use cases
- Collaborating with project leads, solution engineers, and supply managers on complex or experimental projects
Quality Compliance Auditor (GCP/GCLP/GVP) (M/W/D)
A company is looking for an experienced Quality Compliance Auditor who will be responsible for ensuring compliance with GCP, GCLP and GVP standards. The project aims to conduct internal and external audits, prepare and support regulatory inspections, and identify compliance gaps and derive corrective actions.
The role includes planning and conducting audits, assisting with regulatory inspections and ensuring compliance with ICH guidelines as well as EMA/FDA regulations.
- Conducting internal and external audits (GCP, GCLP, GVP)
- Preparing and supporting regulatory inspections (e.g. MHRA, FDA, EMA)
- Identifying compliance gaps and deriving corrective actions
Senior Project Manager Customer Interaction
A company is seeking support for the project to evaluate, implement and further develop quality surveys in digital channels. The goal of the project is to increase customer satisfaction in digital channels, evaluate, implement and enhance survey methods to enable consistent collection of customer satisfaction across all channels. Improvement potentials should be identified and implemented.
The role includes consulting, developing and implementing measures to collect and improve customer satisfaction in digital channels. Main tasks:
- Consulting on survey methods for gathering customer experience and quality in digital channels, market standards, benchmarks and future orientation.
- Developing a future model for quality in digital channels, relevant KPIs and survey methods as well as standard processes.
- Implementing the decided measures including interface management and coordination with technology partners and social partners.
- Testing implemented measures to collect data and ensure all required criteria are met.
- Consolidating and listing existing and missing customer survey methods/quality KPIs across all responsible digital channels.
- Advising on the preparation of decision templates and implementing the necessary measures.
- Identifying improvement potentials and developing a standard process for transparency and implementation.
Business Analyst – SAP S/4HANA Output Management (f/m/d)
- A company is looking for an experienced business analyst to support the transformation from SAP ECC to S/4HANA Utilities.
- The project aims to analyze, document, and optimize output and archiving processes, as well as create functional designs and specifications.
- The analyst will work closely with product owners, IT, and business units to align on feasibility, effort, and prioritization of requirements.
Freelance Physics Expert (with Python) - Quality Assurance / AI Trainer
Generative AI models are improving very quickly, and one of our goals is to make them capable of addressing specialized questions and achieving complex reasoning skills.
Although every project is unique, you might typically:
- Content Creation & Refinement: Create and refine content to ensure accuracy and relevance across a variety of topics in Physics, while also developing references and examples of tasks.
- Experts Acquisition: Assess the qualification tests of experts, ensuring their competency.
- Chat Moderation: Provide support by addressing project-related questions from other experts in Discord chats, especially those related to project guidelines.
- Auditing Work: Review and evaluate tasks completed by other experts, ensuring they align with project guidelines. Provide constructive feedback, verify expertise-related information, and edit content as necessary to improve quality.
Senior Regulatory Compliance Expert (FDA Inspection Preparation) (m/f/d)
A company is looking for a Senior Regulatory Compliance Expert to support its team in getting ready for FDA inspections. The role includes conducting mock inspections, providing strategic advice on inspection readiness, and assisting with pre-approval and routine inspections. The ideal candidate has extensive expertise in compliance with legal requirements, especially FDA standards, and plays a key role in ensuring the company meets global compliance demands.
- Conduct mock inspections according to FDA standards
- Provide strategic advice on inspection readiness
- Support pre-approval and routine inspections
Freelance Electrical Engineer with Python Experience (m/w/d)
Generative AI models are improving very quickly, and one of our goals is to make them capable of addressing specialized questions and achieving complex reasoning skills.
Although every project is unique, you might typically:
- Content Creation & Refinement: Create and refine content to ensure accuracy and relevance across a variety of topics in Physics, while also developing references and examples of tasks.
- Experts Acquisition: Assess the qualification tests of experts, ensuring their competency.
- Chat Moderation: Provide support by addressing project-related questions from other experts in Discord chats, especially those related to project guidelines.
- Auditing Work: Review and evaluate tasks completed by other experts, ensuring they align with project guidelines. Provide constructive feedback, verify expertise-related information, and edit content as necessary to improve quality.
AI Consultant - Machine Learning (m/w/d)
For an AI lab we are looking for Machine learning experts to train an AI model (Large Language Model - LLM). GenAI models are improving very quickly, and one of our goals is to make them capable of addressing specialized questions and achieving complex reasoning skills. If you join you’ll have the opportunity to collaborate on these projects. Although every project is unique, you might typically:
- Design original computational STEM problems that simulate real scientific workflows
- Create problems that require Python programming to solve
- Ensure problems are computationally intensive and cannot be solved manually within reasonable timeframes (days/weeks)
- Develop problems requiring non-trivial reasoning chains and creative problem-solving approaches
- Verify solutions using Python with standard libraries (numpy, pandas, scipy, sklearn)
- Document problem statements clearly and provide verified correct answers
Freelance Statistics Expert with Python Experience (m/f/d)
For an AI lab we are looking for Statistics Expert with python experience to train an AI model (Large Language Model - LLM).
GenAI models are improving very quickly, and one of our goals is to make them capable of addressing specialized questions and achieving complex reasoning skills.
If you join you’ll have the opportunity to collaborate on these projects. Although every project is unique, you might typically:
- Generate prompts that challenge AI.
- Define comprehensive scoring criteria to evaluate the accuracy of the AI’s answers.
- Correct the model’s responses based on your domain-specific knowledge.
Cyber Risk Consulting (Senior Level)
- Identification and analysis of cyber risks arising from changes in the digital landscape and the increasing capabilities of attackers.
- Development and assignment of appropriate countermeasures and creation of roadmaps for effectively managing digital threats.
- Translation of security incidents and threats into concrete business-relevant risks with suitable countermeasures.
- Continuous improvement of processes for managing the cyber risk lifecycle and increasing the maturity of the Cyber Risk Desk.
- Preparation of project reports on the status, impact, and necessary actions related to identified risks.
- Creation of risk analyses and management processes that comply with applicable regulatory standards (SOX, PCI, data protection).
- Conducting an initial risk assessment (likelihood, impact, risk level) including a precise description of risks, impacts, and probability of occurrence.
- Evaluation and detailed description of the residual risk after potential implementation of the identified risk mitigation measures.
ERP-Transformation Manager (m/w/d)
An established company is looking for an experienced ERP Transformation Manager to take full responsibility for planning and steering a comprehensive ERP transformation program. The project's goal is harmonizing processes, implementing a new ERP system, and meeting IFRS requirements.
The ERP Transformation Manager will analyze, redesign, and standardize the commercial core processes in civil and rail construction. This includes translating IFRS requirements into system structures and posting logic, closely coordinating with Finance, Controlling, Project Management, and IT departments.
The role includes managing the ERP rollout, including fit-gap analysis, process design, test management, and migration. In addition, a unified reporting and KPI framework for group financial statements and project management will be established. The manager will act as the central interface between operational units, Finance, management, and the group, and will set up a sustainable change and training concept for users.
- Planning and steering the ERP transformation program (IFRS transition, process harmonization, ERP rollout)
- Analyzing, redesigning, and standardizing commercial core processes
- Translating IFRS requirements into system structures and posting logic
- Managing the ERP rollout, including fit-gap analysis, process design, test management, and migration
- Building a unified reporting and KPI framework
- Stakeholder management and ensuring smooth communication
- Leading interdisciplinary project teams and managing external consultants and implementation partners
- Establishing a sustainable change and training concept
- Ensuring measurable process improvements after the ERP system goes live
New
ITSM Specialist BIA/BCM (m/f/d)
We are looking for temporary support in a highly regulated environment for an ITSM Specialist (m/f/d) – Business Impact Analysis & BCM.
We are seeking experienced ITSM specialists with solid knowledge in conducting and evaluating Business Impact Analyses (BIA) according to current standards (e.g. BSI 200-4). The focus is on independently conducting BIAs for time-critical business processes as well as further developing business continuity structures.
Tasks
- Conducting structured interviews with process owners
- Analyzing and evaluating critical process dependencies
- Creating management reports, heatmaps, and recommendations for action
- Further developing the BIA/BCM documentation in Confluence and ServiceNow
Project Manager Magazines / Magazine Production (m/f/d)
- Responsibility for coordinating and managing the entire production process of magazine publications
- Planning and monitoring issue structure, deadlines, advertisements, and workflows
- Close collaboration with editorial, publishing management, marketing, IT, sales, printers, and service providers
- Quality assurance of layouts, copy, and print approvals
- Cost calculation and organization of supplementary products (e.g., inserts, posters, expansions)
- Active role in strategic projects, conferences, and the launch of new formats
New
Interim Chief Transformation Officer (m/f/d)
A Private Equity–backed German B2B SaaS company is seeking an experienced Interim Chief Transformation Officer (CTO / CTrO) to lead the merger and integration of two B2B SaaS companies, each with approximately 50–100 employees. The companies operate from locations in the Rhine-Ruhr area, Munich, and Hamburg. The objective of this mandate is to design and execute a holistic post-merger transformation, covering strategic direction, product portfolio, target operating model, and scalable processes to enable sustainable and efficient growth.
Mandate Objective End-to-end ownership of the post-merger transformation, with a strong focus on value creation, operational excellence, and scalability.
Key Responsibilities
- Design and execute a comprehensive transformation and post-merger integration (PMI) strategy, including robust business cases
- Identify, assess, and prioritize value creation and optimization opportunities across organization, processes, governance, and technology
- Streamline and consolidate the product portfolio, including active steering of product and customer migrations
- Harmonize and further develop product, sales, and go-to-market organizations (e.g. role models, interfaces, incentives, governance)
- Define and implement the future target operating model (organizational structure, decision frameworks, responsibilities)
- Act as a key sparring partner to the Private Equity investor, management teams, and other senior stakeholders
- Lead, align, and motivate cross-functional teams through a complex change and integration journey
- Establish clear governance, KPIs, and reporting structure
AI Consultants - Data Science (m/w/d)
We are seeking experienced data scientists to create computationally intensive data science problems for an advanced AI evaluation project. This is a remote, project-based opportunity for experts who can design challenging problems that require computational methods to solve and mirror the full data science lifecycle - from data acquisition and processing to statistical analysis and actionable business insights.
What You'll Do
- Design original computational data science problems that simulate real-world analytical workflows across industries (telecom, finance, government, e-commerce, healthcare) Create problems requiring Python programming to solve (using pandas, numpy, scipy, sklearn, statsmodels, matplotlib, seaborn)
- Ensure problems are computationally intensive and cannot be solved manually within reasonable timeframes (days/weeks)
- Develop problems requiring non-trivial reasoning chains in data processing, statistical analysis, feature engineering, predictive modeling, and insight extraction
- Create deterministic problems with reproducible answers - avoid stochastic elements or require fixed random seeds for exact reproducibility
- Base problems on real business challenges: customer analytics, risk assessment, fraud detection, forecasting, optimization, and operational efficiency
- Design end-to-end problems spanning the complete data science pipeline (data ingestion → cleaning → EDA → modeling → validation → deployment considerations)
- Incorporate big data processing scenarios requiring scalable computational approaches
- Verify solutions using Python with standard data science libraries and statistical methods
- Document problem statements clearly with realistic business contexts and provide verified correct answers
Freelance Cybersecurity Consultant for AI Red Teaming
For an AI lab we are looking for cybersecurity consultants to train an AI model (Large Language Model - LLM).
You help AI to make sense of the world. As consultant, you may be invited to take part in online projects to train the model in your domain of expertise.
This flexible role accommodates both experts seeking part-time engagement (minimum few hours/week) and those interested in full-time opportunities
- Evaluate and red team AI models and agents and machine learning systems for vulnerabilities and safety risks.
- Create offline reproducible & auto-evaluable test cases to test safety & capability of AI agents.
- Develop and implement automation scripts, custom tools, environments and test harnesses.
- Lead or contribute to security research initiatives, especially in AI safety, creating and implementing realistic and challenging attack scenarios for the model.
- Advise on cybersecurity best practices and policy implications.
New
SAP FI/CO Consultant (m/f/d) – Focus SAP R/3 - S/4HANA Transition
For an industrial company we are looking for an experienced SAP FI/CO Senior Consultant (m/f/d) with a strong focus on SAP R/3 (ECC) and proven experience in S/4HANA transition projects. The goal of the project is the functional and system-side analysis of the existing R/3 processes in the Finance and Controlling area and support in preparing and implementing the migration to S/4HANA.
Responsibilities
- Analysis of existing FI/CO processes in SAP R/3 (general ledger, accounts receivable/payable, asset accounting, CO objects, CO PA, etc.)
- Conducting a gap analysis between SAP ECC and S/4HANA
- Assessing the impact of new S/4 features (Universal Journal, Business Partner, new asset accounting etc.)
- Identifying optimization potentials & recommendations for process standardization
- Creating a preparation and migration plan (Delivery: end of January)
- Running a remote workshop with the business units
- Advising on FI/CO best practices in industrial environments
- Preparing professional concept and documentation materials
Commissioning & Qualification (C&Q) Engineer (m/f/d)
A company is looking for an experienced Commissioning & Qualification (C&Q) Engineer to qualify and commission production equipment according to GMP standards. The goal of the project is to ensure the technical and organizational prerequisites for GMP-compliant qualification of the production equipment.
- Independent execution of commissioning and qualification activities, especially in IOQ
- Operation of PCS7 systems
- Working with single-use equipment
- Carrying out commissioning and qualification activities for production equipment
- Ensuring all technical and organizational prerequisites for C&Q
- GMP-compliant qualification of the related production equipment
Project Manager Brand Guardianship (m/f/d)
The service is requested as part of the Brand Image Pool photoshoot project. The project includes:
- Managing sub-tasks throughout the entire duration of the image pool photoshoot project from January to June
- Taking on Brand Guardianship tasks during the pool shoot project period
- Detailed service description without personal reference:
- Independently defining, managing, and executing the project. This ranges from project management to creating roadmaps and project presentations
- Developing ideas and concepts for measures
- Actively managing project risks
- Actively managing project issues including professional advice on escalations
- Preparing and following up on stakeholder and steering board meetings
- Defining the project scope and major project phases
- Providing clear and timely information to the client regarding scope, quality, schedule, budget, and status
EHS Specialist – Body in White (M/W/D)
A company is looking for an experienced EHS Specialist to support their Body in White (BIW) operations. Body in White refers to the stage in car manufacturing where the vehicle's sheet metal components are welded together to form the body shell, prior to painting and the installation of the engine, chassis, or interior trim. The goal of the project is to ensure compliance with environmental, health, and safety regulations during this critical manufacturing phase while optimizing processes and maintaining high safety standards.
The role involves collaborating with production and engineering teams to identify risks, implement safety measures, and foster a culture of safety within the organization.
Key responsibilities:
- Conduct risk assessments and ensure compliance with EHS regulations specific to BIW operations.
- Develop and implement safety protocols and procedures tailored to BIW processes.
- Monitor and report on EHS performance metrics within the BIW stage.
- Provide training and guidance to employees on EHS best practices in automotive manufacturing.
- Investigate incidents and implement corrective actions to prevent recurrence.
- Collaborate with cross-functional teams to improve safety standards and processes in BIW.
EHS Specialist – Cell Manufacturing
A company in the automotive and robotics industry is seeking an experienced EHS Specialist to support cell manufacturing processes. The goal of the project is to ensure compliance with environmental, health, and safety regulations and to promote a safety culture within the manufacturing environment.
The role requires close collaboration with cross-functional teams to implement and maintain EHS standards, conduct risk assessments, and drive continuous improvement initiatives.
Key responsibilities:
- Developing, implementing and maintaining EHS policies and procedures tailored specifically to cell manufacturing.
- Conducting regular risk assessments and audits to ensure regulatory compliance.
- Training and guiding employees on EHS best practices.
- Investigating incidents and implementing corrective actions to prevent recurrence.
- Collaborating with internal teams to promote a safety and sustainability culture.
- Monitoring and reporting on EHS performance metrics.
Chemist with Python Experience (m/w/d)
GenAI models are improving very quickly, and one of our goals is to make them capable of addressing specialized questions and achieving complex reasoning skills. If you join the platform as an AI Tutor in Chemistry, you’ll have the opportunity to collaborate on these projects.
Although every project is unique, you might typically:
- Generate prompts that challenge AI.
- Define comprehensive scoring criteria to evaluate the accuracy of the AI’s answers.
- Correct the model’s responses based on your domain-specific knowledge.
IT Project Manager ServiceNow (Senior)
- A company in the energy and energy services sector is looking for an experienced IT project manager for a ServiceNow project.
- The goal of the project is to lead and successfully implement an enterprise ServiceNow project focusing on ITSM and Customer Service Management (CSM).
- The role includes planning, controlling, and ensuring a stable project flow in close collaboration with internal and external stakeholders.
- Operational & strategic service management of the ServiceNow platform
- Process ownership for ITSM and CSM (B2B & B2C)
- Process design, governance & continuous optimization
- Managing external providers and vendors
- Monitoring, KPI analysis & deriving improvements
- Ensuring stable platform operations
AI Consultant for Vibe Coding (m/w/d)
An AI Lab is looking for a AI Trainer for Vibe Coding. This role involves producing accurate, well-reasoned outputs across diverse domains, leveraging automation and AI tools. The position requires expertise in coding and optimizing Python scripts, handling large datasets, improving AI-generated content, and formatting and troubleshooting technical workflows.
This is a remote part-time role that can be flexibly tailored to your availability – from just a few hours per week to full-time.
Key responsibilities:
- Conduct advanced web research and data mining using multiple tools to locate and extract information from official sources. Use LLMs and advanced prompts to refine search strategies and validate data accuracy by cross-referencing authoritative sources.
- Perform web scraping and data extraction by navigating complex website structures and multi-level pages (regions → companies → detailed pages). Handle dynamic content, archived pages, and various HTML formats, and organize extracted data into clean, well-formatted CSV files.
- Write and optimize Python scripts for data processing and analysis using libraries such as pandas, BeautifulSoup, Selenium, and matplotlib. Transform raw data into structured formats (CSV, JSON, tables) and create visualizations when required.
- Carry out data processing and quality assurance by cleaning, validating, and structuring datasets. - - Ensure data integrity across multiple sources, apply formatting specifications, and run verification steps to maintain high output quality.
- Apply strong problem-solving and task execution skills to break down complex workflows, troubleshoot technical issues independently, and adapt quickly between different domains and task types with minimal supervision.
- Produce clear documentation and high-quality outputs that follow exact requirements for file formats, naming conventions, and data structure. Maintain reproducible workflows and well-organized code.
Frontend developer to HR platform with Angular experience
Reach out to us if you are interested in working with us on the project.
Sign up to get access to more exciting projects that match your skills and preferences!
AI Evaluation Consultant (m/w/d)
Industry
Information Technology (IT)
Areas
Audit
Quality Assurance (QA)
Project info
- Period19.01.2026 - 18.03.2026
- Capacityfrom 95%
- Daily rate440 - 480€
- Language
- English(Advanced)
- English
- Remotefrom 95%
Description
We are seeking an analytical and technically-minded professional to:
- Evaluate AI outputs and processes
- Ensure quality, accuracy, and reliability
- Identify logical errors, risks, and structural inconsistencies
- Provide actionable insights and recommendations to the team
Ideal candidates:
- Consultants, auditors, analysts, data researchers, or business/technical analysts with strong reasoning skills
- Professionals curious about AI, process improvement, and quality evaluation
- Problem-solvers who enjoy analyzing complex systems, logic, and scenarios
Key Responsibilities:
- Lead evaluation of AI outputs and related processes
- Review tasks against expected/ideal scenarios; identify gaps and risks
- Provide structured, actionable recommendations to engineers, domain experts, and managers
- Maintain and improve evaluation guidelines, checklists, SOPs
- Suggest new approaches, tools, and processes to enhance AI evaluation
Requirements
- Scenario validation, data analysis, auditing, or consulting experience
- Analytical work in research, technical/business analysis, or risk evaluation
Knowledge & Skills:
- Strong analytical and critical thinking
- Attention to detail, reliability, and an ownership mindset
- Technical understanding: JSON/YAML, basic Git/GitHub
- Independent, proactive mindset
Nice to Have:
- Scenario-based testing, annotation workflows, AI/LLM evaluation
- Experience in cross-functional teams