Project details
Recommended projects
New
MCP & Tools Python Developer (m/w/d)
We’re on the hunt for hands-on Python engineers for a new project focused on developing Model Context Protocol (MCP) servers and internal tools for running and evaluating agent behavior. You’ll implement base methods for agent action verification, integrate with internal and client infrastructures, and help fill tooling gaps across the team.
What you’ll be doing:
- Developing and maintaining MCP-compatible evaluation servers
- Implementing logic to check agent actions against scenario definitions
- Creating or extending tools that writers and QAs use to test agents
- Working closely with infrastructure engineers to ensure compatibility
- Occasionally helping with test writing or debug sessions when needed
Although we’re only looking for experts for this current project, contributors with consistent high-quality submissions may receive an invitation for ongoing collaboration across future projects.
Freelance Mechanical Engineer with Python Experience (m/f/d)
For an AI lab we are looking for Mechanical Engineer with python experience to train an AI model (Large Language Model - LLM).
GenAI models are improving very quickly, and one of our goals is to make them capable of addressing specialized questions and achieving complex reasoning skills.
If you join you’ll have the opportunity to collaborate on these projects. Although every project is unique, you might typically:
- Content Creation & Refinement: Create and refine content to ensure accuracy and relevance across a variety of topics in Mechanical Engineering, while also developing references and examples of tasks.
- Experts Acquisition: Assess the qualification tests of experts, ensuring their competency.
- Chat Moderation: Provide support by addressing project-related questions from other experts in Discord chats, especially those related to project guidelines.
- Auditing Work: Review and evaluate tasks completed by other experts, ensuring they align with project guidelines. Provide constructive feedback, verify expertise-related information, and edit content as necessary to improve quality.
Freelance Electrical Engineer with Python Experience (m/w/d)
Generative AI models are improving very quickly, and one of our goals is to make them capable of addressing specialized questions and achieving complex reasoning skills.
Although every project is unique, you might typically:
- Content Creation & Refinement: Create and refine content to ensure accuracy and relevance across a variety of topics in Physics, while also developing references and examples of tasks.
- Experts Acquisition: Assess the qualification tests of experts, ensuring their competency.
- Chat Moderation: Provide support by addressing project-related questions from other experts in Discord chats, especially those related to project guidelines.
- Auditing Work: Review and evaluate tasks completed by other experts, ensuring they align with project guidelines. Provide constructive feedback, verify expertise-related information, and edit content as necessary to improve quality.
Freelance Java Developer (all genders)
For an AI lab, we’re looking for a Java Developer to train an AI model (large language model - LLM).
You’ll help AI make sense of the world. As a consultant, you may be invited to join online projects to train the model in your area of expertise.
This flexible role suits both experts seeking part-time work (minimum a few hours per week) and those interested in full-time opportunities.
- Code generation and code review
- Prompt evaluation and complex data annotation
- Training and evaluation of large language models
- Benchmarking and agent-based code execution in sandboxed environments
- Working across multiple programming languages
- Adapting guidelines for new domains and use cases
- Following project-specific rubrics and requirements
- Collaborating with project leads, solution engineers, and supply managers on complex or experimental projects
AI Consultant - Machine Learning (m/w/d)
For an AI lab we are looking for Machine learning experts to train an AI model (Large Language Model - LLM). GenAI models are improving very quickly, and one of our goals is to make them capable of addressing specialized questions and achieving complex reasoning skills. If you join you’ll have the opportunity to collaborate on these projects. Although every project is unique, you might typically:
- Design original computational STEM problems that simulate real scientific workflows
- Create problems that require Python programming to solve
- Ensure problems are computationally intensive and cannot be solved manually within reasonable timeframes (days/weeks)
- Develop problems requiring non-trivial reasoning chains and creative problem-solving approaches
- Verify solutions using Python with standard libraries (numpy, pandas, scipy, sklearn)
- Document problem statements clearly and provide verified correct answers
Freelance Ruby Developer (m/f/d)
For an AI lab we are looking for a Ruby Developer to train an AI model (Large Language Model - LLM).
You help AI make sense of the world. As a consultant, you may be invited to take part in online projects to train the model in your domain of expertise.
This flexible role accommodates both experts seeking part-time engagement (at least a few hours per week) and those interested in full-time opportunities.
- Code generation and code review
- Prompt evaluation and complex data annotation
- Training and evaluation of large language models
- Benchmarking and agent-based code execution in sandboxed environments
- Working across multiple programming languages (Python, JavaScript/TypeScript, Rust, SQL, etc.)
- Adapting guidelines for new domains and use cases
- Collaborating with project leads, solution engineers, and supply managers on complex or experimental projects
Freelance Automotive Engineer (with Python) - Quality Assurance / AI Trainer
Generative AI models are improving very quickly, and one of our goals is to make them capable of addressing specialized questions and achieving complex reasoning skills.
Although every project is unique, you might typically:
- Content Creation & Refinement: Create and refine content to ensure accuracy and relevance across a variety of topics in Physics, while also developing references and examples of tasks.
- Experts Acquisition: Assess the qualification tests of experts, ensuring their competency.
- Chat Moderation: Provide support by addressing project-related questions from other experts in Discord chats, especially those related to project guidelines.
- Auditing Work: Review and evaluate tasks completed by other experts, ensuring they align with project guidelines. Provide constructive feedback, verify expertise-related information, and edit content as necessary to improve quality.
Fullstack Engineer (m/f/d)
- Product and web development in a data-driven environment
- Contribute to software architecture for new data products
- Collaboration in interdisciplinary teams (including data scientists and business developers)
Freelance Civil Engineer with Python Experience (m/f/d)
A company is looking for a freelance Civil engineering experts to evaluate AI models. The goal of the project is to assess the performance, accuracy, and reliability of AI models applied in civil engineering contexts. The role involves working closely with the development team to ensure the models meet industry standards and provide actionable insights.
Key responsibilities:
- Evaluate AI models for civil engineering applications.
- Analyze model outputs and provide feedback for improvement.
- Collaborate with the development team to ensure alignment with industry standards.
- Document findings and recommendations for model optimization.
- Conduct tests to validate model performance and reliability.
Physicist with Python Experience (m/w/d)
For an AI lab we are looking for phycists with python experience to train an AI model (Large Language Model - LLM).
GenAI models are improving very quickly, and one of our goals is to make them capable of addressing specialized questions and achieving complex reasoning skills. If you join the platform as a phycist, you’ll have the opportunity to collaborate on these projects.
Although every project is unique, you might typically:
- Design original computational physic problems that simulate real research workflows.
- Create problems requiring Python programming to solve (using numpy, scipy, sympy).
- Ensure problems are computationally intensive and cannot be solved manually within reasonable timeframes (days/weeks).
- Develop problems requiring non-trivial reasoning chains.
- Base problems on real research challenges or practical applications from physical practice.
- Verify solutions using Python with standard libraries.
- Document problem statements clearly and provide verified correct answers.
Freelance Statistics Expert with Python Experience (m/f/d)
For an AI lab we are looking for Statistics Expert with python experience to train an AI model (Large Language Model - LLM).
GenAI models are improving very quickly, and one of our goals is to make them capable of addressing specialized questions and achieving complex reasoning skills.
If you join you’ll have the opportunity to collaborate on these projects. Although every project is unique, you might typically:
- Generate prompts that challenge AI.
- Define comprehensive scoring criteria to evaluate the accuracy of the AI’s answers.
- Correct the model’s responses based on your domain-specific knowledge.
Mathematician with Python Experience (m/w/d)
For an AI lab we are looking for mathematicians with python experience to train an AI model (Large Language Model - LLM).
As consultant, you may be invited to take part in online projects to train the model in your domain of expertise.
This flexible role accommodates both experts seeking part-time engagement (minimum few hours/week) and those interested in full-time opportunities.
Although every project is unique, you might typically:
- Design original computational mathematics problems that simulate real mathematical research workflows.
- Create problems requiring Python programming to solve (using numpy, scipy, sympy).
- Ensure problems are computationally intensive and cannot be solved manually within reasonable timeframes (days/weeks).
- Develop problems requiring non-trivial reasoning chains in areas like number theory, combinatorics, graph theory, and numerical analysis.
- Base problems on real research challenges or practical applications from mathematical practice.
- Verify solutions using Python with standard mathematical libraries.
- Document problem statements clearly and provide verified correct answers.
Support in:
- Number Theory: Prime factorization, Diophantine equations, modular arithmetic, cryptographic computations.
- Combinatorics: Enumerations, partitions, generating functions, combinatorial optimization.
- Graph Theory: Network analysis, path finding, graph coloring, spanning trees.
- Numerical Analysis: Root finding, numerical integration, differential equations, matrix computations.
- Discrete Mathematics: Recurrence relations, algorithmic complexity, discrete optimization.
- Algebra: Polynomial computations, group theory calculations, matrix decompositions.
Freelance Physics Expert (with Python) - Quality Assurance / AI Trainer
Generative AI models are improving very quickly, and one of our goals is to make them capable of addressing specialized questions and achieving complex reasoning skills.
Although every project is unique, you might typically:
- Content Creation & Refinement: Create and refine content to ensure accuracy and relevance across a variety of topics in Physics, while also developing references and examples of tasks.
- Experts Acquisition: Assess the qualification tests of experts, ensuring their competency.
- Chat Moderation: Provide support by addressing project-related questions from other experts in Discord chats, especially those related to project guidelines.
- Auditing Work: Review and evaluate tasks completed by other experts, ensuring they align with project guidelines. Provide constructive feedback, verify expertise-related information, and edit content as necessary to improve quality.
AI Consultant for Vibe Coding (m/w/d)
An AI Lab is looking for a AI Trainer for Vibe Coding. This role involves producing accurate, well-reasoned outputs across diverse domains, leveraging automation and AI tools. The position requires expertise in coding and optimizing Python scripts, handling large datasets, improving AI-generated content, and formatting and troubleshooting technical workflows.
This is a remote part-time role that can be flexibly tailored to your availability – from just a few hours per week to full-time.
Key responsibilities:
- Conduct advanced web research and data mining using multiple tools to locate and extract information from official sources. Use LLMs and advanced prompts to refine search strategies and validate data accuracy by cross-referencing authoritative sources.
- Perform web scraping and data extraction by navigating complex website structures and multi-level pages (regions → companies → detailed pages). Handle dynamic content, archived pages, and various HTML formats, and organize extracted data into clean, well-formatted CSV files.
- Write and optimize Python scripts for data processing and analysis using libraries such as pandas, BeautifulSoup, Selenium, and matplotlib. Transform raw data into structured formats (CSV, JSON, tables) and create visualizations when required.
- Carry out data processing and quality assurance by cleaning, validating, and structuring datasets. - - Ensure data integrity across multiple sources, apply formatting specifications, and run verification steps to maintain high output quality.
- Apply strong problem-solving and task execution skills to break down complex workflows, troubleshoot technical issues independently, and adapt quickly between different domains and task types with minimal supervision.
- Produce clear documentation and high-quality outputs that follow exact requirements for file formats, naming conventions, and data structure. Maintain reproducible workflows and well-organized code.
New
Evaluation Scenario Writer (m/w/d)
We’re looking for someone who can design realistic and structured evaluation scenarios for LLM-based agents. You’ll create test cases that simulate human-performed tasks and define gold-standard behavior to compare agent actions against. You’ll work to ensure each scenario is clearly defined, well-scored, and easy to execute and reuse. You’ll need a sharp analytical mindset, attention to detail, and an interest in how AI agents make decisions.
Although every project is unique, you might typically:
- Designing structured test scenarios based on real-world tasks.
- Defining the golden path and acceptable agent behavior.
- Annotating task steps, expected outputs, and edge cases.
- Working with devs to test your scenarios and improve clarity.
- Reviewing agent outputs and adapting tests accordingly
New
Senior Firmware / Edge Expert for EdgeX (HX) Integration
We are looking for an experienced Senior Firmware/Edge Expert to support a one-time migration project for integrating EdgeX (HX) into existing firmware. The current firmware, based on C++ and FreeRTOS, is no longer compliant with modern security regulations such as NIS2 and EU requirements. As a result, the migration will first involve porting the existing business logic to Linux and then moving towards a containerized architecture using EdgeX as the framework for edge devices.
The goal is to create a universal, configurable platform with shared core modules and product-specific components layered on top, supporting multiple products in the future.
Your Responsibilities
- Firmware Migration: Port existing business logic from C++/FreeRTOS to Linux to ensure identical behavior on customer devices.
- Driver and Hardware Abstraction: Implement missing drivers and hardware abstraction to ensure smooth operation.
- System Architecture & Service Communication: Focus on EdgeX integration with a clear separation between hardware layer (drivers, readers, devices) and application/business logic (isolated services).
- Optimization & Enhancement: While the primary focus is on migration, optimization and feature enhancements will follow after successful integration.
- Collaboration & Knowledge Transfer: Work independently and ensure a clear transfer of knowledge to internal engineers.
- Work Model: Fully remote
- Language: English (strong communication skills required)
- Start Date: ASAP (Realistically January/February 2026)
- Duration: Estimated up to 6 months, with potential for earlier completion in 3–4 months
- Key Value: Knowledge transfer to internal engineers and ensuring smooth integration
Developer for Consent Management Implementation (m/f/d)
For replacing the consent layers previously provided by third-party CMPs on the web for our international brands, these layers need to be reimplemented so they can be operated and served in-house. This requires solid knowledge of TypeScript, Vue.js, and traditional web presentation technologies (HTML and CSS). The goal is to deliver executable code that implements all requirements and includes automated tests that prove correct functionality.
What exactly is the scope of the engagement: The focus of the service is on developing elements for decision-making on the approach and on implementing measures along the resulting project path. This specifically includes the following service packages:
- Code implementation
- Implementation of executable tests that must pass on delivery, test coverage >= 80%
- Creation of code documentation
- Creation of brand-specific cmp-config files.
- Creation of a project (including asset management requirements) as a copy of the consent management platform.
- Removal of netID references.
- Creation of brand-specific settings and files for custom purposes/providers.
- Adding new brand-specific CSS themes (variable values, logos, etc.).
- Inclusion of the required official IAB GVL translations (ES, FR) in the weekly synchronization with the GVL.
- Implementation of I18n and preparation of brand-specific data sources
- Implementation of PMC 2.0 backend usage modules
- Implementation of the playout logic
- Implementation of the layer initialization process (mode=default and mode=resurface)
- CDN upload and release process
- Project documentation
Project implementation:
- The desired result should be written in TypeScript and Vue.js, built with Vite, tested with Vitest.
Freelance Chemistry Expert for AI Model Training (m/f/d)
An AI lab is looking for a freelance chemistry experts to evaluate AI models. The goal of the project is to assess the performance, accuracy, and reliability of AI models applied in chemistry contexts. The role involves working closely with the development team to ensure the models meet industry standards and provide actionable insights.
This is a remote part-time role that can be flexibly tailored to your availability – from just a few hours per week to full-time.
Key responsibilities:
- Evaluate AI models for chemistry applications.
- Analyze model outputs and provide feedback for improvement.
- Collaborate with the development team to ensure alignment with industry standards.
- Document findings and recommendations for model optimization.
- Conduct tests to validate model performance and reliability.
Chemist with Python Experience (m/w/d)
GenAI models are improving very quickly, and one of our goals is to make them capable of addressing specialized questions and achieving complex reasoning skills. If you join the platform as an AI Tutor in Chemistry, you’ll have the opportunity to collaborate on these projects.
Although every project is unique, you might typically:
- Generate prompts that challenge AI.
- Define comprehensive scoring criteria to evaluate the accuracy of the AI’s answers.
- Correct the model’s responses based on your domain-specific knowledge.
Freelance Biology Expert for AI Model Training (m/f/d)
An AI lab is looking for a freelance biology experts to evaluate AI models. The goal of the project is to assess the performance, accuracy, and reliability of AI models applied in biology (all areas) contexts. The role involves working closely with the development team to ensure the models meet industry standards and provide actionable insights.
This is a remote part-time role that can be flexibly tailored to your availability – from just a few hours per week to full-time.
Key responsibilities:
- Evaluate AI models for biology applications.
- Analyze model outputs and provide feedback for improvement.
- Collaborate with the development team to ensure alignment with industry standards.
- Document findings and recommendations for model optimization.
- Conduct tests to validate model performance and reliability.
New
AI Agent Evaluation Analyst (m/w/d)
We’re on the hunt for QAs for autonomous AI agents for a new project focused on validating and improving complex task structures, policy logic, and agent evaluation frameworks. Throughout the project, you’ll have to balance quality assurance, research, and logical problem-solving. This project opportunity is ideal for people who enjoy looking at systems holistically and thinking through scenarios, implications, and edge cases.
You do not need a coding background, but you must be curious, intellectually rigorous, and capable of evaluating the soundness and consistency of complex setups. If you’ve ever excelled in things like consulting, CHGK, Olympiads, case solving, or systems thinking — you might be a great fit.
What you’ll be doing:
- Reviewing evaluation tasks and scenarios for logic, completeness, and realism.
- Identifying inconsistencies, missing assumptions, or unclear decision points.
- Helping define clear expected behaviors (gold standards) for AI agents.
- Annotating cause-effect relationships, reasoning paths, and plausible alternatives.
- Thinking through complex systems and policies as a human would to ensure agents are tested properly.
- Working closely with QA, writers, or developers to suggest refinements or edge case coverage.
Freelance Cybersecurity Consultant for AI Red Teaming
For an AI lab we are looking for cybersecurity consultants to train an AI model (Large Language Model - LLM).
You help AI to make sense of the world. As consultant, you may be invited to take part in online projects to train the model in your domain of expertise.
This flexible role accommodates both experts seeking part-time engagement (minimum few hours/week) and those interested in full-time opportunities
- Evaluate and red team AI models and agents and machine learning systems for vulnerabilities and safety risks.
- Create offline reproducible & auto-evaluable test cases to test safety & capability of AI agents.
- Develop and implement automation scripts, custom tools, environments and test harnesses.
- Lead or contribute to security research initiatives, especially in AI safety, creating and implementing realistic and challenging attack scenarios for the model.
- Advise on cybersecurity best practices and policy implications.
New
Data Engineer (m/f/d)
A company is looking for an experienced Data Engineer to carry out a migration from Snowflake to ClickHouse. The focus is on using Apache Spark for data processing and on managing and optimizing Kubernetes environments. The goal is to build and operate a powerful and scalable data platform.
- Executing the migration from Snowflake to ClickHouse
- Developing and optimizing data pipelines with Apache Spark
- Managing and optimizing Kubernetes clusters
- Ensuring the performance and scalability of the data platform
- Implementing solutions in Python
- Optional: Working with Snowplow for data analytics
AI Evaluation Consultant (m/w/d)
We are seeking an analytical and technically-minded professional to:
- Evaluate AI outputs and processes
- Ensure quality, accuracy, and reliability
- Identify logical errors, risks, and structural inconsistencies
- Provide actionable insights and recommendations to the team
Ideal candidates:
- Consultants, auditors, analysts, data researchers, or business/technical analysts with strong reasoning skills
- Professionals curious about AI, process improvement, and quality evaluation
- Problem-solvers who enjoy analyzing complex systems, logic, and scenarios
Key Responsibilities:
- Lead evaluation of AI outputs and related processes
- Review tasks against expected/ideal scenarios; identify gaps and risks
- Provide structured, actionable recommendations to engineers, domain experts, and managers
- Maintain and improve evaluation guidelines, checklists, SOPs
- Suggest new approaches, tools, and processes to enhance AI evaluation
New
Senior Factor 10 Developer (IPS / IPM) (m/f/d)
An insurance company in Nuremberg is looking for a Senior Factor 10 Developer with expertise in IPS and IPM. The project includes developing and optimizing software solutions in the insurance sector, focusing on high performance and reliability. The role requires solid knowledge of Factor 10 and its applications in the insurance industry.
Key responsibilities:
- Developing and optimizing applications with Factor 10, especially in IPS and IPM.
- Collaborating with interdisciplinary teams to ensure seamless integration and functionality.
- Analyzing and resolving complex technical issues.
- Providing technical guidance and mentoring to junior developers.
- Ensuring compliance with industry standards and best practices.
Senior Web Developer (m/f/d)
- You develop modern, high-performance web frontends with React, TypeScript, HTML, and CSS
- You implement responsive designs with a focus on accessibility and performance
- You plan and execute unit and integration tests (for example with Playwright)
- Troubleshooting in development, testing, or live environments
Biologist with Python Experience (m/w/d)
GenAI models are improving very quickly, and one of our goals is to make them capable of addressing specialized questions and achieving complex reasoning skills. If you join the platform as an AI Tutor in Biology, you’ll have the opportunity to collaborate on these projects.
Although every project is unique, you might typically:
- Generate prompts that challenge AI.
- Define comprehensive scoring criteria to evaluate the accuracy of the AI’s answers.
- Correct the model’s responses based on your domain-specific knowledge.
AI Consultants - Data Science (m/w/d)
We are seeking experienced data scientists to create computationally intensive data science problems for an advanced AI evaluation project. This is a remote, project-based opportunity for experts who can design challenging problems that require computational methods to solve and mirror the full data science lifecycle - from data acquisition and processing to statistical analysis and actionable business insights.
What You'll Do
- Design original computational data science problems that simulate real-world analytical workflows across industries (telecom, finance, government, e-commerce, healthcare) Create problems requiring Python programming to solve (using pandas, numpy, scipy, sklearn, statsmodels, matplotlib, seaborn)
- Ensure problems are computationally intensive and cannot be solved manually within reasonable timeframes (days/weeks)
- Develop problems requiring non-trivial reasoning chains in data processing, statistical analysis, feature engineering, predictive modeling, and insight extraction
- Create deterministic problems with reproducible answers - avoid stochastic elements or require fixed random seeds for exact reproducibility
- Base problems on real business challenges: customer analytics, risk assessment, fraud detection, forecasting, optimization, and operational efficiency
- Design end-to-end problems spanning the complete data science pipeline (data ingestion → cleaning → EDA → modeling → validation → deployment considerations)
- Incorporate big data processing scenarios requiring scalable computational approaches
- Verify solutions using Python with standard data science libraries and statistical methods
- Document problem statements clearly with realistic business contexts and provide verified correct answers
Project Manager Magazines / Magazine Production (m/f/d)
- Responsibility for coordinating and managing the entire production process of magazine publications
- Planning and monitoring issue structure, deadlines, advertisements, and workflows
- Close collaboration with editorial, publishing management, marketing, IT, sales, printers, and service providers
- Quality assurance of layouts, copy, and print approvals
- Cost calculation and organization of supplementary products (e.g., inserts, posters, expansions)
- Active role in strategic projects, conferences, and the launch of new formats
New
Requirement and Content Manager (m/f/d)
A company is looking for support for a project focused on optimizing the buy and leave journey to improve customer acquisition and retention. The goal is to increase efficiency in the value chain, speed up time to market, and reduce the total cost of ownership (TCO). The Adobe Experience Manager (AEM) platform plays a central role, especially for implementing new features like compositions, templates, and micro-frontends.
Main tasks:
- - Definition and design of requirements in the Adobe Experience Manager CMS area, including setting the development order of compositions, components, templates, and micro-frontends.
- Supporting documentation and feedback loops with tools like Jira and Confluence.
- Project-related consulting of development teams and requirement owners during the development phase.
- Analysis of the existing Adobe Experience Manager CMS infrastructure and deriving recommendations to optimize content and site structure, AEM interfaces, and performance.
- Creating documentation on using the provided compositions & components and sharing this information with internal teams.
- Professional consulting of business and technology departments as well as external partners as part of the change program.
- Advising on technical requirements, including content structure, site structure, micro-frontends, product data modeling, compositions & components, templates, headless CMS, and AEM interfaces.
- Support on special topics like accessibility, CIAM, multilingual, personalization, and campaigning.
ERP-Transformation Manager (m/w/d)
An established company is looking for an experienced ERP Transformation Manager to take full responsibility for planning and steering a comprehensive ERP transformation program. The project's goal is harmonizing processes, implementing a new ERP system, and meeting IFRS requirements.
The ERP Transformation Manager will analyze, redesign, and standardize the commercial core processes in civil and rail construction. This includes translating IFRS requirements into system structures and posting logic, closely coordinating with Finance, Controlling, Project Management, and IT departments.
The role includes managing the ERP rollout, including fit-gap analysis, process design, test management, and migration. In addition, a unified reporting and KPI framework for group financial statements and project management will be established. The manager will act as the central interface between operational units, Finance, management, and the group, and will set up a sustainable change and training concept for users.
- Planning and steering the ERP transformation program (IFRS transition, process harmonization, ERP rollout)
- Analyzing, redesigning, and standardizing commercial core processes
- Translating IFRS requirements into system structures and posting logic
- Managing the ERP rollout, including fit-gap analysis, process design, test management, and migration
- Building a unified reporting and KPI framework
- Stakeholder management and ensuring smooth communication
- Leading interdisciplinary project teams and managing external consultants and implementation partners
- Establishing a sustainable change and training concept
- Ensuring measurable process improvements after the ERP system goes live
Frontend developer to HR platform with Angular experience
Reach out to us if you are interested in working with us on the project.
Sign up to get access to more exciting projects that match your skills and preferences!
MCP & Tools Python Developer (m/w/d)
New
Industry
Information Technology (IT)
Area
Information Technology (IT)
Project info
- Daily ratefrom 290€
- Language
- English(Advanced)
- English
- Remote100%
Description
We’re on the hunt for hands-on Python engineers for a new project focused on developing Model Context Protocol (MCP) servers and internal tools for running and evaluating agent behavior. You’ll implement base methods for agent action verification, integrate with internal and client infrastructures, and help fill tooling gaps across the team.
What you’ll be doing:
- Developing and maintaining MCP-compatible evaluation servers
- Implementing logic to check agent actions against scenario definitions
- Creating or extending tools that writers and QAs use to test agents
- Working closely with infrastructure engineers to ensure compatibility
- Occasionally helping with test writing or debug sessions when needed
Although we’re only looking for experts for this current project, contributors with consistent high-quality submissions may receive an invitation for ongoing collaboration across future projects.
Requirements
- 4+ years of Python development experience, ideally in backend or tools
- Solid experience building APIs, testing frameworks, or protocol-based interfaces
- Understanding of Docker, Linux CLI, and HTTP-based communication
- Ability to integrate new tools into existing infrastructures
- Familiarity with how LLM agents are prompted, executed, and evaluated
- Clear documentation and communication skills - you’ll work with QA and writers
We also value applicants who have:
- Experience with Model Context Protocol (MCP) or similar structured agent-server interfaces
- Knowledge of FastAPI or similar async web frameworks
- Experience working with LLM logs, scoring functions, or sandbox environments
- Ability to support dev environments (devcontainers, CI configs, linters)
- JS experience