Stephan Baier
Freelance Data Scientist
Experience
Freelance Data Scientist
Baier Data & AI Consulting
Team Lead Data Science
Check24 GmbH
Setting up a hybrid ML architecture on AWS and on-premises
Developing custom machine learning models for computer vision and information extraction
Evaluating and prototyping with different agentic AI tools, LLMs, and MCP
Collaborating with product owners to define functional requirements and security aspects
Leading a team of data scientists and data engineers
OCR pipeline: Training an EasyOCR model on domain-specific and synthetic datasets (ID cards, passports, driver’s licenses)
Segmentation model: Implemented a segmentation model in PyTorch for precise cropping of documents and perspective correction
ID classifier: Developed a CNN-based model in PyTorch to classify document types
Hologram detection: Developed a specialized classification model in PyTorch to verify holograms
Converted models to ONNX and TensorFlow Lite, including quantization and pruning of model weights to meet real-time performance requirements
Achieved average inference times of under 200 ms on mobile devices
Over one million successful real-time idents
Reduced manual verification effort by more than 90 %
Tech stack: AWS SageMaker, Bedrock, Rekognition; multimodal LLMs, Pydantic, FastMCP, prompt engineering; PyTorch, PyTorch Lightning, TensorFlow Lite, ONNX; TorchVision, OpenCV, EasyOCR
Lead Machine Learning Engineer
RS Alpha Capital GmbH
Set up an on-premise Kubernetes cluster using Apache Ranger
Automated GPU-based training jobs
Built CI/CD pipelines with ArgoCD and GitLab for automated model deployment
Implemented MLOps pipelines with Dagster and ClearML
Tech stack: Kubernetes, Apache Ranger, PyTorch, ArgoCD, GitLab CI/CD, Docker, Grafana, Prometheus, on-prem GPU cluster
Counteracted model degradation through automated retraining and deployment with minimal manual effort
Ensured compliance with regulatory requirements through detailed monitoring and a highly available infrastructure with 99.99 % uptime
Senior Data Science Consultant
Data Reply GmbH
Streaming ML for Customer Message Processing (Oct 2019 – May 2021)**
Built a real-time ML pipeline for classifying customer communications
Created live dashboards for message flow, predictions, and system monitoring
Designed a VAIT-compliant ML lifecycle with audit trails, Jenkins CI, and deployment on Kubernetes
Tech stack: Kafka, OpenShift, Jenkins, MLflow, Python, scikit-learn, XGBoost, Universal Sentence Encoder
Enabled real-time processing of over 20,000 messages per day
Delivered fully traceable and regulatory-compliant ML workflows
NLP Pipeline for Semantic Search and Entity Linking (Aug 2018 – Sept 2019)**
Developed and implemented an NLP pipeline using BERT-based models for NER and entity disambiguation
Built a semantic search engine with Elasticsearch and Kibana dashboards for query analysis
Linked extracted entities and relationships in a Neo4j knowledge graph, enabling graph-based search and interactive exploration via Neo4j Bloom
Tech stack: PyTorch, BERT, Elasticsearch, Kibana, Neo4j, Bloom, Azure
Provided more timely risk assessment for credit insurance by incorporating current news events
Increased efficiency for claims adjusters by enabling faster identification of relevant historical claims
Sales Funnel Optimization for Loan Products (May 2018 – July 2018)**
Applied clustering and classification models to segment user behavior and identify drop-off patterns in the sales funnel
Performed data analysis, feature engineering, and model development
Developed an interactive web application to visualize conversion paths, customer segments, and model results for business stakeholders
Tech stack: Python, PySpark, Pandas, NumPy, scikit-learn, SQL, Plotly, Dash
Industrial Research Fellow
Siemens AG
Software Developer
Steria Mummert AG
Summary
I am an experienced Data Scientist and Machine Learning Engineer with a strong academic background in computer science and artificial intelligence. My focus is on consulting, implementing, and operationalizing state-of-the-art machine learning solutions.
Skills
- Programming & Frameworks: Python (Pandas, Scikit-learn, Pyspark, Fastapi), Java
- Machine Learning & Deep Learning: Pytorch, Tensorflow, Keras, Mlflow, Azureml, Aws Sagemaker
- Natural Language Processing: Llm, Rag, Knowledge Graph, Small Language Model Fine-tuning
- Computer Vision: Ocr, Opencv, Torch-vision, Image Classification, Object Detection, Segmentation
- Data Engineering & Orchestration: Sql, Apache Kafka, Elasticsearch, Dagster
- Cloud & Virtualization: Aws, Azure, Docker, Kubernetes, Jenkins, Ci/cd
Languages
Education
Ludwig Maximilian University
Ph.D. · Computer Science · Munich, Germany · summa cum laude (with highest honours)
Ludwig Maximilian University
M.Sc. · Computer Science · Munich, Germany
Ludwig Maximilian University
B.Sc. · Computer Science · Munich, Germany
Certifications & licenses
AWS Cloud Practitioner
Certified Kubernetes Application Developer
Chartered Financial Analyst (CFA) Level 1
Confluent Certified Developer For Apache Kafka
Professional Scrum Master (PSM 1)
Similar Freelancers
Discover other experts with similar qualifications and experience