Jens Daube

Product Owner & Senior Data Scientist

Frankfurt, Germany

Experience

Jun 2023 - Present
2 years 2 months
Frankfurt, Germany

Product Owner & Senior Data Scientist

Legal Tech

  • Led an international team of six developers in a Scrum environment
  • Defined the project’s strategic goals in coordination with stakeholders and the development team
  • Performed prompt engineering on language models to improve answer accuracy and relevance
  • Implemented LangChain components for a RAG chatbot to answer legal questions
  • Technologies: GPT-4, LangChain, Python (Pandas, sklearn, streamlit), Docker, GitLab, ChromaDB
May 2023 - Apr 2024
1 year
Frankfurt, Germany

Senior Data Scientist

Zentralbank

  • Developed a scalable, high-performance enterprise search solution
  • Implemented a Retrieval-Augmented Generation (RAG) model to deliver valid, context-aware answers to document queries
  • Built and managed messaging queues to ensure reliable, scalable data processing and transfer between system components
  • Created RESTful APIs to provide search functionality and integrate the enterprise search solution into existing applications and systems, including security and authentication
  • Technologies: ElasticSearch, Kibana, LLaMA, SQL, FastAPI, Docker, Python (Pandas, sklearn, PyTorch), HuggingFace
Sep 2022 - Apr 2023
8 months
GG

Senior Data Scientist

Finanzaufsichtsbehörde

  • Developed and deployed an early warning system using structured and unstructured data to monitor fund default risk
  • Built a GPT-4 based chatbot for the regulatory authority to answer questions on annual and quarterly reports
  • Implemented and configured automated CI/CD pipelines to automate build, test, and deploy processes
  • Collaborated closely with subject matter experts to understand requirements for the early warning system
  • Technologies: GPT-4, LangChain, Python (Pandas, NumPy, PyTorch, sklearn), SQL, GitLab, Docker, Kubernetes, Apache Spark, ChromaDB
Dec 2021 - Aug 2022
9 months
Berlin, Germany

Senior Data Scientist

Öffentliche Behörde

  • Led the project, held regular client meetings, and ensured all requirements and expectations were met
  • Developed and trained models to analyze economic and financial market reports
  • Improved model performance through hyperparameter tuning, feature engineering, and regularization
  • Worked with experts to validate results and adapt models to the agency’s specific needs
  • Technologies: Python (Pandas, NumPy, SpaCy, sklearn, Keras), HuggingFace, GitLab, Docker
Mar 2021 - Nov 2021
9 months
Salzburg, Austria

Senior Data Scientist

Getränkehersteller

  • Analyzed historical sales data to identify patterns, trends, and seasonal effects impacting sales
  • Applied time series techniques like ARIMA and Exponential Smoothing, plus advanced ML models like Random Forests and LSTM to improve forecast accuracy
  • Integrated external data (weather, marketing campaigns) to further enhance sales predictions
  • Conducted sensitivity analyses and scenario modeling to spot risks and opportunities early and develop strategies
  • Technologies: Python (Pandas, NumPy, seaborn, sklearn), GCP (Dataproc, BigQuery, Cloud Functions, Vertex AI), SQL, GitLab
May 2020 - Feb 2021
10 months
Zürich, Switzerland

Senior Machine Learning Engineer

Universalbank

  • Built and implemented preprocessing pipelines to standardize and structure traders’ communication data
  • Used the pre-trained FinBERT model to generate word embeddings from finance texts
  • Developed models for network analysis, anomaly detection, and clustering
  • Created and automated end-to-end workflows for model training, validation, and deployment
  • Technologies: Python (SpaCy, sklearn, TensorFlow), Hugging Face, SQL, ElasticSearch, Docker, Kubernetes, GitHub, Jenkins, MLflow
Oct 2019 - Apr 2020
7 months
Utrecht, Netherlands

Data Scientist

Behörde für Verkehrsdaten

  • Designed and implemented cloud-based system architectures on Azure
  • Set up and configured Kubeflow and MLflow to manage and automate ML workflows
  • Built and trained ML models to detect unusual traffic patterns and events
  • Developed and ran tests to ensure solution functionality, reliability, and security
  • Technologies: Python (TensorFlow, PyTorch, Pandas, NumPy), Azure (Kubernetes Service, DevOps, Storage), Kubeflow, MLflow, Helm
Mar 2019 - Sep 2019
7 months
Stuttgart, Germany

Machine Learning Engineer

Automobilkonzern

  • Collected and cleaned historical sales data and external factors like market trends, economic data, and seasonal influences
  • Identified and engineered relevant features to boost model accuracy
  • Developed and trained various ML models for sales forecasting, including specialized models like Prophet and ARIMA
  • Implemented an Explainable AI module using SHAP to increase transparency and interpretability of results
  • Technologies: Python (Prophet, statsmodels, Keras, Pandas, NumPy, SHAP), SQL, GitLab
Aug 2018 - Feb 2019
7 months
Düsseldorf, Germany

Data Scientist

Asset Manager

  • Developed containerized microservices, including APIs and test specs, for named entity recognition using custom and pre-trained AI models
  • Built models to score fund report sentiment
  • Identified and engineered key features from text data to improve model performance and selected top features for training
  • Optimized hyperparameters through systematic search and advanced methods like Bayesian optimization
  • Technologies: Python (Pandas, NumPy, NLTK, SpaCy, TensorFlow), Flask, Azure (Databricks, Cognitive Services, Machine Learning, DevOps)
Jan 2018 - Jul 2018
7 months
Vienna, Austria

Data Scientist

Universalbank

  • Designed a document data processing workflow with seamless OCR and NLP module integration
  • Implemented OCR algorithms to auto-recognize text in image files (tif, jpg, png), containerizing OCR microservices with Docker
  • Built NLP models to extract information from recognized text
  • Extracted key features from OCR and NLP data to boost model performance and improve information extraction
  • Technologies: Python (Tesseract, SpaCy, NLTK, Pandas, NumPy), Docker, Kubernetes, GitLab
Jun 2017 - Dec 2017
7 months
Berlin, Germany

Data Analyst

Kreditbank

  • Developed and validated credit risk models in Python, including Monte Carlo simulations for various risk scenarios
  • Used SQL to manage and query large datasets, then prepared, cleaned, and explored data in Python to spot key features and patterns
  • Validated models via backtesting and historical data analysis, then fine-tuned based on validation results
  • Integrated models into the bank’s IT system for production use and set up continuous monitoring and optimization
  • Technologies: Microsoft SQL, Python (Pandas, NumPy, SciPy, sklearn, Seaborn), GitLab, Docker

Languages

German
Native
English
Advanced

Education

Feb 2018 - Nov 2020

Universität Mannheim

Master in Data Science · Data Science · Mannheim, Germany · 1.3

Sep 2013 - Mar 2017

Universität Mannheim

Bachelor of Science · Wirtschaftsmathematik · Mannheim, Germany · 1.7

Certifications & licenses

Professional SCRUM Master 1