Dan Thach
Lead Data Scientist / AI Platform Engineer
Experience
Lead Data Scientist / AI Platform Engineer
Tkxel
- Led design and deployment of an internal LLM-based assistant platform serving over 1,000 daily users, improving response accuracy by 28% and reducing inference cost by 15% through optimized RAG pipeline orchestration.
- Developed company-wide AI governance framework with model observability and safety layers, ensuring compliance and reducing hallucination incidents in customer-facing models.
- Built modular RAG 2.0 pipelines using LangChain and custom orchestration to enable dynamic context retrieval across product domains.
- Partnered with data engineering teams to align feature store schemas with ML and analytics workflows, accelerating model iteration.
- Mentored junior engineers on LLMOps best practices and scalable deployment strategies across cloud and on-prem environments.
- Tech & Tools: GPT-4/5 APIs, Llama 3, LangChain, HuggingFace, Pinecone, Weaviate, Ray, KServe, MLflow, Kubeflow, Airflow, Feast, Evidently, Prometheus, AWS/GCP
Senior Machine Learning Engineer (NLP Focus)
Meta
- Led fine-tuning and deployment of transformer-based models for document intelligence, boosting text extraction accuracy by 21% and cutting inference latency by 35% through efficient model serving.
- Designed and implemented scalable ML pipelines and retraining workflows, reducing manual retraining cycles by 40% and improving model monitoring coverage organization-wide.
- Built end-to-end NLP components including tokenization, embeddings, and evaluation systems to support enterprise search and knowledge extraction.
- Pioneered early retrieval-augmented generation (RAG) prototypes for internal document Q&A solutions.
- Collaborated with platform teams to integrate ML observability and CI/CD automation within Kubernetes-based workflows.
- Conducted comparative A/B testing between transformer-based and conventional NLP models for production-grade deployment decisions.
- Acted as a mentor and reviewer for junior engineers, standardizing best practices for NLP experimentation and deployment.
- Tech & Tools: HuggingFace Transformers, BERT, RoBERTa, T5, GPT-3 API, Sentence Transformers, MLflow, DVC, Kubeflow, SageMaker, Feast, Evidently, Spark, Kafka, Delta Lake
Machine Learning Engineer
Semantic Visions
- Designed and deployed end-to-end ML microservices for recommendation and NLP features, ensuring reliable model serving and monitoring.
- Implemented A/B testing frameworks for ML system evaluation, improving iteration speed and data-driven decision making.
- Collaborated on MLOps pipeline development, including CI/CD workflows, model versioning, and automatic retraining.
- Partnered with data engineering teams to optimize ETL and feature pipelines using Spark and Airflow.
- Contributed to early adoption of MLflow and model observability dashboards, improving transparency across deployed models.
- Tech & Tools: TensorFlow 2.x, PyTorch, scikit-learn, BERT, Docker, Airflow, Kubernetes, Flask/FastAPI, MLflow, Spark, Kafka, BigQuery, Prometheus, Grafana
Data Engineer
Featurespace
- Designed and implemented ETL-to-ELT data pipelines using Spark and Airflow, enabling near real-time analytics for product metrics.
- Migrated key data workflows from on-premise to AWS and GCP, improving reliability and reducing latency.
- Built and maintained data marts and semantic layers supporting downstream analytics and early machine learning projects.
- Introduced Kafka streaming for processing event data, increasing scalability and monitoring capabilities.
- Partnered with analysts and data scientists to create efficient feature-ready data pipelines for experimentation.
- Tech & Tools: Python 3, SQL, Airflow, Spark, Hive, Kafka, AWS (Redshift, S3), GCP (BigQuery), Docker, Bash
Junior Data Engineer (Analytics & ETL)
UiPath
- Automated legacy Excel/VBA reports using Python and SQL, significantly reducing manual reporting cycles.
- Assisted in building initial BI dashboards and ETL pipelines supporting executive analytics.
- Participated in pilot Hadoop/Hive projects to evaluate distributed data processing for large datasets.
- Tech & Tools: SQL (MySQL, Postgres), Python 2.7/3, Excel/VBA, Tableau, Power BI, Linux, Bash
Industries Experience
See where this freelancer has spent most of their professional time. Longer bars indicate deeper hands-on experience, while shorter ones reflect targeted or project-based work.
Experienced in Information Technology (10.5 years) and Media and Entertainment (3 years).
Business Areas Experience
The graph below provides a cumulative view of the freelancer's experience across multiple business areas, calculated from completed and active engagements. It highlights the areas where the freelancer has most frequently contributed to planning, execution, and delivery of business outcomes.
Experienced in Information Technology (10.5 years), Product Development (8 years), Research and Development (6 years), and Business Intelligence (2.5 years).
Summary
Senior Machine Learning Engineer with a strong hybrid foundation in data science, MLOps, and AI platform development, bridging the gap between scalable machine learning systems and business-driven modeling strategies. Over 10 years of experience spanning data engineering, ML infrastructure, NLP, and LLM-based applications, driving measurable impact through model performance optimization, experimentation, and production reliability. Adept at leading cross-functional ML initiatives, mentoring teams, and transforming complex data pipelines into deployable, value-oriented AI solutions across cloud environments.
Skills
- Programming Languages: Python (Numpy, Pandas, Pyspark), R, Sql, Bash
- Data Engineering & Processing: Spark, Kafka, Airflow, Etl/elt Pipelines, Data Modeling (Star/kimball)
- Machine Learning: Scikit-learn, Tensorflow, Pytorch, Xgboost, Transformers (Bert, Gpt, Llama)
- Nlp & Llms: Huggingface, Sentence Transformers, Rag Architectures, Vector Databases (Pinecone, Faiss, Weaviate)
- Mlops & Platforms: Mlflow, Kubeflow, Vertex Ai, Sagemaker, Docker, Kubernetes, Ci/cd
- Experimentation & Analytics: A/b Testing, Causal Inference, Feature Engineering, Statistical Modeling
- Cloud & Infrastructure: Aws (S3, Ec2, Lambda), Gcp (Vertex, Bigquery), Azure Ml, Nvidia Triton/tensorrt
- Observability & Governance: Weights & Biases, Evidently, Prometheus, Grafana, Feast, Guardrails Ai
- Soft Skills: Cross-functional Collaboration, Mentoring, Product-driven Ml Strategy
Languages
Education
University of York
Master of Science in Computer Science · Computer Science · York, United Kingdom
Hanoi University of Science and Technology
Bachelor of Science in Computer Science · Computer Science · Hanoi, Viet Nam
Profile
Frequently asked questions
Do you have questions? Here you can find further information.
Where is Dan based?
What languages does Dan speak?
How many years of experience does Dan have?
What roles would Dan be best suited for?
What is Dan's latest experience?
What companies has Dan worked for in recent years?
Which industries is Dan most experienced in?
Which business areas is Dan most experienced in?
Which industries has Dan worked in recently?
Which business areas has Dan worked in recently?
What is Dan's education?
What is the availability of Dan?
What is the rate of Dan?
How to hire Dan?
Average rates for similar positions
Rates are based on recent contracts and do not include FRATCH margin.
Similar Freelancers
Discover other experts with similar qualifications and experience
Experts recently working on similar projects
Freelancers with hands-on experience in comparable project as a Lead Data Scientist / AI Platform Engineer
Nearby freelancers
Professionals working in or nearby Warsaw, Poland