Recommended expert

Milos Nikolic

Senior AI/ML Engineer

Milos Nikolic
London, United Kingdom

Experience

Jan 2023 - Jul 2025
2 years 7 months
London, United Kingdom

Senior AI/ML Engineer

Meta AI

  • Agentic AI Platform (V3 Architecture): Architected and implemented a production multi-agent system with 5-stage orchestration (Planner Actioner Executor Feedback Evaluator); achieved 85% task success rate (+20% vs V2) through modular pipeline design and self-correcting feedback loops.
  • RAG Retrieval System: Built end-to-end retrieval pipeline with sliding-window chunking (50 lines + 10-line overlap), hybrid BM25+vector search, and parallel LLM summarization; improved retrieval precision by 85% and reduced hallucinations by 35% through semantic indexing.
  • Enterprise Connectors: Developed Confluence and Jira integrations with full authentication, webhook support, and error handling; enabled real-time knowledge base updates and cross-platform data synchronization for agent context.
  • LLM-as-Judge Evaluation Framework: Implemented automated evaluation system with golden test cases, tournament scoring, and regression testing; shifted from subjective 'looks good' assessments to quantitative evaluation with 15+ benchmark configurations across 6 task categories.
  • Web-Scale Ranking Platform: Built TensorFlow/JAX/TFX pipelines on Vertex AI + Dataflow + BigQuery; improved CTR +18% on 100M+ sessions/month; cut p95 latency -35% (≈210ms→136ms) via feature-store redesign and hard negatives.
  • Privacy-preserving personalization: Launched federated learning + DP for PT/ES/EN markets; ensured offline/online metric parity (AUC/PR, calibration) and automated drift alerts.
  • Experimentation Suite: Standardized A/B & interleaving tests; reusable metrics/dashboards reduced time-to-decision from 2–3 weeks to <5 days.
  • GenAI/RAG evaluation: Built offline evaluator and guardrails that reduced hallucinations by ~35% and improved answer F1 by +7 pts while lowering p95 latency by 20%.
  • ML Reliability Program: Ran Kubernetes/Docker microservices with model registry, shadow/canary deploys, rollbacks; maintained 99.9% inference SLO, MTTR <10 min; mentored 6 DS/ML Eng and partnered with 4 product teams (BR/US).
Feb 2020 - Dec 2022
2 years 11 months
Amsterdam, Netherlands

Data Scientist

Databricks

  • SageMaker Churn & propensity platform: Productionized models on SageMaker with model registry, CI/CD, and blue-green deployment, reducing churn by 22% across three pilot cohorts (around 45k users); included monitoring via MLflow and custom drift detectors.
  • Real-time lakehouse: Designed an S3, Glue, Athena, and EMR data plane ingesting over 10 TB per day; implemented streaming features with Kafka and Spark to enable about 1.8k QPS Lambda/Fargate inference.
  • LATAM Regulated Templates: Delivered reference architectures that reduced time-to-production from roughly 3 weeks to 6 hours and lowered infrastructure costs by 18% through improved observability.
  • Model Governance: Implemented feature lineage, PII safeguards, and model calibration (ECE, Brier) to ensure consistent and auditable performance.
Dec 2017 - Nov 2019
2 years
Berlin, Germany

Principal ML Consultant

Capgemini Invent

  • Enterprise labeling platform: Flask/React system to train/retrain CV/NLP models; reduced dataset turnaround by 50% (4 wks → 2 wks) for a tier-1 bank & public-sector client.
  • Anti-spoofing & error monitoring: Deployed scikit-learn/PyTorch models and a centralized Flask/DB2 error API; reduced critical incidents by 23% QoQ.
  • Serverless identity: Built Cloud Functions + Cloud SQL user-management; lowered access-ticket resolution time by 30% and simplified audits.

Industries Experience

See where this freelancer has spent most of their professional time. Longer bars indicate deeper hands-on experience, while shorter ones reflect targeted or project-based work.

Experienced in Information Technology (5.5 years), Banking and Finance (2 years), and Government and Administration (2 years).

Information Technology
Banking and Finance
Government and Administration

Business Areas Experience

The graph below provides a cumulative view of the freelancer's experience across multiple business areas, calculated from completed and active engagements. It highlights the areas where the freelancer has most frequently contributed to planning, execution, and delivery of business outcomes.

Experienced in Information Technology (7.5 years), Business Intelligence (3 years), Product Development (2.5 years), and Research and Development (2.5 years).

Information Technology
Business Intelligence
Product Development
Research and Development

Summary

Senior AI/ML Engineer with over 10 years delivering production-grade AI that drives measurable business impact.

Scope: Agentic AI, GenAI/RAG, ranking, NLP/CV, and large-scale experimentation; built calibrated, monitored, and drift-resilient ML systems (AUC/PR, ECE).

Platforms: Python; TensorFlow/PyTorch; AWS (SageMaker) & GCP (Vertex AI); Kubernetes/Airflow/MLflow; feature stores; 99.9% real-time inference SLO.

Results: +18% CTR at web scale, -35% p95 latency, -22% churn, -18% infra cost.

Core stack: Python (10+), TensorFlow (6+), PyTorch (5+), Scikit-learn (9+), XGBoost/LightGBM (7+), Transformers/HuggingFace (5+), LangChain/RAG (3+), Vector Databases (FAISS, Pinecone, PostgreSQL) (3+), Airflow/MLflow/Kubernetes/Docker (6+), AWS (SageMaker, S3, Glue, Athena, EMR, Lambda, Fargate) (4+), GCP (Vertex AI, Dataflow, BigQuery) (3+), Spark/Kafka (5+), Feature Store/TFX (3+), SQL/Snowflake/BigQuery (7+), FastAPI/Flask (6+), REST/GraphQL (5+), CI/CD (Jenkins, GitHub Actions) (5+), NLP (8+), Computer Vision (7+), Federated Learning & Responsible AI (2+).

Skills

  • Agentic Ai & Orchestration: Multi-agent (Planner Tools Critic) Tool Routing; Short/long-term Memory; Self-reflection; Guardrails; Llm-as-judge
  • Rag & Retrieval: Hybrid Bm25 + Dense; Cross-encoder Reranking; Query-intent Routing; Semantic Chunking; Vector Stores (Faiss, Pinecone)
  • Serving & Systems: Ray Serve; Nvidia Triton; Gpu Inference; Fastapi Microservices; Distributed Training; Api Design; Kubernetes/docker
  • Mlops & Evaluation: Mlflow; Model Registry; Ci/cd; Feature Stores; Monitoring & Drift/bias; Prometheus/opentelemetry
  • Data & Streaming: Spark; Kafka; Redis; Sql; Hl7/fhir Pipelines
  • Cloud & Infra: Aws (Sagemaker, Lambda, S3); Gcp (Vertex Ai, Bigquery); Azure (Azure Ml)
  • Databases & Warehouses: Postgresql; Azure Cosmos Db; Dynamodb; Snowflake; Bigquery
  • Languages & Frameworks: Python, C++, Java, Javascript/typescript; React (Typescript); Fastapi; Flask; Rest/graphql

Languages

English
Advanced
German
Elementary
Dutch
Elementary

Education

Aug 2013 - Jun 2015

Nanyang Technological University (NTU)

Master of Science in Computer Science · Computer Science · Singapore

Aug 2009 - Jun 2013

Nanyang Technological University (NTU)

Bachelor of Science in Computer Science · Computer Science · Singapore

Profile

Created
Need a freelancer? Find your match in seconds.
Try FRATCH GPT
More actions

Frequently asked questions

Do you have questions? Here you can find further information.

Where is Milos based?

Milos is based in London, United Kingdom.

What languages does Milos speak?

Milos speaks the following languages: English (Advanced), German (Elementary), Dutch (Elementary).

How many years of experience does Milos have?

Milos has at least 7 years of experience. During this time, Milos has worked in at least 3 different roles and for 3 different companies. The average length of individual experience is 2 years and 5 months. Note that Milos may not have shared all experience and actually has more experience.

What roles would Milos be best suited for?

Based on recent experience, Milos would be well-suited for roles such as: Senior AI/ML Engineer, Data Scientist, Principal ML Consultant.

What is Milos's latest experience?

Milos's most recent position is Senior AI/ML Engineer at Meta AI.

What companies has Milos worked for in recent years?

In recent years, Milos has worked for Meta AI and Databricks.

Which industries is Milos most experienced in?

Milos is most experienced in industries like Information Technology (IT), Banking and Finance, and Government and Public Administration.

Which business areas is Milos most experienced in?

Milos is most experienced in business areas like Information Technology (IT), Business Intelligence, and Product Development. Milos also has some experience in Research and Development (R&D).

Which industries has Milos worked in recently?

Milos has recently worked in industries like Information Technology (IT).

Which business areas has Milos worked in recently?

Milos has recently worked in business areas like Information Technology (IT), Business Intelligence, and Product Development.

What is Milos's education?

Milos holds a Master in Computer Science from Nanyang Technological University (NTU) and a Bachelor in Computer Science from Nanyang Technological University (NTU).

What is the availability of Milos?

The availability of Milos needs to be confirmed.

What is the rate of Milos?

Milos's rate depends on the specific project requirements. Please use the Meet button on the profile to schedule a meeting and discuss the details.

How to hire Milos?

To hire Milos, click the Meet button on the profile to request a meeting and discuss your project needs.

Average rates for similar positions

Rates are based on recent contracts and do not include FRATCH margin.

1000
750
500
250
Market avg: 750-910 €
The rates shown represent the typical market range for freelancers in this position based on recent contracts on our platform.
Actual rates may vary depending on seniority level, experience, skill specialization, project complexity, and engagement length.