Milos Nikolic
Senior AI/ML Engineer
Experience
Senior AI/ML Engineer
Meta AI
- Agentic AI Platform (V3 Architecture): Architected and implemented a production multi-agent system with 5-stage orchestration (Planner Actioner Executor Feedback Evaluator); achieved 85% task success rate (+20% vs V2) through modular pipeline design and self-correcting feedback loops.
- RAG Retrieval System: Built end-to-end retrieval pipeline with sliding-window chunking (50 lines + 10-line overlap), hybrid BM25+vector search, and parallel LLM summarization; improved retrieval precision by 85% and reduced hallucinations by 35% through semantic indexing.
- Enterprise Connectors: Developed Confluence and Jira integrations with full authentication, webhook support, and error handling; enabled real-time knowledge base updates and cross-platform data synchronization for agent context.
- LLM-as-Judge Evaluation Framework: Implemented automated evaluation system with golden test cases, tournament scoring, and regression testing; shifted from subjective 'looks good' assessments to quantitative evaluation with 15+ benchmark configurations across 6 task categories.
- Web-Scale Ranking Platform: Built TensorFlow/JAX/TFX pipelines on Vertex AI + Dataflow + BigQuery; improved CTR +18% on 100M+ sessions/month; cut p95 latency -35% (≈210ms→136ms) via feature-store redesign and hard negatives.
- Privacy-preserving personalization: Launched federated learning + DP for PT/ES/EN markets; ensured offline/online metric parity (AUC/PR, calibration) and automated drift alerts.
- Experimentation Suite: Standardized A/B & interleaving tests; reusable metrics/dashboards reduced time-to-decision from 2–3 weeks to <5 days.
- GenAI/RAG evaluation: Built offline evaluator and guardrails that reduced hallucinations by ~35% and improved answer F1 by +7 pts while lowering p95 latency by 20%.
- ML Reliability Program: Ran Kubernetes/Docker microservices with model registry, shadow/canary deploys, rollbacks; maintained 99.9% inference SLO, MTTR <10 min; mentored 6 DS/ML Eng and partnered with 4 product teams (BR/US).
Data Scientist
Databricks
- SageMaker Churn & propensity platform: Productionized models on SageMaker with model registry, CI/CD, and blue-green deployment, reducing churn by 22% across three pilot cohorts (around 45k users); included monitoring via MLflow and custom drift detectors.
- Real-time lakehouse: Designed an S3, Glue, Athena, and EMR data plane ingesting over 10 TB per day; implemented streaming features with Kafka and Spark to enable about 1.8k QPS Lambda/Fargate inference.
- LATAM Regulated Templates: Delivered reference architectures that reduced time-to-production from roughly 3 weeks to 6 hours and lowered infrastructure costs by 18% through improved observability.
- Model Governance: Implemented feature lineage, PII safeguards, and model calibration (ECE, Brier) to ensure consistent and auditable performance.
Principal ML Consultant
Capgemini Invent
- Enterprise labeling platform: Flask/React system to train/retrain CV/NLP models; reduced dataset turnaround by 50% (4 wks → 2 wks) for a tier-1 bank & public-sector client.
- Anti-spoofing & error monitoring: Deployed scikit-learn/PyTorch models and a centralized Flask/DB2 error API; reduced critical incidents by 23% QoQ.
- Serverless identity: Built Cloud Functions + Cloud SQL user-management; lowered access-ticket resolution time by 30% and simplified audits.
Industries Experience
See where this freelancer has spent most of their professional time. Longer bars indicate deeper hands-on experience, while shorter ones reflect targeted or project-based work.
Experienced in Information Technology (5.5 years), Banking and Finance (2 years), and Government and Administration (2 years).
Business Areas Experience
The graph below provides a cumulative view of the freelancer's experience across multiple business areas, calculated from completed and active engagements. It highlights the areas where the freelancer has most frequently contributed to planning, execution, and delivery of business outcomes.
Experienced in Information Technology (7.5 years), Business Intelligence (3 years), Product Development (2.5 years), and Research and Development (2.5 years).
Summary
Senior AI/ML Engineer with over 10 years delivering production-grade AI that drives measurable business impact.
Scope: Agentic AI, GenAI/RAG, ranking, NLP/CV, and large-scale experimentation; built calibrated, monitored, and drift-resilient ML systems (AUC/PR, ECE).
Platforms: Python; TensorFlow/PyTorch; AWS (SageMaker) & GCP (Vertex AI); Kubernetes/Airflow/MLflow; feature stores; 99.9% real-time inference SLO.
Results: +18% CTR at web scale, -35% p95 latency, -22% churn, -18% infra cost.
Core stack: Python (10+), TensorFlow (6+), PyTorch (5+), Scikit-learn (9+), XGBoost/LightGBM (7+), Transformers/HuggingFace (5+), LangChain/RAG (3+), Vector Databases (FAISS, Pinecone, PostgreSQL) (3+), Airflow/MLflow/Kubernetes/Docker (6+), AWS (SageMaker, S3, Glue, Athena, EMR, Lambda, Fargate) (4+), GCP (Vertex AI, Dataflow, BigQuery) (3+), Spark/Kafka (5+), Feature Store/TFX (3+), SQL/Snowflake/BigQuery (7+), FastAPI/Flask (6+), REST/GraphQL (5+), CI/CD (Jenkins, GitHub Actions) (5+), NLP (8+), Computer Vision (7+), Federated Learning & Responsible AI (2+).
Skills
- Agentic Ai & Orchestration: Multi-agent (Planner Tools Critic) Tool Routing; Short/long-term Memory; Self-reflection; Guardrails; Llm-as-judge
- Rag & Retrieval: Hybrid Bm25 + Dense; Cross-encoder Reranking; Query-intent Routing; Semantic Chunking; Vector Stores (Faiss, Pinecone)
- Serving & Systems: Ray Serve; Nvidia Triton; Gpu Inference; Fastapi Microservices; Distributed Training; Api Design; Kubernetes/docker
- Mlops & Evaluation: Mlflow; Model Registry; Ci/cd; Feature Stores; Monitoring & Drift/bias; Prometheus/opentelemetry
- Data & Streaming: Spark; Kafka; Redis; Sql; Hl7/fhir Pipelines
- Cloud & Infra: Aws (Sagemaker, Lambda, S3); Gcp (Vertex Ai, Bigquery); Azure (Azure Ml)
- Databases & Warehouses: Postgresql; Azure Cosmos Db; Dynamodb; Snowflake; Bigquery
- Languages & Frameworks: Python, C++, Java, Javascript/typescript; React (Typescript); Fastapi; Flask; Rest/graphql
Languages
Education
Nanyang Technological University (NTU)
Master of Science in Computer Science · Computer Science · Singapore
Nanyang Technological University (NTU)
Bachelor of Science in Computer Science · Computer Science · Singapore
Profile
Frequently asked questions
Do you have questions? Here you can find further information.
Where is Milos based?
What languages does Milos speak?
How many years of experience does Milos have?
What roles would Milos be best suited for?
What is Milos's latest experience?
What companies has Milos worked for in recent years?
Which industries is Milos most experienced in?
Which business areas is Milos most experienced in?
Which industries has Milos worked in recently?
Which business areas has Milos worked in recently?
What is Milos's education?
What is the availability of Milos?
What is the rate of Milos?
How to hire Milos?
Average rates for similar positions
Rates are based on recent contracts and do not include FRATCH margin.
Similar Freelancers
Discover other experts with similar qualifications and experience
Experts recently working on similar projects
Freelancers with hands-on experience in comparable project as a Senior AI/ML Engineer
Nearby freelancers
Professionals working in or nearby London, United Kingdom