Himanshu Negi
Principal (Data Scientist/Data Engineer/Gen AI Engineer)
Experience
Principal (Data Scientist/Data Engineer/Gen AI Engineer)
Marktguru Deutschland GmbH
Architected an agentic, real-time offer orchestration engine where specialized agents (retrieval, pricing/optimization, and policy/guardrails) coordinate to personalise promotions across customer touchpoints using RAG with FAISS over Delta Lake and low-latency Databricks Model Serving. Collaborated with product managers and commercial stakeholders to shape the roadmap and evaluate emerging agent patterns for production.
Designed an agent-based data quality service that orchestrates schema detection, entity normalization, and validator/exception-handling agents to clean multi-retailer SKU feeds at scale. Wrapped model calls in PySpark UDFs for distributed inference, automated via Databricks Workflows and CI/CD.
Developed a multimodal, agentic extraction pipeline where vision, parsing, and compliance agents collaborate to derive brand, packaging, and volume from scanned images using Claude 3 Sonnet with Swin Transformer encoders. Orchestrated via Azure Event Hub with outputs persisted to Delta Lake.
Implemented a GS1 taxonomy classification service built around cooperating agents for inference, drift monitoring, and auto-retraining governance using Falcon 180B (LoRA-tuned) with a batch pipeline on Databricks.
Created a hybrid agent workflow where a retrieval agent surfaces candidate matches via embeddings and a reasoning/verification agent (Mixtral 8x7B) adjudicates receipt-to-SKU alignment, integrated into a streaming Databricks pipeline.
Built a multimodal attribute inference pipeline structured as cooperating vision-language, rules/consistency, and compliance agents to fill NutriScore, nutrition fields, and packaging types from names and images using LLaMA 3-8B with CLIP embeddings.
Developed a GenAI-powered orchestration system that ingests recipes from multiple websites, parses ingredients through structured extraction agents, and dynamically links them to real-time retailer offers via tagging, semantic reasoning, and business-rule agents.
Architect (Data Scientist/Data Engineer)
ABL Solutions GmbH
Developed a machine learning system leveraging WiFi data to forecast passenger demand, optimizing resource allocation and improving supply chain efficiency, achieving a 20% improvement in route optimization.
Designed an AI-driven traffic optimization solution integrating IoT sensors and Google Maps API to analyze patterns and predict congestion, reducing traffic delays by 30% through real-time signal adjustments.
Built a machine learning-based predictive maintenance system using IoT sensor data to forecast equipment failures, minimizing downtime by 25% and optimizing maintenance workflows.
Created an AI-powered sentiment analysis model with GPT-4 to extract insights from social media, enabling real-time feedback integration and improving customer engagement by 15%.
Data Science Manager
Arable Labs
Developed and deployed a Random Forest model on AWS SageMaker to calibrate IoT device temperatures in greenhouses, leveraging physics-based features and achieving a 20% improvement in prediction accuracy for real-time monitoring across 8000 devices.
Designed an AI-powered inventory system utilizing time series analysis to predict demand and automate replenishment processes, improving stock accuracy by 30%.
Developed an AI-driven quality control system using computer vision to detect manufacturing defects with 98% accuracy, reducing production errors by 35%.
Senior Data Scientist
Ecolab Digital Center
Designed a market basket analysis solution using Apriori and Azure ML to recommend healthcare products, automating lead generation via Power BI and boosting sales team productivity by 25%.
Built ensemble models with AdaBoost and CatBoost to predict evaporator health over 42 days, automated CI/CD workflows with Kubeflow, and reduced model deployment time by 30%.
Developed XGBoost-based models to forecast maintenance schedules, integrated with Power BI and Power Apps for real-time feedback, reducing downtime by 20%.
Created time-series models for tank-level forecasting using Kubeflow and CI/CD automation, delivering Power BI dashboards and improving inventory management efficiency by 35%.
Built a customer lifetime value model using RFM analysis and Power BI integration, improving customer prioritization and retention by 20%.
Developed a hybrid recommendation system combining collaborative filtering and content-based methods with Azure ML, enhancing product recommendation accuracy by 30%.
Assistant Manager Data Science
Genpact Ltd.
Designed a content-based recommendation system for personalized banking product suggestions, increasing sales by 20% and improving customer satisfaction.
Developed an interactive chatbot using IBM Watson Assistant for financial product recommendations, improving customer engagement by 30%.
Built a predictive scorecard using logistic regression and lift charts to identify high-probability loan customers, boosting loan acquisition rates by 25%.
Forecasted monthly truck dealer sales using ARIMA and LSTM models with 95% accuracy to enhance dealership planning.
Predicted individual medical costs using Ridge, Lasso, and Elastic Net regression models, improving cost estimation accuracy by 15%.
Deployed AI-driven models for supply chain demand forecasting and inventory optimization, reducing operational costs by 25%.
Created an AI-powered customer support chatbot with GPT-based NLP for e-commerce, reducing response times by 40%.
Technology Analyst Data Science
Infosys Ltd
Built a scalable recommendation engine using Apache Spark and Hive with item-to-item collaborative filtering, boosting e-commerce sales by 20%.
Developed a product recommendation system using FP-Growth in Spark ML on HDFS transactional data, increasing purchase frequency by 25%.
Designed an NLP-powered email classification system deployed on Azure, automating ticket routing and reducing manual effort by 40%.
Forecasted call volumes using ARIMAX and Holt-Winters models, optimizing staffing levels and reducing wait times by 15%.
Developed predictive models using Random Forest and SVM for HR recruitment, improving efficiency by 30% and enhancing post-offer join rates.
Built an XGBoost-based churn prediction model with MLOps integration, reducing churn rates by 25%.
Implemented a machine learning-based threat detection system using Random Forest, improving threat response times by 30%.
Summary
Accomplished GenAI Engineer, Data Scientist & Data Engineer | 10+ Years of Data Expertise | Dual Postgraduate Degrees & Honorary Doctorate in AI
Languages
Education
Manipal University
PGDBS, Majors in Statistics and Mathematics · India
Punjab T. University
Computer Science Engineering · India
Washington D. University
Doctor of Artificial Intelligence, Honorary doctorate, awarded in recognition · Artificial Intelligence · Washington, United States
Certifications & licenses
Aws Certified Data Scientist
Azure Certified Data Scientist
Certification In Statistical Learning
Stanford University
Deep Learning Specialization
Andrew Ng
Gcp Certified Data Scientist
IBM Data Science And Ai Certificate Level 1
IBM
IBM Data Science And Ai Certificate Level 2
IBM
IBM Data Science And Ai Certificate Level 3
IBM
MLOps Certified Data Scientist
Machine Learning Specialization
Andrew Ng
Similar Freelancers
Discover other experts with similar qualifications and experience