Himanshu N.

Principal (Data Scientist/Data Engineer/Gen AI Engineer)

Haar, Germany

Experience

Jan 2024 - Present
2 years
Munich, Germany

Principal (Data Scientist/Data Engineer/Gen AI Engineer)

Marktguru Deutschland GmbH

  • Architected an agentic, real-time offer orchestration engine where specialized agents (retrieval, pricing/optimization, and policy/guardrails) coordinate to personalise promotions across customer touchpoints. The system uses RAG with FAISS over Delta Lake and low-latency Databricks Model Serving to support on-the-fly decisions and partner integrations; collaborated with PMs and commercial stakeholders to shape the roadmap and evaluate emerging agent patterns for production.

  • Designed an agent-based data quality service that orchestrates schema detection, entity normalization, and validator/exception-handling agents to clean multi-retailer SKU feeds at scale. Wrapped model calls in PySpark UDFs for distributed inference, automated via Databricks Workflows and CI/CD, enabling near-real-time readiness for downstream agent decisioning in supply and catalog processes.

  • Developed a multimodal, agentic extraction pipeline where vision, parsing, and compliance agents collaborate to derive brand, packaging, and volume from scanned images using Claude 3 Sonnet with Swin Transformer encoders. Asynchronously orchestrated via Azure Event Hub with enriched outputs persisted to Delta Lake for consumption by search, recommendations, and downstream operational agents.

  • Implemented a GS1 taxonomy classification service built around cooperating agents for inference, drift monitoring, and auto-retraining governance. Falcon 180B (LoRA-tuned) powers the classifier; a batch pipeline on Databricks triggers model refreshes when accuracy dips below thresholds, supporting reliable merchandising analytics and agent-driven discovery.

  • Created a hybrid agent workflow where a retrieval agent surfaces candidate matches via embeddings and a reasoning/verification agent (Mixtral 8x7B) adjudicates final receipt-to-SKU alignment. Integrated into a streaming Databricks pipeline to support near-real-time sales operations and exception handling across customer touchpoints.

  • Built a multimodal attribute inference pipeline structured as cooperating vision-language, rules/consistency, and compliance agents to fill NutriScore, nutrition fields, and packaging types from names and images using LLaMA 3-8B with CLIP embeddings. Designed for fast feedback loops so downstream agents can trust and act on enriched product records.

  • Developed a GenAI-powered orchestration system that ingests recipes from multiple websites, parses ingredients through structured extraction agents, and dynamically links them to real-time retailer offers via tagging agents. Designed a multi-agent workflow where retrieval agents identify candidate offers, semantic reasoning agents validate ingredient–offer matches, and business-rule agents ensure compliance. Integrated into customer touchpoints so that users could directly click on matched offers, driving incremental sales and partner revenue share.

Jan 2023 - Dec 2024
2 years
Nuremberg, Germany

Architect (Data Scientist/Data Engineer)

ABL solutions GmbH

  • Developed a machine learning system leveraging WiFi data to forecast passenger demand, optimizing resource allocation and improving supply chain efficiency. Achieved a 20% improvement in route optimization, enhancing service delivery for public transportation.

  • Designed an AI-driven traffic optimization solution integrating IoT sensors and Google Maps API to analyze patterns and predict congestion. Reduced traffic delays by 30% through real-time signal adjustments, empowering authorities with actionable insights for better flow management.

  • Built a machine learning-based predictive maintenance system using IoT sensor data to forecast equipment failures. Minimized downtime by 25% and optimized maintenance workflows, aligning predictive insights with cost-effective strategies.

  • Created an AI-powered sentiment analysis model with GPT-4, extracting insights from social media to guide marketing strategies. Enabled real-time feedback integration, improving customer engagement and response times by 15%.

Jan 2021 - Dec 2023
3 years
Bengaluru, India

Data Science Manager

Arable labs

  • Developed and deployed a Random Forest model on AWS SageMaker to calibrate IoT device temperatures in greenhouses. Leveraged physics-based features, achieving a 20% improvement in prediction accuracy for real-time monitoring across 8000 devices.

  • Designed an AI-powered inventory system utilizing time series analysis to predict demand, automating replenishment processes. Improved stock accuracy by 30%, reducing inefficiencies and aligning inventory with dynamic retail needs.

  • Developed an AI-driven quality control system using computer vision to detect defects, achieving 98% accuracy. Reduced production errors by 35%, ensuring compliance with manufacturing standards and boosting efficiency.

Dec 2018 - Nov 2021
3 years
Bengaluru, India

Senior Data Scientist

Ecolab Digital Center

  • Designed a market basket analysis solution using Apriori and Azure ML to recommend healthcare products, automating the process and enabling actionable leads via Power BI, boosting sales team productivity by 25%.

  • Built ensemble models with AdaBoost and CatBoost to predict evaporator health over 42 days. Automated CI/CD workflows with Kubeflow, reducing model deployment time by 30% and improving operational efficiency.

  • Developed XGBoost-based models to forecast maintenance schedules, integrating Power BI with Power Apps for real-time feedback, enhancing proactive maintenance strategies and reducing downtime by 20%.

  • Created time-series models for tank-level forecasting, leveraging Kubeflow and CI/CD for workflow automation. Delivered Power BI dashboards, improving inventory management efficiency by 35%.

  • Built a CLTV model using RFM analysis to provide strategic sales insights, integrating Power BI for stakeholder visibility, resulting in a 20% improvement in customer prioritization and retention.

  • Developed a hybrid recommendation system combining collaborative filtering and content-based methods with Azure ML, enhancing product recommendation accuracy by 30% and driving customer satisfaction.

Jan 2017 - Dec 2018
2 years
Bengaluru, India

Assistant Manager Data Science

Genpact Ltd.

  • Designed a content-based recommendation system to provide personalized banking product suggestions, increasing product sales by 20% and improving customer satisfaction through data-driven insights.

  • Developed an interactive chatbot for financial product recommendations using IBM Watson Assistant, improving customer engagement by 30% with a user-friendly interface hosted on a website.

  • Built a predictive scorecard using logistic regression and lift charts to identify high-probability loan customers, boosting loan acquisition rates by 25% through targeted marketing.

  • Forecasted monthly sales using advanced time series models, including ARIMA and LSTM, delivering 95% prediction accuracy to enhance dealership-level strategic planning.

  • Predicted individual medical costs using Ridge, Lasso, and Elastic Net regression models, achieving a 15% improvement in cost estimation accuracy for policy pricing and outreach strategies.

  • Deployed AI-driven models to forecast demand and optimize inventory, reducing operational costs by 25% and improving logistics efficiency for supply chain operations.

  • Created an AI-powered chatbot with GPT-based NLP for efficient query handling, reducing response times by 40% and enhancing customer experience on e-commerce platforms.

Jan 2014 - Dec 2017
4 years

Technology Analyst Data Science

Infosys Ltd

  • Built a scalable recommendation engine using Apache Spark and Hive with item-to-item collaborative filtering, personalizing customer experiences and boosting sales by 20% for e-commerce platforms.

  • Developed a product recommendation system using FP-Growth in Spark ML to analyze transactional data in HDFS, increasing purchase frequency by 25% and enhancing customer satisfaction.

  • Designed an NLP-powered email classification system deployed on Azure, automating ticket routing and reducing manual effort by 40%, improving operational efficiency.

  • Forecasted call volumes using advanced time series models like ARIMAX and Holt-Winters, optimizing staffing levels and reducing customer wait times by 15%.

  • Developed predictive models using Random Forest and SVM to streamline candidate selection, improving recruitment efficiency by 30% and enhancing post-offer join rates.

  • Built an XGBoost-based churn prediction model integrated with MLOps workflows, enabling proactive customer retention and reducing churn rates by 25%.

  • Implemented a machine learning-based threat detection system using Random Forest, improving threat response times by 30% and strengthening organizational security.

Summary

Accomplished GenAI Engineer, Data Scientist & Data Engineer | 10+ Years of Data Expertise | Dual Postgraduate Degrees & Honorary Doctorate in AI

Languages

English
Native
German
Advanced

Education

Amity University

Post graduation diploma in Machine Learning and Artificial Intelligence · Machine Learning and Artificial Intelligence · Noida, India

Manipal University

PGDBS, Majors in Statistics and Mathematics · Statistics and Mathematics · Manipal, India

Punjab T. University

Computer Science Engineering · India

...and 1 more

Certifications & licenses

AWS Certified Data Scientist

Azure Certified Data Scientist

Certification In Statistical Learning

Stanford University

Data Science And Ai Certificates

IBM

Deep Learning Specialization

Andrew Ng

GCP Certified Data Scientist

MLOps Certified Data Scientist

Machine Learning Specialization

Andrew Ng

Need a freelancer? Find your match in seconds.
Try FRATCH GPT
More actions