Nagaraju Anthati
Senior Data Scientist
Experience
Senior Data Scientist
JPMC
- Designed and deployed Bayesian Marketing Mix Models (MMM) using PyMC-3+ and PySpark, quantifying ROI and channel-level elasticity across retail and asset management portfolios.
- Engineered ETL and feature pipelines in Airflow and AWS Databricks, automating ingestion of terabyte-scale marketing, transaction, and behavioral data from S3, Hive, Postgres, and Kafka.
- Built Delta Lake + Apache Iceberg architecture supporting adstock, carry-over, and seasonal transformations for model input.
- Implemented hierarchical Bayesian structures and regression-based MMMs using NumPy, PyMC, and TensorFlow Probability to model multi-region effects.
- Optimized PySpark jobs with Liquid Clustering and adaptive partitioning, cutting MMM data-prep runtime by ~40%.
- Automated model training, versioning, and deployment via MLflow and Databricks Asset Bundles, ensuring reproducibility and compliance.
- Streamed near-real-time ad-exposure and conversion data from multi-tenant Kafka clusters into model pipelines.
- Deployed probabilistic inference workflows on AWS EMR using distributed MCMC sampling, reducing convergence time significantly.
- Delivered model explainability dashboards in Plotly Dash visualizing posterior distributions, channel effects, and uncertainty intervals.
- Applied Bayesian regularization and feature selection techniques to optimize MMM performance.
- Integrated MMM outputs into Snowflake and AWS RDS for BI and marketing analytics consumption.
- Implemented data quality monitoring using Great Expectations and integrated validation across ETL workflows.
- Collaborated with quant research teams to embed MMM-driven elasticities into financial forecasting models.
- Automated CI/CD pipelines using Jules and ServiceNow for model retraining and deployment.
- Delivered cross-functional MMM insights to marketing, finance, and analytics teams to support budget optimization.
- Defined and implemented advanced eCommerce tracking for online transactions, allowing for granular reporting on product performance and customer journey analysis.
- Integrated Google Analytics with third-party platforms, such as Google Ads and CRM systems, enabling cross-platform attribution and seamless data flow.
- Utilized machine learning techniques, including regression analysis, decision trees, and clustering, to predict customer behavior and segment audiences for targeted marketing.
- Developed a multi-touch attribution model to accurately assign conversion credit across digital touchpoints, improving understanding of the customer journey.
- Worked on implementing resiliency, reliability, and availability of various asset and wealth management tools both on premises and in the cloud.
- Worked on reconciliation and reporting integration of fund positions, instruments, cash, or money markets.
- Worked on change management, release process and release management using Jules pipelines and ServiceNow.
- Worked on AFX merchant reports, P&L validation and reporting related to various funds, assets, and instruments.
- Worked on implementing real-time daily load status solution using Geneos dashboards.
- Worked in a customer-facing role supporting various asset and wealth management MMM/ML activities.
Data Scientist - MMM/ML activities
Glaxo Smith Kline
- Designed and implemented Bayesian MMM frameworks in PyMC to evaluate ROI across multichannel marketing campaigns in consumer health and pharma domains.
- Built end-to-end ETL pipelines using Airflow, Kafka, Azure Data Factory, and Databricks, integrating CRM, sales, and process data (>100 TB).
- Developed probabilistic regression models with hierarchical priors to capture campaign, region, and HCP-level heterogeneity.
- Built schema-evolving data models using open table formats and ADLS Gen2 integration.
- Implemented Bayesian inference workflows with MCMC sampling on Azure Databricks for channel elasticity estimation.
- Developed custom priors to reflect domain knowledge such as decay rates, carry-over, and saturation effects.
- Automated training and evaluation pipelines using Azure ML and MLflow with version-controlled experiments.
- Implemented streaming analytics using Kafka + Flink to continuously refresh MMM datasets from digital and field systems.
- Built PySpark feature stores and validation layers to ensure data quality and consistency.
- Conducted model diagnostics using WAIC, LOO-CV, and posterior predictive checks.
- Created Power BI and Plotly Dashboards for marketing teams to visualize MMM insights and posterior ROI curves.
- Ensured data governance, lineage tracking, and GDPR/GxP compliance across all Azure data pipelines.
- Migrated legacy MMM workloads from on-prem HDP to Azure Databricks, improving scalability and reducing processing time by 60%.
- Built budget optimization simulators in Python using Bayesian Decision Theory principles.
- Partnered with commercial analytics teams to operationalize MMM insights into forecasting and promotional planning models.
- Worked on cloud hosted Kafka data sources and streamed using Kafka connectors and Flink.
- Worked on creating standardized SQL engine clusters using Presto DB.
- Worked on creating virtual cloud data warehouses using Snowflake and querying data using SnowSQL, Spark jobs, and Tez.
- Worked on documentation on Confluence, code review and build management with Groovy on Jenkins.
Data Engineer
Visa Europe
- Worked on data analysis on CDH5 and CDH6 clusters using Apache Hue.
- Worked on autoscaling and maintenance of AWS EMR cluster.
- Worked on massive data warehouse solutions to offload 800 TB of data from DB2 storage to Hadoop.
- Worked on setting up streaming processes for transactional and clearance data using Kinesis.
- Implemented workflow schedules using Airflow and Oozie.
- Worked on implementing streaming ingestions using Kafka Confluent platform consisting of 10 broker nodes from various data sources.
Hadoop/Big Data Engineer
Solera Holdings
- Technologies: Hadoop, Sqoop, Hive, HBase, Spark, AKKA, Lucene, Solr, Pig, Pentaho, Hue, Scala.
Big Data Hadoop Developer
Silicon Integra Limited
- Technologies: Hadoop, Sqoop, R, Kite, SDK, Kudu, Hive (CDH5.4, CDH5.6), HBase, Impala, Hue, Spark, Oozie, AWS EMR, Azure, Solr, Pig, valuation and estimation algorithms, Paxata, Scala, Presto DB.
Hadoop Developer / Analyst Consultant
Nortech Solutions
- Technologies: Hadoop, Sqoop, Hive, HBase, Spark, AKKA, Lucene, Solr, Pig, Pentaho, Hue, Scala.
Bigdata Developer/Engineer
Nextgen Solutions
- Technologies: Hadoop, Hive, Scala, JSF, MongoDB, HBase, ActiveMQ, multi-threading.
Bigdata/Hadoop Engineer
Tata Telecom
- Technologies: Hadoop Analytics, Pentaho, Java, Python, J2EE, Hadoop ecosystem.
Summary
Having overall 13+ years of experience in Planning, Building, Implementation, and Integration of full-scale commercial projects in the different verticals like Financial, Retail, Insurance, Banking, High-tech, social media, Oil and Gas and Networking/Telecom.
Worked with various cloud environments like AWS, Azure and open-source cloud deployment and configuration tools like Open stack and open nebula. Gained hands on experience on NoSQL databases like Mongo DB, HBase and Cassandra. Worked on various Agile practices like TDD, BDD, pair programming, continuous Integration and Scrum. Worked on programming languages like Java, Scala, Python, Golang, C, PySpark, Shell scripting, J2EE, JSF, Apache Hadoop eco System, Hortonworks, Cloudera, Accel Data ODP, ETL practices and Analytics platforms.
Skills
Pymc
Pymc-marketing
Bayesian Modelling
Regression
Classification/clustering
Timeseries Forecasting
Google Analytics
Genai
Lang Chain/lang Graph
Milvus
Neilson Marketing Cloud
Python
Sql
Java
Git
Docker
Mongodb
R
Presto Db
Linux/unix
Github
Spring Boot
Artificial Intelligence
Etl
Cloud Services
Bash
Ansible
Graphql
Nosql
Eks
Jupyter Hub
Scala
Kubernetes
Apache Hadoop
Conflient/kafka
Oracle Database
Azure Adf,azure Datalake
Databricks
Azure Synapse
Dataiku
Sagemaker
Azure Dsvm
Slurm/lsf
Data Analysis
Statistical Modelling
Model Deployment
Cloud Data Engineering
Advanced Analytics
Machine Learning
Generative Ai
Solution Development
Streaming Data Pipelines
Staging Tables
Low Latency Solutions
Multitasking Abilities
Decision-making
Self-motivated
Languages
Education
Northumbria University
Master of Science, Computer Science · Computer Science · Newcastle upon Tyne, United Kingdom
JNTU
Bachelor of Technology, Electrical, Electronics And Communications Engineering · Electrical, Electronics And Communications Engineering · India
Profile
Frequently asked questions
Do you have questions? Here you can find further information.
Where is Nagaraju based?
What languages does Nagaraju speak?
How many years of experience does Nagaraju have?
What roles would Nagaraju be best suited for?
What is Nagaraju's latest experience?
What companies has Nagaraju worked for in recent years?
Which industries is Nagaraju most experienced in?
Which business areas is Nagaraju most experienced in?
Which industries has Nagaraju worked in recently?
Which business areas has Nagaraju worked in recently?
What is Nagaraju's education?
What is the availability of Nagaraju?
What is the rate of Nagaraju?
How to hire Nagaraju?
Average rates for similar positions
Rates are based on recent contracts and do not include FRATCH margin.
Similar Freelancers
Discover other experts with similar qualifications and experience
Experts recently working on similar projects
Freelancers with hands-on experience in comparable project as a Senior Data Scientist
Nearby freelancers
Professionals working in or nearby Stevenage, United Kingdom