I am a highly experienced Data Scientist and Senior Statistician with over eight years of hands-on experience in statistical analysis, forecasting, and advanced data science across diverse sectors, including biomedical research, healthcare, insurance, transportation, and clinical studies. My academic foundation is robust, comprising a Ph.D. in Statistics, a Master’s degree in Statistics, and a Master’s degree in Mathematics, complemented by multiple professional certifications in AI, machine learning, and clinical research.
Throughout my career, I have specialized in designing and implementing machine learning and AI solutions that translate complex datasets into actionable insights. I excel at bridging diverse analytical approaches, ranging from classical statistics and Bayesian methods to deep learning and reinforcement learning, to provide clients with precise, data-driven decision support.
I possess extensive experience in developing predictive and optimization models, including:
Machine Learning & AI: Deep learning, reinforcement learning, and predictive modeling.
Big Data & Cloud Platforms: Expertise in handling large-scale data using Spark and Hadoop, as well as Microsoft Azure services such as Azure Databricks, Azure Data Factory, Azure Synapse Analytics, Azure SQL Database, Azure Apache Spark Pools, and Azure ML Studio.
Programming & Data Tools: Python, R, SAS, SQL, and related analytical frameworks.
Natural Language Processing (NLP): Analyzing and deriving insights from textual and unstructured data.
Clinical Research & Epidemiology: Statistical analysis and modeling for clinical trials, healthcare outcomes, and regulatory reporting.
Forecasting & Decision Support: Building robust mathematical models and predictive frameworks to support operational, strategic, and regulatory decisions.
In addition to hands-on analytical work, I have taught advanced statistical methods and mentored professionals in Bayesian statistics, machine learning, and big data analytics, helping teams translate complex analyses into practical, actionable recommendations.
I have successfully delivered AI and ML solutions that optimize decision-making processes for clients in health insurance, biopharma, and other data-intensive industries. My approach integrates statistical rigor with technological innovation, ensuring that analytical solutions are both scientifically sound and practically impactful.
I thrive on transforming complex data into clear, reliable insights that drive informed decision-making, product development, and operational efficiency. I am available to collaborate on projects involving data analysis, machine learning, AI, or any data-driven strategic initiative, and I am committed to delivering high-quality, client-focused solutions.
Regression models are suitable to analyse the association between health outcomes and environmental exposures. However, in urban health studies where spatial and temporal changes are of importance, spatial and spatio-temporal variations are usually neglected. This thesis develops and applies regression methods incorporating latent random effects terms with Conditional Autoregressive (CAR) structures in classical regression models to account for the spatial effects for cross-sectional analysis and spatio-temporal effects for longitudinal analysis. The thesis is divided into two main parts. Firstly, methods to analyse data for which all variables are given on an areal level are considered. The longitudinal Heinz Nixdorf Recall Study is used throughout this thesis for application. The association between the risk of depression and greenness at the district level is analysed. A spatial Poisson model with a latent CAR structured-Random effect is applied for selected time points. Then, a sophisticated spatio-temporal extension of the Poisson model results to a negative association between greenness and depression. The findings also suggest strong temporal autocorrelation and weak spatial effects. Even if the weak spatial effects are suggestive of neglecting them, as in the case of this thesis, spatial and spatio-temporal random effects should be taken into account to provide reliable inference in urban health studies. Secondly, to avoid ecological and atomic fallacies due to data aggregation and disaggregation, all data should be used at their finest spatial level given. Multilevel Conditional Autoregressive (CAR) models help to simultaneously use all variables at their initial spatial resolution and explain the spatial effect in epidemiological studies. This is especially important where subjects are nested within geographical units. This second part of the thesis has two goals. Essentially, it further develops the multilevel models for longitudinal data by adding existing random effects with CAR structures that change over time. These new models are named MLM tCARs. By comparing the MLM tCARs to the classical multilevel growth model via simulation studies, we observe a better performance of MLM tCARs in retrieving the true regression coefficients and with better fits. The models are comparatively applied on the analysis of the association between greenness and depressive symptoms at the individual level in the longitudinal Heinz Nixdorf Recall Study. The results show again negative association between greenness and depression and a decreasing linear individual time trend for all models. We observe once more very weak spatial variation and moderate temporal autocorrelation. Besides, the thesis provides comprehensive decision trees for analysing data in epidemiological studies for which variables have a spatial background.
My Master’s degree in Statistics provided me with a comprehensive and rigorous training in both theoretical foundations and applied methodologies of modern statistics. The program combined probability theory, statistical inference, and decision theory with advanced computational and data-driven approaches, ensuring a strong balance between mathematical rigor and real-world applications.
The curriculum covered a broad range of areas, including:
Theoretical Foundations: Probability theory, decision theory, estimation and hypothesis testing, Bayesian statistics, stochastic processes.
Applied Statistical Methods: Descriptive and inferential statistics, linear models, multivariate analysis, sampling techniques, advanced experimental design, nonlinear optimization.
Specialized Fields: Econometrics, risk theory in actuarial science, statistical methods in epidemiology and genetics, meta-analysis, spatial statistics, spline regression.
Data Science & Machine Learning: Classification methods and big data analytics, advanced statistical learning, introduction to data science, time series analysis.
This diverse training equipped me with the ability to:
Build and validate robust statistical models.
Apply advanced machine learning and data mining techniques to large and complex datasets.
Design and implement experiments with rigorous methodology.
Translate complex statistical findings into clear, actionable insights for decision-making.
Overall, the program strengthened both my theoretical expertise and my applied skills, preparing me to tackle a wide range of data-centric challenges across industries such as healthcare, insurance, finance, and scientific research.
My Master’s degree in Mathematics provided me with a deep and rigorous training across pure and applied mathematics, equipping me with advanced problem-solving skills, abstract reasoning, and the ability to translate complex mathematical concepts into practical solutions. The program covered both foundational mathematics and highly specialized areas relevant to modern applications in data science, optimization, and scientific computing.
The curriculum included:
Analysis & Functional Spaces: Measure theory and integration, Sobolev spaces, distribution theory, Fourier transform, functional analysis, topology, and complex analysis.
Geometry & Algebra: Differential geometry, Kähler and Riemannian geometry, rings and modules, general algebra, and topological vector spaces.
Differential Equations & Dynamical Systems: Ordinary and partial differential equations, numerical methods for PDEs, continuous and discrete dynamical systems, inverse problems.
Optimization & Numerical Methods: Nonlinear optimization, advanced numerical analysis, applied statistics, data and correspondence analysis.
Foundations & Logic: Set theory, mathematical logic, foundations of analysis and algebra, computer science for mathematics.
Probability & Statistics: Probability theory, applied statistics, and connections to real-world modeling.
Through this program, I developed strong analytical skills and the ability to:
Solve highly complex mathematical problems using both theoretical and numerical approaches.
Apply optimization and differential equations to model real-world phenomena.
Work across abstract mathematical structures (algebra, geometry, topology) and translate them into applied contexts.
Use advanced statistical and computational methods to analyze data and support decision-making.
This diverse mathematical background enables me to approach client projects with precision, creativity, and the flexibility to adapt rigorous methods to practical business and research challenges.
Discover other experts with similar qualifications and experience
2025 © FRATCH.IO GmbH. All rights reserved.