Mahmoud T.

Data Scientist

Sfax, Tunisia

Experience

Feb 2024 - Present
1 year 9 months

Data Scientist

Sofrecom Tunisia, part of Orange Group

  • Developed automated PDF audit information extraction system using GPT-4
  • Implemented multi-page PDF processing with image conversion
  • Built robust validation system for extracted data
  • Designed modular Python architecture for scalable processing
  • Created custom data validation using Pydantic models
  • Built automated JSON output generation
  • Implemented PDF to image conversion with optimized resolution
  • Applied Base64 encoding for LLM compatibility
  • Performed structured data validation and normalization
  • Developed FTTH access orders forecast with 12-month horizon and multi-dimensional predictions
  • Created comprehensive feature engineering pipeline and used advanced time series decomposition techniques
  • Implemented ensemble modeling with R’s feasts and fable achieving ~5% RMSE
  • Migrated forecasting solution to Dataiku and converted R code into production-ready flow zones
  • Validated performance using Dataiku’s native models
  • Optimized co-financing prediction model by converting Python scripts to Dataiku workflows
  • Prioritized SQL and native Dataiku components for performance
  • Designed efficient Analytics Base Table creation process
  • Validated MLForecast function and tested additional predictive variables
  • Evaluated alternative modeling approaches and confirmed solution robustness through comparative analysis
Feb 2022 - Dec 2022
11 months

Parttime Statistics Lecturer

International School of Business

Mar 2021 - Feb 2023
2 years

Data Scientist

Kiota Intelligence

  • Built predictive models for startup survival, Series A funding probability, and pre-valuation modeling
  • Developed investor-startup matching algorithm
  • Created interactive Shiny dashboard for funding visualization with dynamic filtering tools
  • Developed multidimensional outlier detection system and temporal clustering analysis
  • Designed CSS-styled RMarkdown automated reports and implemented email distribution system
  • Developed statistical testing framework for funding rounds and interactive filtering for investor matching
Aug 2019 - Jan 2021
1 year 6 months

Machine Learning Competitor (R programmer)

Zindi platform for Data Science

  • Ranked 4th in A14D Predict the Global Spread of COVID-19 time series competition
  • Placed in top 11% for Uber Movement SANRAL Cape Town road incident prediction
  • Ranked 6th in AI Hackathon Tunisia for fraud detection solution
  • Achieved top 36% in Tech4MentalHealth NLP classification for mental health chatbot
  • Ranked top 26% in Sendy Logistics ETA prediction for motorbike deliveries
  • Achieved top 13% in Financial Inclusion in Africa bank account usage prediction
  • Participated in IEEE Big Data Cup for customer support escalation prediction
  • Ranked top 31% in Uber Nairobi Ambulance Perambulation optimization
  • Ranked top 37% in Wazihub soil moisture prediction using IoT sensor data
  • Ranked top 38% in female-headed households wage prediction in South Africa
  • Ranked top 45% in Akeed restaurant recommendation engine for Oman
  • Ranked top 53% in UNICEF Flood Prediction in Malawi competition
  • Ranked top 52% in South African COVID-19 vulnerability mapping
  • Ranked top 59% in Sea Turtle Rescue weekly forecast challenge
Nov 2018 - Jan 2020
1 year 3 months

Data Scientist (R programmer)

Freelance

  • Designed comprehensive machine learning curriculum and hands-on labs
  • Created Python notebooks for supervised (linear/logistic regression, SVM, decision trees) and unsupervised learning (K-means)
  • Developed model optimization techniques and best practices in ML implementation
  • Built NLP classification system with scalable text classification model and automated category prediction pipeline
  • Created framework for future data classification
  • Developed IoT sensor analytics weight prediction model and optimized error reduction algorithms
  • Built social media analytics platform with Facebook API integration, sentiment analysis, and actionable insights
Jan 2018 - Mar 2020
2 years 3 months

Data Scientist

Tunisia Telecom Group

  • Constructed Analytics Base Table from multiple data sources with robust data quality checks
  • Executed advanced feature engineering and transformations
  • Performed statistical testing and correlation analysis
  • Developed K-means clustering model for behavioral segmentation and detailed segment profiles
  • Designed automated classification system for new customer assignment
  • Delivered actionable customer insights for business strategy
  • Developed and validated hypotheses for dual-SIM usage patterns
  • Engineered customer scoring algorithm and detailed user profiles using SAS Guide and Miner
  • Distinguished household-level connections and identified extended family networks
  • Collaborated with SAS, KPMG, and Business&Decision experts on churn, cross-sell, and community link analysis models
Jun 2016 - Dec 2017
1 year 7 months

Data Analyst

Tunisia Telecom Group

  • Developed SQL ad-hoc requests and dashboards for CVM performance, network quality KPIs, data service penetration, and sales analytics
  • Implemented VBA automation for PowerPoint reporting and established data quality verification protocols
  • Designed targeted marketing campaigns and Try & Buy offer frameworks
  • Generated data-driven product recommendations
Jun 2015 - Dec 2015
7 months
Toulouse, France

Research Statistician

LAAS-CNRS

  • Developed behavioral pattern recognition model for homeowner identification
  • Created real-time intrusion detection algorithm and automated alarm management system
  • Designed real-time data processing pipeline and statistical learning models for behavior analysis
  • Built automated decision-making system
  • Authored research book "Gestion Automatique d'un Système de Sécurisation des Biens à Domicile" published by European University Editions
  • Ranked 1st in IBM Watson Services competition and led AI labs as instructor

Summary

Data Scientist with over 9 years of experience, blending technical precision with strategic insight. I specialize in working with business-driven data, applying deep analytical thinking and iterative exploration to uncover meaningful patterns. My passion lies in transforming raw data into high-impact features, enabling models that align closely with real-world objectives and decision-making.

Languages

Arabic
Native
English
Advanced
French
Advanced

Education

Oct 2011 - Jun 2015

ESSAI

Engineering School · Statistics and Data Analysis

Oct 2008 - Jun 2011

IPEIS

Mathematics- Physics · Sfax, Tunisia

High School

High School degree · Computer Sciences

Certifications & licenses

Advanced R Programming

Coursera (from The Johns Hopkins University)

Creating Features For Time Series Data

Coursera (from SAS)

Practical Time Series Analysis

Coursera (from The State University Of New York)

Data Analyst In R Path

DataQuest

Forecasting Product Demand In R

DataCamp Courses

Statistical Learning (Using R)

Stanford Online

Time Series With R Track (6 Courses)

DataCamp Courses

Applied Data Science With R - Level 2

IBM Badges

Machine Learning By Andrew Ng

Stanford Online

Predictive Modeling And Text Mining

SAS Badges

Exploratory Data Analysis

SAS Badges

Data Analyst Track

Udacity

Build Your Own Chatbot - Level 1

IBM Badges

Node-Red Basics To Bots

IBM Badges

SAS Programming 1: Essentials

SAS Badges

Data Science Orientation

Coursera (from IBM)

Introduction To Anova, Regression And Logistic Regression

SAS Badges

Mining Massive Datasets (Big-Data Algorithms)

Stanford Online

Analyzing And Visualizing Data With Excel

edX

Inferential And Predictive Statistics For Business

Coursera (from Illinois University)

Text Mining And Analytics

Coursera (from Illinois University)

Cluster Analysis In Data Mining

Coursera (from Illinois University)

SQL

Stanford Online

Data Science: Data To Insights

MIT Professional X

Managing Big Data With MySQL And TERADATA

Coursera (from Duke University)

Querying With Transact-SQL

edX (from Microsoft)

Statistics With R: Correlation And Linear Regression

DataCamp Courses

Introduction To Python For Data Science

edX (from Microsoft)

Need a freelancer? Find your match in seconds.
Try FRATCH GPT
More actions