Worked on training Google’s Gemini model for Data Science and Data Analysis tasks using custom SFT and RLHF techniques, improving the model responses by 60%
Assisted in defining and solving business problems, one of which saved up to 100,000 USD in monthly savings by eliminating a human layer after doing thorough data driven findings
Worked on creating high quality notebooks using BigQuery and Python that directly contributed to training Google Colab’s AI coding assistant, improving the efficiency of Data Scientists by 50%
Created robust data pipelines in Hex, removed inconsistencies, resulting in an improved data accuracy by 7-8%
Automated report generation and sending them through emails, saving on average 2.5 hours of work for every report
Created, maintained and optimized data pipelines and SQL queries reducing overall execution time from 2 hours to 3 minutes.
Mar 2023 - Apr 2024
1 year 2 months
Islamabad, Pakistan
Associate Data Analyst
Global Rescue LLC
Performed data manipulation and automation using Python, saving an average of 5-6 hours per day
Used Python and SQL to carry out data analysis of travel data, finding patterns that helped the company sell more subscriptions during peak season, which generated about 15% more revenue in subscription sales
Developed a Streamlit dashboard that helped track data issues, fix data issues, and generated reports within seconds, eliminating the manual work and improved the overall efficiency by 90%
Worked as part of Data and BizOps team to help cut costs, one of which helped the company save $12,000 /month
Found and fixed a data parser that had been shut for 2 months impacting 1000+ travellers, due to an incorrect field in data.
Jul 2021 - Apr 2024
2 years 10 months
Data Scientist – Level II
Fiverr
Developed XGBoost classifier that detected 98% of fraudulent transactions, improving the previous 90% accuracy that helped the bank save $500,000 in annual losses. Built a complete system with 2-Step verification
Performed hypothesis testing on online taxi cab data, helping improve the revenue for drivers by 30% by suggesting they use card payment instead of cash as people preferred cards over cash
Developed Random Forest Classifier with 88% accuracy, trained & evaluated on a large dataset of tweets to detect hate/offensive tweets by using different NLP techniques, built Streamlit web app to facilitate reporting of flagged tweets
Developed a conversational PDF Q&A chatbot using LangChain and OpenAI, leveraging chromaDB as vector store for Retrieval Augmentation generation (RAG) and Streamlit to develop an interface that required PDFs as input, utilized prompt engineering that helped generated 100% accurate responses based on the given PDF.
Languages
Urdu
Native
English
Advanced
Education
Oct 2018 - Jun 2022
IUB
Bachelor of Engineering · Computer Systems Engineering · Bahawalpur, Pakistan · 3.69/4.0