Wrote category-specific image prompts and scored 5,000+ model outputs using Apple’s RLHF rubric.
Built and applied prompt sets; evaluated grounding, reasoning, and compliance on every response, following structured guidelines.
Conducted jailbreak, bias, and prompt-injection tests; logged errors, inefficiencies, and edge conditions.
Reviewed model transcriptions and intent alignment; produced clear outcome notes marked as success / partial / failure.
Performed frame-level tagging of clicks, drags, and UI states; generated timestamped annotations to train action-sequence GUI agents through video annotation.
Created LaTeX/Markdown ground-truth pairs for OCR models; QA’d every sample for loss-free parsing and consistency.
Prediction of AQI of Lahore City using Machine-Learning Models
Compiled a 20-year dataset (2003–2022) from Copernicus CAMS; five pollutants plus temperature, 65,000+ rows.
Bench-tested ARIMA, SARIMAX, STL-Decomposition, and XGBoost; evaluated with MSE and Dynamic-Time-Warping.
Selected ARIMA (0,1,0) as best model, reducing forecast error by 30% vs. alternatives; produced structured performance summaries.
Languages
Urdu
Native
English
Advanced
Education
Oct 2020 - Jun 2024
National University of Sciences and Technology
Bachelors in Engineering · Engineering · Islamabad, Pakistan