Projects

Selected projects showing how I approach problems from exploration and modelling through to pipeline design and automation.

Customer Churn Prediction

In progress

Built an end-to-end churn modelling workflow to help subscription-style businesses understand and predict which customers are likely to leave, using a Telco-style dataset.

  • Cleaned and preprocessed 7,000+ customer records, handling missing values, encoding categorical fields and scaling numerical features.
  • Performed EDA to uncover relationships between tenure, contract types, billing patterns and churn behaviour.
  • Improved model performance from a baseline of ~70% to ~82% through feature engineering, tuning and cross-validation, with a focus on business-relevant metrics.
Tech: Python • Pandas • scikit-learn

End-to-End Payroll ETL Framework

Python • CLI

Designed a command-line–driven ETL engine to automate payroll data workflows, integrating CSV and SQL sources into a single, reproducible pipeline.

  • Implemented a structured flow covering extract → DDL creation → transform → load, reducing manual preparation time from hours to minutes.
  • Built repeatable, auditable runs with a scripted pipeline design (configurable and reproducible).
Tech: Python • Pandas • MySQL • ETL Automation

How I approach projects

  • Start from a clear business question and success metric.
  • Use targeted EDA to shape the data model and feature set, not just generate charts.
  • Keep clean, modular notebooks or scripts with clear stages.
  • Focus on evaluation that reflects business impact, not just a single score.
  • Document assumptions, limitations and next steps for future iterations or deployment.

Tableau Dashboards

BI • Storytelling

I build interactive Tableau dashboards that combine clear KPIs, drill-down exploration, and a short “so what” insight layer. Recent work includes a Victoria Road Crash analytics dashboard (real government open data) and a Global Graduate Employability dashboard (synthetic dataset for demonstration).