Programming Experience
R-Studio
With over four years of experience in R and RStudio, I specialize in data cleaning, visualization, and statistical modeling.
Highlighted projects (as seen on Github):
Deepfake Detection Data – Applied regression and group comparison models to detect patterns in human responses to AI-generated stimuli.
Life Sampling Dataset – Processed and analyzed millions of longitudinal observations to identify developmental and aging trends.
Non-Profit Demographics – Built analyses and visualizations of mentorship and diversity data to inform program planning.
Libraries: tidyverse
, ggplot2
, dplyr
, readr
, lme4
Methods: Linear regression, multilevel regression, ANOVA, ANCOVA, MANOVA
Python
Experienced in using Python for machine learning, data analysis, and forecasting across research and applied projects.
Highlighted projects (as seen on Github):
Aviation Safety Forecasting – Applied time-series models (SARIMA) to predict trends in NASA ASRS accident reports.
Mentorship Demographics Forecasting – Built machine learning pipelines (Ridge regression, one-hot encoding) to forecast mentor/mentee participation.
Libraries: pandas
, numpy
, scikit-learn
, matplotlib
, seaborn
Methods: Regression modeling, classification, cross-validation, time-series forecasting, data visualization
SQL
Skilled in using SQL for data cleaning, querying, and extracting actionable insights from large, complex datasets.
Highlighted projects (as seen on Github):
COVID-19 Trends Analysis (Our World in Data) – Used SQL to aggregate and compare global case counts, vaccination rates, and mortality trends across countries, highlighting the use of SQL for public health monitoring and time-series reporting.
Skills: Joins, subqueries, window functions, CTEs, aggregate functions
Applications: Business intelligence, customer behavior insights, public health analytics