5 years of data analytics experience in the automotive and defense industries.
Go Blue!
View My LinkedIn Profile
View My Github
I’ve skiied since I was four years old and I now can go on ski trips with friends since we all have steady incomes now. However, it’s always random where we want to ski and when… but what if it doesn’t have to be? I built a tool that calls multiple APIs to aggregate historical weather details for U.S. ski resorts to predict future conditions and subjectively rated them based on skiing preferences. I wanted to get more experience with APIs and streaming data, so I simulated Spark Streaming using Azure Event Hubs and Databricks. I also developed an interactive chart that showcases the top 3 mountains (think gold, silver, bronze rating like the Olympics) to ski during different weeks of ski season (December to mid April).
Demonstrated medallion data architecture and pipeline orchestration in Azure Databricks to combine and clean data for RFM analysis to assign “value” to a customer based on their transaction history.
Identifying anomalous transactions across merchants using machine learning and statistics. Generated reproducible, automated HTML reports to query SQLite databases and present suspicious activities. More recently rolled up my sleeves to handle the data ingestion piece that feeds into said report that’s briefed quarterly to senior stakeholders. Due to the success of this project, client has requested follow-on work to investigate fraud in another sector. Became a solutions architect to lead the pipeline development using Python and AWS.
Exploring the trends in cooking class fill rates through feature engineering and hypothesis testing for future models as a former chef’s assistant.
Developed global supply map to highlight profitable and high-trafficked shipping routes for cost-improvement projects. Created dashboard to see incoming shipments on a per-shipper basis for demand planners.
The goal is to increase revenue and decrease bike shortages at stations across Manhattan. Developed interactive map using Leaflet to highlight asymmetric traffic routes and stock levels in real-time. Created pricing scheme recommendation based on the stock level amount in a quantile function.
Narcan can be a life-saving medicine to rapidly reverse an opioid overdose in emergency situations. According to the NHTSA, roughly 50% of EMS agencies run at the Basic Life Support level and cannot legally administer intravenous drugs, which includes Narcan. In this project, I wanted to investigate how quickly a patient can die from exposure to fentanyl versus oxycodone with additional analysess considering confounding variables of age and race.
The additive manufacturing process features multiple stages of rigorous approvals and milestones each engineer must follow to vet a part. A suite of dashboards were created for senior leadership to monitor this process for 1,000+ parts: see how many parts are in each stage over time, where bottlenecks occur in the sub-processes, how well engineers performed, etc.
Examined survey results to understand workforce attrition and cluster employees based on income level, education field, and other significant variables. Analyzed logistic regression and ten machine learning models to predict turnover with 91% accuracy and proposed improvements to retention and recruiting processes.
No: employee did not leave the company | Yes: employee did leave the company
Fit and fine-tuned ARIMA models to consider trend and seasonality to better forecast air quality levels. Validated prediction performance against actual levels with 80% accuracy.
In this project I used text processing techniques and metrics like term frequency-inverse document frequency to gather descriptive statistics and distinguish important words from notes. I looked into questions like “What percentage of notes that mention ‘depression’ also mention ‘pain’?” and vice versa.