A small collection of work and project examples to peruse. 

File Storage API Template - Python

Every business has to work with files, moving them around and deriving insight from them. They're also a good foundation for doing more interesting work with NLP and text analytics. I've done two different general containerized file management API templates implemented in Python, one using Django and the other using Flask. They both use either/or AWS and Azure for file storage based on the request, and utilize Postgres record management. To make the record management async, they both utilize Celery, with the Flask app also taking advantage of Redis. They're both set up to be easily expandable for more robust file metadata, or if one were inclined to add processing functionality like text extraction or file manipulation. 

Languages:

Utilizes:


File Storage API Template - C#.NET

Another general purpose file management API template, but this one is implemented in C#.NET with clean architecture principles. Like the Python versions, it is containerized and uses either/or AWS and Azure for file storage based on the request, and utilizes Postgres for asych queue based record management. It's also set up to be easily expandable to add more metadata or to do more advanced file processing

Languages:

Utilizes:


Qualtrics Attention Tracking

Qualtrics is one of the most widely used academic/scientific survey platforms, but one thing it's missing is the ability to track user attention to the current task. This project demonstrates how to integrate JavaScript within Qualtrics to monitor user activity and track task engagement. The script detects whether a participant has left or entered the browser window and records these events in real-time. All captured data is stored within a JSON object and saved as an embedded variable within the Qualtrics platform itself, so the data comes out as part of the survey results. I've also provided the base scripts for unpacking the JSON for analysis in both Python and R.

Languages:

Utilizes:


Recommender System

The goal of this project was to deploy a recommender system app online that would allow a user to select between a basic filter based approach, or a statistical approach built on user data. The app itself uses a classic public movie dataset (movielens) that has been used for recommendation research for quite a while. Since it's built in R's Shiny library, it also has a pretty boiler plate appearance, but I'm pretty happy with the final result as the algorithms run as expected and nothing breaks :)

Languages:

Utilizes:


EM for Guassian Mixtures

Although I have done tons of coding assignments, including them all here would be cumbersome, redundant, and frankly super boring. However, I found that this one from CS_598 Practical Statistical Learning was pretty challenging and it's a good showcase of some coding in R. The html from the notebook is embedded in the site, so if you want to check it out and judge me on my loop structure or bad R vectorization, feel free.

Languages:

Shows:


Parallel Programing

For the final project of CS484 Parallel programing, the task was to code a repeating histogram sort algorithm in parallel using different paradigms on the University of Illinois's campus cluster. This was a fun way for me to test my coding chops, because if there's one thing sure to add a challenge to coding or to an algorithm, it's to have it run in parallel efficiently.

Languages:

Utilizes:


If you're bored enough to check out the repo, see the solution.cpp file for the actual algorithm

NER ML with Tensorflow

To get extra practice building neural networks from scratch, I decided to participate in eBay's 2022 University Machine Learning Competition to see how I stacked up. The goal of the competition is to build a model for Named Entity Recognition to label a massive dataset of handbag listings. I obviously didn't win anything, but I don't think I did half bad training some models in my spare time. All in all it's good practice!

Benchmark F1 Score: 0.800

Best F1 Score: 0.8488

Languages:

Models tried with various embedding strategies and architectures:

Utilizes:


Some examples in the repo (not all because they're huge and who actually cares)

Financial News Sentiment Oscillator

Accumulates and analyzes news for a specified company via multiple APIS and creates relevancy weighted news sentiment scores with advanced NLP. This was one of my favorite projects to build, as it brings together a lot of different pieces.

Languages:

Utilizes:

Check out the repo for more information. There are multiple ERDs to help understand the codebase and methodology, and a link to a usage tutorial.

Analyzing Neighborhoods in Chicago

For the culmination of the IBM Data Science certification, I decided on a value analysis approach to analyzing neighborhood value in the Chicagoland area for new home buyers. To add an extra challenge (and because I was curious) I wanted to see if the amount of tree cover had any effect on clustering with regard to housing prices and neighborhood value.

Languages:

Utilizes:

Data Processing Wizard

Designed to be used as an extension to excel, this takes an assembly line approach to routine data processing to create and validate large import files. I originally created the first prototype with the goal of helping less technically savvy colleagues work with data faster, without the need to learn complex excel functions or coding. It grew into a decent excel tool to help speed up data manipulation, validation and importing while keeping human operators in the loop.

Languages:

Utilizes:


Report Generators

Creating custom excel report generators and script runners with VBA is a bit of a passion of mine. 

The baked in integration with Excel and simple UserForm-code integration allows for incredibly swift development of tools to help with a plethora of routine data tasks. 

Whether it be for reporting department metrics, or calculating the grades for my wife's psych 400 class, I say if you can put it into a spreadsheet, you might as well code a reusable solution. 

While you're at it, making it look ridiculous is always a bonus.