About

Shaquille Pearson

Software

[Research & Engineering]

My name is Shaquille, I'm a Master's student in Computer Science at the University of Waterloo. I focus on solving complex software engineering challenges, particularly in build systems, dependency management, and emerging technologies like machine learning and blockchains. My goal is to continue advancing my expertise while also making meaningful contributions to the software engineering community, whether in academia or industry.

Skills

Experience

Graduate Research Assistant @ (UoW)

  • Data Pipeline - Designed and built a data filtration pipeline with Python and Github's GraphQL that processed over 1.27 million open-source projects.
  • Build Reproduction - Led efforts to create reproducible build environments for 982 builds in the NPM ecosystem using Docker, Python, and GitHub Actions and YML files.
  • Algorithm Optimization - Identified and categorized 156 new ghost commit patterns within the Debian ecosystem. Designed mitigation strategies and improved SSZ algorithm accuracy by 14%.

January 2023 - Present

Instructional Apprentice/Teaching Assistant @ (UoW)

  • Technical Assistance - Assisted 30+ students with coding assignments debugging code in Python.
  • Teamwork & Leadership - Communicated closely with course instructors and fellow instructional apprentices to lead tutorials, proctor exams, and coordinate grading.
  • Communication & Collaboration - Communicated effectively with students and instructors via email, forums , and in-person meetings to address inquiries.

January 2023 - Present

Junior ICT Officer @ (DPI)

  • Content Automation - Developed an automation system using Node.js and the Axios library to interact with the DPI website’s REST CMS API, enabling automatic updates for news articles and press releases etc.
  • Networking Traffic Monitoring - Used Wireshark and Python to monitor and identify/resolve traffic bottlenecks caused by IP conflicts on DHCP-assigned devices by 9%.
  • Database Optimization - Conducted a full audit of MySQL DBMS to minimize redundant data and implemented B-Tree indexing on frequently queried columns which reduced query response times by 13%

August 2022 - January 2023

Course Projects

Exploring Dependency Related Build Breakages In The NPM Ecosystem

This project analyzes dependency-related build failures within the NPM ecosystem by examining JavaScript projects. I utilized Git to track modifications in package.json files and employed Docker to create isolated, reproducible build environments. CI/CD pipelines, specifically GitHub Actions, were parsed to identify breaking changes, while tools like nektos/act and docker-compose were used to simulate the build process locally.

Code Review Practises On Ethereum Smart Contracts

This project assesses the effectiveness of code review for Ethereum smart contracts across major projects like Uniswap and Aave. Which represent some of the most widely used protocols in the blockchain ecosystem, where security vulnerabilities can lead to significant financial losses. Using Git for version control and Slither for static analysis, vulnerabilities such as reentrancy, unchecked transfers, and zero-address issues were identified.

Predicting Build Breakage With Machine Learning.

This project is a literature survey focused on applying machine learning techniques to predict build failures in Continuous Integration (CI) systems. It reviews a range of approaches, including Random Forests, Logistic Regression, and Deep Learning, The survey explores key factors influencing model performance, such as code complexity, commit frequency, and noise in the data. It also examines statistical methods like ANOVA and Principal Component Analysis (PCA).

Exploring the Prevalence of Social Biases In State Of The Art large language Models

This project investigates the prevalence of social biases in large language models like GPT-2, DistilGPT-2, Bloom-560M, and Facebook-OPT-350M, hosted on Hugging Face. By analyzing the behavior of these models with prompts designed to evoke toxic or biased responses, we utilized tools like the Perspective API to assess generated content across attributes such as toxicity, identity attack, and profanity.

Personal Projects