Every student at Mass Academy is required to have a science project. We spend a month deciding on a project, and then we present our projects at the fair in February.

Science Project

My science project is titled Accurate Centroid-Determining Human Body Detection. I am using three neural networks to more accurately determine the location of people within depth images. At the time of writing, I have two of my three neural networks done. I chose this project because I am interested in the fields of computer vision and machine learning, and I wanted to have some experience before I committed myself to this path in college. Below is the code for one of my neural networks, called a sparse auto-encoder.

Problem Statement

Current centroid-determining human body detection algorithms using RGB-D images are only up to 95% accurate. Applications for this technology, ranging from security systems to video gaming, require higher accuracy.

Engineering Goal

The algorithm developed for this project will be more accurate than existing algorithms, as evidenced by decreased percent error in bounding box sizes and locations and low distance between calculated and actual centroids.


Video calling platforms, security systems, and video games all use human body detection, but inaccurate methods of locating the human lower the overall accuracy of this technology. To improve this technology from the current 95% maximum accuracy, a Sparse Auto-Encoder (SAE) and convolutional neural network (CNN) were used with Sliding Window Localization (SWL) to locate the centroids of human bodies. Depth images were taken with an Xbox Kinect. The SAE learns the human body features of a depth image dataset, which are then used by the CNN to determine the human-body features of the Kinect’s image. A histogram was then generated, and the centroids of similar valued regions were found. SWL was used to then determine which region’s centroid represented a human. This method has an accuracy of 54.856%. This algorithm can be used in all systems where locating a human body within an image is crucial.

Summary of Work

I used 3 neural networks for this project: a Sparse Auto-Encoder, a Convolutional Neural Network, and a binary human classifier. I used an additional Python file to test the accuracy of the centroid determination using a False Positives Per View algorithm. This algorithm compares the union and intersection of the actual and experimental bounding boxes of the human. Unfortunately, my project ended up being only 55% accurate, as determined by FPPV. This algorithm is not more accurate than current algorithms, but I learned a lot about machine learning, which was valuable to me. My summer job will be in the field of machine learning, partially because of this project.


In the actual STEM class, we learn about different parts of the research process as we go through our projects. One unit, we focused on statistics. As a group project, I worked with some of my friends on a presentation where we discussed when one should use a paired or unpaired T-test. Below is that presentation.