STEM I

STEM is a class taught by Dr. C that builds scientific research and writing skills. We do a five month long independent research project on a science topic of our choice then present it at the February science fair. Some students move on to WRSEF (Worcester Regional Science & Engineering Fair), MSEF (Massachusetts Science & Engineering Fair), and even ISEF! (International Science & Engineering Fair). I decided to do mine in the astrophysics/machine learning field. Check out a description of my project below!

Lowering the Effect of Stellar Noise in the Radial Velocity Exoplanet Detection Method

ABSTRACT
Planets outside of the solar system, known as exoplanets, can be found and characterized using planetary detection methods. This project introduces a deep learning method for discovering planets based on a radial velocity data. This model has strong implications for planets similar to Earth due to their size which leads to them often going undetected. Additionally, it has implications for false identifications of planets which can happen due to noise interfering with measurements, or human error in the identification process. Keywords: radial velocity, stellar noise, convolutional neural network, deep learning

my stem graphical abstract

Click here to view my research proposal!

ENGINEERING QUESTION
Stellar noise negatively influences radial velocity exoplanet detection results, How can it be negated through a machine learning model?

ENGINEERING GOAL
The aim of this project is to accurately identify exoplanets in the presence of stellar noise using a deep learning model.

BACKGROUND
Have you ever wanted to find aliens? Well you're in luck because scientists have discovered methods of finding planets outside of our solar system, known as exoplanets, using various different techniques. One such technique is known as the radial velocity method which involves observing a ray of light from a star to see if it shifts in the light spectrum, which indicates the existance of a planet orbitting it. However, this wonderful technique does not come without drawbacks. Radial velocity is particularly suseptable to noise in calculations. As instruments are becoming more precise, we must look towards noise from space, known as stellar noise. Stellar noise can have numerous causes including but not limited to: shifts in a stars radius and solar flares. Additionally, with human calculations in the process, this can introduce human error, creating the need for an automated detector for exoplanets. This fuels the reasoning behind choosing a deep learning model for this project. Deep learning is a subsect of machine learning characterized by neural networks which mimic human brains to analyze information.

my stem infographic

METHODOLOGY
I chose to use a pre-made dataset for this project because collecting my own radial velocity data would have been incredibly difficult, and I likely would have needed the assistance of a lab. The dataset combines data from the Keppler and TESS missions by NASA and has points between the years of 1992-2025. First, all planets not found through radial velocity were manually removed from the dataset by me. Ater that I was left with around 2578 points. This dataset had a very large majority of points being exoplanets as opposed to non-exoplanets. Because of this very large majority the SMOTE (Synthetic Minority Oversampling Technique) was applied to generate synthetic datapoints of the minority class, which is the non-confirmed exoplanet points. After that, there was around 6000 data points. The last steps in data cleaning were to remove any unnecessary columns, and change strings to numbers for the neural network to be able to translate them. The model was then written, trained, and tested to get the results presented.

my stem methodology graphic"
my stem confusion matrix my stem roc graph

FIGURE 1
A confusion matrix of the model’s predictions on the testing data. It correctly identified 765 truths, 760 falses, and falsely identified 10.

FIGURE 2
A graph of the ROC curve from the model's results. This shows a very high level of accuracy.

ANALYSIS & CONCLUSION
This project has implications for smaller, Earth-like planets, that might go under the radar when it comes to detection due to their small size having a lower gravitational effect on their stars. When the model can identify the data itself, it removes the possibility of human errors. Because of the very high accuracy rate and scores the model is likely overfitting. In the future, I will improve upon the model by identifying if/why the model is overfitting and add more data to the training and testing to improve on the low amount of data it was trained on.

REFERENCES
Williams, M. (2017). What is the radial velocity method?. Universe Today. What is the Radial Velocity Method? - Universe Today  Dumusque, X., Lovis, C., Monteiro, M. J. P. F. G., Santos, N. C., & Udry, S. (2010). Planetary detection limits taking into account stellar noise. Astronomy & Astrophysics, 525, A140. https://doi.org/10.48550/arXiv.1010.2616  Bonfils, X., Cegla, H. M., Littlefair, S., Marsh, T. R., Mathioudakis, M., Moulds, V., Pollacco, D., Shelyag, S., & Watson, C. A. (2012). Stellar jitter from variable gravitational redshift: implications for radial velocity confirmation of habitable exoplanets. Monthly Notices of the Royal Astronomical Society. Letters, 421(1), L54–L58. https://doi.org/10.1111/j.1745-3933.2011.01205.x  Adibekyan, V., Bonfils, X., Delgado Mena, E., Israelian, G., Mayor, M., Mortier, A., Neves, V., Santos, N. C., Sousa, S. G., Tsantaki, M., Udry, S. (2013). SWEET-Cat: A catalogue of parameters for Stars With ExoplanETs: I. New atmospheric parameters and masses for 48 stars with planets. Astronomy and Astrophysics (Berlin), 556, A150-11. https://doi.org/10.1051/0004-6361/201321286  Nash, R., O’Shea, K. (2015). An introduction to convolutional neural networks.  [1511.08458] An Introduction to Convolutional Neural Networks  Kovalerchuk, B., Kalla, D.C., Agarwal, B. (2022). Deep Learning Image Recognition for Non-images. In: Kovalerchuk, B., Nazemi, K., Andonie, R., Datia, N., Banissi, E. (eds) Integrating Artificial Intelligence and Visualization for Visual Knowledge Discovery. Studies in Computational Intelligence, vol 1014. Springer, Cham. https://doi-org.ezpv7-web-p-u01.wpi.edu/10.1007/978-3-030-93119-3_3  Beléndez, A., Beléndez, T., Hernández, A., Márquez, A., Neipp, C., Rodes, J. J. (2003). An analysis of the classical doppler effect. European Journal of Physics, 24(5), 497. 10.1088/0143-0807/24/5/306  Linsky, J. (2025). Effects of Stellar and Instrumental Noise on Radial Velocity Measurements. In: Host Stars and their Effects on Exoplanet Atmospheres. Astrophysics and Space Science Library, vol 473. Springer, Cham. https://doi-org.ezpv7-web-p-u01.wpi.edu/10.1007/978-3-031-75208-7_15  Fischer, D., Queloz, D., Udry, S. (2007). A decade of radial-velocity discoveries in the exoplanet domain. Protostars and Planets, 5, 685-699. 8058-libre.pdf  Francesco Pepe, Michel Mayor, Bernard Delabre, Dominique Kohler, Daniel Lacroix, Didier Queloz, Stephane Udry, Willy Benz, Jean-Loup Bertaux, Jean-Pierre Sivan, "HARPS: a new high-resolution spectrograph for the search of extrasolar planets," Proc. SPIE 4008, Optical and IR Telescope Instrumentation and Detectors, (16 August 2000); https://doi.org/10.1117/12.395516 

My Poster from the February STEM Fair