Dr. Crowthers’ STEM I class includes one major research project, an independent one. Over the summer, students brainstorm to narrow down and finally choose a topic. Then, throughout the first half of the year, they research it, and refine their presentation skills along the way. Each term has regular update meetings. During A Term, students present the findings of a paper that they read, developing presentational skills that would be helpful when presenting our own work. Then, in B Term and C Term, there are update meetings where we present our own data from our projects. There are also peer discussions to provide feedback, which helps students improve on their work.
Using Convolutional Neural Networks and Machine Learning to Diagnose Skin Cancer Moles
This project created a machine learning model using CNNs to detect skin cancer and give treatment recommendations. It could tell the difference between benign and malignant lesions with high accuracy and even used ChatGPT to provide helpful medical advice. There were some challenges, like an imbalanced dataset and a lack of diversity in skin tones, but the model still showed that AI can make skin cancer detection faster and more accessible. Future improvements could focus on making the dataset more inclusive, reducing overfitting, and testing the model in real-world settings.
Skin cancer diagnosis is often a lengthy and stressful process, requiring a biopsy, lab testing, and weeks of waiting for results. This delay can impact treatment plans and create unnecessary anxiety for patients. Machine Learning models offer a promising alternative by providing accurate diagnoses almost instantly, allowing for quicker decision-making and earlier medical intervention. This project aims to develop a Convolutional Neural Network (CNN)-based Machine Learning model that is capable of accurately classifying benign and malignant skin cancer moles. CNNs are extremely effective in medical image analysis, as they can identify intricate patterns and subtle differences that distinguish various skin conditions (Li et al., 2023). While existing models are accurately able to detect skin cancer, they often lack an integrated system that provides real-time medical advice or treatment recommendations specific to the diagnosis. As a result, patients may receive a diagnosis without clear guidance on the appropriate next steps for treatment or further evaluation. Doctors using this model as a secondary resource for their diagnoses could also benefit from that advice, as it can help them tailor the advice they give to their patient. By combining AI-driven diagnosis with personalized medical guidance, this model aims to provide a more useful tool for patients and healthcare professionals. Not only will it help detect skin cancer earlier, but it will also give users the next steps they should take, making it a practical solution for real-world use in dermatology and healthcare settings.
Click HERE to read the supplemental documents to the project!
Skin cancer is one of the most common and potentially deadliest forms of cancer, especially if not detected early on. Traditional methods to diagnose, such as biopsies, can take weeks to provide results, which can cause delays in treatment. In malignant cases, this can lead to a drop in survival rates.
This project aims to create an online tool that efficiently and accurately diagnoses benign and malignant skin cancer moles while providing personalized treatment advice and information about the diagnosis. The goal is to reduce delays in treatment, particularly for malignant cases, caused by long wait times for results.
Skin cancer diagnosis typically requires a healthcare visit, a biopsy, and a 2–3 week wait for lab results (Cancer Research UK, n.d.). This delay can be stressful, especially for malignant cases where early detection is crucial. Melanoma patients, for example, have a 99% five-year survival rate if diagnosed early, but this drops to 27% if metastasized (Cleveland Clinic, n.d.). AI and Machine Learning (ML) are transforming healthcare by reducing human errors and offering personalized treatment plans (Cruz & Wishart, 2007; Khalifa & Albadawy, 2024). While doctors rely on training and experience, AI models can integrate patient history and data to deliver highly accurate, personalized treatment recommendations—enhancing, rather than replacing, medical expertise.
Model 1: Skin Lesion Classification
A Convolutional Neural Network (CNN) was trained on the HAM10000 Dataset to classify skin lesions into seven categories.
1. Data Preprocessing and Balancing
2. CNN Model and Training
Model 2: AI-Powered Skin Lesion Diagnosis
Enhances Model 1 by integrating ChatGPT for medical guidance.
3. Image Processing and Prediction
4. AI-Generated Treatment Advice
5. Visualization and Deployment
Figure #1
Dataset Characteristics
The top-left graph shows the distribution of cell types, the top-right graph displays the distribution by sex, the bottom-left graph illustrates the distribution by localization, and the bottom-right graph depicts the age distribution (Kaggle 2019).
Figure #2
Model Accuracy and Loss
The top graph shows training and validation accuracy over the various epochs, while the bottom graph shows training and validation loss. These trends help to evaluate how well the model can handle new training data.
Figure #3
Example Diagnosis #1
Prediction results showing the original image, preprocessed image, and a probability chart highlighting the confidence level of the model’s diagnosis of an image, along with the AI-powered medical information and precautions for the given diagnosis.
Figure #4
Example Diagnosis #2
Prediction results showing the original image, preprocessed image, and a probability chart highlighting the confidence level of the model’s diagnosis of another image, along with the AI-powered medical information and precautions for the given diagnosis.
Model 1: Skin Lesion Classification
The first model successfully classified skin lesion images into seven categories based on the dataset:
melanoma (mel), melanocytic nevi (nv), benign keratosis-like lesions (bkl), basal cell carcinoma (bcc), actinic keratoses (akiec),
vascular lesions (vas), and dermatofibroma (df).
In addition to lesion classification, the model categorized cases based on sex, lesion location on the body, and patient age.
Model Performance:
• Training accuracy increased steadily, showing that the model learned well from the data.
• Validation accuracy fluctuated significantly, suggesting overfitting (where the model performed well on training data but struggled with new data).
• Training loss decreased continuously as the model minimized errors.
• Validation loss fluctuated and diverged, a common sign of overfitting. Training was stopped at an optimal number of epochs to prevent further performance decline.
Model 2: AI-Powered Skin Lesion Diagnosis
Building on the classification capabilities of Model 1, the second model allowed user input for a real-time lesion diagnosis. A user could
upload an image of a mole, and the model would:
This addition bridges the gap between AI-based classification and medical insights, making the model more practical for real-world use.
This project successfully developed a CNN-based model to classify skin cancer lesions accurately while providing real-time AI-driven treatment recommendations for practical use.
The first model classified seven skin lesion types and used patient metadata (age, sex, lesion location) to refine predictions. While training accuracy improved, validation accuracy fluctuated, showing overfitting and a need for better generalization.
A major challenge was dataset imbalance, with more nevi cases than malignant ones. Resampling and augmentation helped, but larger, more diverse datasets would be helpful. Limited skin tone representation also shows the need for more inclusive data to improve AI fairness.
The second model let users upload images, get instant predictions, and receive ChatGPT treatment recommendations, making AI-powered dermatology more practical and useful in real-world settings.
Limitations include:
This project demonstrates how AI can improve dermatology by making skin cancer detection faster and more accessible. While the model performed well, more diverse data, better generalization, and real-world testing are needed for clinical use.