STEM is a unique course offered to juniors that focuses on scientific research and engineering. Students are instructed to conduct their own scientific research project on a topic of their choice, where they can reach out to professors and authors, collect their data, and write technical documents such as grant proposals and a STEM thesis. Students have the opportunity to submit their projects to science fairs and competitions. This class encourages students to work independently and efficiently.
A multi-class model was constructed to evaluate all seven emotions chosen: happy, fear, disgust, anger, surprise, fear, and neutral. Along with the seven-class model, two binary models were also constructed to differentiate between specific emotions that shared many similar features, such as anger and disgust. All training accuracies were significant and high, all presenting themselves above 85% whereas the validation accuracies were more significant in the binary models compared to that in the multi-class model.
Facial expression analysis is often not recognized as substantial evidence under the notion that it is unreliable. However, analyzing the responses displayed on a suspect's face or studying the reactions present in evidence, such as images and videos, can reveal copious amounts of information. But, in many cases, recognizing an emotion is not enough evidence to be significant. The proposed models can analyze an individual's facial features and classify the emotion as one of seven that they are trained on. Utilizing Keras as well as multiple deep learning packages a model was constructed to classify emotions with efficiency and accuracy. The model contained seven layers: two convolutional layers, two pooling layers, two dense layers, and one flatten layer. These layers trained a dataset imported from Kaggle, which was trimmed down to fit the purpose of this project. In order to keep a 70:30 ratio between training and testing images as well as having an even number of images for each emotion, the data set, originally containing 10,000 images, was trimmed down so each training emotion had 100 images and each testing emotion had 43 images. A multi-class model was constructed to evaluate all seven emotions chosen: happy, fear, disgust, anger, surprise, fear, and neutral. Along with the seven-class model, two binary models were also constructed to differentiate between specific emotions that shared many similar features, such as anger and disgust. All training accuracies were significant and high, all presenting themselves above 85% whereas the validation accuracies were more significant in the binary models compared to that in the multi-class model.
Facial expression analysis is often not recognized as substantial evidence under the notion that it is unreliable.
However, analyzing the responses displayed on a suspect's face or studying the reactions present in evidence, such as images and videos, can reveal copious amounts of information.
Due to flaws being present in the dataset itself, the accuracy of the seven- class model was low, with a testing accuracy of 96% and a validation accuracy of 35%, as seen in figure 1. This indicates that the model is confident in classifying the images it had been trained with but not as confident with classifying new data. This led to a high validation loss while running the epochs. The training loss indicates how well the model is fitting the training data, while the validation loss indicates how well the model fits new data. A high loss value usually means the model is producing erroneous output, while a low loss value indicates that there are fewer errors in the model. As seen in figure 2, the graph portrays a significant difference between the validation loss and the testing loss, with the validation loss increasing. The model was having difficulty classifying the emotions which led to this outcome.
However, the model was altered to accommodate two emotions. This is beneficial in order to differentiate between the two without human bias interfering. The model was relatively similar to that of the seven-class model, but the batch size was reduced to 32, half of what it was originally. The size of the testing and training were the same, with the same images used as well. This was to preserve the 70:30 ratio established in the beginning of the study.
The second model created was classifying images of anger and disgust. Both emotions have features in common which, to the human eye, are difficult to differentiate. In the case of a courtroom, human opinion would be called into play to make the deduction. Said opinion could contain some level of bias against or for the defendant. The model was relatively accurate when presented with only two emotions. With a testing accuracy of about 97% and a validation accuracy of about 73%, as seen in figure 3, this model presented itself with a much greater performance than that of the seven-class model. The issue of the validation loss increasing did arise in this model as well, with the testing loss steadily decreasing but the validation loss increasing. The same issues in the dataset were still present but the model took in less of that confusing data, allowing for a higher accuracy rate.
The third model was put to the task of classifying images of the emotions fear and surprise. These emotions are often confused with one another, both sharing similar facial features at times. A frightened person can share wide eyes and agape mouth that a surprised person may wear. How a person interprets such an expression can turn a court case in vastly different directions. This model, in comparison to the seven class model, performed with a relatively high accuracy as well. The model presented a testing accuracy of 85% and a validation accuracy of 65%, as seen in figure 4. Though the Anger vs Disgust model presented a higher testing and validation accuracy, the testing and validation losses are much more consistent in this model.
Though the results were not as high as expected, the overall goal of the project was reached. It can be confidently said, with the correct dataset, void of interfering features, a more accurate portrayal of the emotions present on ones face can be achieved. With these obstacles, the model was still able to preform with relative accuracy when presented with only two emotions which, in some cases, would be more beneficial. If a person is clearly in distress, what use is there to check for happiness or neutral? The overall goal of this project was to create a model that could be used as substantial data in a court setting, and that goal was achieved through the three constructed models.
Finding a dataset was a significant hurdle in the process of creating the models. There were a limited number of datasets that contained the desired emotions, good quality, and enough images to train the model. This caused some of the criteria for the datasets to favored at the expense of another. The chosen dataset was a good size and contained the seven emotions used in this project, but did not have the best quality images to train the model. However, the models still preformed with an unforeseen accuracy and there is confidence that the accuracy could be increased with a better dataset for future extensions.
There has been a long history of judges needing to turn down cases or a biased jury. These human interferences are less impactful to the outcome of the case with the use of this model. Since the model eliminates human opinion as a barrier, there will be less of chance for the court to favor one decision over another because of a person’s reaction. Likewise, instances where the emotions were disregarded or brushed off will become less common. The model provides credible as well as substantial evidence behind any expression integrator of defendant could wear. This can allow attorneys to drive the case in ways previously unavailable or unknown.
As mentioned before, the model could be improved, as a future extension, with a better dataset to increase accuracy. This dataset would be void of any images that were of bad quality, blocked the person’s facial expressions, or were not of real human beings. All these factors contributed to the current accuracy rates but can be improved with further research. This is important as it provides the justice system a tool to eliminate bias against those at a disadvantage. With an increased accuracy rate, courtrooms would be more trusting in the new technology and be able to aid those blinded by bias.
The justice system has had a long history of overlooking the emotional reactions of suspects. From their built-in responses such as their fight, flight, or freeze to our responses to danger, humans are complex beings. Catching these aspects of a conversation or encounter can unravel a whole interrogation but instead have been dismissed to lack of credibility. This pertinent issue must be faced with a solution, which is what this model provides. Utilizing the resources of Kaggle as well the elements of Google Collab, the program was able to achieve unbiased results. The models constructed in this project provided concise answers to the unwritten question: what is someone feeling? The data unveiled the hidden emotions that could have never been identified with the same accuracy by human eyes. With multiple Keras packages imported as well as OpenCV, a deep learning model with seven layers was constructed. This model creates opportunities for the justice system to become less biased and consider the emotional responses present in evidence and interrogations.