STEM1

STEM is taught by Dr. Crowthers (who is known by many as the STEM Overlord). From the start of the year until our school science fair in February, we work on our independent STEM research projects. This is an opportunity for us to conduct research and experimentation on a topic of our choosing, and present our findings. We also become acquainted with scientific and technical writing through assignments we complete based on our projects.

Below is an overview to my project for this year, for which I aim to employ a multimodal approach to improve upon existing driver safety systems and effectively detect drowsy drivers in an automotive vehicle.

Fusion-Based Driver State Monitoring: A Multimodal Approach to Drowsiness Mitigation

Driver fatigue is a leading cause of vehicular accidents today. Although there are safety systems being increasingly implemented in modern automotive vehicles, most detect drowsiness after the hazardous driving has already occurred based on singular, individual indicators from the movement of the vehicle. Such systems primarily rely on vehicle-based data rather than the physical state of the driver. My project aims to provide early intervention by developing a multimodal system that monitors the driver in real-time. This design incorporates sensor fusion from three modules that are connected to a RaspberryPi: a computer vision system to track eye closure, an FSR array to detect postural slumping, and an accelerometer to identify irregular head movements characteristic of microsleeps. Furthermore, by utilizing convolutional neural network (CNN) based machine learning for the vision module, a much more accurate and reliable assessment of a driver's state (over traditional methods) can be provided. The sensor fusion system also compensates for the weaknesses of individual sensors, minimizing spurious detections. After conducting testing, I have sufficient evidence to say this design is indeed a robust, low-cost safety system that can detect the earliest signs of fatigue and alert a driver within two seconds, potentially preventing life-threatening accidents before they even occur.

Figures

Analysis

Multi-Sensor Performance Metrics

Figure 4 depicts the FSR and Accelerometer readings during the "Awake" and "Drowsy" states. The x-axis represents FSR pressure in pounds, while the y-axis is for Accelerometer tilt (degrees). A Student’s T-test was conducted to verify the statistical significance between these states (p < 0.0001). The results provide convincing statistical evidence that the sensor readings during the awake state are significantly different from the drowsy state. The drowsy state signifies a clear physical slump, characterised by the loss of pressure and increased head tilt.

Integrated Detection Accuracy

Figure 5 shows the detection accuracy of individual sensor modules compared to the integrated multimodal system. This was determined through trials of the Camera, FSR, and Accelerometer modules both independently and in fusion. Fisher’s Exact Test was conducted for these trials to verify a significant difference in accuracy (p < 0.05). This confirmed that the multimodal integration is significantly more accurate than each individual module’s performance.

Environmental Resilience: Illuminance

Figure 6 depicts the system accuracy across varying lighting conditions. Fisher’s Exact Test was conducted to verify performance differences between the integrated system and camera vision alone under these conditions (p < 0.01). The results show that the integrated sensor fusion is significantly more robust against illuminance interference than vision-only methods. The hardware components (FSR and Accelerometer) compensate for visual obstructions for the camera, allowing the integrated system to maintain high reliability despite the poor lighting.

Environmental Resilience: Eyewear

Figure 7 shows the results of system accuracy across different eyewear types: sunglasses and spectacles. The results are compared using Fisher’s Exact Test (p < 0.05). This was used to show that the various experimental groups affected the vision model's reliability. There is convincing evidence that the sensor fusion is significantly more robust against eyewear interference than a standalone camera module.

System Latency Analysis

Figure 8 shows the system latency in milliseconds recorded over multiple trials. A one-sample T-test was conducted to verify a significant difference between the system speed and the 1500 ms human perception threshold (p < 0.0001). The mean latency of 473.88 ms signifies a response time well under the human standard of approximately 1.5 seconds. This conveys that the system response time is significantly faster than human perception, allowing for adequate time for intervention before a hazardous event occurs.

Discussion/Conclusion

Discussion

In experimentation, the independent performance of the YOLOv8 camera module, FSR array, and accelerometer were evaluated against the integrated multimodal system to determine the feasibility of the proposed sensor fusion approach for real-time safety. While individual modules are capable of identifying specific markers of fatigue, results indicate that they remain highly susceptible to environmental variables when operating.

This is particularly evident when addressing the system's robustness against common driving stressors. Standalone computer vision models are often hindered by fluctuations in lighting or the presence of driver accessories. However, the integrated system maintained a high level of accuracy across various illuminance levels and successfully navigated the use of different eyewear types, such as sunglasses and spectacles. This implies that when visual data is compromised, the hardware-based physical sensor data from the FSRs and accelerometer provides a critical redundancy. This multi-layered architecture ensures that the system remains functional regardless of external conditions that would typically cause a software-only monitor to fail.

During a "nodding off" event, the FSR sensors detected almost a total loss of seatback pressure while the accelerometer identified a significant shift in head angle. Statistical analysis provided convincing evidence that the readings recorded during an alert state are distinct from those recorded during a drowsy state. This confirms that the drowsy state can essentially be characterised by a measurable physical slump that is distinct from an active driving posture, allowing the system to categorize these shifts as reliable indicators of fatigue rather than random movement.

Furthermore, system efficiency was analysed to ensure that safety interventions occur within a viable timeframe. While the standard human perception threshold for a hazard is approximately 1500 ms, this system achieved a mean latency of 473.88 ms. By maintaining a response time well under the human standard, the device is capable of delivering an immediate audio alert to reawaken the driver. This rapid response time is a crucial factor in bridging the gap between driver drowsiness and the prevention of a potential collision.

Conclusion

In conclusion, this project successfully achieved its objective through the development of a multimodal system that utilises YOLOv8 machine learning object detection alongside hardware sensor inputs to create a more robust solution for drowsiness mitigation. The results indicate that this sensor fusion approach is significantly more accurate than individual detection by module, allowing for the reliability necessary for real-world implementation. Furthermore, the system maintained a high degree of effectiveness even when tested under environmental stressors—such as low lighting and eyewear—that typically act as obstructions for traditional vision-based systems. Statistical testing validated the use of physical position-based modules. It was established that postural slumping and head nodding are distinct indicators that effectively differentiate between "Awake" and "Drowsy" states. With a mean latency of 473.88 ms, the system operates significantly faster than the typical human perception time, ensuring that safety interventions occur and the driver is reawakened a sufficient amount of time before a hazardous driving event occurs. Ultimately, by fusing the visual physical data into one functioning system, this project addresses the limitations of existing single-module monitors, offering a reliable tool to combat one of the leading causes of vehicular accidents, and potentially save more lives as well.

STEM I

Fusion-Based Driver State Monitoring: A Multimodal Approach to Drowsiness Mitigation

Abstract

Graphical Abstract

Problem Statement

Objective

Background Infographic

Background

Procedure Infographic

Procedure

Figures

Analysis

Discussion/Conclusion

References

February Fair Poster