STEM I

STEM I is a course in which MAMS juniors learn how to read and write scientific literature, work in lab environments, and many other factors while working on a project for the science fair. The course is taught by Dr. Kevin Crowthers, who also advises the juniors throughout the research process.

A Machine Learning Method to Optimizing the Tensile Strength of Starch-Based Bioplastics

This project tested out machine learning models (specifically, four regression algorithms) to determine which algorithm can best predict the tensile strength of starch-based bioplastics. The inspiration behind this project was that everyday synthetic plastics are so common yet so dangerous to the environment, including humans. With rising tensions over the climate, microplastics, and plastic pollution, some scientists have began looking into a biodegradable, "green" type of plastics known as bioplastics. While these plastics are biogdegradable and do not suffer from the aforementioned problems, they are quite weak. Therefore, being able to use a regression algorithm to model the tensile strength of a bioplastic can allow for the tensile strength to then be maximized. It was finally found that, with an R-squared of roughly 0.52, polynomial regression is the best model when it comes to predicting the tensile strength of a starch-based bioplastic given its amylose and glycerol concentrations. In the future, a polynomial regression algorithm can be used to maximize tensile strength (or model the strength to one's needs), allowing for a much more efficient and sustainable method of fabricating bioplastics.

ABSTRACT

Plastics are used all over the world, and recently, there has been a boom in the need for plastics due to the growing infrastructure and technologies of the modern era. However, with the rise of synthetic plastic production comes a great environmental cost. A large problem with synthetic plastics, or plastics made from petroleum, is the fact that the production of these plastics accounts for a significant amount of carbon dioxide emissions, which may lead to global warming or climate change. Therefore, many scientists have begun research on bioplastics, a type of biodegradable plastic that is created using organic materials, such as starch. Furthermore, the production of bioplastics emits much less carbon dioxide into the atmosphere. However, there are also many problems currently haunting the field of bioplastics. For example, bioplastics are, on average, weaker than their synthetic counterparts. As a result, this project wishes to use regression algorithms to estimate and optimize the tensile strength of starch-based bioplastics. The independent variables were the amylose, amylopectin, and glycerol concentrations. It was hypothesized that as the starch concentrations increased, the tensile strength would also increase (with the opposite occurring for glycerol) and that a regression algorithm could model these correlations. Multiple iterations of training and verifying (with a 70-30 split) to minimize the model’s error and maximize the model’s accuracy. Polynomial regression was found to be the most accurate predictor of tensile strength by just using the independent variables of amylose and glycerol.

Keywords: Bioplastics, Starch-Based, Amylose, Amylopectin, Glycerol, Machine Learning, Linear Regression, Polynomial Regression, Support Vector Regression

Graphical Abstract Pg. 1 Graphical Abstract Pg. 2

RESEARCH PROPOSAL SUBPAGE

A subpage containing the project's Grant Proposal as well as Project Notes can be found at this link: Link to Research Proposal!

PROBLEM STATEMENT + ENGINEERING GOAL

Given the relatively low strength of current starch-based bioplastics fabricated with glycerol, can a machine learning model be used to help predict and optimize the tensile strength of these bioplastics?

Design a regression algorithm that can accurately predict the tensile strength of a starch-based bioplastic given a subset of the independent variables of the amylose, amylopectin, and glycerol concentrations of said plastic.

BACKGROUND INFORMATION

Graphical Background

Currently, synthetic plastics are one of the most harmful yet constantly used materials in the world. These plastics, fabricated with the use of petroleum, a non-biodegradable material, end up in landfills, increase carbon dioxide emissions, and pollute the Earth (Ritchie, 2023; Pilapitiya & Ratnayke, 2024). Therefore, some scientists have been working with starch-based bioplastics, plastics that are fabricated with starch and are biodegradable. According to a review article by Nanda et al. (2022), there were roughly 1.35 million metric tons of biodegradable bioplastics produced out of the 2.2 million metric tons total of bioplastics produced in 2021. While this is significantly less than the 370 million tons of synthetic plastics reported in the same article, this shows great potential for the bioplastic field (Nanda et. al, 2022). A significant problem with starch-based bioplastics are that they are significantly weaker than their synthetic counterparts (Abe et al., 2021). Therefore, the engineering goal for my project is to use machine learning to predict and optimize the tensile strength of a starch-based bioplastic. Machine learning has been used in other papers thus far, such as Marichelvam et al. (2019), but this used the Response Surface Method for just one specific starch (Arrowroot), rather than nmy project which uses regression for all starch-based bioplastics.

METHODOLOGY

Graphical Methodology

Four different regression algorithms were identified and used to predict the tensile strength of a bioplastic given its amylose, amylopectin, and glycerol concentrations. Those four were Linear, Polynomial, Support Vector, and Partial Least Squares Regression. The code used for the model is housed at this GitHub repository. These types of regression algorithms, which are supervised algorithms, were used since they predict the value of a dependent variable given the independent variables. The data used to train the model was collected from online sources (i.e. previous research papers), and the data was normalized to minimize differences that could have been caused by external factors. A 70/30 training/testing split was used, with the average R-squared and RMSE values being recorded over 250 iterations. The R-squared and RMSE values were recorded as these generally show the accuracy of the model and the model’s error respectively. Both the R-squared and RMSE values were found through the Python code (by using .score() and mean_squared_error() respectively). As explained above, these values help explain the accuracy and the error of the regression algorithm respectively, with the R-squared value optimized at 1 (highest accuracy), and the RMSE value being optimal at 0 (lowest error).

DATA/FIGURES

Figure One Figure Two

RESULTS/ANALYSIS

The accuracies of the four regression algorithms can be seen from the figures above. Specifically, in the image on the left, the R-squared and RMSE of a specific regression algorithm are one and two rows down respectively from the name of said algorithm. For example, the R-squared value of Linear Regression is roughly 0.24 and the RMSE is roughly 3.24. As shown in this image, the R-squared values of Linear, Polynomial, Support Vector, and Partial Least Square Regression algorithms were roughly 0.24, 0.52, 0.19, and 0.24 respectively. As shown in this image, the RMSE values of Linear, Polynomial, Support Vector, and Partial Least Square Regression algorithms were roughly 3.24, 2.57, 3.69, and 3.24 respectively. This shows us that, of the four regression algorithms we have tested, polynomial regression is the most promising way to predict the tensile strength of starch-based bioplastics given their amylose and glycerol concentrations.

DISCUSSION/CONCLUSION

Although the R-squared values of the model can still be improved, these regression algorithms show that machine learning can be used to predict the tensile strength of a starch-based bioplastic. Our results show that polynomial regression performs better than linear, support vector, and partial least square regression, with an R-squared of roughly 0.52, and a lower RMSE than the other algorithms. Therefore, polynomial regression can viably be used to predict the tensile strength of starch-based bioplastics in an actual lab environment to help save time. A potential limitation with this project is that the model cannot account for other factors in the bioplastic fabrication process, such as the utilization of nanoparticles or other starches to strengthen the plastic. Therefore, one possible future step would be to use polynomial regression to make a model that can take any combination of starch and plasticizer and predict the tensile strength. The R-squared and RMSE values of the four regression algorithms were kept track of, and the values of 0.52 and 2.57 respectively for polynomial regression prove that, although more future steps can be done to make the model even more accurate, there is significant potential in this proposal and that polynomial regression can potentially be used to model the tensile strength of starch-based bioplastics. Future steps including looking at more types of regression algorithms, such as gradient boosting or random forest regression algorithms. Alongside using other types of regression, the machine learning model could also be used to predict the values of other mechanical properties, such as the bioplastic’s Young’s modulus, time required for the plastic to biodegrade, or water resistance. Overall, this project aimed to create a machine learning model that used the amylose, amylopectin, and glycerol concentration of a bioplastic to predict the tensile strength of said bioplastic. The model was coded in Python using pre-existing data to train and test the model. A 70/30 training/testing split was used over 250 random iterations to see the average R-squared and RMSE values of the four separate regression algorithms being tested. It was found that the polynomial regression model performed the best out of all four models, with an R-squared value of 0.52 and a RMSE of 2.57. This shows that a polynomial regression algorithm is best suited for the modeling of the tensile strength of a bioplastic. Now that we have shown the usefulness of machine learning in the bioplastic fabrication process, with polynomial regression’s ability to make the process of testing plastics for tensile strength much more efficient, we can continue moving towards a much more sustainable future.

REFERENCES

Abe, M. M., Martins, J. R., Sanvezzo, P. B., Macedo, J. V., Branciforti, M. C., Halley, P., Botaro V. R., & Brienzo, M. (2021).

Advantages and Disadvantages of Bioplastics Production from Starch and Lignocellulosic Components.

Polymers (Basel), 13(15). 10.3390/polym13152484

Guardia, C., Caseiro, J., & Pires, A. (2024). Machine learning to enhance sustainable plastics: A review.

Journal of Cleaner Production, 474. https://doi.org/10.1016/j.jclepro.2024.143602.

Hendrawan, Y., Putranto, A. W., Fauziah, T. R., & Argo, B. D. (2020). Modeling and Optimization of Tensile Strength of

Arrowroot Bioplastic Using Response Surface Method. IOP Conference Series: Earth and Environmental Science, 515.

https://doi.org/10.1088/1755-1315/515/1/012079

IPCC. (2021, August 9). Climate change widespread, rapid, and intensifying – IPCC.

https://www.ipcc.ch/2021/08/09/ar6-wg1-20210809-pr/

Kuenneth, C., Lalonde, J., Marrone, B. L., Iverson, C. N., Ramprasad, R., & Pilania, G. (2022).

Bioplastic design using multitask deep neural networks. Communications Networks, 3(96).

https://doi.org/10.1038/s43246-022-00319-2

Li, Y., Tao, L., Wang, Q., Wang, F., Li, G., & Song, M. (2023). Potential Health Impact of Microplastics:

A Review of Environmental Distribution, Human Exposure, and Toxic Effects. Environmental Health, 1(4).

https://doi.org/10.1021/envhealth.3c00052.

Marichelvam, M. K., Jawaid, M., & Asim, M. (2019). Corn and Rice Starch-Based Bio-Plastics as Alternative Packaging Materials.

Fibers, 7(4). https://doi.org/10.3390/fib7040032.

NASA. (n. d.). The Effects of Climate Change. https://science.nasa.gov/climate-change/effects/

Nguyen, T. K., That, N. T. T., Nguyen, N. T., & Nguyen, H. T. (2022). Development of Starch-Based Bioplastic from Jackfruit Seed.

Advances in Polymer Technology, 2022(1). https://doi.org/10.1155/2022/6547461.

Pilapitiya, P. G. C. N. T., & Ratnayake, A. S. (2024). The world of plastic waste: A review. Cleaner Materials, 11.

https://doi.org/10.1016/j.clema.2024.100220.

Ritchie, H. (2023, October 5). How much of global greenhouse gas emissions come from plastics? Our World in Data.

https://ourworldindata.org/ghg-emissions-plastics

FEB FAIR POSTER

Feb Fair Poster