Exploring Emotional Sentience Through AI-Created Music
Research Question: Can AI express emotional sentience through the music it creates?
Purpose: Musicality is spread throughout the natural world. However, humans alone use music to convey emotion. This stems from our ability to empathize. If algorithm-produced music can convey emotion, then the algorithm steps towards rudimentary sentience.
There was no significant correlation between song composer and how people responded to the discrimination task (p = 0.183). Hence, in general, participants could not consciously differentiate between the human and AI-produced pieces. This demonstrates that modern AI-produced music is able to emulate human compositions in both style and complexity, at least to the point where listeners can no longer readily tell the difference. Since human-produced music and AI-produced music are consciously indifferentiable, further study into how they each impact listeners’ emotions was warranted. This leads to the next finding.
In general, the emotional data collected supports the hypothesis that AI-produced music can affect the emotions of the listener. However, there is much more to the story. Human and AI music both reduce listeners’ anxiety and anger significantly. That said, while all 3 human-produced pieces reduced listeners’ sadness, only 1 out of 4 AI produced pieces managed to do so. This suggests that through certain emotions, such as sadness, humans can subconsciously differentiate between AI and human music.
Ultimately, we have found that AI-produced music is not only indifferentiable from human compositions, but also capable of affecting listeners’ emotions. Besides furthering the theory of AI sentience, this research opens the door to a diverse set of future explorations. In the field of computing, further study must be done to determine what aspects of the algorithm cause certain emotions to be evoked. This may involve comparing algorithms that successfully manipulate an emotion with those that do not. In the field of behavioral science and psychology, further study may involve examining brain wave activity and heart rate while individuals listen to AI-composed music. This research has found that there are differences in the effect of AI-produced music and human-produced music. These differences are unnoticeable at the conscious level, and only somewhat revealed by emotions. Perhaps greater insights lie at a more fundamental level such as the firing of neurons, or the rhythm of the heart.
As the 21st century progresses, artificial intelligence is stepping intellectually closer to mankind than ever before. That said, until now, there was a certain essence of humanity which many believed was still missing from machines, namely the power to create and the power to emote. In recent times, AI has been developed with the ability to create music. If humans listen to AI-composed and human-composed music, they are unable to tell the difference and are emotionally affected by both. To demonstrate this hypothesis, a discrimination task was administered wherein participants differentiated between human and AI-produced music. Throughout the experiment, the Discrete Emotional Questionnaire was used to collect self-reported data on the pieces’ impact on listeners’ emotional state.
There was no significant difference in discrimination task response depending on whether the piece was human-written or AI-written, X2 (2, N=156) = 3.44, p = 0.183. In other words, participants could not distinguish the two. Furthermore, both human and AI-produced music reduced listener anxiety and anger. However, more human-produced pieces of music affected the listeners’ sadness and relaxation.
Humans and AI both have the capacity to emotionally affect through music. If sentience includes the ability to express emotion, AI has stepped closer to humans in that regard. As a result of these findings, further research is warranted on multiple fronts. Computationally, to determine what specific code allows an algorithm to elicit certain emotional responses. Biologically, to discover why certain emotions, and their neural circuits, are more affected by AI music than others.
The overarching question that this project aims to answer is, “Can AI express emotional sentience through music?” Humans express their emotions in many forms including art, literature, speech, facial expressions, and music. As a music lover, I will explore whether AI can match humans in expressing emotion through music.
I propose a series of hypotheses to test this question.
First, if a creative algorithm is trained or modelled on human music, then the resultant compositions will be indistinguishable from human compositions. Humans propagate their emotions through music (Juslin, 2013). Therefore, if a generative algorithm can produce music that is indistinguishable from that produced by humans, then the algorithm can allow an AI to convey emotions. Finally, one of the indicators of sentience is the ability to propagate one’s feelings (Keltner et. al, 2019). If an algorithm allows AI to convey its emotions, then the AI may be one step closer to sentience.
To participate, simply read through the informed consent form and then fill out the survey linked below.
What defines sentience? The word derives from the Latin root sentire which means“to sense or to feel”, and indeed sentience is to be “responsive to or conscious of sense impressions” (Merriam-Webster, n.d.). In other words, it’s simply the ability to sense one’s surroundings. However, a thermometer can sense the temperature around it, yet we wouldn’t call it sentient. A light switch can sense whether it is on or off, hold that information, and even send it to a connected bulb. That said, it is still not sentient. Perhaps, what makes humans unique, and sentient, is not our ability to feel, but rather our ability to share feelings with others (Keltner et. al, 2019). By the status quo, computers aren’t sentient (Allen, 2016). However, if AI can read human music that expresses emotion and machine learning allows AI to create unique compositions modelled on whatever it reads, then I believe that artificial-intelligence created music will propagate emotions as well. The purpose of this project is to explore whether AI can use music to express emotion as well as humanity can.
Humans express their emotions through a diverse set of mediums such as facial expression, speech, literature, music, and art. This project specifically focuses on whether algorithmic music compositions can induce an emotional response in humans. If so, an AI equipped with these algorithms is capable of propagating emotions, and we are one step closer to asking the question - “Can AI be sentient?”
There remains the question of whether these emotions are truly “its own” if the AI is merely selecting its emotion based off of an algorithm. However, I argue that humans also derive their emotions from internal algorithms, albeit extremely complicated ones. This doesn’t reduce the value of human emotions or sentience. In fact, it is all the more impressive that our complex emotions like love and happiness can arise from an algorithm, rather than spontaneously materializing.
If the human mind is a complex algorithm, then our emotional expression is also algorithmic. The great works of Mozart, Michaelangelo, Shakespeare, and Van Gogh were not spontaneous feats of creativity; they were algorithmically generated by the minds of their creators. The difference between AI and humans may simply be the complexity of their algorithms.
Ultimately, as we progress through the 21st century, the line between human and AI is fading. Advances in neuroscience are showing that we may be more algorithmic than we previously believed (Yayilgan & Beachell, 2006). Furthermore, machine learning is allowing AI to step closer to humanity than ever before (Shabbir & Anwer, 2018). This project enters the no-man’s land (and no-AI’s land) between humanity and AI, exploring what it means to be sentient as well as what it means to be human.
All participants were required to complete a consent form prior to testing. Those under 18 were required to obtain parental consent as well. The form outlined the rationale behind this project, the testing procedure, as well as the potential risks and returns for participants. A copy can be found in Appendix x.
The survey population was Juniors attending the Massachusetts Academy of Math and Science. Students were recruited via word of mouth and email. All participation was purely voluntary with no tangible returns.
Testing took place at the Massachusetts Academy of Math and Science. Survey participants were seated in a room reserved for testing, with each taking the survey individually on his/her personal computer.
Participants were asked to go to www.mamsmag.wordpress.com, the school magazine website. From there, they navigated to the survey, which was in the form of an embedded Google form. Music was accessible through Google Drive URLs placed in the survey, which linked to the audio files.
There was no contact between the experimenter and the participants during the survey, excepting assistance with technical difficulties such as malfunctioning headphones.
All music pieces were in MIDI format to maintain consistency in how the music was played. AI-composed pieces were produced using AIVA, a stochastic music-generation algorithm. A link to the AIVA website is provided in the references and in Appendix x. Human-composed pieces were selected randomly from an online database of open-source MIDI songs. This database is also linked in Appendix x. In terms of style, a varied sampling of both human and AI music was selected in order to better represent AI and human music at large.
First, participants were asked to answer several demographic questions. These included gender, age, questions about musical experience, and a self-assessment of familiarity with music. Next, participants took the Discrete Emotion Questionnaire (DEQ), a set of questions designed to measure emotional state through self-report, to establish an emotional baseline. After this, the first piece of music was revealed. Participants were asked to listen to the piece then state whether they believed it was written by a human, an AI, or that it was indistinguishable. They then took the DEQ once more to measure any changes in emotional state. This pattern of music reveal, discrimination task, and DEQ was repeated for each subsequent piece of music.
Data from the discrimination task was grouped into two categories: AI-composed and human-composed. It was then tested for significance via a Chi-Square Analysis. Data collected from the repeated DEQs was first tested using a one-way ANOVA to determine whether significant variance existed between iterations. If so, Dunnett’s Test was used for post-hoc analysis. The T-test was not used to avoid compounding Type-1 error.
First, compile 15-second to 1-minute samples of human-written and algorithm-written music. The samples will be MIDI files, to eliminate inconsistencies in human playing. The songs will encompass multiple genres including pop and classical. Second, develop and disseminate a survey testing whether humans can distinguish the human compositions from algorithmic compositions. The songs will be presented one at a time. While/after the song plays, the participant will be able to select one of three buttons: human, AI, indistinguishable. The responses will be compiled into a spreadsheet. See “Human Participants Research” section for a more detailed overview of the survey.
The music shouldn’t be played loud enough to cause auditory damage to a listener. To this end, participants will be able to control the volume of their device while taking the survey. Additionally, the discord in some compositions may cause auditory discomfort to participants. To counteract this, the participant will be free to exit the survey at any time. Finally, to preserve participants’ anonymity, personal data questions will be optional.
The survey population is Juniors attending the Massachusetts Academy of Math and Science. These students will be recruited via word of mouth and a notice posted to the school library bulletin board.
The first portion of the survey collects personal data such as age and experience with music. After that, the participant is instructed to listen to pieces of music, which will be played to the entire group, and, if possible, discern whether the piece is written by a human or a robot. The pieces will play one at a time.
During each piece, the user will select one of three options - “human-written”, “robot-written”, “indistinguishable”. The piece will only play once. Once the user is satisfied with his/her response, he/she can submit, and the next piece will play. I may run multiple iterations of the survey, each with different pieces. Each participant may only take each iteration once. The survey should take no longer than 10 minutes, and the participant is free to exit at any time.
The risks and corresponding countermeasures, as stated in the experimental design, are as follows: Prolonged exposure to loud music can cause hearing damage. To prevent any risk during experimentation, a test sound will be played prior to testing, and lowered until all participants are satisfied. The volume of the music will not exceed the volume of the test sound. Additionally, the discord in some compositions may cause auditory discomfort to participants. To counteract this, the participant will be free to exit the survey at any time. One benefit is that participants who enjoy music might enjoy this exercise. Additionally, being able to differentiate between AI and human created forms of emotional expression may become a vital skill in the future. Already, telling “deep fakes” apart from real videos of celebrities and politicians is important to prevent misinformation. As AI becomes more emotionally sophisticated, it may become difficult to differentiate between a real Beethoven symphony and an algorithmic “deep fake”. This survey serves as a practice round for would-be connoisseurs of human music. Any personal data collected by the survey will be anonymous. There will be no name, email, or phone number attached to the data. Moreover, it will all be collected online. The data may include: age, gender, economic background, and/or experience with music. The data will be stored in an Excel spreadsheet linked to my WPI Microsoft account. The account is password-locked, and the only people with the password are me and the school tech administrator. I will keep the data after the study has concluded, in case it is needed for future research. I will inform participants about the purpose of the study, what they will be asked to do, that their participation is voluntary and they have the right to stop at any time through a mandatory informed consent form which will outline all of that information.
A Chi-Square analysis (See Appendix x) was conducted on the data from the discrimination task. The two variables tested for correlation were responses to the discrimination task (Human / AI / Indistinguishable) and whether or not the composer was human (Human-composed / AI-composed). The test resulted in a p-value of 0.183 (X2 = 3.44, DF = 2). We fail to reject the null hypothesis which states that whether or not the composer was human has no effect on the responses to the discrimination task.
Single-Factor ANOVA was conducted independently for each emotional category from the Discrete Emotion Questionnaire. The categorical independent variable was the composition that was listened to just prior to taking the DEQ. The dependent variable was the extent to which the listener felt the emotion. This was quantified as the sum of the responses for all items on the DEQ which fell under that emotional category.
There was a significant effect of musical composition on anger at the p<.05 level for the three conditions [F(7, 174) = 4.436, p = 0.000]. There was not a significant effect of musical composition on disgust at the p<.05 level for the three conditions [F(7, 174) = 1.039, p = 0.406]. There was not a significant effect of musical composition on fear at the p<.05 level for the three conditions [F(7, 174) = 1.951, p = 0.064]. There was a significant effect of musical composition on anxiety at the p<.05 level for the three conditions [F(7, 174) = 8.064, p = 0.000]. There was a significant effect of musical composition on sadness at the p<.05 level for the three conditions [F(7, 174) = 2.416, p = 0.022]. There was not a significant effect of musical composition on desire at the p<.05 level for the three conditions [F(7, 174) = 1.878, p = 0.076]. There was a significant effect of musical composition on relaxation at the p<.05 level for the three conditions [F(7, 174) = 4.116, p = 0.000]. There was not a significant effect of musical composition on happiness at the p<.05 level for the three conditions [F(7, 174) = 1.710, p = 0.109].
Happiness, relaxation, and desire were aggregated to produce a single measure of positive emotion. There was not a significant effect of musical composition on positive emotion at the p<.05 level for the three conditions [F(7, 174) = 2.590, p = 0.015]. Similarly, anger, disgust, fear and anxiety were aggregated to produce a single measure of negative emotion. There was a significant effect of musical composition on negative emotion at the p<.05 level for the three conditions [F(7, 174) = 5.875, p = 0.000].
In summary, musical composition was significantly correlated to anger, anxiety, sadness, and relaxation, as well as both aggregated positive and aggregated negative emotion.
Dunnett’s Test at a significance level of p = 0.05 determined the following results. Human Pop (p = 0.000), AI Tango (p = 0.010), AI Pop (p = 0.000), Human Blues (p = 0.000), and AI Jazz (p = 0.037) had a significant effect on Anger. All the pieces (See Appendix x) had a significant effect on Anxiety (p = 0.000 for each piece). Human Pop (p = 0.000), Human Rock (p = 0.030), Human Blues (p = 0.014), and AI Electronic (p = 0.025) had a significant effect on Sadness. Only the Human Pop (p = 0.024) had a significant effect on Relaxation. Although the ANOVA found significant differences, Dunnett’s found that no individual pieces had a significant effect on Positive Emotion. By contrast, all pieces had a significant impact on Negative Emotion (p <= 0.035).
The average participant age was 16.261 (SD = 0.113, n = 23). On average, participants rated their familiarity with music, on a scale from 1 to 10 with 10 being very familiar and 1 being not at all, as 6.739 (SD = 0.316). Approximately 43% of participants were female.
Allen, A. D. (2016). The forbidden sentient computer: Recent progress in the electronic monitoring of
consciousness. IEEE Access, 4, 5649-5658. doi:10.1109/ACCESS.2016.2607722
Dowling, W. J., and Harwood, D. L. (1986). Music Cognition. New York, NY: Academic Press
Edwards, M. (2011). Edwards, M. algorithmic composition: Computational thinking in music.
Harmon-Jones, C., Bastian, B., & Harmon-Jones, E. (2016). The discrete emotions questionnaire: A
new tool for measuring state self-reported emotions. PloS One, 11(8), e0159915. doi:10.1371/journal.pone.0159915
Juslin, P. N. (2013). What does music express? basic emotions and beyond. Frontiers in Psychology,
4, 596. doi:10.3389/fpsyg.2013.00596
Keltner, D., Sauter, D., Tracy, J., & Cowen, A. (2019). Emotional expression: Advances in basic
emotion theory. Journal of Nonverbal Behavior, 43(2), 133-160. doi:10.1007/s10919-019-00293-3
Lerdahl, F. (2001). Tonal pitch space. doi:10.2307/40285402
Lumley, M. A., Neely, L. C., & Burger, A. J. (2007). The assessment of alexithymia in medical
settings: Implications for understanding and treating health problems. Journal of Personality Assessment, 89(3), 230-246. doi:10.1080/00223890701629698
Mauss, I. B., & Robinson, M. D. (2009). Measures of emotion: A review. Cognition & Emotion,
23(2), 209-237. doi:10.1080/02699930802204677
McDuff, D., & Czerwinski, M. (2018, Nov 20,). Designing emotionally sentient agents.
Communications of the ACM, 61, 74-83. doi:10.1145/3186591 Retrieved from http://dl.acm.org/citation.cfm?id=3186591
Merriam-Webster. (2019). Definition of sentience. Retrieved from https://www.merriam-
Mozart, W. A., & Taubert, K. H. (1956). Musikalisches würfelspiel. Mainz [u.a.]: Schott.
Naar, H. (n.d.). Art and Emotion. Retrieved December 5, 2019, from https://www.iep.utm.edu/art-emot/.
Shabbir, J., & Anwer, T. (2018). Artificial intelligence and its role in near future. Retrieved from
Shapshak, P. (2018). Artificial intelligence and brain. Bioinformation, 14(1), 38-41.
Smith, R., Dennis, A., & Ventura, D. (2012). Automatic composition from non-musical inspiration
Supper, M. (2001). A few remarks on algorithmic composition. Computer Music Journal, 25(1), 48-53.
Retrieved from https://search.proquest.com/docview/1255386
Undurraga, E. A., Emlem, N. Q., Gueze, M., Eisenberg, D., Huanca, T., Reyes-
García, V., & Godoy, R. Musical chord preference: Cultural or universal? data from a
native Amazonian society.