One of a kind

Concept:

We want to explore the rockabilly between American music and culture in the mid-1950s and how Midi RNN can generate country music style music at that time. Rockabilly emerged in the early 1950s as a fusion of rock and roll and country music. It was most popular with country fans in the 1950s. The music was propelled by catchy beats, an electric guitar and an acoustic bass which was played using the slap-back technique. We choose two American Rock and Roll “Hall of Fame” singer Elvis Presley and Johnny Cash’s songs. Elvis Presley is known as “The King of Rock 'n' Roll” and played a huge role in the music industry during this time. Johnny Cash is also a imposing and influential figure in country music. Their career coincided with the birth of rock & roll. 

Technique:

A recurrent neural network(RNN) is a class of artificial neural networks that make use of sequential information. Midi-rnn is a tool I learned that allows you to generate monophonic melodies with machine learning using a basic LSTM RNN, a model architecture designed to work with sequences. 

We used Midi-rnn to train our models with 50+ country music Midi files as our dataset, the training takes about 2 hours. Then we imported the generated midi files to Logic Pro and gave every midi track a specific instruments that country music commonly used.

Process:

First, we utilized a lot of midi files of Elvis Presley and Johnny Cash songs from the Internet including Cash and Presley placed songs in the top 5 in 1958 with No. 3 "Guess Things Happen That Way/Come In, Stranger" by Cash, and No. 5 by Presley "Don't/I Beg Of You". The original midi files have many tracks of many instruments. 

unnamed.png

We used the original data to train and after we’ve done we found out the result is not what we expected.

Screen Shot 2019-07-10 at 3.52.52 PM.png

The generated midi files only have very few notes and many blanks. The period of time is not long enough too. So we separated and exported the data file into single midi ›tracks and trained again. We increased the number of epochs, changed the dropouts to 0.5 and generated 10 midi files. The results seems better and rhythmic.

Screen Shot 2019-07-10 at 3.53.03 PM.png
Unknown Track - Unknown Artist
00:00 / 00:00
Unknown Track - Unknown Artist
00:00 / 00:00

Finally we utilized these 10 files in Logic Pro and implemented some commonly used instruments into them and made 5 sample country style melodies.

To make the project visually appealing, we decided to add some visualization with the generated musics, and we firstly tried to use the Music Visualizer to visualize it on screen, then we would like to step further, and put the music files into augmented reality, connecting it with a Google Home speaker, and make it feel like the music viz is actually sitting on the speaker. The augmented reality part is done with Zed Mini pass-through AR camera, and Oculus Rift, and we used Unity for prototyping.

屏幕快照 2019-04-03 下午10.43.32.png

The final result contains two parts:

  1. 5 pieces of music generated by RNN algorithm trained with 50+ Country Music songs, and the music is also post-processed with logic pro software to refine the music that sounds more like “country music”. 

https://drive.google.com/drive/folders/1WlNfLzXwcq0oeAsKA9FPVY1tByuynujm?usp=sharing

 

  1. A short video that showcase the visualization of generated music in augmented reality, providing richer visual experience.

Reflection:

In this project, we firstly tried to train the machine learning algorithm with a mix of different music styles due to the limited numbers of music files we can collect, but the result turned out not so well, and the music style of the generated results could not keep very consistency,  so I later tried to use a smaller dataset (about 30 midi files) but in the same music category, which is the country music. The final result turns out to be more compelling than the previous ones. Therefore we learned that for Midi RNN, the music style consistency is quite important in order to get a nice result, as supposed to have a larger dataset but with different styles of music.