Step-by-Step Guide to Creating a Drake Lyric Generator with Python and LSTM

Step-by-Step Guide to Creating a Drake Lyric Generator with Python and LSTM Produced by Big Data Digest

Compiled by: Fei, Ni Ni, Mix Candy, QianTian Pei

The main application of AI in the future is to establish networks that can learn from data and then generate original content. This idea has been fully applied in the field of Natural Language Processing (NLP), which is also why the AI community has been able to build what are called language models: the premise of a language model is to learn the composition structure of sentences in paragraphs, thereby generating new content.

In this article, I want to try generating rap lyrics similar to the style of the popular Canadian rapper Drake (a.k.a. #6god), which is definitely an interesting endeavor.

Additionally, I want to share some regular channels for machine learning projects, as I have found that many students want to do small projects but do not know where to start.

Gathering Data

First, we start by collecting Drake’s song library. To save time, I directly wrote a crawler to scrape lyrics from the website metrolyrics.com.

import urllib.request as urllib2
from bs4 import BeautifulSoup
import pandas as pd
import re
from unidecode import unidecode

quote_page = 'http://metrolyrics.com/{}-lyrics-drake.html'
filename = 'drake-songs.csv'
songs = pd.read_csv(filename)
for index, row in songs.iterrows():
    page = urllib2.urlopen(quote_page.format(row['song']))
    soup = BeautifulSoup(page, 'html.parser')
    verses = soup.find_all('p', attrs={'class': 'verse'})
    lyrics = ''
    for verse in verses:
        text = verse.text.strip()
        text = re.sub(r"\[.*\]\n", "", unidecode(text))
        if lyrics == '':
            lyrics = lyrics + text.replace('\n', '|-|')
        else:
            lyrics = lyrics + '|-|' + text.replace('\n', '|-|')
    songs.at[index, 'lyrics'] = lyrics
    print('saving {}'.format(row['song']))
songs.head()
print('writing to .csv')
songs.to_csv(filename, sep=',', encoding='utf-8')

I used a Python package that everyone is familiar with, BeautifulSoup, to scrape the web. I referred to a tutorial by a great expert, Justin Yek, and I learned how to use it in just five minutes. Just to clarify, in the above code, I used the data format ‘songs’ in the loop because I had previously defined the songs I wanted to obtain.

Tutorial:

https://medium.freecodecamp.org/how-to-scrape-websites-with-python-and-beautifulsoup-5946935d93fe

Step-by-Step Guide to Creating a Drake Lyric Generator with Python and LSTM

Stored all song lyrics in a DataFrame

After running the crawler, I obtained a csv file that stores the lyrics in an appropriate structure. The next step is to preprocess the data and build the model.

Model Introduction

Now let’s see how the model generates text. This part is crucial to understand because it contains the real insights. I will start with the model design and key components in the lyric generation model, after which we can directly move into the implementation phase.

There are mainly two methods for building a language model:

1. Character-level model,

2. Word-level model.

The main difference between the two lies in the input and output of the model. Next, I will explain how both models work in detail.

Character-level Model

In a character-level model, the input is a sequence of characters (seed), and the model is responsible for predicting the next character, then using the seed + new_char combination to generate the next character, and so on. Note that since we need to keep the input length consistent, we actually drop one character during each iteration of input. We can look at a simple intuitive example:

Step-by-Step Guide to Creating a Drake Lyric Generator with Python and LSTM

Character-level model’s iterative process of generating words

During each iteration, the model predicts the next most likely character based on the given seed character, or uses conditional probability, i.e., finding the maximum value of P(new_char|seed), where new_char is any letter in the alphabet.

In this example, the character set refers to the collection of all English letters and space symbols. (Note that the alphabet can include different letters depending on your needs, mainly depending on the type of language you are generating).

Word-level Model

The word-level model is very similar to the character-level model, but it generates the next word instead of a character. Here’s a simple example to illustrate this:

Step-by-Step Guide to Creating a Drake Lyric Generator with Python and LSTM

Figure 3. Iterative process of generating vocabulary in the word-level model

In this model, we seek the next vocabulary unit rather than a character. Therefore, we want to find the maximum value of P(new_word|seed), where new_word is any vocabulary.

It is important to note that the search scope here is much larger than in the character-level model. In the character-level model, we only need to look up about 30 characters from the character set, but in the word-level model, the search range during each iteration is significantly larger than this number. Therefore, the running speed of each iteration is slower. However, since we are generating an entire word rather than just a character, it is not too bad.

Regarding the word-level model, I want to point out that we can generate more diverse vocabulary by searching for unique vocabulary in the dataset (this step is usually done during the data preprocessing phase). Since the vocabulary can be infinitely large, we actually have many algorithms to improve the performance of generated vocabulary, such as word embeddings, but this issue can be written about in another article.

This article mainly focuses on character-level models because they are easier to implement and understand, and they can be more easily transformed into complex word-level models.

Data Preprocessing

For the character-level model, we will follow these steps for data preprocessing:

1. Character Tokenization

For the character-level model, the input should be based on characters rather than strings. So, we first need to turn each line of lyrics into a collection of characters.

2. Define Character Set

In the previous step, we obtained all possible characters that may appear in the lyrics. Next, we need to identify all unique characters. Since the entire dataset is not large (only 140 songs), for simplicity, I only retained all English letters and some special symbols (such as spaces), ignoring numbers and other information (because the dataset is small, I prefer the model to predict fewer characters).

3. Create Training Sequences

Here we will use the concept of a sliding window. By sliding a fixed-length window along the sentence, we will establish the data sequences for training. The following image illustrates the sliding window operation well:

Step-by-Step Guide to Creating a Drake Lyric Generator with Python and LSTM

Figure 4. Obtaining input/output with a sliding window

By shifting one character each time, we obtain model inputs of length 20 characters and model outputs of length 1 character. The additional benefit of shifting just one space is that it greatly expands the size of the dataset.

4. Label Encoding Training Sequences

Finally, we do not want to directly handle raw characters (even though theoretically, each character is a number, so you could say ASCII code has already encoded each character for us). What we want to do is to correspond unique numbers to each character; this step is called label encoding. At the same time, we need to establish two very important mappings: character-to-index and index-to-character. With these two mappings, we can encode any character from the alphabet into the corresponding number, and similarly, decode the numerical index output by the model back into the corresponding character.

5. One-Hot Encoding of the Dataset

Because we are dealing with categorical data, meaning all characters can be classified into a certain category, we need to encode the characters into the form of input columns.

When we complete the above five steps, we are basically done. The next step is to build and train the model. If you want to dive deeper into the details, here are the codes for the five steps for reference.

Step-by-Step Guide to Creating a Drake Lyric Generator with Python and LSTM

3. Build the Model

We will use a Recurrent Neural Network (RNN), more specifically a Long Short-Term Memory network (LSTM), based on the character set mentioned earlier to predict the next character. If these two concepts sound unfamiliar to you, I also provide a quick review of the related concepts:

Quick Review of RNN

Typically, the network you see is a mesh, converging from many points to a single point output. As shown in the figure below:

Step-by-Step Guide to Creating a Drake Lyric Generator with Python and LSTM

Figure 5. Neural Network Diagram

Here, the neural network has single-point input and single-point output. It is suitable for cases where the input is discontinuous because the order of the input does not affect the output result. However, in our case, the order of input characters is very important, as the order determines the corresponding words.

RNN can receive continuous inputs while using the output of the previous node as a parameter input to the next node, thus solving the input order problem.

Step-by-Step Guide to Creating a Drake Lyric Generator with Python and LSTM

Figure 6. Simple RNN Diagram

For example, based on the sequence Tryna_keep_it_simple, the next character extracted should be _. This is exactly what we want the neural network to do. The neural network’s input will be T — > x<1>, r -> x<2>, n -> x<3>… e-> x<n>, and the corresponding output character will be a space y -> _.

Quick Review of LSTM

Simple RNN networks still have some issues; they are not good at passing information from very front cells to back cells. For example, in the sentence Tryna keep it simple is a struggle for me, the last word me is difficult to predict accurately if we do not look back at what words appeared earlier (it is likely to be predicted as Baka, cat, potato, etc.).

LSTM can solve this problem well; it stores some information about past events (i.e., the words that appeared earlier) in each cell. As shown in the following image:

Step-by-Step Guide to Creating a Drake Lyric Generator with Python and LSTM

Figure 7. LSTM Diagram, excerpt from Andrew Ng’s Deep Learning Course

Not only does it pass the output of the previous cell a<n>, but it also includes the input information from previous cells c<n> as part of the input to the next cell. This allows LSTM to better retain contextual information and is suitable for language modeling predictions.

Programming the Model

I have learned a bit about Keras before, so this time I will use Keras as the framework to program the model. In fact, you could also choose to build your own model framework, but that would take more time.

# create sequential network, because we are passing activations
# down the network
model = Sequential()
# add LSTM layer
model.add(LSTM(128, input_shape=(maxlen, len(chars))))
# add Softmax layer to output one character
model.add(Dense(len(chars)))
model.add(Activation('softmax'))
# compile the model and pick the loss and optimizer
model.compile(loss='categorical_crossentropy', optimizer=RMSprop(lr=0.01))
# train the model
model.fit(x, y, batch_size=128, epochs=30)

As seen above, we have built the LSTM model and used batch processing, utilizing subsets of data for training instead of inputting all data at once, which can slightly improve training speed.

4. Generating Lyrics

After training the model, the next step is to generate the next character. We first need to use a simple string input from the user as a random seed. Then, we use the seed as input to the network to predict the next character, repeating this process until we generate some new lyrics, similar to what is shown in Figure 2.

Here are some examples of generated lyrics.

Note: These lyrics have not been reviewed; please discern when reading.

Step-by-Step Guide to Creating a Drake Lyric Generator with Python and LSTM

You may notice that some of the generated words are nonsensical, which is a common issue with character-level models. This is because the input sequences are often cut off in the middle of words, causing the neural network model to learn and generate new words that are meaningful in relation to its input, but seem strange to us.

This is also a problem that can be solved in word-level models, but for a model built with only 200 lines of code, the results achieved by the character-level model are still impressive.

Other Applications

The lyric prediction function of the character-level model demonstrated here can be extended to other more useful applications.

For example, the same principle can be used to predict the next word to be input on an iPhone keyboard.

Step-by-Step Guide to Creating a Drake Lyric Generator with Python and LSTM

Figure 8. Keyboard input predicting the next word

Imagine if you build a highly accurate Python language model that can not only automatically fill in keywords or variable names but also fill in large chunks of code. How much time would that save for programmers!

You may have also noticed that the code in the article is not complete and is missing some parts. Please visit my GitHub for more details and learn how to build your own project model.

GitHub link:

https://github.com/nikolaevra/drake-lyric-generator

Related report:

https://towardsdatascience.com/generating-drake-rap-lyrics-using-language-models-and-lstms-8725d71b1b12

【Today’s Machine Learning Concept

Have a Great Definition Step-by-Step Guide to Creating a Drake Lyric Generator with Python and LSTM