Generating Drake Rap Lyrics using Language Models and LSTMs

阿新 • • 發佈：2018-12-29

About the Model

Now, we are going to talk about the model for text generation, this is really what you are here for, it’s the real sauce - raw sauce. I’m going to start off by talking about the model design and some important elements that make lyric generation possible and then, we are going to jump into the implementation of it.

There are two main approaches to building Language Models: (1) Character-level Models and (2) Word-level models.

The main difference for each one of the models comes from what your inputs and outputs are, and I’m going to talk exactly about how each one of them works here.

Character-level model

In a case of a character-level model your input is a series of characters seed

and your model is responsible for predicting the next character new_char. Then you use the seed + new_char together to generate the next character and so on. Note, since your network input must always be of the same shape, we are actually going to lose one character from the seed on every iteration of this process. Here is a simple visualization:

Fig. 2 Iterative process of word generation with Character-level Language Model

At every iteration, the model is basically making a prediction what is the next most likely character given the seed characters, or using conditional probability, this can be described like finding the maximum P(new_char|seed) , where new_char is any character from the alphabet. In our case, the alphabet is a set of all english letters, and a space character.(Note, your alphabet can be very different and can contain any characters that you want, depends on language that you are building the model for)

Word-level model

Word level model is almost the same as the character one, but it generates the next word instead of the next character. Here is a simple example:

Fig. 3 Iterative process of word generation with Word-level Language Model

Now, in this model, we are looking ahead by one unit, but this time our unit is a word, not a character. So, we are looking for P(new_word|seed) , where new_word is any word from our vocabulary.

Notice, that now we are searching through a much larger set than before. With alphabet, we searched through approximately 30 items, now we are searching through many more items at every iteration, hence the word-level algorithm is slower on every iteration, but since we are generating a whole word instead of a single character, it is actually not that bad at all. As a final note on our Word-level model, we can have a very diverse vocabulary and we usually develop it by finding all unique words from our dataset (usually done in data preprocessing stage). Since vocabularies can get infinitely large, there are many techniques that improve the efficiency of algorithm, such as Word-Embeddings, but that is for a later article.

For the purposes of this article, I’m going to focus on the character level model because it is simpler in its implementation and understanding of Character-level model can be easily transferred to a more complex Word-level model later. As, I’m writing this, I have also built a Word-level model and will attach a link to it as soon as I’m done the write up [here] (or you can follow me to stay updated ?)

Generating Drake Rap Lyrics using Language Models and LSTMs

About the Model

Character-level model

Word-level model

Generating Drake Rap Lyrics using Language Models and LSTMs

Reading Level Assessment Using Support Vector Machines and Statistical Language Models-paper

Compare Models And Select The Best Using The Caret R Package

dango models and database ---- relation ship

[React] Create a Virtualized List with Auto Sizing Cells using react-virtualized and CellMeasurer

Cpp Chapter 9: Memory Models and Namespaces Part1

Cpp Chapter 9: Memory Models and Namespaces Part2

Exploring Models and Data for Image Question Answering 論文翻譯

識別簡單的答題卡（Bubble sheet multiple choice scanner and test grader using OMR, Python and OpenCV——jsxyhelu重新整編）

ELMo（Embeddings from Language Models） --學習筆記

8-------Short-term Electricity Load Forecasting using Time Series and Ensemble Learning Methods

Question Answering on Knowledge Bases and Text using Universal Schema and Memory Networks

Protein Secondary Structure Prediction Using Cascaded Convolutional and Recurrent Neural Networks筆記

【機器學習】Feature selection – Part II: linear models and regularization

Stopping using console.log() and start using your browser’s debugger

Analyze and visualize your VPC network traffic using Amazon Kinesis and Amazon Athena

Creating a Modern OCR Pipeline Using Computer Vision and Deep Learning

Stack Applications Using Angular CLI and Nx

Facebook Login Using AWS Amplify and Amazon Cognito

Ask HN: Creat a simple language point and click easy?

Generating Drake Rap Lyrics using Language Models and LSTMs

About the Model

Character-level model

Word-level model

相關推薦