whereas a lower score signals better predictive
capability. From the Table 1, I observe a steep decline
in perplexity from 107.4 in the first epoch to 2.2 by
the sixth epoch. This dramatic decrease signifies that
the model has become significantly better at
predicting the next word in a sequence, representing
a substantial leap in learning from the data. This
improvement in perplexity scores correlates with the
qualitative improvements seen in the generated text
samples. Initially, the model may produce text with
less relevance and randomness, as evidenced by
higher perplexity. But as the model trains and the
perplexity decreases, the output becomes more
coherent and contextually appropriate. This is a
typical observation in LSTM models, as they are
well-suited to capture and utilize the long-term
dependencies within the text data, which is crucial for
generating meaningful language sequences. The blue
scores of both LAST models and Markov chain
models are relatively low, since the models at the
stage of generating consecutive words instead of
sentences. However, the words in a sentence
generated by the LSTM model can be easily put into
the same context while the relationship of each word
generated by Markov chain model is relatively weak.
In summary, the LSTM model has demonstrated
a promising ability to learn from the corpus of
Chinese rap lyrics. The generated text samples,
although limited, suggest that the model is capturing
the nuances of the language and the style of the genre.
Meanwhile, the quantitative reduction in perplexity
offers a concrete measure of the model’s evolving
competence. Together, these outcomes underscore
the LSTM’s potential in natural language generation
tasks and its effectiveness in modeling complex
language patterns.
5 CONCLUSION
In this paper, I have successfully developed and
trained a Long Short-Term Memory (LSTM)
network-based language model for generating
Chinese rap lyrics that exhibit thematic coherence and
logical structure. Evaluating the model’s performance
across various training epochs, I observed a
significant enhancement in its predictive capabilities,
evidenced by both the improved quality of generated
text samples and a marked reduction in perplexity.
The initial generated text may have lacked coherence
and logic, but with continued training, the quality of
the text substantially improved. Perplexity dropped
from 107.4 in the first training epoch to 2.2 by the
sixth epoch, indicating a substantial increase in the
model’s effectiveness in learning from the data. The
conclusion drawn is that LSTM models are highly
suitable for processing and generating complex
language patterns, especially in natural language
generation tasks that require an understanding of
long-term dependencies. My model demonstrated the
potential to capture the unique rhythm and style of
Chinese rap lyrics and progressively learned to
generate new, creative lyrical content throughout the
training process. For future work, I plan to extend and
deepen my efforts in several areas:
(1) Model Structure Optimization. Although the
current LSTM model has shown promising
performance, I believe that deeper neural network
architectures or the introduction of more advanced
models, such as Transformers or BERT, could further
improve the quality of text generation.
(2) Hyperparameter Tuning. I will explore a
broader hyperparameter space to find a more
optimized model configuration. Additionally,
considering the significant impact of different
embedding dimensions on model performance, I aim
to employ automated hyperparameter search
methods, like Bayesian optimization, to determine the
optimal settings.
(3) Dataset Expansion. To enhance the model’s
robustness and generalization ability, I plan to collect
and integrate a more diverse set of Chinese rap lyrics
data. Moreover, incorporating other forms of Chinese
textual data may help the model learn richer language
patterns.
(4) Creativity Assessment. I will develop new
metrics to quantify the creativity and diversity of the
lyrics generated by the model. While current
perplexity metrics focus on prediction accuracy, I
hope to more comprehensively assess the quality of
generated text in the future.
(5) Interactive Generation Tools. Ultimately, I
aim to develop an interactive platform that allows
users to input specific themes or keywords and have
the model generate corresponding lyrics. This will
make the model more engaging and practical for real-
world applications. By continuing to research and
improve, I believe that LSTM models and other deep
learning technologies will bring revolutionary
progress to the field of natural language processing,
particularly in natural language generation, in the
future.
REFERENCES
J. Whittaker, M Thomason, A Markov chain model for
statistical software testing, IEEE Transactions on
Advanced Chinese Rap Lyric Generation with Integrated Markov Chain and LSTM Models
375