
Future research should include further exploration of
hyper-parameters, the investigation of hybrid methods
wherein task-specific information is injected into only
a subset of hidden layers, and the application of this
method to large-scale models for machine translation
and text generation.
REFERENCES
Cho, K., van Merri
¨
enboer, B., Gulcehre, C., Bahdanau, D.,
Bougares, F., Schwenk, H., and Bengio, Y. (2014).
Learning phrase representations using RNN encoder–
decoder for statistical machine translation. In Mos-
chitti, A., Pang, B., and Daelemans, W., editors, Pro-
ceedings of the 2014 Conference on Empirical Meth-
ods in Natural Language Processing (EMNLP), pages
1724–1734, Doha, Qatar. Association for Computa-
tional Linguistics.
Dathathri, S., Madotto, A., Lan, J., Hung, J., Frank, E.,
Molino, P., Yosinski, J., and Liu, R. (2020). Plug
and play language models: A simple approach to con-
trolled text generation. In International Conference
on Learning Representations.
Fan, A., Bhosale, S., Schwenk, H., Ma, Z., El-Kishky, A.,
Goyal, S., Baines, M., Celebi, O., Wenzek, G., Chaud-
hary, V., et al. (2021). Beyond english-centric multi-
lingual machine translation. The Journal of Machine
Learning Research, 22(1):4839–4886.
Ficler, J. and Goldberg, Y. (2017). Controlling linguis-
tic style aspects in neural language generation. In
Brooke, J., Solorio, T., and Koppel, M., editors,
Proceedings of the Workshop on Stylistic Variation,
pages 94–104, Copenhagen, Denmark. Association
for Computational Linguistics.
Greene, D. and Cunningham, P. (2006). Practical solutions
to the problem of diagonal dominance in kernel doc-
ument clustering. In Proc. 23rd International Confer-
ence on Machine learning (ICML’06), pages 377–384.
ACM Press.
Ha, T.-L., Niehues, J., and Waibel, A. (2016). Toward mul-
tilingual neural machine translation with universal en-
coder and decoder. In Proceedings of the 13th Interna-
tional Conference on Spoken Language Translation,
Seattle, Washington D.C. International Workshop on
Spoken Language Translation.
Hanna, M. and Bojar, O. (2021). A fine-grained analysis of
BERTScore. In Proceedings of the Sixth Conference
on Machine Translation, pages 507–517, Online. As-
sociation for Computational Linguistics.
He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep resid-
ual learning for image recognition. In Proceedings of
the IEEE conference on computer vision and pattern
recognition, pages 770–778.
Hochreiter, S. and Schmidhuber, J. (1997). Long short-term
memory. Neural Computation, 9(8):1735–1780.
J. Schler, M. Koppel, S. A. and Pennebaker, J. (2006). Ef-
fects of age and gender on blogging. AAAI Spring
Symposium on Computational Approaches for Analyz-
ing Weblogs.
Johnson, M., Schuster, M., Le, Q. V., Krikun, M., Wu, Y.,
Chen, Z., Thorat, N., Vi
´
egas, F., Wattenberg, M., Cor-
rado, G., Hughes, M., and Dean, J. (2017). Google’s
multilingual neural machine translation system: En-
abling zero-shot translation. Transactions of the Asso-
ciation for Computational Linguistics, 5:339–351.
Keskar, N. S., McCann, B., Varshney, L. R., Xiong, C., and
Socher, R. (2019). Ctrl: A conditional transformer
language model for controllable generation.
Ni, J., Li, J., and McAuley, J. (2019). Justifying recom-
mendations using distantly-labeled reviews and fine-
grained aspects. In Proceedings of the 2019 Con-
ference on Empirical Methods in Natural Language
Processing and the 9th International Joint Conference
on Natural Language Processing (EMNLP-IJCNLP),
pages 188–197, Hong Kong, China. Association for
Computational Linguistics.
OpenAI (2023). Gpt-4 technical report.
Papineni, K., Roukos, S., Ward, T., and Zhu, W.-J. (2002).
Bleu: a method for automatic evaluation of machine
translation. In Proceedings of the 40th Annual Meet-
ing of the Association for Computational Linguistics,
pages 311–318, Philadelphia, Pennsylvania, USA.
Association for Computational Linguistics.
Post, M. (2018). A call for clarity in reporting BLEU scores.
In Proceedings of the Third Conference on Machine
Translation: Research Papers, pages 186–191, Bel-
gium, Brussels. Association for Computational Lin-
guistics.
Project Gutenberg (2024).
Sennrich, R., Haddow, B., and Birch, A. (2016). Control-
ling politeness in neural machine translation via side
constraints. In Proceedings of the 2016 Conference
of the North American Chapter of the Association for
Computational Linguistics: Human Language Tech-
nologies, pages 35–40, San Diego, California. Asso-
ciation for Computational Linguistics.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones,
L., Gomez, A. N., Kaiser, L., and Polosukhin, I.
(2017). Attention is all you need. In Guyon, I.,
Luxburg, U. V., Bengio, S., Wallach, H., Fergus,
R., Vishwanathan, S., and Garnett, R., editors, Ad-
vances in Neural Information Processing Systems,
volume 30. Curran Associates, Inc.
Zhang, T., Kishore, V., Wu, F., Weinberger, K. Q., and
Artzi, Y. (2020). Bertscore: Evaluating text genera-
tion with bert. In International Conference on Learn-
ing Representations.
Improving Controlled Text Generation via Neuron-Level Control Codes
581