the proposed framework with a generative adversarial
network for synthesizing high quality sentence candi-
dates.
ACKNOWLEDGEMENTS
The authors would like to thank the Ministry of Ed-
ucation, Culture, Sports, Science and Technology
(MEXT) of Japan for providing the Japanese Gov-
ernment (Monbukagakusho) Scholarship under which
this work was carried out. This work was also sup-
ported in part by the Asian Office of Aerospace Re-
search and Development (AOARD), Air Forced Of-
fice of Scientific Research (Grant no. FA2386-19-1-
4041).
REFERENCES
Ding, C., Thu, Y. K., Utiyama, M., and Sumita, E. (2016).
Word segmentation for Burmese (Myanmar). ACM
Transactions on Asian and Low-Resource Language
Information Processing (TALLIP), 15(4):1–10.
Du, J. and Way, A. (2017). Neural pre-translation for hybrid
machine translation.
Gr
´
egoire, F. and Langlais, P. (2018). Extracting parallel sen-
tences with bidirectional recurrent neural networks to
improve machine translation. In Proceedings of the
27th International Conference on Computational Lin-
guistics, pages 1442–1453.
Gu, J., Lu, Z., Li, H., and Li, V. O. (2016). Incorporating
copying mechanism in sequence-to-sequence learn-
ing. arXiv preprint arXiv:1603.06393.
Hangya, V. and Fraser, A. (2019). Unsupervised paral-
lel sentence extraction with parallel segment detection
helps machine translation. In Proceedings of the 57th
Annual Meeting of the Association for Computational
Linguistics, pages 1224–1234.
Heafield, K., Pouzyrevsky, I., Clark, J. H., and Koehn,
P. (2013). Scalable modified Kneser-Ney language
model estimation. In Proceedings of the 51st Annual
Meeting of the Association for Computational Lin-
guistics (Volume 2: Short Papers), pages 690–696.
Klein, G., Kim, Y., Deng, Y., Nguyen, V., Senellart, J., and
Rush, A. M. (2018). Opennmt: Neural machine trans-
lation toolkit. arXiv preprint arXiv:1805.11462.
Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Fed-
erico, M., Bertoldi, N., Cowan, B., Shen, W., Moran,
C., Zens, R., et al. (2007). Moses: Open source toolkit
for statistical machine translation. In Proceedings of
the 45th Annual Meeting of the ACL on Interactive
Poster and Demonstration Sessions, pages 177–180.
Association for Computational Linguistics.
Kudo, T. and Richardson, J. (2018). Sentencepiece: A sim-
ple and language independent subword tokenizer and
detokenizer for neural text processing. arXiv preprint
arXiv:1808.06226.
Luong, M.-T., Pham, H., and Manning, C. D. (2015). Ef-
fective approaches to attention-based neural machine
translation. arXiv preprint arXiv:1508.04025.
Och, F. J. and Ney, H. (2003). A systematic comparison of
various statistical alignment models. Computational
Linguistics, 29(1):19–51.
Oo, Y. and Soe, K. M. (2019). Applying RNNs architec-
ture by jointly learning segmentation and stemming
for Myanmar language. In 2019 IEEE 8th Global
Conference on Consumer Electronics (GCCE), pages
391–393. IEEE.
Pa, W. P. and Thein, N. L. (2008). Myanmar word segmen-
tation using hybrid approach. In Proceedings of 6th
International Conference on Computer Applications,
Yangon, Myanmar, pages 166–170.
Phyu, M. L. and Hashimoto, K. (2017). Burmese word
segmentation with character clustering and CRFs. In
2017 14th International Joint Conference on Com-
puter Science and Software Engineering (JCSSE),
pages 1–6. IEEE.
Reimers, N. and Gurevych, I. (2019). Sentence-bert: Sen-
tence embeddings using Siamese bert-networks. arXiv
preprint arXiv:1908.10084.
Reimers, N. and Gurevych, I. (2020). Making monolin-
gual sentence embeddings multilingual using knowl-
edge distillation. arXiv preprint arXiv:2004.09813.
Riza, H., Purwoadi, M., Uliniansyah, T., Ti, A. A., Alju-
nied, S. M., Mai, L. C., Thang, V. T., Thai, N. P.,
Chea, V., Sam, S., et al. (2016). Introduction of the
Asian language treebank. In 2016 Conference of The
Oriental Chapter of International Committee for Co-
ordination and Standardization of Speech Databases
and Assessment Techniques (O-COCOSDA), pages 1–
6. IEEE.
Schroff, F., Kalenichenko, D., and Philbin, J. (2015).
Facenet: A unified embedding for face recognition
and clustering. In Proceedings of the IEEE conference
on computer vision and pattern recognition, pages
815–823.
Sennrich, R., Haddow, B., and Birch, A. (2016a). Improv-
ing neural machine translation models with monolin-
gual data. In Proceedings of the 54th Annual Meeting
of the Association for Computational Linguistics (Vol-
ume 1: Long Papers), pages 86–96, Berlin, Germany.
Association for Computational Linguistics.
Sennrich, R., Haddow, B., and Birch, A. (2016b). Neu-
ral machine translation of rare words with subword
units. In Proceedings of the 54th Annual Meeting of
the Association for Computational Linguistics (Vol-
ume 1: Long Papers), pages 1715–1725, Berlin, Ger-
many. Association for Computational Linguistics.
Teahan, W. J., Wen, Y., McNab, R., and Witten, I. H.
(2000). A compression-based algorithm for Chi-
nese word segmentation. Computational Linguistics,
26(3):375–393.
Xu, G., Ko, Y., and Seo, J. (2019). Improving neural ma-
chine translation by filtering synthetic parallel data.
Entropy, 21(12):1213.
Zhao, H., Utiyama, M., Sumita, E., and Lu, B.-L. (2013).
An empirical study on word segmentation for Chi-
nese machine translation. In International Conference
on Intelligent Text Processing and Computational Lin-
guistics, pages 248–263. Springer.
ICAART 2021 - 13th International Conference on Agents and Artificial Intelligence
342