transformer based language models has achieved state
of the art results on many NLP tasks. Incorporating
the weak signals with those methods may further in-
crease performance.
REFERENCES
Arora, S., Liang, Y., and Ma, T. (2017). A simple but tough-
to-beat baseline for sentence embeddings. In 5th In-
ternational Conference on Learning Representations,
ICLR 2017.
Baeza-Yates, R. and Tiberi, A. (2007). Extracting semantic
relations from query logs. In Proceedings of the 13th
ACM SIGKDD International Conference on Knowl-
edge Discovery and Data Mining, KDD ’07, pages
76–85.
Bogdanova, D., dos Santos, C., Barbosa, L., and Zadrozny,
B. (2015). Detecting semantically equivalent ques-
tions in online user forums. In Proceedings of
the Nineteenth Conference on Computational Natural
Language Learning, pages 123–131. Association for
Computational Linguistics.
Cer, D., Diab, M., Agirre, E., Lopez-Gazpio, I., and Spe-
cia, L. (2017). Semeval-2017 task 1: Semantic textual
similarity multilingual and crosslingual focused eval-
uation. In Proceedings of the 11th International Work-
shop on Semantic Evaluation (SemEval-2017), pages
1–14. Association for Computational Linguistics.
Cer, D., Yang, Y., Kong, S.-y., Hua, N., Limtiaco, N.,
St. John, R., Constant, N., Guajardo-Cespedes, M.,
Yuan, S., Tar, C., Strope, B., and Kurzweil, R. (2018).
Universal sentence encoder for english. In Proceed-
ings of the 2018 Conference on Empirical Methods
in Natural Language Processing: System Demonstra-
tions, pages 169–174. Association for Computational
Linguistics.
Charlet, D. and Damnati, G. (2017). Simbow at semeval-
2017 task 3: Soft-cosine semantic similarity between
questions for community question answering. In Pro-
ceedings of the 11th International Workshop on Se-
mantic Evaluation (SemEval-2017), pages 315–319.
Association for Computational Linguistics.
Chopra, S., Hadsell, R., and LeCun, Y. (2005). Learning
a similarity metric discriminatively, with application
to face verification. In 2005 IEEE Computer Society
Conference on Computer Vision and Pattern Recogni-
tion (CVPR’05), volume 1, pages 539–546 vol. 1.
Conneau, A., Kiela, D., Schwenk, H., Barrault, L., and
Bordes, A. (2017). Supervised learning of universal
sentence representations from natural language infer-
ence data. In Proceedings of the 2017 Conference on
Empirical Methods in Natural Language Processing,
pages 670–680, Copenhagen, Denmark. Association
for Computational Linguistics.
Craswell, N. and Szummer, M. (2007). Random walks
on the click graph. In Proceedings of the 30th An-
nual International ACM SIGIR Conference on Re-
search and Development in Information Retrieval, SI-
GIR ’07, pages 239–246.
Devlin, J., Chang, M., Lee, K., and Toutanova, K. (2019).
BERT: pre-training of deep bidirectional transform-
ers for language understanding. In Proceedings of
the 2019 Conference of the North American Chap-
ter of the Association for Computational Linguistics:
Human Language Technologies, NAACL-HLT 2019,
Minneapolis, MN, USA, June 2-7, 2019, Volume 1
(Long and Short Papers), pages 4171–4186.
Ein Dor, L., Mass, Y., Halfon, A., Venezian, E., Shnayder-
man, I., Aharonov, R., and Slonim, N. (2018). Learn-
ing thematic similarity metric from article sections us-
ing triplet networks. In Proceedings of the 56th An-
nual Meeting of the Association for Computational
Linguistics (Volume 2: Short Papers), pages 49–54.
Association for Computational Linguistics.
Figueroa, A. and Neumann, G. (2013). Learning to rank
effective paraphrases from query logs for community
question answering. In Proceedings of the Twenty-
Seventh AAAI Conference on Artificial Intelligence,
AAAI’13, pages 1099–1105.
Filice, S., Da San Martino, G., and Moschitti, A. (2017).
Kelp at semeval-2017 task 3: Learning pairwise pat-
terns in community question answering. In Proceed-
ings of the 11th International Workshop on Semantic
Evaluation (SemEval-2017), pages 326–333. Associ-
ation for Computational Linguistics.
Hochreiter, S. and Schmidhuber, J. (1997). Long short-term
memory. Neural Comput., 9(8):1735–1780.
Hoffer, E. and Ailon, N. (2015). Deep metric learning
using triplet network. In International Workshop on
Similarity-Based Pattern Recognition, pages 84–92.
Springer.
Huang, P.-S., He, X., Gao, J., Deng, L., Acero, A., and
Heck, L. (2013). Learning deep structured seman-
tic models for web search using clickthrough data.
In Proceedings of the 22Nd ACM International Con-
ference on Information & Knowledge Management,
CIKM ’13, pages 2333–2338, New York, NY, USA.
ACM.
˙
Irsoy, O. and Cardie, C. (2014). Opinion mining with deep
recurrent neural networks. In EMNLP, pages 720–
728.
Jeh, G. and Widom, J. (2002). Simrank: A measure of
structural-context similarity. In Proceedings of the
Eighth ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining, KDD ’02.
Kiros, R., Zhu, Y., Salakhutdinov, R., Zemel, R. S., Tor-
ralba, A., Urtasun, R., and Fidler, S. (2015). Skip-
thought vectors. arXiv preprint arXiv:1506.06726.
Logeswaran, L. and Lee, H. (2018). An efficient framework
for learning sentence representations. In International
Conference on Learning Representations ICLR 2018.
Ma, H., Yang, H., King, I., and Lyu, M. R. (2008). Learn-
ing latent semantic relations from clickthrough data
for query suggestion. In Proceedings of the 17th ACM
Conference on Information and Knowledge Manage-
ment, CIKM ’08, pages 709–718, New York, NY,
USA.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., and
Dean, J. (2013). Distributed representations of words
Learning Question Similarity in CQA from References and Query-logs
351