Learning Knowledge Representation by Aligning Text and Triples via Finetuned Pretrained Language Models

Víctor Jesús Sotelo Chico; Julio Reis

doi:10.5220/0013015100003838

Learning Knowledge Representation by Aligning Text and Triples via Finetuned Pretrained Language Models

Víctor Jesús Sotelo Chico, Julio Reis

2024

Abstract

Representation learning has produced embedding for structure and unstructured knowledge constructed independently, not sharing a vectorial space. Alignment between text and RDF triples has been explored in natural language generation, from RDF verbalizers to generative models. Existing approaches have treated the semantics in these data via unsupervised approaches proposed to allow semantic alignment with adequate application studies. The existing datasets involved in text-triples are limited and have only been applied to text-to-triple generation rather than for representation. This research proposes a supervised approach for representing triples. Our approach feeds an existing pretrained model with triple-text pairs exploring measures for the semantic alignment between the pair elements. Our solution employs a data augmentation technique with contrastive loss to address the dataset limitation. We applied a loss function that requires only positive examples, which is suitable for the explored dataset. Our experimental evaluation measures the effectiveness of the fine-tuned models in two main tasks: ’Semantic Similarity’ and ’Information Retrieval’. These tasks were addressed to measure whether our designed models can learn triple representation while maintaining the semantics learned by the text encoder models. Our contribution paves the way for better embeddings targeting text-triples alignment without huge data, bridging unstructured text and knowledge graph data.

Download

Paper Citation

in Harvard Style

Jesús Sotelo Chico V. and Reis J. (2024). Learning Knowledge Representation by Aligning Text and Triples via Finetuned Pretrained Language Models. In Proceedings of the 16th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 2: KEOD; ISBN 978-989-758-716-0, SciTePress, pages 51-62. DOI: 10.5220/0013015100003838

in Bibtex Style

@conference{keod24,
author={Víctor Jesús Sotelo Chico and Julio Reis},
title={Learning Knowledge Representation by Aligning Text and Triples via Finetuned Pretrained Language Models},
booktitle={Proceedings of the 16th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 2: KEOD},
year={2024},
pages={51-62},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013015100003838},
isbn={978-989-758-716-0},
}

in EndNote Style

TY - CONF

JO - Proceedings of the 16th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 2: KEOD
TI - Learning Knowledge Representation by Aligning Text and Triples via Finetuned Pretrained Language Models
SN - 978-989-758-716-0
AU - Jesús Sotelo Chico V.
AU - Reis J.
PY - 2024
SP - 51
EP - 62
DO - 10.5220/0013015100003838
PB - SciTePress