Dynamic Indexing for Incremental Entity Resolution in Data Integration Systems

Priscilla Kelly M. Vieira; Bernadette Farias Lóscio; Ana Carolina Salgado

Research.Publish.Connect.

*Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search *Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search *Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search *Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

Paper

Dynamic Indexing for Incremental Entity Resolution in Data Integration Systems

Topics: Coupling and Integrating Heterogeneous Data Sources; Large Scale Databases; Organisational Issues on Systems Integration

In Proceedings of the 19th International Conference on Enterprise Information Systems - Volume 1: ICEIS, 185-192, 2017 , Porto, Portugal

Authors: Priscilla Kelly M. Vieira ¹ ; Bernadette Farias Lóscio ² and Ana Carolina Salgado ²

Affiliations: ¹ Federal University of Pernambuco and Federal Rural University of Pernambuco, Brazil ; ² Federal University of Pernambuco, Brazil

Keyword(s): Data Integration, Entity Resolution, Data Matching, Duplicate Detection, Indexing.

Related Ontology Subjects/Areas/Topics: Coupling and Integrating Heterogeneous Data Sources ; Data Engineering ; Databases and Data Security ; Databases and Information Systems Integration ; Enterprise Information Systems ; Large Scale Databases ; Organisational Issues on Systems Integration

Abstract: Entity Resolution (ER) is the problem of identifying groups of tuples from one or multiple data sources that represent the same real-world entity. This is a crucial stage of data integration processes, which often need to integrate data at query time. This task becomes even more challenging in scenarios with dynamic data sources or with a large volume of data. As most ER techniques deal with all tuples at once, new solutions have been proposed to deal with large volumes of data. One possible approach consists in performing the ER process on query results rather than the whole data set. It is also possible to reuse previous results of ER tasks in order to reduce the number of comparisons between pairs of tuples at query time. In a similar way, indexing techniques can also be employed to help the identification of equivalent tuples and to reduce the number of comparisons between pairs of tuples. In this context, this work proposes an indexing technique for incremental Entity Resolution processes. The expected contributions of this work are the specification, the implementation and the evaluation of the proposed indexes. We performed some experiments and the time spent for storing, accessing and updating the indexes was measured. We concluded that the reuse turns the ER process more efficient than the reprocessing of tuples comparison and with similar quality of results. (More)

CC BY-NC-ND 4.0

Guest: Register as new SciTePress user now for free.

SciTePress user: please login.

My Papers

You are not signed in, therefore limits apply to your IP address 18.216.194.71

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Paper citation in several formats:

Vieira, P. K. M., Lóscio, B. F. and Salgado, A. C. (2017). Dynamic Indexing for Incremental Entity Resolution in Data Integration Systems. In Proceedings of the 19th International Conference on Enterprise Information Systems - Volume 1: ICEIS; ISBN 978-989-758-247-9; ISSN 2184-4992, SciTePress, pages 185-192. DOI: 10.5220/0006251801850192

@conference{iceis17,
author={Priscilla Kelly M. Vieira and Bernadette Farias Lóscio and Ana Carolina Salgado},
title={Dynamic Indexing for Incremental Entity Resolution in Data Integration Systems},
booktitle={Proceedings of the 19th International Conference on Enterprise Information Systems - Volume 1: ICEIS},
year={2017},
pages={185-192},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006251801850192},
isbn={978-989-758-247-9},
issn={2184-4992},
}

TY - CONF

JO - Proceedings of the 19th International Conference on Enterprise Information Systems - Volume 1: ICEIS
TI - Dynamic Indexing for Incremental Entity Resolution in Data Integration Systems
SN - 978-989-758-247-9
IS - 2184-4992
AU - Vieira, P.
AU - Lóscio, B.
AU - Salgado, A.
PY - 2017
SP - 185
EP - 192
DO - 10.5220/0006251801850192
PB - SciTePress