ENTERPRISE INFORMATION SEARCH SYSTEMS FOR HETEROGENEOUS CONTENT REPOSITORIES

Trieu C. Chieu, Shyh-Kwei Chen, Shiwa S. Fu

2007

Abstract

In larger enterprises, business documents are typically stored in disparate, autonomous content repositories with various formats. Efficient search and retrieval mechanisms are needed to deal with the heterogeneousness and complexity of this environment. This paper presents a general architecture and two industrial implementations of a service-based information system to perform search in Lotus Notes databases and data sources with Web service interfaces. The first implementation is based on a federated database system that maps the various schemas of the sources into a common interface and aggregates information from their native locations. This implementation offers the advantages of scalability and accessibility to real-time information. The second one is based on a one-index enterprise-scale search engine that crawls, parses and indexes the document contents from the sources. This latter implementation offers the ability of scoring the relevance ranking of documents and eliminating duplications in search results. The relative merits and limitations of both implementations will be presented.

References

  1. Lyman et. al., 2003. How Much Information 2003?. From http://www.sims.berkeley.edu/research/projects/howmuch-info-2003/
  2. Haas et. al., 2002. Data Integration through Database Federation. IBM Systems Journal, Vol. 41, No. 4, 578- 596.
  3. IBM Lotus Notes/Domino site. From http://www142.ibm.com/software/swlotus/products/product4.nsf/wdocs/noteshomepage.
  4. Mahmoud, Q. H., 2005. Service-Oriented Architecture (SOA) and Web Services: The Road to Enterprise Application Integration (EAI). Sun Developer Network Web site: http://java.sun.com/developer/technicalArticles/WebS ervices/soa/.
  5. Weerawarana et. al., 2005. Web Services Platform Architecture : SOAP, WSDL, WS-Policy, WSAddressing, WS-BPEL, WS-Reliable Messaging, and More. Prentice Hall.
  6. Web Services Architecture. From http://www.w3.org/TR/2004/NOTE-ws-arch20040211/.
  7. Chen et. al., 2005. Semantic Query Transformation for Integrating Web Information Sources. In Proc. 7th Int'l Conference on Enterprise Information Systems, 176-181.
  8. Fu et. al., 2005. An Intelligent Event Adaptation Mechanism for Business Performance Monitoring. In ICEBE 2005, 2005 IEEE Int'l Conference on eBusiness Engineering, 558-563.
  9. Madhavan, J., and Halevy, A., 2003. Composing Mappings among Data Sources. In VLDB, 572-583.
  10. IBM DB2 Content Manager site. From http://www306.ibm.com/software/data/cm/cmgr/.
  11. Hass, L., and Lin, E., 2002. IBM Federated Database Technology. IBM DeveloperWorks Web site: http://www128.ibm.com/developerworks/db2/library/techarticle/0 203haas/0203haas.html.
  12. Tork Roth et. al., 2001. An Architecture for Transparent Access to Diverse Data Sources. Component Database Systems, 175-206.
  13. Lotus NotesSQL site. From http://www12.lotus.com/ldd/doc/notessql/3.0.1/notes_sql.nsf/662 08c256b4136a2852563c000646f8c?OpenView.
  14. IBM WebSphere Information Integrator. From http://www306.ibm.com/software/data/integration/db2ii/editions_ womnifind.html.
  15. Carmel et. al., 2003. Searching XML Documents via XML Fragments. In Proc. 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.
Download


Paper Citation


in Harvard Style

C. Chieu T., Chen S. and S. Fu S. (2007). ENTERPRISE INFORMATION SEARCH SYSTEMS FOR HETEROGENEOUS CONTENT REPOSITORIES . In Proceedings of the Ninth International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-972-8865-88-7, pages 365-371. DOI: 10.5220/0002349603650371


in Bibtex Style

@conference{iceis07,
author={Trieu C. Chieu and Shyh-Kwei Chen and Shiwa S. Fu},
title={ENTERPRISE INFORMATION SEARCH SYSTEMS FOR HETEROGENEOUS CONTENT REPOSITORIES},
booktitle={Proceedings of the Ninth International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2007},
pages={365-371},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002349603650371},
isbn={978-972-8865-88-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Ninth International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - ENTERPRISE INFORMATION SEARCH SYSTEMS FOR HETEROGENEOUS CONTENT REPOSITORIES
SN - 978-972-8865-88-7
AU - C. Chieu T.
AU - Chen S.
AU - S. Fu S.
PY - 2007
SP - 365
EP - 371
DO - 10.5220/0002349603650371