2 IMPORTANCE OF DATA
INTEGRATION
Database integration has various benefits for the
organization. Database integration increases the
number of experts viewing and manipulating the
data by increasing the exposure of the data. It also
helps the organization by detecting and correcting
more errors in the data. The utilization of data in the
workflow also increases and hence enhances the
quality, trust and decreasing the business risk.
The process of integrating a particular category of
data can be thought of as a progression. Initially the
organization has only a single-source version of the
data. Over time more sources become available.
These multi-source versions are brought together
until a single comprehensive system is created that
relates these "variants" of the information and
resolves the inconsistencies. (Steve Hawtin et.al,
2003)
Integrated data provides a framework that helps
organization by delivering a complete view of a
customer and standardizing business processes and
data definitions. Integrated data helps to combine the
current and past values from disparate sources in
order to see the big picture. It helps to offload the
processing burden on operational systems. At the
same time this increases the effectiveness of its data
access and analysis capabilities. (Steve Hawtin et.al,
2003)
3 RELATED WORK
Database integration is often divided into scheme
integration, instance integration and application
integration. Scheme integration reconciles schema
elements (e.g., entities, relations, attributes,
relationships) of heterogeneous database. (Kim W,
Seo J, 1991) Instance integration matches tuples and
attribute values. Attribute identification is the task of
schema integration that deals with identifying
matching attributes of heterogeneous databases.
Entity identification (Lim E-P et.al, 1993) is the task
of instance integration that matches tuples from two
relations based on the values of their key attributes.
Application integration involves storing the data an
application manipulates in ways that other
applications can easily access. (Ian Gorton et.al,
2004) Meta-data management has become a
sophisticated endeavour. In the present world, nearly
all components that comprise modern information
technology, such as Computer Aided Software
Engineering (CASE) tools, Enterprise Application
Integration(EAI) environments, Extract/Transform/
Load (ETL) engines, Warehouses, EII, and Business
Intelligence (BI), all contain a great deal of meta-
data. Such meta-data often drives much of the tool’s
functionality. (John R Friedrich, 2005)
Database integration can take many forms. There are
three main forms of integration: Extract Transform
and Load (ETL), Enterprise Application Integration
(EAI), and Enterprise Information Integration (EII).
ETL refers to a process of extracting data from
source systems, transforming the data so it will
integrate properly with data in the other source
systems, and then loading it into the data warehouse.
ETL simplifies the creation, maintenance and
expansion of Data warehouses, data marts and
operational data stores. ETL is either batch, near
real-time, and sometimes real-time. (Surajit
Chaudhuri et.al, 2004, CoreIntegration, 2004)
Enterprise Application Integration (EAI) combines
separate applications into a cooperating federation of
applications by placing a semantic layer on top of
each application that is part of the EAI
infrastructure. EAI is a business computing term for
plans, methods, and tools aimed at modernizing,
consolidating, and coordinating the overall computer
functionality in an enterprise. (Jinyoul Lee, 2003)
Typically, an enterprise has existing legacy
applications and databases, and wants to continue to
use them while adding or migrating to a new set of
applications that takes advantage of Internet and
other new technologies.
Previously, integration of different systems required
rewriting codes on source and target systems, which
in turn, consumed much time and money. Unlike
traditional integration, EAI uses special middleware
that serves as a bridge between different applications
for system integration. All applications
communicate using the common interface.
EII: Enterprise Information Integration
EII provides real-time access to aggregated
information and an infrastructure for integrated
enterprise data management. While the graphical EII
data mapping tools are easy to use and speed the
integration process, the information they capture is
valuable corporate information required for
enterprise data quality management. EII helps to
capture the metadata to drive data transformations, is
the same information required for enterprise
information management. (Beth Gold-Bernstein,
2004).
DATABASES AND INFORMATION SYSTEMS INTEGRATION USING CALOPUS : A CASE STUDY
201