On the Generation of Dynamic Business Indicators

Fábio Alexandre Pereira dos Santos, Rui César das Neves and Joaquim Belo Filipe

Instituto Politécnico de Setúbal, Escola Superior de Tecnologia de Setúbal, Setúbal, Portugal

Keywords: Dynamic Generation of Data, Business Intelligence, OLAP, Data Representation, Infographics, Entity

Framework.

Abstract: While information is rapidly gaining relevance to the organizations, systems that help companies analyse

that information need to improve their effectiveness in several layers. We address, in our on-going research

work, mainly the presentation layer and business layer of software systems development. The aim of this

position paper is to discuss how to develop a system that is as flexible and configurable as possible, which

allows multiple methods of analysing and visualizing the relevant data to an organization, allowing them to

define certain views on business data with appropriate graphics. One of the values from this system is the

domain of technology in a controlled environment, because it helps solving a specific problem, since it is

both a generic tool, autonomously and with a high degree of adaptability. Flexibility will go hand-in-hand

with ensuring the consistence of the data model, i.e. the data being analyzed by an user must respect

syntactic and semantic constraints when relations to each other created in the organization’s logic. This

feature prevents the user to attempt to connect data that aren’t related. Based on the existing data types

described on metadata in the data model, the system provides the users with a list of possible graphical

representations for the selected information type. This list will be filtered in order to allow the user to select

only graphical representation types that are appropriate to the selected data types. This is an innovative

feature, in the sense that the system constrains the selection of the visualization elements thus avoiding

potential conceptual errors.

1 INTRODUCTION

In recent times an increasing attention has been

given to take advantage of information and

knowledge in organizations in order to gain

competitive advantage.

Thus, the information about organizations is now

an essential component of support for their

operations, since it allows creating internal and

external favourable conditions to reduce costs and to

provide innovative services. It also facilitates the

construction of knowledge that is required to plan

and implement solutions for problems and

challenges that arise everyday on an organization.

In organizations, are agents there are responsible

for decision making and in this process they require

rigorous tools to analyse the data and further

improve the performance of the organization. Those

agents need to access and process different types of

relevant data.

Unfortunately the development of information

systems is, usually, more concern in the

development of the information input areas, creating

system that collects information and guarantees the

information coherence and leaving the outputs areas

with limited capabilities.

The main goal is the development of a tool that

could be connected, in a standard way,

independently from the application domain, to an

already developed system which gives information

extraction capabilities to the end user. These

capabilities will allow the drill down in all

information available, maintaining the coherence

and relations defined in the input area.

With this in mind, the system presented in this

position paper is a tool to help companies mining

their own data and obtain new relevant information,

which is not accessible from the traditional

information systems.

Actually there are a lot of systems doing similar

things, but most of them are not as specific and

configurable as desired. The systems that are

configurable often have features that most

organizations do not need.

390

Alexandre Pereira dos Santos F., César das Neves R. and Belo Filipe J..

On the Generation of Dynamic Business Indicators.

DOI: 10.5220/0004199603900394

In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval (KDIR-2012), pages 390-394

ISBN: 978-989-8565-29-7

 2012 SCITEPRESS (Science and Technology Publications, Lda.)

2 IMPORTANT TERMS

AND SYSTEM MODEL

In this section, it is important to clarify a few critical

terms that are needed to set the stage for everything

else that follows.

The first term is entity, which is defined as a

concept in the domain of application from which a

data type is defined, where data type is an element

that represents the structure of a top-level concept

and is a template for instances of entity types in a

system.

Another important term is indicator, which is

defined as a set of data belonging to one or more

entities which are related to each other in the data

model. It’s not only necessary to be related, but also

that they have the same meaning in the

organization’s business logic.

An important aspect of the system that we will

implement based on these ideas is its generic and

configurable structure, independent of the

application domain and requiring only a set of

rigorously defined terms in order to allow navigation

through all information.

Thus, the aim is to develop a web-based system

that allows structured access to information,

according to the organization’s business logic. This

gives to the end user the opportunity to do queries

using a configurable and appropriate interface. The

integrity of these queries will be guaranteed through

the modeling process required by ORM (Object/

Relational Mapping), because this, ORM, will

guarantee that the end user has restrictions on data

access, avoiding mistakes, unlike conventional

systems in which the ORM is applied on developer

side.

Based on this, the system is built around the

approach of the Entity Framework (EF) from

Microsoft, ADO.NET. There are other ORM

technologies available in the market, but we choose

this one because it comes from a major software

supplier, is the ORM embedded in.NET Framework,

and is the approach that we are studying.

It is also important to mention that changes in the

data model shouldn’t force the rewrite of the whole

system and should allow the reuse of all available

information.

The idea is to provide to end users a list of

possible graphic representation forms for an

indicator, because the aim is to constrain possible

actions by the user, thereby reducing the probability

of mistakes on selecting a graphical representation.

We select that kind of representation, although

there are several ways to display and process these

types of data, but the most simple is through

graphical representation, being this conclusion given

by (Tufte, 2001) “(…) graphics are instruments for

reasoning about quantitative information. Often the

most effective way to describe, explore, and

summarize a set of numbers – even a very large set –

is to look at pictures of those numbers. Furthermore,

of all methods for analyzing and communicating

statistical information, well-designed data graphics

are usually the simplest and at the same time the

most powerful”.

These systems, which help decision-making,

must have the simplest forms of data representation,

i.e., it should be able to easily illustrate the graphical

representation of sums, dates and tables.

It is also expected to illustrate facts that have

occurred in time, i.e., present time series, for a given

data, because with only one dimension this kind of

representation allows to display time series, with an

appropriate scale that can’t be achieved with another

type of graphical representation.

Another important feature is the presentation of

indicators regarding space, because organizations

typically have data that is related to locations /

regions and are of great importance to the

organization’s business model.

The graphical presentation is one of the key

points in this system, because according to (Tufte,

2001) “(…) No doubt some graphics do distort the

underlying data, making it hard for the viewer to

learn the truth”. Thus, it is important that the system

can filter out these types of representations,

according to the types of data that compose the

indicator; this feature will be described in section 3

of this paper.

The system that we are implementing is clearly

placed in the scope of the so-called OLAP, i.e On-

Line Analytical Processing.

OLAP provides the ability to analyse different

aspects of information in a fast and dynamic way,

where large volume of data are contained within a

data warehouse according to (Thomsen, 2002) “(…)

OLAP is meant to contrast with OLTP (On-Line

Transaction Processing). The key aspects are that

OLAP is analysis-based and decision-oriented”.

Until recently these systems were known as DSS

(Decision Support Systems), but now it is common

to refer to them as Business Intelligence (BI)

systems, which according to (Rud, 2009), “(…) BI is

defined as the ability for an organization to take all

its capabilities and convert them into knowledge

(…)”.

The concept of BI is relatively recent and BI

systems are usually composed of a set of tools that

enable report generation and allow users to extract

OntheGenerationofDynamicBusinessIndicators

391

useful information from the stored data.

BI systems have also a set of tasks associated,

which according to (Cardoso, 2011) “(…) can be set

into four groups:

 Make predictions based on historical data,

based on past performance and current;

 Creation of scenarios that demonstrate the

impact of changes to existing variables;

 Allow ad-hoc access to data;

 Analyse in detail the organization, thereby

ensuring a deeper knowledge about the

same.”

With the sets of tasks and features shown above,

the traditional systems could work connected to the

BI systems yet independently, allowing to apply

specific techniques, which according to (Santos and

Ramos, 2009) “(…) BI systems have implemented

the functionality, scalability and security of existing

systems which manage the database for building

data warehouses that are analysed with techniques

for OLAP, Data Mining and Query Report”.

The system being developed cannot be

considered a BI tool as it doesn’t cover the first two

groups identified by the (Cardoso, 2011), just

covering the third and fourth points - allow ad-hoc

access data and in detail analysis of the organization.

At the data source level, the system needs a

model that contains information about the present

data and the relationships between objects that

compose this data.

The next section will describe forms of

presentation for indicators.

3 DATA AND DATA TYPE

REPRESENTATION

With the information available from the EF, the

system must advise users what types of common

graphical representation can be used to see the data.

After analyzing the major database vendors, namely

Oracle, SQL Server and MySQL, it can be

concluded that all kinds of data may be grouped in 4

groups, which we call Numeric, Date and Time,

Strings and Other Types of data that we don’t want

to represent, like “XML”, “Cursor”, “BLOB”,

“BFILE” types, etc..

The most common forms of representation of

these groups of information are “Column”, “Lines”,

“Pie”, “Bar” and “Scatter”.

The scatter plots (X, Y) usually shows values in

both axes X and Y, while the other graphs show

usually in the Y axis, using “categories” in the X

axis.

This information is very important because it

allows the system to tell the user which data

representation can be selected. Better yet, according

to the data types of the selected information, we can

tell the user what kind of graphical representation

shouldn’t be selected, avoiding the user to select

forms of graphic representation which may not be

suitable to the selected data.

Initially we want to use simple graphics, which

can be represented by 2D representation, because we

need to find an efficient way to help users avoiding

errors on select data representation.

Those simpler graphic representation forms can

be divided into two groups, depending on the data

types that we are working on. Those groups are

Categories vs. Values and Values vs. Values, as we

can see on

Table 1, where categories are like objects

and values are like the object values. The second one

is a traditional graph, where values for both axes are

available.

Table 1: Forms Representation.

The idea of this mapping, between the existing

data types and forms of graphical representation, is

to show that there is dependence between the

existing data types and the ways to represent them.

According to the table above, there are types of data

that cannot represent certain forms of information

and that when displayed on screen are shown in the

wrong way and therefore lose their meaning.

4 DEVELOPMENT

In the following sections, the main focus is to

describe how we want to solve the problem,

explaining the specifications and highlighting some

implementation details.

KDIR2012-InternationalConferenceonKnowledgeDiscoveryandInformationRetrieval

392

4.1 Requirements Analysis

The requirements for the project are divided into two

different groups: technical and user interface.

4.1.1 Technical Requirements

Regarding technical requirements, it is expected that

the system is capable of accessing information types

at runtime and validate new data relations that the

user may wish to create.

In the EF, the conceptual model, storage model

and the mapping between the two is defined on an

.edmx file. This file is updated when either the

database or the model changes. According

(Corporation, Modeling and Mapping, 2011), “the

EDM Generator, which is included with .Net

Framework, generates the .csdl, .ssdl and .msl files

from an existing data source”.

These files, XML-based, describe the conceptual

model, storage model, and mapping, are known as

metadata.

This means accessing the metadata which was

represented in .csdl and .ssdl files and was loaded

into instances of the

“System.Data.Metadata.Edm.EdmItemCollection”

and

“System.Data.Metadata.Edm.StoreItemCollection”

classes, which are accessible by using methods in

the

“System.Data.Metadata.Edm.MetadataWorkspace”

class.

These instances allow the system to understand

the structure of data, thus, allowing navigation

through all information to create query’s

dynamically as well as an appropriate representation

for the user.

4.2 User Interface

The second group of requirements defines the user

interface: users will use the application to analyse

data which originated from the data source.

The system should enable its users to create and

display dynamically a set of indicators through

graphical representation, which can be customized

for each user.

The user should be able to select or even create

data visualization models at runtime, in a highly

configurable way: the user could add a new graphic

representation for an indicator, he/she could also

remove certain graphical representation forms or

modify parameters of others.

If the user wants to create a new graphical form

to represent the indicator, user should select which

properties from entities wants to relate and after that

the system will process the selected data. After that,

the system will show to user the output. If the user

wishes to add more constraints to this process, may

at any time add it.

This process, the selection of properties from

entities, should operate in a cycle, and this cycle is

summarized in Figure 1, allowing refining the

output, which will be shown on the user interface.

Figure 1: Operating Workflow.

As the user can have a large amount of graphics

and indicators representation, the idea is to give the

ability for to personalize her/his interface, deciding

the order in which the content should appear on the

page, for example. The user should be able to

minimize, move items and presentation forms

around the page as if it were a design surface and to

remove items as needed.

The system will be required to maintain the user

page customization, in order to avoid repeating the

visualization settings each time the user visits the

system. This can be done using Web Parts, a

technology that provides an appropriate way to build

a modular Web Site that can be customized, with

dynamic settings, on a per-user basis. This

customization is provided by one provider model –

Personalization Provider, and according to some

authors (Evjen et al., 2010) “(…) this provider

makes associations between the end user viewing the

application and any data points stored centrally that

are specific to that user”, giving exactly what we

need.

The approach used in the proposed

implementation follows a similar approach to the

traditional providers, according to the same source

(Evjen et al., 2010) ”a provider is an object that

allows for programmatic access to data stores,

processes and more”, and in this systems’ context

means the independence of a data model.

OntheGenerationofDynamicBusinessIndicators

393

5 CONCLUSIONS AND FUTURE

WORK

In this position paper, we presented a comprehensive

set of criteria for the development of a data

visualization system. This system is based on

Microsoft EF and it has, explicitly or implicitly, a

constrained conceptual data model that describes the

various elements of the problem domain.

The data model represents concepts and also the

relationships between concepts, constraints, and so

on.

Since currently most applications are written on

top of relational databases, they will have to deal

with data represented in a relational form. The

programming paradigm is typically some form of

Object-Oriented Programming (OOP) including

features such as data abstraction, encapsulation and

inheritance, and these features are fundamental to

the flexibility that is intended to characterize this

system.

Typically a higher-level conceptual model is

used during the design phase, and that model is not

directly executable, so its need to be translated into a

relational form and applied to a logical database

schema and to the application code.

With this flexibility guaranteed by EF, our

system differs from many other BI systems on this

point as well, because the great majority run on top

of traditional databases, i.e. other systems are based

on multiple data sources but ours just needs to

connect to a schema from EF.

Another feature of our system is the ability to

help users to visualize the information that was

identified as relevant business indicators, based on

the selection and combination of data types or

dynamic relations and constraining the choice of

possible graphical representations.

This feature is also important because it avoids

user errors when selecting the graphical

representation forms for the selected information.

The system that we are developing will use an

approach like the providers approach because, as we

mentioned above, this feature allows the user to

access data via a web browser, in readable form,

without the need to use a complex system.

It is intended that the system will evolve to an

interface that allows the user to create a

configuration of graphical representation forms, as

well as the possibility of adding new types of

graphical representation and mapping them to

existing data types.

ACKNOWLEDGEMENTS

We would like to thank the Polytechnic Institute of

Setúbal, School of Technology of Setúbal, for

supporting the research work reflected in this paper,

presented at KDIR 2012 in the scope of the RETE

project.

REFERENCES

Cardoso, E. (2011, Setembro). Introduction to Business

Intelligence. Lisboa, Lisboa, Portugal.

Corporation, M. (2006, June). The ADO.NET Entity

Framework Overview. Retrieved 04 2012, 04, from

MSDN: http://msdn.microsoft.com/en-

us/library/aa697427(v=vs.80).aspx

Corporation, M. (2011, 8 29). Modeling and Mapping.

Retrieved 5 5, 2012, from MSDN:

http://msdn.microsoft.com/en-us/library/bb896343

Evjen, B., Hanselman, S., & Rader, D. (2010). ASP.NET 4

in C# and VB. Canada: Wrox.

Inmon, W. (2002). Building the Data Warehouse. Willey.

Rud, O. P. (2009). Business Intelligence Success Factors:

Tools for Aligning Your Business in the Global

Economy. Hoboken, New Jersey: Wiley & Sons.

Santos, M. Y., & Ramos, I. (2009). Business Intelligence.

FCA.

Thomsen, E. (2002). OLAP Solutions - Second Edition.

Willey.

Tufte, E. R. (2001). The Visual Display of Quantitative

Information 2ª Edition. Graphics Press LLC.

KDIR2012-InternationalConferenceonKnowledgeDiscoveryandInformationRetrieval

394