Table 1: Relation PAT_HISTORY: an approximate
representation of Example 1 in TSQL2.
Patient Symptom T
start
T
end
John headache 10/9/2011 0:00 10/9/2011 23:59
At a first glance this representation looks
reasonable. However, we stress that, to cope with the
exact content of Example 1, we cannot interpret the
relation PAT_HISTORY above using the usual
semantics adopted by TSQL2 and by most TDB
approaches (such a semantics has been specified by
the BCDM model (Jensen & Snodgrass, 1996)). In
such a semantics, the so-called snapshot semantics,
the meaning of the tuple in Table 1 would be the
following:
t 10/9/2011 0:00 ≤ t ≤ 10/9/2011 23:59
Holds_at(<John,headache>,t)
(1)
where Holds_at(<a
1
,…,a
n
>,t
i
) is a predicate (that we
have imported from (Galton, 1990)) stating that the
fact <a
1
,…,a
n
> occurs (is true) at the time t
i
. In other
word, in TSQL2, the tuple in Table 1 would mean
that John had headache continuously, in all the time
granules (temporal snapshots) in September 10
th
. On
the other hand, the intended semantics of Example 1
is different, as shown below:
t
start
t
end
(t
start
≤t
end
)
t t
star
≤ t ≤ t
end
Holds_at(<John,headache>,t)
(2)
Intuitively, Example 1 means that there is an interval
of time (starting at t
start
and ending at t
end
) during
September 10
th
in which John had (continuously)
headache. Adopting such a new semantics has a deep
impact in the definition of relational algebraic
operators. Indeed, since we demand that a data
model must be expressive enough to cope with the
results of the application of algebraic operators, and
since we want to enforce the above semantics, the
TSQL2 data model is inadequate here. As a simple
example, let us consider intersection. Let us suppose
to have another relation, PAT_HISTORY’, which is
identical to PAT_HISTORY in Table 1, and to
perform the intersection PAT_HISTORY
PAT_HISTORY’. Trivially, the non-temporal
component of the two tuples is equal. But what
about the intersection of their temporal components?
Remember that the underlying semantics of our
representation is not (1), but is the one shown at
point (2) above. Thus, the tuple <John, headache|
10/9/2011 0:00, 10/9/2011 23:59> in
PAT_HYSTORY states that there is a (convex) time
interval in which John had headache, which is
located somewhere in September 10
th
, and the tuple
in PAT_HISTORY’ behaves in the same way. There
is no support for the conclusion that the time
intervals of the two tuples are the same, or
temporally intersect. Indeed, they may intersect (or
even be the same), but may also be disjoint! Stated
in other words, the two relations may denote two
different episodes of John’s headache, or the same
episode. Due to the intrinsic indeterminacy of the
inputs, both cases are possible. As a consequence,
the output of intersection may be empty, in case the
two episodes are disjoint, or contain an episode
occurring on September 10
th
otherwise.
However, as stressed already in the introduction,
we aim at providing a data model in which the
output of algebraic queries can be expressed. We
must thus move towards a different representation
with respect to the one in Table 1, since it cannot
capture the indeterminacy about the output of
intersection described above. Our idea is simple, and
starts from the consideration that the above
indeterminacy can be interpreted as an
indeterminacy about the number of occurrences
(zero or one, in the example) of the described
episode. Thus, we propose to extend the data model
in order to explicitly model the minimum and
maximum number of occurrences of facts.
It is worth emphasizing that such a number of
occurrences cannot be coped with as a “standard”
numeric attribute, to be managed directly by
users/developers. We will show soon that relational
algebraic operators have to be carefully re-defined to
correctly deal with such numbers. Such a definition
must be provided once-and-for-all, and cannot be
demanded to users/developers (analogous
motivations have been provided in Section 1 of the
TSQL2 book (Snodgrass, 1995) for the specialized
treatment of valid time in temporal relational
databases).
Also, the cases in which the exact number of
occurrences of facts is known (see, e.g., example 1
in the introduction) can be easily modeled, and
constitute a specific case (in which the minimum
and maximum cardinality are set to be equal) of our
general model. Indeed, example 1 above shows that,
even in case the exact input cardinalities are known,
the cardinalities obtained by the application of
relational operators may only be bounded by a
minimum and a maximum value. It is worth
stressing that such a behavior is not due to our
choice of the data model, but is an intrinsic feature
of the phenomena we want to model.
Finally, we stress that our approach is even more
expressive with respect to the initial requirements
discussed in the introduction, since it copes with
FrameTimeandCardinalityIndeterminacyinTemporalRelationalDatabases
271