same semantics can be (correctly) represented as an
employee record referencing one department (but
not vice versa). Also, the shared value is a
department identifier and employee identifier. If we
change the direction of this relationship then we
change the meaning of the connection. Yet, joins are
not able to represent this semantics because all input
relations have equal rights in a join.
Figure 3: Join is symmetric.
In the relational model, a unit of connectivity
(between domains) is one relation and composition
means joining relations. In other words, join allows
to compose or chain relations. Now assume that we
have 10 relations and want to retrieve tuples from
one of them which are related to tuples in another
relation. Formally, we need to build a Cartesian
product of these relations by adding also all relevant
join conditions. This approach is highly unnatural
and very difficult to use in practice (therefore, its use
is quite rare). Why we have to include all input
relations if we want to retrieve records from only
one of them? It is also not obvious what join
conditions to use (especially if we do not have FKs
declared). It is therefore very easy to produce
formally correct but semantically wrong results. And
the reason is that the constraints are propagated
along a sequence of relations connected by common
values. It is probably one of the reasons why the
development of the conception of automatic
reasoning in the unified relation model (URM) failed
(Maier et al., 1984).
It should be noted that there exist a mechanism
of foreign keys (FK) that can solve this problem of
symmetricity of joins. Indeed, a FK declares one
input relation and one output relation which have
different roles. If two relations are used in a join
then this FK declaration can be used as a semantic
specification of our intention in the join operation.
Yet, the use of FKs for this purpose has the
following drawbacks. First, it is not an original
purpose of FKs to describe semantics of joins (FK is
a mechanism of imposing constraints). Second, the
need in an additional mechanism like FK
emphasizes that joins have some limitations. Third,
FKs are not enforced by existing models in general
and they must not be used in the context of joins in
particular. Join is an operation which is used at
query time while FK is a declaration which is used
at design time. Fourth, it can be difficult to
understand how to use FKs in the case of arbitrarily
complex join conditions. Essentially, FK is an
attempt to introduce a mechanism of links but they
have incompatible semantics and therefore their
simultaneous use is quite controversial and eclectic.
The use of FKs in combination with joins is
analogous to introducing constructs for structural
programming along with goto operator. Their
simultaneous use will result in strange mixtures of
different patterns.
2.3 Join is a Cross-cutting Concern
Let us assume that we want to get a list of
employees working at some department. It can be
done by means of the following query:
SELECTE.emp,D.dept,D.location
FROMEmployeesE,DepartmentsD
WHEREE.dept_id=D.dept
An important observation here is that many similar
queries will include the same join condition. In other
words, this same join condition
E.dept_id=D.dept
will appear in quite many queries which involve
these two tables. It is because join conditions
describe the details of how entities are connected in
the model rather than the logic of what needs to be
retrieved.
Such fragments of the source code which scatter
throughout the whole program or query are referred
to as a cross-cutting concern (Kiczales et al., 1997).
The existence of such repeated fragments of code is
an indication of either bad design or impossibility to
modularize their logic due to limitations of the
language. The main negative consequence is that the
same fragment can appear in quite different and
unrelated contexts semantically belonging to
different levels of organization. As a result, the
program or query can become error-prone and
difficult to maintain. The solution of this problem is
to provide a mechanism for modularizing such
repeated fragments in a separate modeling construct.
Ideally, a mechanism of connectivity should
declare how different entities in the whole model are
connected independent of where these connections
will be used. Yet, in join-based queries, both the
logic of the query and the logic of the connections
between relations are described together in the same
construct. It is a typical example of mixing different
concerns. On one hand, the main purpose of the
query is to retrieve employees with the related
department information. It is application-specific
logic and we do not care how these relations are
Employee
String
Departments
dept_id
dept
domains
relations
join