The third bullet has been tackled by creating a
data sampler that creates a histogram of the data to
partition horizontally data across the servers before
the loading process starts in order to guarantee that all
servers will receive a similar amount of data, thus,
reaching the maximum capacity of the appliance.
2 LeanXcale ARCHITECTURE
2.1 Layers
LeanXcale (LeanXcale 2019) is an ultra-scalable
parallel & distributed database manager. It consists of
three layers: storage engine, transactional manager,
and query engine. The storage engine is actually a
parallel-distributed relational key-value data store.
The transactional manager is a parallel-distributed
system with several components. The query engine is
parallel-distributed as well and can scale both OLTP
(each instance handles a subset of the transactions)
and OLAP workloads (multiple instances cooperate
to execute a large analytical query).
2.2 KiVi Storage Manager
KiVi is a parallel-distributed storage manager. One of
its differential features is that it is optimized to run
efficiently on many-core and NUMA architectures
(Ricardo Jiménez-Peris, 2019). Basically, a different
KiVi server is deployed at each of the cores that are
dedicated to the storage layer.
Each table is horizontally partitioned into regions.
Each region comprises a range of primary keys. The
region is the distribution unit across servers. When a
row is inserted it will hit the server managing the
region where the row belongs (based on the primary
key range of the horizontal partition).
Client applications access LeanXcale database
through the SQL interface, via the JDBC driver.
Internally, KiVi is accessed by the query engine
subsystem. KiVi offers a key-value API. This API is
internally used by the query engine to interact with
the storage layer. However, this API is also available
to be used directly by client applications. In this way,
LeanXcale database offers a dual-interface, key-value
and SQL.
This dual interface has the advantage that
whenever it is convenient it becomes possible to
avoid the overhead of SQL processing by directly
accessing the key-value interface that is accesing the
same relational data as SQL.
Figure 1: LeanXcale subsystems.
3 LeanXcale ARCHITECTURE
3.1 What Is LeanXcale Database
LeanXcale is an ultra-scalable operational Full SQL
Full ACID distributed database (Ozsu and Valduriez,
2014) with analytical capabilities. The database
system consists of three subsystems:
1. KiVi Storage Engine.
2. Transactional Engine.
3. SQL query Engine.
3.2 LeanXcale Subsystems
The operational database is a quite complex system in
terms of different kinds of components. The
operational database consists of a set of subsystems
namely: the Query Engine (QE), the Transactional
Manager (TM), the Storage Engine (SE) and the
Manager (MG). Some subsystems are homogeneous
and other heterogeneous.
Homogeneous subsystems have all instances of
the same kind of role. Heterogenous subsystems have
different roles. Each role can have a single instance
or multiple instances. The transactional manager has
the following roles: Commit Sequencer (CS),
Snapshot Server (SS), Conflict Managers (CMs) and
Loggers (LGs). The former two are mono-instance,
whilst the latter two are multi-instance. The Storage
Engine has two roles data server (DM) and meta-data
server (MS), both multiple instances. The query
engine is homogeneous and multi-instance. There is a
manager (MNG) that is single instance and single-
threaded. Many of these components can be
replicated to provide high availability, but their nature
does not change. Since replication it is an orthogonal
topic, we do not mention anymore.