manager requests on-disk free space to store the
amount of data being flushed by the write gather
cache. Based on the array of chunks returned by the
space management layer, the inode manager stores
the metadata either in the row-column intersection of
the base table associated with the object, or in the
most current header block of the SecureFile object.
The metadata information includes start block
address and length of a chunk as well as the start and
end offsets of the object being mapped to mana the
chunk. The metadata structures are transactional
managed similar to relational data and are
recoverable after process, session and instance
failures.
2.2.4 Space Management
The space management layer supports allocation of
sets of variable sized contiguous data blocks or
chunks up to 64M for on-disk storage of SecureFile
objects. With SecureFile objects being cached in the
Write Gather Cache, the space management layer is
able to meet larger space requests from the inode
manager through more contiguous layout on disk,
therefore providing more efficient read and write
access. Although space metadata is managed in-
memory, the metadata changes are consistent across
transactions, instance failures as well as media
failiures
Operations such as full overwrites / rewrites,
updates and deletes in SecureFiles follow ‘copy-on-
write’ semantics resulting in de-allocation of space
previously occupied by the offsets affected by the
operation. Space freed during the de-allocation
operations is not reused until it is retained for a
certain period of time to achieve read consistency
correctness for qureies.
2.2.5 I/O Management
During writes, the Inode Manager communicates the
set of chunks obtained from the space layer as well
as the write gather cache buffers to the I/O Manager.
Based on a user parameter, the I/O Manager either
copies the write gather cache buffers to database
cache buffers or schedules asynchronous disk writes
for the set of chunks.
The I/O Manager supports read-ahead or pre-
fetching data from disk. It keeps track of access
patterns of SecureFile objects and issues intelligent
pre-fetching of chunks before the request is actually
made. Read latency is reduced by overlapping the
network and storage throughput.
3 FEATURES
Being stored as first-class objects within the
database, Oracle SecureFiles has been designed to
inherit most of the data management features such as
transaction support, read consistency and data
durability provided by the Oracle database server
that are not provided by traditional filesystems.
3.1 Transactions and Read Consistency
Oracle SecureFiles is a transactional data store.
Operations of Oracle SecureFiles generate undo
records for relational data as well as metadata
operations in the delta update, inode and space
management components. SecureFile objects
undergo 'copy on write' semantics on data
manipulation operations and hence alleviate the
requirement to store previous images for rollback
purposes. Oracle SecureFiles achieves read
consistency through 'copy-on-write' semantics thus
enabling SecureFile segments to retain previous
versions of SecureFile objects up to a certain period.
A query on a SecureFile object issued at a point in
time within the retention period is guaranteed to
return the most consistent version of the object as of
that point in time.
3.2 Data Durability
Oracle SecureFiles System design supports a range
of data durability options. The design provides
choice to the users to either use the database buffer
cache to stage writes on SecureFile object buffers or
to use the underlying storage for direct writes of
SecureFile object buffers. Direct writes prevent
pollution of buffer cache for large I/Os. Direct write
operations can also be logged for media recovery
purposes. The accompanying relational data, inode
metadata and on-disk space metadata changes
modify Oracle data blocks in the buffer cache itself
and are logged in Oracle Redo logs.
4 PERFORMANCE EVALUATION
The evaluation experiment simulates a real world
DICOM application consisting of digital diagnostic
images accompanied by patient metadata. We
compare read and write throughput of SecureFiles to
that of NFSv3 filesystem. In both cases patient
metadata is stored in the Oracle database. In the case
of filesystem, the images are stored on Ext3 FS file
servers that are accessed using NFSv3. In case of
ICSOFT 2008 - International Conference on Software and Data Technologies
62