Authors:
Nir Soffer
1
and
Erez Waisbard
2
Affiliations:
1
IBM, Givataim, Israel
;
2
CyberArk, Petach Tikva, Israel
Keyword(s):
Integrity Verification, Hash Functions, Storage Virtualization, Sparse Disks, Parallel Computation.
Abstract:
Verifying the integrity of files during transfer is a fundamental operation critical to ensuring data reliability and security. This is accomplished by computing and comparing a hash value generated from the file’s contents by both the sender and the receiver. This process becomes prohibitively slow when dealing with large files, even in scenarios involving sparse disk images where significant portions of the file may be unallocated. We introduce blkhash, the first hash construction tailored specifically for optimizing hash computation performance in sparse disk images. Our approach addresses the inefficiencies inherent in traditional hashing algorithms by significantly reducing the computational overhead associated with unallocated areas within the file. Moreover, blkhash implements a parallel computation strategy that leverages multiple cores, further enhancing efficiency and scalability. We have implemented the blkhash construction and conducted extensive performance evaluations t
o assess its efficacy. Our results demonstrate remarkable improvements in hash computation speed, outperforming state-of-the-art hash functions by up to four orders of magnitude. This substantial acceleration in hash computation offers immense potential for use cases requiring rapid verification of large virtual disk images, particularly in virtualization and software-defined storage.
(More)