Authors:
            
                    Qiang Xue
                    
                        
                                1
                            
                    
                    ; 
                
                    Sakti Pramanik
                    
                        
                                1
                            
                    
                    ; 
                
                    Gang Qian
                    
                        
                                2
                            
                    
                     and
                
                    Qiang Zhu
                    
                        
                                3
                            
                    
                    
                
        
        
            Affiliations:
            
                    
                        
                                1
                            
                    
                    Michigan State University, United States
                
                    ; 
                
                    
                        
                                2
                            
                    
                    University of Central Oklahoma, United States
                
                    ; 
                
                    
                        
                                3
                            
                    
                    The University of Michigan, United States
                
        
        
        
        
        
             Keyword(s):
            Hybrid Digital tree, indexing, string databases, prefix searches, substring searches.
        
        
            
                Related
                    Ontology
                    Subjects/Areas/Topics:
                
                        Coupling and Integrating Heterogeneous Data Sources
                    ; 
                        Databases and Information Systems Integration
                    ; 
                        Enterprise Information Systems
                    
            
        
        
            
                Abstract: 
                There is an increasing demand for efficient indexing techniques to support queries on large string databases. In this paper, a hybrid RAM/disk-based index structure, called the Hybrid Digital tree (HD-tree), is proposed. The HD-tree keeps internal nodes in the RAM to minimize the number of disk I/Os, while maintaining leaf nodes on the disk to maximize the capability of the tree for indexing large databases. Experimental results using real data have shown that the HD-tree outperformed the Prefix B-tree for prefix and substring searches. In particular, for distinctive random queries in the experiments, the average number of disk I/Os was reduced by a factor of two to three, while the running time was reduced in an order of magnitude.