cess operates in binary reduction fashion. Total num-
ber of steps involved in registration process are boun-
ded by log
2
(max(O1,O2)), where max(O1,O2) de-
notes the maximum of O1 and O2. In this case, num-
ber of steps in reduction pro cess are log
2
(8) = 3.
The figure depicts a thr ee step reduction operation.
In Step 1, four sets of operations o ccur in parallel (re-
fer Figure 3(b)). Specifically, entries 0 and 1 parti-
cipate in first set of comparison. Similarity pattern
correspo nds to type 1, which is marked on the arch
from source to sink.
Second set of operation comprises entries 2 and
3. Note that both of the entries are unshaded which
indicates a similarity pattern corresponds of type 0,
as marked on the arch from source to sin k. Third and
fourth set of operations comprises entries 4 and 5, and
6 and 7, resp ectively. And similarity pa tterns corr e-
sponds to type 1 and type 0, respectively. After Step
1, pattern table is updated with similarity pattern (1,
0, 1, 0).
In Step 2, two set of operations occur in para l-
lel. In first set of operation, entries 0-1 form the sink
and entries 2-3 form source. Similarity p a ttern corre-
sponds to type 2, as marked on the arch from source
to sink. In seco nd set, the participating entries are 4-7
wherein entries 4 -5 for m the sink a nd entries 6-7 form
source. The similarity pattern corresponds to type 2
as marked on the arch. After Step 2, pa ttern table is
updated with similarity pattern (2, 2).
In Step 3, one set of operation takes place wherein
all the entries participate. Entries 0-3 form the sink
and entries 4-7 form source. Similarity p a ttern corre-
sponds to type 3, which is ma rked on the arch from
source to sink. After Step 3, pattern table is upda-
ted with similarity pattern (3). With this step the fin a l
pattern table is as follows: {(1, 0, 1, 0), (2, 2), (3) }.
Listing 1 depicts the methodology to register the
similarity distribution pattern.
Classification of Similarity Patterns. Next, we dis-
cuss some of the common sim ilarity patterns from
delta encoding perspective.
Intuitively, if O1 and O2 are identical, then the
similarity pattern comprises of all 0’s, and denoted as
{(0, 0, 0, 0), (0, 0), (0 )}.
If O1 and O2 are co mpletely different, then the
similarity pattern comprises all 3’s, a nd denoted as
{(3, 3, 3, 3), (3, 3), (3 )}.
Consider another similarity pattern {(0, 0, X, X),
(0, X), (X) }, where X=(1, 2, 3). This is similarity
pattern represents a scenario where first half of O1 is
identical to the first half of O2. Note that this is based
on the observing similarity pattern emerging out of
the first step wh ich is {(0, 0, X, X) }.
Likewise, another similarity pattern {(X, X, 0, 0),
(X, 0) , (X)}, where X=(1, 2, 3), indicates that the
bottom halves of O1 and O2 are identical.
Such patterns indicate that there exist a signifi-
cant amount of contiguity in the similarity. Contigu-
ous similar ity pattern represent scenarios which are
highly favorable for delta encod ing. Reason being
that the overhead resulting from delta encoding would
be limited due to the contiguous nature of the simila-
rity (or dissimilarity) in the objects being considered.
Such favorable patterns are also referred as positive
patterns.
Listing 1: Register-Pattern-GPU.
1 { i n t s t e p ;
2 i n t m a x Steps = l o g ( max ( O1 , O2 ) ) ;
3 i n t maxThreads = max ( O1 , O2 ) / 2 ;
4 i n t i ; i n t S t a r t ;
5 i f ( b lo c k I d x . x < num
BLOCKS )
6 { i f ( t h r e a d I d x . x < maxThreads )
7 { f or ( i = 0 ; i < m ax S tep s ; i ++)
8 { i f ( t h r e a d I d x . x<maxT hr eads )
9 { TID = t h r e a d I d x . x ;
10 i f ( XORMatch [ TID + 1] == 0 )
11 { i f ( XORM atch [ TID ] == 0 )
12 { S i m
P a t t e r n [ i ] [ TID ] = 0 ; }
13 e l s e S i m
P a t t e r n [ i ] [ TID ] = 2 ;}
14 e l s e { S t a r t = 0 ;
15 i f ( XORMatch [ TID ] == 0)
16 { S i m
P a t t e r n [ i ] [ TID ] = 1 ;
17 S t a r t = 0 ;}
18 e l s e { S i m
P a t t e r n [ i ] [ TID ] = 3 ;
19 S t a r t = si ze A T [ TID ] ;
20 fo r ( j = 0 ; j <siz eA T [ TID + 1 ] ; j ++)
21 {XORMatch [ TID+ j ] =
22 XORMatch [ ( TID +1)+ j ] ; }
23 siz eA T [ TID ]+= s i ze A T [ TID + 1 ] ;
24 }}} m axT hr eads = maxT hr eads / 2 ;
25
t h r e a d f e n c e s y s t e m ( ) ;
26 } } } } r e t ur n ;
27 }
Consider a pattern such as {(Y, Y, Y, Y) , (3, 3) ,
(3)}, w here Y=(1, 2) indicates that similarity is non-
contiguous. Such patterns represent worst case sce-
narios and are u nfavorable for delta encoding from
overhead aspect. Such u nfavorable patterns are also
referred as n egative p a tterns.
We argue th at negative patterns or unfavorable
patterns are not likely to b e nefit the cause of delta
encodin g. This is due to th e diverging nature of the
overhead accruing. The task of finding d elta enco-
ding for objects, which exhibit such negative pattern
similarity, should be abandoned in favor of othe r m ore
meaningful comp utations.