also made use of user supplied parameter for maxi-
mum allowed length of motion vector which enabled
us to control the magnitude of independent local mo-
tions. A temporary flow field was created and uni-
formly filled with this vector. The mask of this com-
ponent only and a copy of the ith flow field were trans-
formed according to this uniform flow. This moved
the component mask and prepared the concatenation
of the corresponding region of the flow field. The
concatenation was finished by pasting this region fol-
lowed by addition of chosen flow vector to each vec-
tor inside the region into the ith flow. Note that more
complex foreground movement may be used by sub-
stituting any smooth flow field for the uniform one
as well as corresponding vectors should be added in-
stead of constantly adding the chosen one. After all,
new foreground mask was created by merging all sin-
gle locally moved masks.
In the foreground preparation unit, similarly to
the background preparation, another special flow field
was used. It was again concatenated to the current
ith flow and the result was stored for the next frame-
work’s iteration. Another copy of sample image was
transformed according to this another special flow
field to position the foreground texture.
In the ith frame unit, the foreground regions were
extracted from the transformed copy of sample im-
ages. The extraction was driven by the new fore-
ground mask which was dilated (extended) only for
that purpose beforehand. Finally, the ith image was
finished by weighted insertion (for details refer to (Ul-
man, 2005)) of the extracted foreground into the artifi-
cially generated background. The weights were com-
puted by thresholding the distance transformed (we
used (Saito and Toriwaki, 1994)) foreground mask.
An illustration of the whole process is shown in Fig. 4.
4 RESULTS
We implemented and tested the presented frame-
work in C++ and in two versions. The first version
created only image pairs while the second version cre-
ated arbitrarily long image sequences. It was imple-
mented with both backward and forward transforma-
tions. We observed that for 2D images the forward
variant was up to two orders of magnitude slower than
the backward variant. Therefore, the second version
was implemented based only on backward transfor-
mation. The program required less then 5 minutes on
Pentium4 2.6GHz for computation of a sequence of
50 images with 10 independent foreground regions.
The generator was tested on several different 2D
real-world images and one real-world 3D image. All
generated images were inspected. The framework
generates every last image in the sequence as a re-
placement for the sample image. Thus, we com-
puted correlation coefficient (Corr.), average abso-
lute difference (Avg. diff.) and root mean squared
(RMS) differences. The results are summarized in
Table 1. The generator achieved minimal value of
0.98 for correlation. This quantitatively supports our
observations that generated images are very close to
their originals. The suggested framework also guaran-
tees exactly one transformation of the sample image,
hence the quality of the foreground texture is best pos-
sible thorough the sequence. Refer to Fig. 5 for ex-
ample of 3 images of a 50 images long sequence. A
decent improvement was also observed when an arti-
ficial background of 3D image was formed in a slice-
by-slice manner, see rows C and D in Table 1. In the
case of row D, a separate random pools and mean val-
ues were used for each slice of the 3D image.
Inappropriately created foreground mask may em-
phasize the borders of extracted foreground when
inserted into artificial background. The weighted
foreground insertion was observed to give visually
better results. Table 1 quantitatively supports our
claim: merging the foreground components accord-
ing to twice dilated foreground mask was comparable
to the plain overlaying of foreground components ac-
cording to non-modified masks.
The use of user-supplied movements mask pre-
vented the foreground components from moving into
regions where there were not supposed to appear, e.g.
outside the cell. The masks are simple to create, for
example by extending the foreground mask into de-
manded directions. The generated sequences then be-
came even more real. Anyway, randomness of com-
ponents’ movements prohibited their movements con-
sistency. Pre-programming the movements would en-
able the consistency. Clearly, the movement mask
wouldn’t be necessary in this case.
5 CONCLUSION
We have described the framework for generating
time-lapse pseudo-real images together with unbiased
flow fields. The aim was to automatically generate a
large dataset in order to automatically evaluate meth-
ods for optical flow computation. However, one may
discard the generated flow fields and use just the im-
age sequence.
The framework allows for the synthesis of 2D and
3D image sequences of arbitrary length. By suppling
real-world sample image and carefully created masks
for foreground and background, we could force im-
ON GENERATING GROUND-TRUTH TIME-LAPSE IMAGE SEQUENCES AND FLOW FIELDS
237