Virtual Flattening of a Clothing Surface by Integrating Geodesic

Distances from Different Three-dimensional Views

Yasuyo Kita and Nobuyuki Kita

Intelligent Systems Research Institute, National Institute of Advanced Industrial Science and Technology (AIST),

Tsukuba, Japan

Keywords:

Geodesic Distance, Virtual Flattening, Clothes Handling, Recognition of Deformable Objects, Robot Vision.

Abstract:

We propose a method of virtually ﬂattening a largely deformed surface using three-dimensional images taken

from different directions. In a previous paper (Kita and Kita, 2016), we proposed a method of virtually fat-

tening a surface from a 3D depth image according to the calculation of geodesic lines, which are the shortest

paths between two points on an arbitrary curved surface. Although the work showed the promise of the pro-

posed approach, only gently curved surfaces can be ﬂattened owing to the limit of the observation being made

from one direction. To apply the method to a wider range of surfaces, including sharply curved surfaces, we

extended the method to three-dimensional depth images taken from different directions integratively. This was

done by combining equations obtained from each observation through the surface points observed commonly

in different observations and by solving all the equations simultaneously. Experiments using actual clothing

items demonstrated the effect of the integration.

1 INTRODUCTION

Robots need to act more ﬂexibly according to circum-

stance when their activities are extended from facto-

ries to the daily lives of people. To that end, com-

puter vision plays an important role in recognizing

various objects. Especially, the recognition of de-

formable daily objects presents difﬁculties that are

different from those presented by rigid objects that

have long been studied in factory automation. The

recognition of clothing items for the automatic han-

dling of clothing is a typical example.

The complex self-occlusion that accompanies the

large deformation of clothing items makes the task of

recognizing the items challenging. Figure 1 shows a

clothing item being handled by a robot. It is not easy

to determine the clothing type (e.g., trousers) or to

detect the best position to grasp next (e.g., the cor-

ner of the waist) from such deformed shapes. There-

fore, many studies on clothing recognition aimed at

automatic handling have tried to ﬁrst spread the cloth-

ing item to reduce the level of self-occlusion (F. Os-

awa and Kamiya, 2007) (Hu and Kita, 2015) (D. Tri-

antafyllou and Aspragathos, 2016) (A. Doumanoglou,

2014). However, selecting proper positions to grasp

for good spreading is another difﬁcult recognition

problem.

Figure 1: Our goal of obtaining the clothing shape ﬂattened

on a two-dimensional plane from the observation of a three-

dimensionally deformed clothing item.

Meanwhile, a person can often imagine the ﬂat-

tened shape from a three-dimensionally deformed

shape. Our aim is to realize this virtual ﬂattening

function by transforming the three-dimensional (3D)

surface into a two-dimensional (2D) shape as shown

in Fig. 1. Hereafter, we refer to the shape after virtual

ﬂattening as the ﬂattened view. In a previous paper

(Kita and Kita, 2016), we proposed a method of vir-

tually ﬂattening a clothing surface on a 2D plane from

a 3D depth image of the surface. The method formu-

lates the ﬂattening of the observed 3D surface as a

problem of solving simultaneous equations given by

Kita, Y. and Kita, N.

Virtual Flattening of a Clothing Surface by Integrating Geodesic Distances from Different Three-dimensional Views.

DOI: 10.5220/0007410305410547

In Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2019), pages 541-547

ISBN: 978-989-758-354-4

541

sets of the geodesic distance, which is the length of

the shortest path between two points on an arbitrary

curved surface. However, because only one depth im-

age is considered as the input in that paper, the target

is limited to the case that the whole surface can be

observed from one direction. In reality, most clothing

items can be seen only partially when they are held

in the air by one hand and do not satisfy this condi-

tion. The present paper extends the method to vir-

tually ﬂatten a sharply curved surface by integrating

views taken from different directions.

Section 2 introduces the basic method of ﬂatten-

ing proposed in the previous paper. Section 3 ex-

plains a method of integrating partial observations of

a deformed surface to ﬂatten the whole surface. Ex-

perimental results using actual clothing items are pre-

sented in Section 4. Finally, the results and future

topics are discussed in Section 5.

2 BASIC FLATTENING METHOD

2.1 Calculation of the Geodesic Distance

Many methods of calculating the geodesic line use ﬁ-

nite element meshes(P. Bose, 2011)(Zhong and Xu,

2006) or a voxel representation(R. Grossmann and

Kimme, 2002). However, both mesh-based and

voxel-based methods assume uniformly dense 3D

data of objects, which is not always the case for obser-

vation data in the real world. To calculate a geodesic

line directly from 3D point clouds obtained by a range

sensor or stereo cameras, we adopt an approach pro-

posed by Kawashima et al. (T. Kawashima, 1999) that

calculates geodesic lines in a mesh-free way.

To calculate the geodesic line between two points

on a surface, P

and P

, we set M nodes on the sur-

face to represent the line P

= p

....p

, where

= P

, p

= P

and p

= (x

) as shown in Fig.

2. The problem of obtaining the geodesic distance of

can be set as minimizing L

total

M− 1

∑

i=1

, (1)

i+1

− x

)

+ (y

i+1

− y

)

+ (z

i+1

− z

)

We here represent the surface as z(x), x = (x,y).

To solve this minimization problem, z(x) at any place

on the surface should be continuously determined

from discontinuous observed 3D points of the sur-

face. For this purpose, the surface around the posi-

tion x = (x,y) is approximated by the local surface

function with continuous partial derivatives using the

Figure 2: Direct calculation of the geodesic line from a 3D

point cloud through surface interpolation.

element-free Galerkin method (T. Belytschko and Gu,

1994). Speciﬁcally, z(x) is approximated by a poly-

nomial function:

z(x) = P

(x)a(x), (2)

(x) = (1, x,y), a(x) = (a

(x),a

(x))

The coefﬁcient vector a(x) is locally determined

at each x by minimizing the difference of z(x) from

the observed points in the vicinity of x:

J =

∑

l=1

w(r

)(z(x

) − z

)

, (3)

= |x − x

where w(r) is a weight function deﬁned by the

distance between the target point, x, and observed

points x

(l = 1,··· ,N

) within a ﬁxed distance, r

A fourth-order spline function is used as the weight

function in this paper.

The depth and normal at x are calculated using the

resultant a(x):

z(x) = a

(x) + a

(x)x+ a

(x)y, (4)

n(x) = (a

(x)/D,a

(x)/D,1/D), (5)

D =

(x)

+ a

(x)

+ 1.

Figure 2 shows an example of synthetic ob-

served points sampled from the surface of z =

25− (x− 5)

(red crosses). The green points in

Fig. 2 show an example of surface interpolation along

the line y = −x+ 10.

Because L

total

includes the term z(x) determined

using Eq. (2), it is a complicated function of x.

To stably minimize this function, we make the zero-

length spring analogy. Speciﬁcally, we assume that

each segment p

i+1

is a spring connecting two

nodes, p

and p

i+1

, with a basis length (i.e., the length

at neutral time) of zero and a spring constant k. The

total length of all springs, L

total

, is a minimum when

the spring system is at equilibrium.

VISAPP 2019 - 14th International Conference on Computer Vision Theory and Applications

542

Because each node is constrained to move on the

surface, when a force F is exerted on a node, p

, only

the component of F along the surface,

F , affects the

movement of p

F = F − (F · n

, (6)

where n

= (n

) is a unit vector in the nor-

mal direction at p

. The blue points in Fig. 2 show

the geodesic line obtained after the convergenceof the

successive approximation, using the green line as the

initial position.

2.2 Calculation of the Flattened View

We assume that a clothing surface can be ﬂattened

onto a 2D plane, (u, v). If we then consider N points

on the clothing surface, P

),i = 1, · · · , N, the

ﬂattening can be formulated as the problem of calcu-

lating the 2D coordinates of P

when the surface is

ﬂattened on the plane, (u

The geodesic distance between P

and P

, G

i, j

, is

invariant regardless of the surface deformation and

equals the Euclidean distance between (u

) and

) on the ﬂattened 2D surface.

− u

)

+ (v

− v

)

= G

i, j

, (7)

Although the number of equations of the form of

Eq. (7) is

if we consider all combinations of N

points, not all equations are required if the number

of equations related to one point is more than two,

which is the number of unknowns for the point. By

representing the use/disuse of G

i, j

as B(i, j) = {1,0},

the ﬂattening becomes the minimization problem of

the equation

H(u,v) =

N−1

∑

i=1

∑

j=i+1

B(i, j)(

− u

)

+ (v

− v

)

− G

i, j

)

(8)

The solution is then obtained by solving 2N simulta-

neous equations, where the two equations for each P

are

∂H(u,v)

∂u

= 0,

∂H(u,v)

∂v

= 0. (9)

3 INTEGRATION OF MULTIPLE

OBSERVATIONS

We assume the item is held in the air by a robot hand

and can be observed from different directions. Al-

though we plan to integrate several views in the fu-

ture, in this paper, we deal with the integration of two

Figure 3: Selection of surface points from observation data:

(a) observation images; (b) 3D depth images; (c) selection

of surface points; (d) point correspondence.

Figure 4: Example of a geodesic line calculated from the

3D point cloud when observing a clothing surface.

views taken from largely different directions (sepa-

rated by about 90 to 120 degrees) as a ﬁrst step. We

assume that the views are selected so that they cover

occluded parts of each other. Figure 3(a) shows an

example of a pair of such views.

The surface points commonly observed in the two

3D images are used to combine two sets of simulta-

neous equations of (u, v) that are obtained from each

observation. Although we plan to automatically de-

tect such point correspondences using 2D/3D char-

Virtual Flattening of a Clothing Surface by Integrating Geodesic Distances from Different Three-dimensional Views

543

acteristic points, we currently give the correspon-

dences manually. The steps described below are the

concrete procedure of combining two observations,

which cover one whole side of a clothing item by

complementing each other.

1. Point Setting

Figure 3(b) shows an example of 3D observation data,

where the observed3D points are illustrated with grey

dots. Basically, we choose points from the bound-

ary of the observed clothing region, P

),i =

1,··· , N

, as shown by blue points in the left ﬁgure

of Fig. 3(c). Then, from the region observed in both

observation images, characteristics points that corre-

spond to the two images are selected and replaced as

shown by green points on the right of Fig. 3(c). Fig-

ure 3(d) shows an example of selected surface points

for the observation of Fig. 3(a) with the letters indi-

cating the correspondence between images.

2. Calculation of Geodesic Lines

Geodesic lines are calculated in each observation.

Pairs of two points for calculating geodesic distances

are selected such that the two points have similar

height. This is because folds on the surface occur

mainly in the vertical direction owing to the effect of

gravity. Figure 4 shows an example of the calcula-

tion of a geodesic line. An initial line, p

....p

, is

set between P

and P

through the uniform sampling

of (x,y) from the line between (x

) and (x

) and

calculating z(x,y) as described in Section 2.1. The

blue line in Fig. 4 shows an example of the initial line.

By minimizing Eq. (1), the geodesic line between P

and P

is calculated as shown by the red line.

The geodesic distance between P

and P

of ob-

servation m (in this paper, m = 1,2) is stored in

the array G

][ j

], where G

][ j

] = G

[ j

][i

Besides the geodesic distance, to maintain the lo-

cal shape, the Euclidean distance between the pair

of neighboring points, P

and P

i+1(2)

, is recorded in

][i

+ 1(2)].

3. Integration of Geodesic Distances

After all G

are calculated, the arrays are integrated

to one array, G[i][ j], by merging the points observed

in common among the observations. During this pro-

cess, the x

and y

coordinates of each P

are re-

spectively recorded as initial values of u and v, u

and

. At this time, the x

and y

coordinates of m ≥ 2

observations are two-dimensionally translated and ro-

tated on the x − y plane so that the average coordi-

nates and direction of the corresponding points co-

incide with those of another observation. Figure 5

shows an example, where three points (A, B and C)

on the fold are selected as the corresponding points of

the two observations. In Fig. 5(b), points from ob-

servation m = 2 are set using the original x

and y

Figure 5(c) shows their position after the 2D transfor-

mation.

Figure 5: Example of the initial location for merging: (a)

point correspondence; (b) (x

); (c) (u

4. Calculation of the Flattened View

The ﬂattened view represented by (u,v) is calculated

via the minimization of

′

(u,v) =

∑

i=1

B(i, j)(

− u

)

+ (v

− v

)

− G[i][ j])

(10)

where B(i, j) = {1,0} represents the use/disuse of the

pair of i and j. We solve this minimization using a

spring analogy by setting a spring between the pairs

of B(i, j) = 1 with a basis length of G[i][ j].

4 EXPERIMENTS

Experiments were conducted using two long-sleeve

shirts and two pairs of trousers. So that the situation

was similar to practical applications, the items were

hung by a robot hand, after the robot had picked them

up from a desk and had grasped their lowest part.

The target item was recorded by a trinocular stereo

vision system(Ueshiba, 2006) while the robot rotated

the item along the vertical axis through the holding

position. Two different views were manually selected

from the sequence of 3D data recorded during the ro-

tation. Three to ﬁve corresponding points between the

different views were manually given.

Figure 6 shows the result for the long-sleeve shirt

(LS1) in Fig. 3. The red lines in Fig. 6(a) show

geodesic lines calculated from 3D observation data.

Figure 6(b) and (c) shows the initial state set at

) obtained as described in Section 3 and the

resultant ﬂattened view. Although there are small

VISAPP 2019 - 14th International Conference on Computer Vision Theory and Applications

544

Figure 6: Experimental results of the virtual ﬂattening of a

long-sleeve shirt (LS1): (a) geodesic lines; (b) initial state

of (u,v); (c) ﬁnal state of (u,v).

zigzags owing to the inaccuracy of 3D observation

data along the region boundary, the shape of the long-

sleeve shirt when ﬂattened on a table was globally

well estimated by the virtual ﬂattening process.

Figure 7 shows the results for the other three sets,

LS2, TR1 and TR2. By combining two complemen-

tary views, the ﬂattened views of the whole items

were roughly but well obtained. Especially in the case

of trousers, the acute fold of one leg can be unfolded

by this virtual ﬂattening. In the case of TR2 (Fig.

7(c)), one leg was ﬂattened in the wrong direction

with a shorter length. This resulted from large deep

creases marked by orange circles in Fig. 7(c), where

some hidden parts were not measured in both obser-

vations. The parts hidden by such deep creases would

be difﬁcult to observe from any direction, until some

kind of action removes the creases.

The virtual ﬂattened view can be directly matched

with the canonical shape of clothing items. This is a

great advantage for the analysis of largely deformable

objects. To examine this merit, we conducted a pre-

liminary experiment of category classiﬁcation. Fig-

ure 8(a) shows clothing model images, I, calculated

from one of the typical shapes of each category, where

higher values represent a higher possibility of the con-

Table 1: Matching score with the canonical model images

of each category.

LS1 LS2 TR1 TR2

Long-sleeve shirts (LS) 0.65 0.78 0.43 0.47

Trousers (TR) 0.56 0.53 0.57 0.57

Table 2: Accuracy of the length measured for the ﬂattened

shape.

W L

Data Result

(cm)

Error

(%)

Result

(cm)

Error

(%)

LS1 (Fig. 6) 35.6 36 1.1 34.2 35 2.3

LS2 (Fig. 7a) 35.6 39 8.7 38.1 40 4.8

TR1 (Fig. 7b) 25.7 25 2.8 64.2 62 3.5

TR2 (Fig. 7c) 26.0 27 3.7 55.8 67 16.7

tour of each category. When comparing the virtual

ﬂattened view with the model images, the size of

the ﬂattened view is normalized so that the longest

lengths of the view and the model typical shape be-

come the same. The matching criterion R is then cal-

culated as

R =

∑

n=1

I(i

, j

)

, (11)

where N

is the number of contour points of the ﬂat-

tened view and (i

, j

)(n = 1,··· , N

)) denotes the co-

ordinates of the contour point on the image. While ro-

tating the contour after placing it on the model image

so that the centroid of the contour coincides with the

center of the image, the highest R is selected as the

matching score of the category. The resultant match-

ing score is summarized in Table 1, where the values

of the larger class are indicated in boldface. All views

are stably classiﬁed into the correct category. Figure

8(b) shows the matching results, where each view is

superposed on the model image with higher R.

To examine the accuracy of the virtual ﬂattening,

we manually measured the width (W) and length (L)

of the resultant ﬂattened views as marked by thin or-

ange lines in Fig. 6 and Fig. 7. The comparison

of those lengths with the actual length (GT: ground

truth) is summarized in Table 2. The errors are about

1–2 cm, or about 5%, except for the length error of

TR2 (Fig. 7(c)) owing to the ﬂattening error as noted.

Combining the categorization result, the system can

describe the item such that “the observed item is a

long-sleeve shirt with width of about 35 cm and body

length of about 34 cm” for the case of Fig. 6.

Virtual Flattening of a Clothing Surface by Integrating Geodesic Distances from Different Three-dimensional Views

545

Figure 7: Experimental results of the virtual ﬂattening of long-sleeve shirts and trousers: (a) LS2; (b) TR1; (c) TR2.

5 CONCLUSION

We proposed a method of virtually ﬂattening a cloth-

ing surface onto a 2D plane from 3D depth images

of the surface taken from different directions. On the

basis that geodesic distances between surface points

equal 2D distances of the points on the ﬂattened

plane, the method realizes ﬂattening by solving simul-

taneous equations of the geodesic distances among

the surface points. Through points observed in com-

mon in different observations, the equations derived

from the observations are integrated and simultane-

ously solved to ﬂatten the whole surface.

Experimental results for actual clothing items

show that the proposed method well estimates the

shape of a clothing surface when the clothing is ﬂat-

tened on a table from two 3D observations of the de-

formed surface. The recognition of clothing items

having such shape is much easier than that for an ar-

bitrary shape. Accordingly, this allows a robot to rec-

VISAPP 2019 - 14th International Conference on Computer Vision Theory and Applications

546

Figure 8: Category classiﬁcation using the ﬂattened views:

(a) model images; (b) selected model.

ognize a clothing item during an ordinary handling

procedure, greatly beneﬁting the automatic handling

of clothing items. Category classiﬁcation using the

resultant views demonstrated the good prospects of

using the virtual ﬂattened view as a mediator between

canonical models and deformed items.

Although the corresponding points between dif-

ferent observations are manually given at present, au-

tomatic point correspondence should be possible us-

ing characteristic points of color (or intensity) fea-

tures and/or the particular 3D shape. As future work,

we plan to extend the method to integrate more obser-

vations, which will allow us to use more reliable 3D

data for all parts of the surface.

ACKNOWLEDGEMENTS

The authors thank Dr. Y. Kawai, Dr. Y. Domae, Mr.

T. Ueshiba and Mr. J. Hu for their support of this

research. This work was supported by a Grant-in-Aid

for Scientiﬁc Research, KAKENHI (16H02885).

REFERENCES

A. Doumanoglou, A. Kargakos, T.-K. K. S. M. (2014). Au-

tonomous active recognition and unfolding of clothes

using random decision forests and probabilistic plan-

ning. In International Conference in Robotics and Au-

tomation (ICRA) 2014, pages pp.987–993.

D. Triantafyllou, I. Mariolis, A. K. S. M. and Aspragathos,

N. (2016). A geometric approach to robotic unfolding

of garments. Robotics and Autonomous Systems, Vol

75:pp. 233–243.

F. Osawa, H. S. and Kamiya, Y. (2007). Unfolding of mas-

sive laundry and classiﬁcation types by dual manip-

ulator. Journal of Advanced Computational Intelli-

gence and Intelligent Informatics, Vol. 11, No.5:457–

463.

Hu, J. and Kita, Y. (2015). Classiﬁcation of the category

of clothing item after bringing it into limited shapes.

In Proc. of International Conference on Humanoid

Robots 2015, pages pp.588–594.

Kita, Y. and Kita, N. (2016). Virtual ﬂattening of clothing

item held in the air. In Proc. of 23rd International

Conference on Pattern Recognition, pages pp.2771–

2777.

P. Bose, A. Maheshwari, C. S. a. S. W. (2011). A survey of

geodesic paths on 3d surfaces. Computational Geom-

etry, Vol 44:pp. 486–498.

R. Grossmann, N. K. and Kimme, R. (2002). Com-

putational surface ﬂattening:a voxel-based approach.

IEEE Trans. on Pattern Anal. and Machine Intelli.,

vol. 24, no.4.

T. Belytschko, Y. Y. L. and Gu, L. (1994). Element-free

galerkin methods. International Journal for Numeri-

cal Methods in Engineering, Vol 37, No. 2:pp. 229–

256.

T. Kawashima, S. Yabashi, H. K. Y. (1999). Meshless

method for searching geodesic line by using moving

least squares interpolation. In Research Report on

Membrane Structures, pages pp. 1–6.

Ueshiba, T. (2006). An efﬁcient implementation tech-

nique of bidirectional matching for real-time trinocu-

lar stereo vision. In Proc. of 18th Int. Conf. on Pattern

Recognition, pages pp.1076–1079.

Zhong, Y. and Xu, B. (2006). A physically based method

for triangulated surface ﬂattening. Computer-Aided

Design, Vol. 38:pp. 1062–1073.

Virtual Flattening of a Clothing Surface by Integrating Geodesic Distances from Different Three-dimensional Views

547