Braid Hairstyle Recognition based on CNNs

Chao Sun and Won-Sook Lee

EECS, University of Ottawa, Ottawa, ON, Canada

{csun014, wslee}@uottawa.ca

Keywords:

Braid Hairstyle Recognition, Convolutional Neural Networks.

Abstract:

In this paper, we present a novel braid hairstyle recognition system based on Convolutional Neural Networks

(CNNs). We ﬁrst build a hairstyle patch dataset that is composed of braid hairstyle patches and non-braid

hairstyle patches (straight hairstyle patches, curly hairstyle patches, and kinky hairstyle patches). Then we

train our hairstyle recognition system via transfer learning on a pre-trained CNN model in order to extract

the features of different hairstyles. Our hairstyle recognition CNN model achieves the accuracy of 92.7% on

image patch dataset. Then the CNN model is used to perform braid hairstyle detection and recognition in full-

hair images. The experiment results shows that the patch-level trained CNN model can successfully detect and

recognize braid hairstyle in image-level.

1 INTRODUCTION

Hairstyle, which can help to provide unique personal-

ity, is considered as one of the most important features

of a human being in real-world. Moreover, in com-

puter games and animation ﬁlms, different hairstyles

represent different identiﬁcations of virtual charac-

ters. However, hairstyle recognition remains one of

the most challenging tasks due to the characteristics

of the hair(e.g. the texture, colors, etc), the variety of

appearances under different environments (e.g. light-

ing conditions, etc), as well as countless combinations

of different hairstyles.

Most of the researchers who work on 3D hair

modelling examine the characteristics of hair based

on single-view or multiple-view hair images and try

to obtain hair strands structure information (e.g. ori-

entation of hair strands). For certain hairstyles, such

as straight hairstyle, this kind of information is rel-

atively easy to obtain since the straight hair strands

share the same direction. However, for more complex

hairstyle, such as the braid hairstyle, the correspond-

ing recognition procedure is more challenging, and

is usually performed by human. Thus, an automatic

braid hairstyle recognition system is needed in order

to facility the hair modelling procedure.

The main challenges for braid hairstyle recogni-

tion are:

• The braid hairstyle spans a diverse range of ap-

pearances in real-world, it is very difﬁcult to use

hand-designed image features to recognize. Ex-

Figure 1: Different braid hairstyles.

amples of braid hairstyle are shown in Figure 1.

They are ”french braid”, ”reverse french braid”,

”ﬁshtail braid”, and ”four-strand braid”.

• The braid hairstyle often co-exist with other

hairstyles, thus the hair strands usually share sim-

ilar appearance. As shown in Figure 2, the hair

image contains three different hairstyles: straight

548

Sun C. and Lee W.

Braid Hairstyle Recognition based on CNNs.

DOI: 10.5220/0006169805480555

In Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2017), pages 548-555

ISBN: 978-989-758-225-7

Figure 2: The combination of different hairstyles.

hairstyle (indicated by the blue stroke), curly

hairstyle (indicated by the yellow stroke), and a

braid (indicated by the green stroke) that lies be-

tween those two regions. The only difference

is the structure or pattern that formed by hair

strands.

• The boundaries between the braid hairstyle and

other hairstyles are difﬁcult to detect. As shown

in Figure 1, the hair strands gradually merge into

the braid region and become a part of the braid.

Braid hairstyle is deﬁned as two to four hair

strands interlacing with each other to form a complex

structure or pattern. Since the braid hairstyle is com-

posed by certain repeated patterns, indicating that the

most distinguish pattern lies in the interlacing area.

Thus, if we can detect the interlacing pattern in the

hair images, then we can locate the braid area in the

full hair images.

Since braid hairstyle is usually co-exist with other

hairstyles, it is reasonable that we develop a recog-

nition system that can learn features from both the

braid hairstyle and non-braid hairstyle and separate

them based on those features. Thus, we include three

other hairstyles in our system. They are straight

hairstyle (hair is normally straight and do not hold a

curl), curly hairstyle (hair contains spirals or inwardly

curved forms, or has a deﬁnite ”S” pattern), and kinky

hairstyle (hair is tightly coiled with a less visible curl

pattern).

Due to the characteristics of different hairstyles,

traditional image processing methods usually failed

to extract the structure features from hair images di-

rectly (e.g. the kinky hairstyle). Thus, we lever-

age the strength of the Convolutional Neural Net-

works (CNNs) to automatically learn features of dif-

ferent hairstyles. Usually, the CNNs are trained on

large-scale image dataset (e.g. ImageNet(Deng et al.,

2009), etc), however, our hairstyle patch dataset is a

relatively small one. There are four hairstyle classes

and each class has approximate 1000 image patches,

including 800 patches for training and 200 patches for

testing. When dealing with small dataset, which is re-

alistic in real-world use cases , overﬁtting is the main

problem we need to avoid. although . Thus, we ap-

ply the transfer learning via a pre-trained CNNs with

a ﬁnal layer retrained to our own hairstyle dataset to

learn features for different hairstyles.

To sum up, the contributions of this paper are:

• A novel hairstyle recognition system that can de-

tect the unique features of braid hairstyle and rec-

ognize braid hairstyle in full hair images.

• The strategy of patch-level feature learning and

image-level recognition can facility the recogni-

tion for complex hairstyles. The hairstyle recog-

nition system can be applied to the front-view hair

images, the side-view hair images, as well as the

back-view hair images.

2 RELATED WORK

2.1 Hair Recognition in Human

Identiﬁcation

Researchers use human hair as a supplementary fea-

ture for human identiﬁcation recognition.

Yacoob et. al estimated a set of attributes (e.g.

length, volume, surface area, dominant color, color-

ing, etc) of the head hair from a single image. They

developed algorithms and associated metrics that en-

able detection, representation, and comparison of the

hair of different subjects. Their experiment results

shown that the hair attributes can improved the hu-

man identiﬁcation results(Yacoob and Davis, 2006).

In their work, they provided some important informa-

tion for hair detection and description by introducing

the hair attributes, however, since the purpose of their

work is human identiﬁcation, the images in their ex-

periments are all human frontal face images with hair

regions.

Dass et. al used unsupervised learning method to

discover distinct hairstyles, namely the whole hair re-

gions, from a large number of frontal face images.

Their learning method involved clustering of hair re-

gions, where they do not need to assume any pre-

determined number of clusters. For each hair-style

region cluster, they generate a style-template, which

is a probability mask indicating the probability of hair

at a certain position of a facial image. The templates

are subsequently used to recognize the hairstyle of

Braid Hairstyle Recognition based on CNNs

549

Figure 3: Braid hairstyle recognition system overview.

a person. The hair distribution of a person is com-

pared with the templates to recognize hairstyles. In

their experiments, they collected male and female

face images randomly from the Internet. Clustering

on these selected images resulted in ﬁve clusters of

hairstyles. The The ﬁve different hairstyles probabil-

ity masks are generated and the accuracy of the clas-

siﬁcation is 75.62% (Dass et al., 2013). All their ex-

periments are based on frontal face images and unsu-

pervised learning algorithm (clustering) and they fo-

cus on the recognition of complete hair regions from

face regions rather than the detailed partition of dif-

ferent hairstyles inside the hair region. To sum up,

research works on more complicated hairstyles, espe-

cially from the back-views images or side-views im-

ages, have not been explored. Furthermore, super-

vised learning algorithm can also be used in hairstyle

recognition in order to provide high-level features and

more reliable recognition results.

2.2 3D Braid Modelling

In the area of hair modelling, research obtained the

3D braid models by ﬁtting the captured hair braid

3D point cloud with the pre-generated 3D braid mod-

els. Hu et.al propose a data-driven method to au-

tomatically reconstruct braided hairstyles from input

data obtained from RGB-D camera. They produced a

database of 3D braid patches and use a robust random

sampling approach for data ﬁtting. The experiment

results demonstrated that using a simple equipment is

sufﬁcient to effectively capture a wide range of braids

with distinct shapes and structures (Hu et al., 2014).

2.3 Materials Recognition

Research works on material recognition usually ap-

plied hand-designed image features to classify differ-

ent materials.

Liu et. al proposed an augmented Latent Dirich-

let Allocation (aLDA) model to combine the rich set

of low and mid-level features under a Bayesian gen-

erative framework and learn an optimal combination

of features. Experimental results show that the sys-

tem performs material recognition reasonably well on

a challenging material database, outperforming state-

of-the-art material/texture recognition systems (Liu

et al., 2010).

Hu et. al empirically study material recognition

of real-world objects using a rich set of local fea-

tures. They applied the Kernel Descriptor framework

and extend the set of descriptors to include material-

motivated attributes using variances of gradient orien-

tation and magnitude. Large-Margin Nearest Neigh-

bor learning is used for a 30-fold dimension reduc-

tion. They also introduce two new datasets using Im-

ageNet and macro photos (Hu et al., 2011).

Qi et.al introduced the Pairwise Transform Invari-

ance (PTI) principle, and then proposed a novel Pair-

wise Rotation Invariant Co-occurrence Local Binary

Pattern (PRICoLBP) feature, and further extend it to

incorporate multi-scale, multi-orientation, and multi-

channel information. The experiments demonstrated

that PRICoLBP is efﬁcient, effective, and of a well-

balanced tradeoff between the discriminative power

and robustness (Qi et al., 2014).

Cimpoi et.al identiﬁed a rich vocabulary of forty-

seven texture terms and use them to describe a large

dataset of patterns collected in the wild. The result-

VISAPP 2017 - International Conference on Computer Vision Theory and Applications

550

ing Describable Textures Dataset (DTD) is the basis

to seek for the best texture representation for recog-

nizing describable texture attributes in images. They

applied the Improved Fisher Vector (IFV) to texture

recognition. The experiment results showed that their

method outperformed other specialized texture de-

scriptors in established material recognition datasets

(FMD and KTHTIPS-2) benchmarks(Cimpoi et al.,

2013).

Bell et.al introduced a new, large-scale, open

dataset of materials in the wild, the Materials in Con-

text Database (MINC), and combine this dataset with

deep learning to achieve material recognition and seg-

mentation of images in the wild. For material classi-

ﬁcation on MINC, they achieved 85.2% mean class

accuracy. They combined these trained CNN classi-

ﬁers with a fully connected conditional random ﬁeld

(CRF) to predict the material at every pixel in an im-

age and achieving 73.1% mean class accuracy(Bell

et al., 2014).

The differences between material recognition and

the hairstyle recognition are:

• The material recognition emphasizes on the

recognition of different classes of materials. For

example, they want to tell the hair from the skin.

Thus, it is a inter-class recognition problem.

• Our braid hairstyle recognition focus on distin-

guishing hairstyle structures inside the hair class.

The differences between hairstyles are caused not

only by the characteristics of different hair ﬁbres,

but also by the structures that the hair strands

formed. Thus, it is more like a intra-class recog-

nition problem.

2.4 Convolutional Neural Networks

Convolutional Neural Networks (CNNs) have been

widely adopted in classiﬁcation and segmentation

tasks, including object recognition (Krizhevsky et al.,

2012), hair region detection(Chai et al., 2016), and

demonstrated to provide superior performance than

traditional classiﬁcation and segmentation systems.

CNNs usually require a large amount of training data

in order to reach the best performance and avoid over-

ﬁtting. However, for our braid hairstyle detection and

recognition system, only a small amount of training

data is available. In order to avoid overﬁtting, we

trained our CNN on a larger data set from a related do-

main (ImageNet). Trained on large dataset, the CNN

can learned useful features and leverage such features

to reach a better accuracy than other methods that rely

on the small dataset. We perform an additional train-

ing step using our own data to ﬁne-tune the trained

Figure 4: Hairstyle patches.

network weights. The model in our system is the In-

ception V3 network with a ﬁnal layer retrained on our

own hairstyle patch dataset.

3 BRAID HAIRSTYLE

RECOGNITION

The overview of the braid hairstyle recognition sys-

tem is shown in Figure 3.

All hairstyle images used in our system are down-

loaded via Internet. The hairstyle images contain dif-

ferent hair colors, lengths, and volumes, etc. In ad-

dition, those hairstyle images are captured from dif-

ferent point of views, including front-view hairstyle

image, side-view hairstyle images, and back-view

hairstyle images. Moreover, we avoid very small-size

hairstyle images since the quality is relative low and

the details of the hair structure tend to be vague. We

also reduce the size of very high resolution hairstyle

images. Thus the average width of the hair region is

in the range of 450 pixels to 600 pixels.

The hairstyle images are then separated into the

following two categories. Noting there is no overlap-

ping between those two sets.

• Dataset-I: Hair images that needed to be cropped

into hairstyle patches to form the dataset for train-

ing the hairstyle recognition system.

• Dataset-II: Hair images that used to perform the

full-image hairstyle recognition.

Braid Hairstyle Recognition based on CNNs

551

3.1 Training Procedure

In order to prepare the hairstyle patches dataset for

training the braid hairstyle recognition model, we

manually crop hairstyle patches from Dataset-I and

label them. During the cropping procedure, we need

to control the size of the cropping window in or-

der to reserve the distinguish structures of the braid

hairstyles. Given the characteristics of the braid

hairstyle, if the cropping windows are very small, then

the image patches will lose the ability to represent

the unique interlacing structure and every image patch

will look like the straight hairstyle. On the other hand,

if the cropping window is very large, it may contains

several different hairstyles and make the recognition

difﬁcult. Thus, instead of using a ﬁxed-size window

for hairstyle patch cropping, we made the size of the

cropping window adjustable in order to capture the

unique braid structure.

After the cropping procedure, we adjust the size

of each hairstyle image patch into 50 pixels × 50 pix-

els. The hairstyle patch samples are shown in Figure

4. The ﬁrst row shows the braid hairstyle patches,

the rest are the non-braid hairstyle patches, including:

the straight hairstyle patches (the second row), the

curly hairstyle patches (the third row), and the kinky

hairstyle patches (the last row).

Then we separate all the hairstyle patches into the

training dataset and the testing dataset to train the

braid hairstyle recognition model. The details of the

training dataset and testing datasets are shown in Ta-

ble 1.

As shown in Table 1, our training and testing

datasets only contain a small amount of hairstyle

patches. In order to prevent over-ﬁtting and help the

hairstyle recognition model generalize better, we need

to make the most of our few training examples by

”augmenting” the hairstyle image patches via a num-

ber of random transformations, including: rotation,

vertical shift, horizontal shift, shearing transforma-

tion, and horizontal-ﬂip. The hairstyle patch augment

results are shown in Figure 5, the ﬁrst row is the orig-

inal braid patch. The second to sixth rows are aug-

mented braid patches. We notice that the augment

procedure reserves the basic structure of the braid, it

also increases the diversity of the braid by changing

the direction of the braid, modifying the width of the

braid, etc. Since all the augmented patches can be

found in real-world hairstyles, thus the augmented re-

sults are reasonable. During the training stage, we

apply the random transformations and normalization

operations on our hairstyle image patch dataset and

generate augmented hairstyle image patches and their

corresponding labels.

Table 1: Hairstyle patch dataset.

Hairstyle Index # Training # Testing

Braid 1 800 200

Straight 2 800 200

Curly 3 800 200

Kinky 4 800 200

Figure 5: Data argumentation results (braid hairstyle).

After obtain the hairstyle patch dataset, we ap-

plied the Inception v3 network (Szegedy et al., 2016)

with a ﬁnal layer retrained on it. The original In-

ception v3 network is trained on ImageNet (Deng

et al., 2009), which provides enough knowledge of

real-world objects. We add a ﬁnal layer retrained to

our own hairstyle dataset to learn features for different

hairstyles. Our hairstyle recognition system reaches

the accuracy of 92.7%.

3.2 Full-image based Braid Hairstyle

Recognition

3.2.1 Hair Region Mask Generation

During the procedure for full image braid hairstyle

detection and recognition, the input images of our sys-

tem are selected from the Dataset-II that mentioned

before. Those hair images contain both hair regions

and non-hair regions (e.g. faces, backgrounds, etc).

We manually select points on the boundary of the hair

VISAPP 2017 - International Conference on Computer Vision Theory and Applications

552

Figure 6: Hair region mask.

region to generate the hair mask and obtain the

hair region, the results as shown in Figure 6.

We apply the sliding window method inside the

hair region. The size of the sliding window is W pix-

els × H pixels (e.g. W = 50 pixels and H = 50 pix-

els). The stride of the window is S pixels (e.g. S = 15

pixels). Then we can obtain the hairstyle prediction

for every window patch.

For each hairstyle patch patch

, the braid hairstyle

recognition system will provide the class labels and

the corresponding scores (label

, score

). Noting that

n indicates the label index in Table 1 and the scores

satisfy

∑

n=1

score

= 1. Although our system aims

to detect and recognize braid hairstyle, we keep all

the labels and scores for different hairstyles. How-

ever, there are overlapping regions between the adja-

cent windows, the scores and labels updating proce-

dure is shown in Figure 7. The red window indicates

the original patch, the blue window and the green win-

dow indicate current patch when the sliding window

moves 15 pixels horizontally and 15 pixels vertically,

respectively. We compare the scores of the original

score with the current score of the overlapping region.

If the current score (0.994485) is less than the original

score (0.996278), we keep the original label (straight)

and score (0.996278) for the overlapping part, other-

wise, we update the score (0.998383) and the corre-

sponding label (straight) according to the score and

label of the current window.

After the score and label updating procedure, we

compare the score of each pixel with the predeﬁned

threshold value threshold(= 0.88), if the score is

larger than the threshold value, we accept the recog-

nition result. Otherwise, we reject the recognition re-

sult.

3.3 Experiment Results

We conduct experiments on full hair images that se-

lected from Dataset-II.

As shown in Figure 9, our system can detect braid

region in full hair image. The ﬁst column shows the

original full hair images, the second column shows

the hair region mask, the third column shows the

hair region images, the last column shows the braid

hairstyle recognition results inside the hair regions.

Figure 7: Hairstyle label and score update.

Figure 8: The braid hairstyle recognition results.

The braid hairstyle regions are highlighted with color

green.

In the ﬁrst row of Figure 9, the size of the full

hair image is 458 pixels × 504 pixels. The size of the

sliding window is 50 pixels × 50 pixels, the stride of

the sliding window is 25 pixels.

In the second row of Figure 9, the size of the full

hair image is 517 pixels × 678 pixels. The size of

the sliding window is 50 pixels × 50 pixels, the stride

of the sliding window is 25 pixels. There are mainly

three curly hairstyle patches are recognized as braid

hairstyle, as shown in Figure 10. The patches contain

the patterns that are very similar to the strands inter-

lacing structure of the braid hairstyle. The results in-

dicate that the braid hairstyle recognition is relatively

more difﬁcult than the recognition of other hairstyles.

In the third row of Figure 9, the size of the full

hair image is 653 pixels × 1129 pixels. The size of

the sliding window is 60 pixels × 60 pixels, the stride

of the sliding window is 30 pixels. Since the braid in

this hair image is simpler than other full hair images,

Braid Hairstyle Recognition based on CNNs

553

Figure 9: Braid hairstyles recognition results.

Figure 10: Mis-classiﬁed hair patches.

a slightly large sliding window will contain more in-

formation for braid recognition.

The ”ﬁshtail” braid recognition results is shown

in the fourth row of Figure 9, the size of the full hair

image is 488 pixels × 763 pixels. The size of the

sliding window is 60 pixels × 60 pixels, the stride of

the sliding window is 30 pixels.

The experiment results indicate that the braid

hairstyle recognition system can successfully recog-

nize braid hairstyle in full hair images.

4 CONCLUSIONS AND FUTURE

WORKS

In this paper, we present a novel braid hairstyle

recognition system. We leverage the power of the

VISAPP 2017 - International Conference on Computer Vision Theory and Applications

554

pre-trained Convolutional Neural Networks to learn

the features of braid hairstyle as well as non-braid

hairstyles. However, due to our small-scale dataset,

data augment techniques and transfer learning are ap-

plied to deal with the problem of overﬁtting. The

experiment results show that our system is capable

to recognize four basic hairstyles, including braid

hairstyle, straight hairstyle, curly hairstyle, and kinky

hairstyle, however, we focus on recognize braid

hairstyle in this paper. Moreover, the strategy of

training on patch-level and performing recognition on

image-level can facility the recognition procedure for

complex hairstyles. In addition, since the system is

based on image patches, it can be used to recognize

hairstyle not only in the front-view hair images, but

also in the side-view hair images, as well as the back-

view hair image.

In the future, we need to increase our data to in-

clude more braid hairstyles. Furthermore, we need

include the spacial information as the global informa-

tion in order to eliminate mis-classiﬁed patches.

REFERENCES

Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., and

Susstrunk, S. (2012). Slic superpixels compared to

state-of-the-art superpixel methods. IEEE Trans. Pat-

tern Anal. Mach. Intell.

Bell, S., Upchurch, P., Snavely, N., and Bala, K. (2014).

Material recognition in the wild with the materials in

context database. CoRR, abs/1412.0623.

Chai, M., Shao, T., Wu, H., Weng, Y., and Zhou, K. (2016).

Autohair: Fully automatic hair modeling from a single

image. ACM Trans. Graph.

Cimpoi, M., Maji, S., Kokkinos, I., Mohamed, S., and

Vedaldi, A. (2013). Describing textures in the wild.

CoRR, abs/1311.3618.

Dass, J., Sharma, M., Hassan, E., and Ghosh, H. (2013).

A density based method for automatic hairstyle dis-

covery and recognition. In Computer Vision, Pat-

tern Recognition, Image Processing and Graphics

(NCVPRIPG), 2013 Fourth National Conference on.

Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-

Fei, L. (2009). ImageNet: A Large-Scale Hierarchical

Image Database. In CVPR09.

Hu, D., Bo, L., and Ren, X. (2011). Toward robust material

recognition for everyday objects. In BMVC, pages 1–

11.

Hu, L., Ma, C., Luo, L., Wei, L.-Y., and Li, H. (2014). Cap-

turing braided hairstyles. ACM Trans. Graph.

Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). Im-

agenet classiﬁcation with deep convolutional neural

networks. In Advances in Neural Information Pro-

cessing Systems.

Liu, C., Sharan, L., Adelson, E. H., and Rosenholtz, R.

(2010). Exploring features in a bayesian framework

for material recognition. In Computer Vision and Pat-

tern Recognition (CVPR), 2010 IEEE Conference on,

pages 239–246.

Qi, X., Xiao, R., Li, C. G., Qiao, Y., Guo, J., and Tang,

X. (2014). Pairwise rotation invariant co-occurrence

local binary pattern. IEEE Transactions on Pat-

tern Analysis and Machine Intelligence, 36(11):2199–

2213.

Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna,

Z. (2016). Rethinking the inception architecture for

computer vision. In Proceedings of IEEE Conference

on Computer Vision and Pattern Recognition,.

Yacoob, Y. and Davis, L. S. (2006). Detection and analysis

of hair. IEEE Trans. Pattern Anal. Mach. Intell.

Braid Hairstyle Recognition based on CNNs

555