Comparative Analysis of Store Clustering Techniques in the Retail

Industry

Kanika Agarwal, Prateek Jain and Mamta Rajnayak

Accenture Digital, Accenture Private Solution Limited., India

Keywords: Store Clustering, Self Organizing Maps, Gaussian Mixture Matrix, Fuzzy C-means.

Abstract: Many offline retailers in European Markets are currently exploring different store designs to address local

demands and to gain a competitive edge. There has been a significant demand in this industry to use analytics

as a key pillar to take store-centric informed strategic decisions. The main objective of this case study is to

propose a robust store clustering mechanism which will help the business to understand their stores better and

frame store-centric marketing strategies with an aim to maximize their revenues. This paper evaluates four

advance analytics-based clustering techniques namely: Hierarchical clustering, Self Organizing Maps,

Gaussian Mixture Matrix, and Fuzzy C-means These techniques are used for clustering offline stores of a

global retailer across four European markets. The results from these four techniques are compared and

presented in this paper.

1 INTRODUCTION

Over the last decade, there has been a steady growth

in the European retail market. Retailers have designed

different store designs across the markets to cater to

local customer preferences and to gain competitive

advantage. There has been a significant demand for

analytics in the market to drift from traditional

descriptive to more of a predictive/prescriptive

approach.

According to a report by Neilsen, there has been a

shift in the convenience store’s transaction and

purchase patterns. The store visit has increased,

however, spending per visit has decreased. There has

been a change in customer lifestyle, for instance,

people prefer fresh and healthy products nowadays.

Availability of contactless payment method, self-

checkouts also have a positive impact on store

footfall. Analyzing these factors would help the

retailer in maximizing profit and optimizing

inventory.

The retailers are concerned with the following

business problems.

1. How are the various stores performing? Which

stores have the maximum potential to grow?

2. What is the customer footfall? What is the

average spending per transaction?

3. What kinds of products are purchased the

most? Is it tobacco, coffee, grocery or any

other category?

4. What are the top performing manufacturers

and brands?

5. What type of customers visits the stores? What

are their preferences?

6. How much is the store responsive to

promotion such as discount coupons, meal

deal offers etc.?

7. How accessible is the store? Is parking facility

available or is the store well connected?

8. What is the store firmographics: store size,

store layout, store design?

This paper is designed to address these business

problems and propose a strategic point of view to

retailers with an end objective to be more profitable

and competitive in the market.

The retailer considered in this paper is operational

in many European countries such as Germany,

Netherland, Austria, Poland, United Kingdom,

Switzerland etc. It has more than 5000 store outlets

and millions of customer base across all the

geographies. It offers a wide range of product

portfolio: groceries, tobacco, drinks, fast food,

packaged food etc. This retail organisation wants to

leverage power of analytics and better understand

their retail store business with an aim to stay ahead of

Agarwal, K., Jain, P. and Rajnayak, M.

Comparative Analysis of Store Clustering Techniques in the Retail Industry.

DOI: 10.5220/0007917500650073

In Proceedings of the 8th International Conference on Data Science, Technology and Applications (DATA 2019), pages 65-73

ISBN: 978-989-758-377-3

Copyright

c

2019 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved

65

its peers. To achieve this aim, it is important for this

organisation to better understand the markets in

which they are operating and have a personalised

local view of the retail stores within these markets.

Hence, store segmentation is proposed to cater to

these business requirements. Given the complexity of

data and market dynamics, it is imperative to apply

some sophisticated clustering techniques which

would address the limitations of traditional

techniques like K-mean and agglomerative

clustering.

This paper proposes the use of advance machine

learning techniques like Self Organizing Maps

(SOM), Gaussian Mixture Models (GMM), Fuzzy C-

means (FCM) for clustering offline stores of different

European markets. The results of these techniques are

also compared with results of legacy clustering

technique like hierarchical to prepare a comparative

analysis for each market.

This paper is structured as follows. Section 2

presents the related literature available in this domain.

Section 3 describes the different data sources,

variables and techniques used in the analysis. Section

4 presents the comparative results of the techniques

applied across different markets and the paper is

concluded in Section 5.

2 RELATED WORKS

Various algorithms have been proposed by

researchers relating to clustering applications for

retailers in the literature and results from clustering

have been presented.

Researchers have classified internet retail sites for

an e-commerce company. 35 observable internet

retail store’s attributes are used, and hierarchical

clustering technique is applied to classify store into

five distinct web catalog interface categories:

superstores, promotional stores, plain sales stores,

one-page stores, and product listings. The classified

online stores differ primarily on the three dimensions:

size, service offerings, and interface quality (Spiller

and Lohse, 2015).

Researchers analyze the data of a supermarket

chain which has 73 stores in Turkey. Data related to

stores such as store size, number of competitors

nearby, trade area demographics like distribution of

population by age, marital status are used for

conducting the segmentation. Hierarchical clustering

is applied, and effective target marketing strategy is

designed for each store segment (Bilgic, Kantardzic,

and Cakir, 2015).

Researchers have applied artificial neural

networks (ANNs) as an alternative means of

segmenting customers in retail space. Hopfield–

Kagmar (HK) clustering algorithm, an ANN

technique based on Hopfield networks, is compared

with K-means clustering algorithms. Purchase

behavior such as the total number of orders, days

since first purchase, the number of credit cards etc is

used for profiling the customers. The results indicate

that ANNs could be more useful to retailers for

segmentation because they provide more

homogeneous segmentation solution than K-means

clustering algorithms and are less sensitive to initial

starting conditions (Boone and Roehm, 2002).

Researchers have applied clustering techniques

namely K-means clustering, Mountain clustering, and

Subtractive clustering on the dataset for medical

diagnosis of heart disease. It is observed that K-means

overperformed in cases where many dimensions are

present. Mountain clustering is suitable only for

problems with two or three dimensions (Hammouda

and Karray, 2002).

Most of the papers have applied hard clustering

techniques like K-means and hierarchical. Most of

them have been used for customer segmentation

rather than for store segmentation. Even if there is

some research in the store segmentation space, it is

predominantly focused on online channel than the

traditional offline channel. To add further, the

attributes used for store clustering are mostly related

to firmographics, customer demographics or

competitor information. In this paper, store clustering

is performed for a retail organisation. Attributes

related to purchase pattern, transaction pattern,

customer behaviour, store dimensions are used for

clustering. Both hard clustering technique such as

hierarchical clustering and soft clustering techniques

such as Self Organizing Maps (SOM), Gaussian

Mixture Models (GMM), Fuzzy C-means (FCM) are

applied for clustering stores for four different

European markets. A comparative study on the results

derived from these different techniques for different

markets has been presented in this paper.

3 DATA AND METHODLOGY

The retailer considered is a UK based multinational

organization offering convenience retail services to

.consumers. The company operates through various

channels. Some of the stores are owned and operated

by the company itself, however, there are some which

are owned and operated by a franchise or a dealer.

DATA 2019 - 8th International Conference on Data Science, Technology and Applications

66

In this paper, data is analyzed for four different

European markets. The time frame considered for the

analysis is one year. The data sources used are

transaction data, product data, store data, loyalty data,

and competitor’s data. In transaction data, attributes

like transaction date, sales, quantity etc. are captured.

Attributes like product description, category

description, brand etc. are captured in the product

dataset. Dimensions like store size, location,

operating channel etc. are recorded in the store data.

Information related to purchase behavior of the

customers using the loyalty card, methods of

payments, discounts, point redemption etc. are

captured in the loyalty data. Competitor’s data

included the competitor’s pricing attributes. All the

datasets together have millions of transactions

encapsulating close to a hundred raw variables.

Table 1: Description of some of the variables captured in

the dataset.

Variable name Data type Description

Transaction id Varchar

Unique id associated

with each transaction.

Transaction

date

Time

stamp

Time at which

transaction is recorded.

Product id Varchar

Unique id the product

purchased.

Store id Varchar

Unique id of the store in

which the product is

sold.

Sales Numeric

Sales value of the

product.

Quantity Numeric

Quantity in which

product is sold.

Product

description

Varchar

Description of the

product sold.

Category

description

Varchar

Description of the

category the product

belonged to such as

tobacco, drinks etc.

Operating

channel

Varchar

Flag to identify id the

store owned by

company or not.

Location Varchar

Indicate if the store is

located centrally or if it

is in countryside.

3.1 Data Wrangling

To perform store clustering, the data must be

represented at a store level. So, after collating the

datasets, all the variables are rolled at a store level.

Depending on the nature of the variable, aggregation

methods like sum, count, max, min are applied. For

example, in the case of sales and quantity sum is

taken, however, in the case of transactions, a distinct

count is calculated. Many derived variables like

spending per category, average price, sales

corresponding to different months, week of the day

and time of the day are created. This led to the

creation of around 400 variables for each store. These

set of variables provide a holistic view of stores and

capture dimensions related to demographics,

firmographics, transaction pattern, purchase pattern

etc

In order to ensure that quality data is used for

clustering, a cleansing procedure is applied. The

process is as followed.

1. A univariate analysis is conducted to calculate

the percentile distributions (0.01, 0.05,

0.1,0.25, 0.5, 0.75, 0,9 ,0.95 ,0.99), count of

missing values etc.

2. As per the nature of the variable, missing value

imputation techniques like replacement with

mean/median/mode etc. are applied.

3. Variables with significant missing values are

excluded from the analysis.

4. Variables that showed less variability are also

removed.

5. The last step is the outlier treatment.

Depending on the distribution of the variable

the treatment is conducted. For some variables

95

th

percentile value is used to replace the

outlier at the upper end and similarly for

others, some other threshold is applied.

All the stores are not considered for analysis.

Only the stores that are owned by the company and

that are operational for more than 80% of the time

period are taken into account.

Conducting clustering on 400 variables is neither

efficient nor feasible. So, the next process is the

selection of relevant variables. To do this, the variable

clustering technique is applied. The package

ClustofVar in R is used for the same. Hierarchical

clustering technique is applied to club variables

strongly related to each other. The algorithm is

explained in detail in the section3.2.4. There is only

one difference, here the algorithm is applied to group

variables and in section 3.2.4 it is applied to group

stores. Once the variables are grouped into clusters, a

loading is attached to each variable. From each cluster

some variables are selected based on the loading

value and business inputs. Around 30 variables are

shortlisted to be used in the final clustering process.

Comparative Analysis of Store Clustering Techniques in the Retail Industry

67

3.2 Clustering Techniques

There are two kinds of clustering techniques: hard

clustering and soft clustering. In case of hard

clustering a data point belongs to only one cluster

However, in case of soft clustering, a data point has

the probability of belonging to all the clusters. K-

means and Hierarchical clustering fall under the hard

clustering classification while Self Organizing Maps

(SOM), Gaussian Mixture Models (GMM), Fuzzy C-

means (FCM) are a part of soft clustering

classification.

In this section, the hard clustering technique:

Hierarchical and soft clustering techniques: SOM,

GMM, FCM are explained in detail.

3.2.1 Self Organizing Maps

This is a type of artificial neural network which works

on the principle of reducing high dimensional data

into low dimensional space. The technique maintains

the spatial relationship between the data. The process

followed by SOM is as follows.

1. The very first step is the specification of grid

space as hexagonal or rectangular. For

example, grid space for 6 clusters could be

2x3, 1x6, 6x1 or 3x2. In figure1, it is

rectangular 2x3.

2. Once the grid is selected, each cluster/node in

the grid is assigned a random weight. The

dimension of a node is equivalent to the

number of variables in the data. For example,

in figure2, Node1 has 3 weight dimensions

corresponding to 3 variables (X

1

, X

2

, X

3

) in the

data.

3. For each iteration, an observation is randomly

selected, and a distance metric is calculated

with respect to all the nodes as shown in

figure2.

4. The cluster with the minimum distance is

assigned to the observation.

5. As this happens, the whole grid moves closer

to the observations, as shown in figure3. The

movement is dependent on the learning rate

specified in the model.

6. Weights of the nodes are adjusted.

7. This completes an iteration for one

observation (Step 3-6).

8. In the next iteration, again one observation is

selected to pass through the above steps.

9. The process is repeated iteratively till all the

observations are assigned a cluster and a

convergence criterion is achieved.

The equation used for updating weight is as

follows.

1

(1)

where t is time step, W(t) is the weight at time t,

L is the leaning rate factor at time t, θ(t) is

neighbourhood function at time t.

The fine-tuning parameters for SOM are the cluster

number, the dimension of grid space, the learning rate

which determines the rate at which the node’s weights

are updated. For the analysis, the Kohonen package

in R is used. SOM is one of the techniques which is

very powerful when it comes to visualization of the

clusters across different dimensions.

Figure 1: This figure shows the grid 2x3 (on the left) and

the set of observations (on the right). (Source mentioned in

the references section).

Figure 2: This figure shows the calculation of distance for

observation with 3 dimensions. (Source mentioned in the

references section).

Figure 3: This figure shows how the grid moves when a

cluster is assigned to observation. (Source mentioned in the

references section).

DATA 2019 - 8th International Conference on Data Science, Technology and Applications

68

3.2.2 Gaussian Mixture Models (GMM)

This technique is a probabilistic approach to

clustering. GMM is a mixture of K Gaussian

component that means it is a weighted average of K

Gaussian (normal) distribution. The technique is

based on the Expectation Maximisation algorithm.

The technique works in the following way.

1. For each cluster, a mean and standard

deviation value is allocated. In figure4, there

are two clusters which have a normal

distribution with mean and standard deviation

as (µ

a,

σ

a )

,( µ

b

,σ

b) .

2. Then for each observation, the probability of

belonging to these 2 clusters is calculated

using equation2. In figure 5, the two different

colors per observation show the probability

attached to the corresponding distribution.

3. Using these probabilities, the mean and

standard deviation of the clusters are re-

estimated as shown in equation4 and

equation5.

4. The process keeps on repeating until

convergence is achieved. Figure 6 shows how

the final distribution changes over various

iterations.

Figure 4: This figure shows the initial distribution of two

clusters. (Source mentioned in the references section).

Figure 5: This figure shows the probability assigned to each

observation based on the parameters of the distribution.

(Source mentioned in the references section).

Figure 6: This figure shows the result after multiple

iterations. (Source mentioned in the references section).

The equations used in GMM are as follows.

⁄

⁄

⁄

(2)

exp

(3)

μ

⋯

⋯

(4)

⋯

⋯

(5)

Here x

i

is the ith observation, µ

b

is the mean of the

second cluster, σ

b

is the standard deviation of the

second cluster.

The optimal number of clusters is chosen based

on the Akaike Information Criterion and the Bayes

Information Criterion. Mclust package in R is used

for conducting the exercise.

3.2.3 Fuzzy C-Means (FCM)

This technique is like K-means, however, here every

observation has a degree of belonging to all the

clusters. The process for clustering is as follows.

1. Cluster centers are created randomly based on

the number of clusters.

2. Euclidean distance between the observations

and cluster centroids is calculated in this step.

3. Then, the membership matrix is generated,

using equation 6.

4. After this, the centroids are updated using

equation 7.

5. The last two steps are repeated until the

convergence criterion, as shown in equation 8

is achieved. The value of epsilon should be

between 0 and 1.

Comparative Analysis of Store Clustering Techniques in the Retail Industry

69

The equations are as follows.

∑

||

||

||

||

(6)

∑

∑

(7)

(8)

Where Uij

= membership of the ith data to the jth

cluster, m = fuzziness exponent, C = number of

clusters, c

j

= jth cluster centre , x

i

= ith observation ,

N = number of observations.

The fine-tuning parameters here are the number of

clusters and the fuzziness exponent “m” whose value

should be greater than one. For this exercise, fclust

package in R is applied.

3.2.4 Hierarchical Clustering

In hierarchical clustering, the bottom up clustering

approach is applied. The process applied is as

follows.

1. Each observation is considered as a single

cluster.

2. Then the distance between every pair of

observation is calculated and stored in a

distance matrix. The distance between cluster

can be calculated using complete linkage,

average linkage etc.

3. Pair closest to each other are merged together

and as a result, the number of clusters is

reduced by 1 in each step.

4. Step 2 and 3 are repeated until all the points

are a part of one big cluster.

At the end of the process, a dendrogram is created

as shown in figure7. This helps to identify the optimal

number of clusters. The package hclust in R is used

for the analysis.

Figure 7: This figure shows a dendrogram. The line depicts

the point at which dendrogram is cut.

4 IMPLEMENTATION AND

RESULTS

In section 3, the clustering modelling exercise is

discussed. This section describes the different steps

that are performed after the clustering modelling task

is completed.

4.1 Validation

Several iterations are performed, and many

parameters are considered to get the final iteration.

Some of the validation steps are as follows.

1. The number of clusters formed is decided

based on statistical as well as business inputs.

Some of the statistical techniques that are used

to identify the optimal set of clusters are

dendrograms, heatmaps etc. The number of

clusters formed lied in the range of 3-5

depending on the market and technique.

2. The minimum number of stores per cluster is

set to be at least 30.

3. The following parameters across iterations are

compared.

Table 1: Metrics compared.

Hierarchical Dunn Index, Silhoutte

coefficient

SOM Neighbour distance, Training

Progress

GMM Akaike Information Criterion,

Bayes Information Criterion

FCM Coefficient of Variation

4 All the clusters formed have some distinct

features that would ensure that stores within a

cluster are homogeneous and stores across

clusters are heterogeneous.

4.2 Profiling

There are two levels of profiling that are performed

during this exercise.

1. Basic profiling: In this, all the modelling vari-

ables that are used for clustering are considered

and their variations across the clusters are

captured. If the variables are numerical then

mean is considered and if the variables are

categorical then the frequency is considered.

2. Advance Profiling: In this, other variables

apart from modelling variables that are

relevant to the business are considered and

their variations across the clusters are captured

DATA 2019 - 8th International Conference on Data Science, Technology and Applications

70

in a similar way as described for basic

profiling. This helped in personifying the

clusters and capturing all the differentiated

attributes for each cluster. For instance, as

shown in figure 8, cluster 3 has the maximum

sales whereas cluster 4 has the minimum sales.

Cluster 1 has the maximum numbers of stores

and because of that, they have the maximum

number of the customer base as well. Spending

per transaction is another attribute that is used

to differentiate clusters. The spending per

transaction in cluster 1 is higher as compared to

others. Also, each cluster is dominant in at least

one of the categories. For example, category 1

has the maximum sales share for cluster 1

where as category6 is dominant in cluster 2.

This insight would help the category managers,

in better understanding and designing of the

strategies/promotions. Distribution of different

store designs within a cluster is also captured.

For instance, the stores of cluster 4 and 3,

majorly have Z layout whereas cluster 2 and 1

have mostly layout Y. This information helped

in better understanding of store attributes.

Figure 8: This figure shows the store profiling for one market using GMM.

KPIs Cluster1 Cluster2 Cluster3 Cluster4

Average

Value

Number of Sites 88 47 110 38 283

Share of Sites, % 31% 17% 39% 13%

Total Sales 2,518 m 968 m 3,449 m 681 m 1,904 m

Sales Share 33% 13% 45% 9%

Customer Count 501,200 230,345 631,134 239,234 400,478

Loyal Transactions/Overall Transactions (%) 44% 74% 56% 63%

Points Redeemed/Points Issue (%) 56% 61% 54% 72%

Transactions (Per Store) 165,130 157,152 173,600 335,215 178,514

Transactions (Per Month/Per Store) 14,056 13,125 14,524 28,338 15,025

Sales (Per Store) 3,602,804 2,643,445 3,139,645 5,792,097 3,369,966

Sales (Per Month/Per Store) 305,429 220,771 262,638 489,562 283,424

Units Per Transaction 2.5 2.2 2.9 2.5 2.6

Sales Per Transaction 21.8 16.8 18.1 17.3 18.9

Transactions (Per Store) 93 88 97 188

Transactions (Per Month/Per Store) 94 87 97 189

Sales (Per Store) 107 78 93 172

Sales (Per Month/Per Store) 108 78 93 173

Units Per Transaction 95 83 11

0

95

Sales Per Transaction 116 89 96 92

Category 1 111,565 16,992 20,806 107,820 64,296

Category 2 13,415 14,314 11,891 18,195 14,454

Category 3 21,930 11,645 15,668 83,616 33,215

Category 4 26,609 7,242 79,306 89,497 50,664

Category 5 21,669 16,307 18,651 50,751 26,844

Category 6 10,386 78,652 8,474 22,911 30,106

Category 7 2,713 1,842 2,025 7,974 3,638

Category 8 25,314 15,156 24,258 40,784 26,378

Category 1 48% 10% 11% 26%

Category 2 6% 9% 7% 4%

Category 3 9% 7% 9% 20%

Category 4 11% 4% 44% 21%

Category 5 9% 10% 10% 12%

Category 6 4% 49% 5% 5%

Category 7 1% 1% 1% 2%

Category 8 11% 9% 13% 10%

Store Size -Small 31% 25% 23% 23%

Store Size -Medium 34% 29% 44% 34%

Store Size -Large 35% 46% 33% 43%

Store Layout -X 23% 27% 25% 35%

Store Layout -Y 41% 50% 35% 24%

Store Layout -Z 36% 23% 40% 41%

Store KPIs

KPIs Per Store & Per Store/Month, Absolute Values

KPIs Per Store & Per Store/Month, Indices

Category Average Sales Per Site/Per Month

Category Average Sales Per Site/Per Month, % Share

Comparative Analysis of Store Clustering Techniques in the Retail Industry

71

Figure 9: This shows the quarterly migration from all the techniques across all the markets.

4.3 Business Recommendations

The profiling helped in providing business

recommendations related to the following business

problems.

1. Identifying the key categories for the stores in

order to make a strategic decision. Category 4

is dominant in cluster 3 indicating the stores

belonging to cluster 3 should focus more on

category4.

2. For each cluster, an index can be created using

dimensions like average spend per transaction,

average units per transaction etc. These index

scores can then be leveraged to identify the

categories for each cluster which have the

maximum potential to grow.

3. Identifying the top performing stores. Cluster

3 has the maximum sales share but per store

sales is maximum for cluster 4 indicating that

cluster 4 stores on average performed better

than others.

4. Customer preferences are captured across

stores. For instance, cluster 2 has the

maximum number of Loyalty customers

followed by cluster 4. However, the Loyal

customer points redemption is the most in

cluster 4 which means promotions are most

effective for cluster 4 stores.

5. Understanding store firmographics to

optimize product portfolio. Cluster 1 has

mostly small size stores whereas cluster 3 has

medium type stores and cluster 2 / 4 are mostly

made up of large stores. This information

would help in space optimization planning for

each cluster type.

4.4 Scoring

The clustering techniques used above are

unsupervised learning algorithms, this essentially

means that there is no dependent variable in the

modelling exercise. In case, a new store is entering

a market then these algorithms cannot be applied to

classify the new store among one of the existing

clusters. To overcome this, machine learning

techniques such as Random Forest/Support Vector

Machines are applied. Here, the independent

variables are chosen out of the set of clustering

modelling variables and the dependent variable is

cluster mapping of each store. Hence, this is the

classic use case of multinomial classification. Once,

the prediction model is built, this model is further

used to score on the existing/new stores at a set

frequency (Quarterly/Semi-Annually/Annually).

Market Quarters Q2 '17 - Q3 '17 Q3 '17 - Q4 '17 Q4 '17 - Q1 '18 Q1 '18 - Q2 '18 Average

Hierarchical

11.8% 13.6% 11.0% 8.5%

11%

FCM

14.3% 15.8% 7.4% 11.0%

12%

GMM

12.2% 12.9% 7.8% 8.8%

10%

SOM

10.4% 10.0% 6.4% 4.2%

8%

Hierarchical

7.8% 10.9% 7.8% 9.3%

9%

FCM

3.9% 5.6% 7.4% 8.4%

6%

GMM

2.8% 5.5% 6.8% 8.8%

6%

SOM

4.2% 3.9% 4.6% 4.2%

4%

Hierarchical

7.6% 7.2% 4.9% 4.9%

6%

FCM

3.4% 5.1% 6.9% 6.8%

6%

GMM

2.3% 3.2% 3.6% 4.0%

3%

SOM

3.7% 3.4% 4.1% 3.7%

4%

Hierarchical

2.1% 1.5% 1.1% 3.8%

2%

FCM

3.5% 1.6% 1.1% 2.9%

2%

GMM

2.6% 3.5% 4.5% 3.4%

4%

SOM

4.2% 1.7% 1.7% 3.3%

3%

1

2

3

4

DATA 2019 - 8th International Conference on Data Science, Technology and Applications

72

4.5 Migration

To check the robustness of the model, migration

across quarters is calculated. For this, store level data

is prepared for 5 quarters. Stores belonging to the

quarters are scored using the prediction model built at

the earlier stage. For example, each store of Q2 and

Q3 of 2017 are scored (allocated a cluster). Then

migration is calculated across quarters. Migration is

the number of stores which have changed cluster

across the two quarters divided by the total number of

common stores across the two quarters. As shown in

figure 9, in market 1, the migration from Q2’17 to

Q3’17 using Hierarchical clustering is 11.8%. This

mean for 11.8% of the stores the cluster allotment

changed when the quarter changed from Q2 to Q3.

Lower migration implies that the model is robust.

Hence, quarterly migration is considered as one of the

most important criteria for choosing the best

technique.

As shown in figure 9, SOM performed the best for

market 1 and market 2 with an average migration

across quarters of about 7.8% and 4.2% respectively.

GMM is the best technique for market 3 with the

average migration of 3.3%. Hierarchical clustering

performed the best for market 4 with the average

migration of 2.1%, however, the results from fuzzy

logic are close. Different techniques performed

differently in each market.

5 CONCLUSIONS

The paper considers four clustering techniques

namely: Hierarchal Clustering, Self Organizing

Maps (SOM), Gaussian Mixture Matrix (GMM) and

Fuzzy C-means(FCM). The techniques are applied to

the retail database to cluster the stores with similar

profile together. Each technique has a different

approach to clustering. The main parameter for the

retailer to measure the effectiveness of the cluster is

quarterly migration. It is noticed that no technique is

the best for all the markets. SOM performed better in

two markets, however, GMM and Hierarchical

outperformed the other techniques in one market

each. So, it is concluded that it is difficult to

generalize one technique to be the best suited for store

clustering exercise. The data and the features

determine which technique is to be applied. From this

exercise, it is recommended different clustering

techniques should be performed and one with the best

results should be finally selected.

ACKNOWLEDGEMENTS

The authors would like to thank three reviewers who

assisted in reviewing the content and improving the

quality of the paper.

REFERENCES

Bilgic E., Kantardzic M. and Cakir O. (2015). Retail Store

Segmentation for Target Marketing.

Spiller, P. and Lohse, G. (1997). A classification of internet

retail stores. International Journal of Electronic

Commerce, 2(2), pp.29-56.

Boone, D. and Roehm, M. (2002). Retail segmentation

using artificial neural networks. International Journal

of Research in Marketing. 19(3), pp287-301.

Hammouda, K. and Karray, F. (2002). A comparative study

of data clustering techniques. Tools of Intelligent

Systems Design.

Laverenko, V. (2014). EM algorithm: how it works.[

image]. Available at https://www.youtube.com/

watch?v=REypj2sy_5U&t=338s

Alogobean.com, (2017). Self Organizing Maps Tutorials.

[image] Available at: https://www.superdatascience.

com/blogs/the-ultimate-guide-to-self-organizing-

maps-soms

Superdatascience.com, (2018). The Ultimate Guide to Self

Organizing Maps. [image]. Available at: https://

algobeans.com/2017/11/02/self-organizing-map/

Chavent, M., Kuentz, V., Liquet, B., Saracco, J. (2011).

ClustofVar: An R package for clustering of variables.

Journal of Statistical Software, 55(2)

Watkins, M. (2014). The rise of modern convenience store

in Europe. Available at: https://www.nielsen.com/

eu/en/insights/reports/2014/the-rise-of-the-modern-

convenience-store-in-europe.html

Comparative Analysis of Store Clustering Techniques in the Retail Industry

73