Supporting Taxonomy Development and Evolution by Means of

Crowdsourcing

Binh Vu and Matthias Hemmje

FernUniversität in Hagen, Hagen, Germany

Keywords: Taxonomy, Taxonomy Development, Taxonomy Management, Crowdsourcing, Knowledge, Knowledge

Management, Information Overload, Ecosystem Portal.

Abstract: Information overload continues to be a challenge. By dividing the material into many different small subsets,

classification based on a taxonomy makes data exploration and retrieval faster and more accurate. Instead of

having to know the exact keywords that describe the knowledge resource, users can browse and search for

them by selecting the categories that the resource is most likely to belong. Nevertheless, developing

taxonomies is not an easy task. It requires the authors to have a certain amount of knowledge in the domain.

Furthermore, the workload will increase as any new taxonomy needs to be frequently updated to remain

relevant and useful. To combat these problems, this paper proposes another approach to crowdsource

taxonomy development and evolution. We describe in this paper the concept of this approach along with

different types of evaluations targeting on the one hand to demonstrate the feasibility of the approach and the

usability of the initial prototype as well as on the other hand the quality and effectiveness of the chosen method.

1 INTRODUCTION AND

MOTIVATION

Today’s internet is a big source of information

available in the form of content and also more explicit

forms of knowledge resources. Every day there is a

huge amount of such content and knowledge resource

data moving on the internet. Most companies in the

U.S. in 2005 have at least 100 TB of such data stored.

They estimate that by 2020, 40 Zettabytes (43 trillion

Gigabytes) will be created, an increase of 300 times

from 2005 (The Four V's of Big Data, 2005). Such

data not only needs to be indexed, but also the index

terms should be unique and descriptive. Otherwise, an

indexer would have to classify documents, which are

from the same topic, to various categories, despite the

fact that these categories may have the same or very

similar meaning. This would make searching and

comparing results afterward more difficult. A

taxonomy, in this case, can be a source of a unique

and well-controlled vocabulary. It is a hierarchy of

agreed-on terms, which later can be used for indexing

or classifying documents. This means, with the

support of taxonomy, classification consistency can

be achieved (Vu et al., 2018).

The development of a new taxonomy is usually

done by knowledge workers and domain experts.

While providing many benefits and advantages, it

also has problems. New approaches involving the

crowd in the development and management of

taxonomies were introduced to overcome these

constraints using the “wisdom of the crowd”

(Karampinas & Triantafillou, 2012).

By using “the power of the crowd”, one can

achieve definitions of taxonomy terms and relations

that no person or organization alone can achieve. One

example of crowdsourcing is the knowledge resource

Wikipedia, which is considered as one of the world’s

largest crowdsourcing projects. It was initially an

English-language encyclopedia. Today, Wikipedia

has more than 40 million articles in 301 different

languages. All of them were written by the crowd

through a model of content editing by means of web-

based applications, called a wiki (Wikipedia, n.d.).

With crowdsourcing, human resources only need

to work when they want, when they need to, as much

as they need to and for whomever they like, and to

choose the activities that they will do. This makes

them happier compared to traditional types of

employing human resources. Moreover, people’s

goods can be shared to lower their expenses and avoid

waste due to collaborative consumption (Andro,

2018).

Vu, B. and Hemmje, M.

Supporting Taxonomy Development and Evolution by Means of Crowdsourcing.

DOI: 10.5220/0008348003510358

In Proceedings of the 11th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2019), pages 351-358

ISBN: 978-989-758-382-7

351

From a company’s point of view, by applying

crowdsourcing, they ideally receive work results in

much higher quality, quantity, and at a lower cost in

less time. The work is done for free as the crowd

workers hope to be compensated by the crowds, e.g.,

in a crowdcontest or other forms of social incentives

not necessarily completely excluding later payment.

In addition, a company would ideally benefit from a

large number of proposals while having only a few

individuals to compensate for a much lower overall

cost than that of traditional human resource

employments (Andro, 2018).

In this paper, we want to introduce existing

approaches to develop and manage taxonomy, as well

as point out their challenges. To overcome these

problems, we purpose another method of applying

crowdsourcing in taxonomy development, evolution,

and management.

2 PROBLEM STATEMENT

The combination of taxonomy development and

crowdsourcing poses additional challenges. A part of

these challenges was mentioned in the authors’

previous publication (Vu et al., 2018)

Developing a taxonomy involves many people,

such as IT staff, corporate knowledge workers,

departmental publishers, etc (Kon & Hoey, 2005).

The more people are working together, the more

problems it can potentially generate. On another

hand, working alone can get us surrounded by

information and knowledge resources that only

support one point of view and forget other

alternatives.

Furthermore, things always change. To reflect,

e.g., the changing needs in knowledge domain

concept and resource modeling, taxonomies need to

be maintained frequently. Without maintenance and

governance, and especially a tool to manage version

and ownership, taxonomies can be drifting away from

business and organizational information needs

(Lambe, 2007).

Storing and processing taxonomy representations

potentially requires a lot of computational and storage

resources. Therefore, we need to consider how to

organize the taxonomy in the database in such a way

that it requires less space and is fast to retrieve.

One primary problem of crowdsourcing is how to

motivate the crowd. Each individual engaged in

crowdsourcing has their motivation. The motivation

to participate in crowdsourcing is not very different

from the motivation to participate in blogging,

creating open-source software, etc. (Brabham, 2013).

Some do it for fun and recognition. Some do it for

financial reward. The problem is not every

organization has the ability to provide all these

incentives to the crowd.

The next problem is the quality of results

produced by the crowd. Although one requirement for

“the wisdom of crowds” is diversity, there is always

unskilled, unrelated, insufficient people in the crowd.

Compared to experts, cheap (sometimes, free) labor

is likely to produce less quality work. Therefore, we

need to either lower the complexity of the task or find

a skilled crowd, which is not always easy (Eskenazi

et al., 2013)

Crowdsourcing is difficult to manage. Not only it

needs more resources for management but also bears

challenges in security and privacy. It is hard to keep

a project secret when it involves many people

working on it from everywhere. Furthermore,

collaboration generates personal data, which need to

be handled carefully. All of this adds more problems

to the management process, which was already

difficult (Bar & Maheswaran, 2014).

3 STATE OF THE ART

In this chapter, we provide an overview of a selection

of important fundamental concepts that are related to

knowledge resource management, taxonomy

management, and crowdsourcing. Furthermore,

relevant approaches using social tagging and

applying crowdsourcing in forming a term corpus or

creating hierarchical relationships between terms will

be mentioned.

3.1 Knowledge Management

Knowledge Management (KM), like most complex

things, has many different definitions. Depending on

the nature of the scientific area, the definition of KM

might have a different meaning. Nevertheless, what

Devenport and Prusak wrote in their book “Working

Knowledge: How Organizations Manage What They

Know” was agreed and cited the most: “Knowledge

management draws from existing resources that your

organization may already have in place good

information systems management, organizational

change management, and human resources

management practices. If you've got a good library, a

textual database system, or even effective education

programs, your company is probably already doing

something that might be called knowledge

management” (Davenport & Prusak, 1998).

KEOD 2019 - 11th International Conference on Knowledge Engineering and Ontology Development

352

The Content and Knowledge Management

Ecosystem Portal (KM-EP), has been developed to

provide powerful web-based tools for managing

knowledge resources and content (Vu, 2015). Figure

1 presents KM-EP’s architecture, which consists of

five subsystems:

 Information Retrieval Subsystem (IRS) indexes

contents and lets the user search for them in a

quick manner.

 Learning Management Subsystem (LMS) helps,

e.g., a course creator, who is not an expert of the

KM-EP and the underlying Learning

Management System - Moodle, to create and

manage courses.

 Content and Knowledge Management Subsystem

(CKMS) manages contents and knowledge

resources. It allows users to create, edit, remove

and rate different type of contents in the

ecosystem.

 User Management Subsystem (UMS) manages

users, groups of users, authentication, and access

control for all subsystems.

 Storage Management Subsystem (SMS) preserves

the integrity of the digital file and its metadata for

the lifetime of an asset. (Vu et al., 2018)

Figure 1: KM-EP architecture (Vu, 2015).

In the context of the research and development

work presented in this paper, an additional taxonomy

management system was developed as part of the

KM-EP. The system allows domain experts to create

new taxonomies. These taxonomies later will be used

for supporting the classification, searching, and

browsing of content and knowledge resources in the

KM-EP. Nevertheless, without the support of

crowdsourcing, only a given group of people had

access rights to modify existing taxonomies. Normal

users could not access the system and therefore, were

not able to create their own taxonomy. Furthermore,

there is no option to add information about the authors

of a taxonomy, and additional properties such as

descriptions or keywords.

3.2 Taxonomy Development

The term “taxonomy” has a very broad meaning and

is being used in many areas, from psychology,

biology to computer science.

In her book “The Accidental Taxonomist”,

Hedden see taxonomy in a broad sense as “any means

of organizing concepts of knowledge” and in a

broader sense as “a knowledge organization system

or knowledge organization structure” (Hedden,

2010). The term “knowledge organization systems”

was mentioned in 2000 by Hodge as a synonym for

taxonomy. There are various types of knowledge

organization systems, which include (1) term lists,

such as authority files, glossaries, dictionaries, and

gazetteers, (2) classifications and categories, such as

subject headings, classification schemes, taxonomies,

and categorization schemes and (3) relationship lists,

such as thesauri, semantic networks, and ontologies

(Hodge, 2000).

The development of a new taxonomy is usually

done using the Delphi method, which is a technique

to obtain the most reliable consensus of opinions of a

group of experts through a series of intensive

questionnaires interspersed with controlled opinion

feedback (Dalkey & Helmer, 1963). The traditional

method of using knowledge workers and experts for

reviewing, while providing many benefits and

advantages, has its own problems. One example is

that experts are not always available. They have other

jobs to do, and rounds of reviewing take too much

time from their schedule. Another example is that

people tend to ignore disagreements. The result is,

e.g., a poor design decision which is ignored during

reevaluation and not getting fixed.

3.3 Crowdsourcing and

Crowdknowledge

“Under the right circumstances, groups are

remarkably intelligent and are often smarter than the

smartest people in them. Groups do not need to be

dominated by exceptionally intelligent people in

order to be smart” wrote James Surowiecki in his

book, The Wisdom of Crowds. By putting together a

big enough and diverse enough group of people, we

can produce decisions better than experts. Therefore,

chasing the expert for answers is a mistake. Group's

decisions will, over time, be intellectually superior to

Supporting Taxonomy Development and Evolution by Means of Crowdsourcing

353

the isolated individual, no matter how smart or well-

informed he is (Surowiecki, 2005).

According to Estellés-Arolas and Guevara, there

are 40 definitions for the concepts of crowdsourcing

that come from 32 distinct articles published from

2006 to 2011 (Estellés-Arolas & Guevara, 2012). The

term was created by Jeff Howe in his article “The Rise

of Crowdsourcing” in 2006. It a combination of

“crowd” and “outsourcing” and can be described as

“the act of taking work once performed within an

organization and outsourcing it to the general public

through an open call for participants” (Ridge, 2014).

Crowdvoting is, e.g., one type of crowdsourcing.

Its objective is to know the opinions of the crowd

regarding specific issues or products. Here, people are

giving their opinions and vote on a certain topic

(Simon, Pechuan, & Estelles-Miguel, 2015)

(Jimenez-Crespo, 2017) (Kitchens & Crane, 2014)

(Turban, King, Lee, Liang, & Turban, 2015).

3.4 Related Works

The concept of crowdsourcing is fairly new.

Nevertheless, the idea of crowdsourcing taxonomy

development and evolution was already applied in

scientific publications. There are two steps involving

in the process of developing a new taxonomy:

forming a term corpus and creating hierarchical

relationships between terms. Crowdsourcing can be

used in either one of these steps or in both of them.

The work of forming a term corpus using

crowdsourcing in the first step was introduced by the

mean of social tagging and folksonomy. Popular

tagging systems, which were mentioned the most in

scientific publications, are social bookmarking

website Delicious and photo-sharing site Flickr.

They have features that allow the user to add tags to

existing contents, in contrast to stricter systems like

libraries where a book will have exactly one proper

call number based on content (Heymann & Garcia-

Molina, 2006). These tags together form a

folksonomy and can be used as terms for the

developing taxonomy.

Nevertheless, folksonomy has its disadvantages.

There is no control of synonymy and homonymy,

there are many formats for dates and a lot of typing

and orthographic errors. Tags can also contain words

from different languages or even compound words

consisting of more than two words or a mixture of

languages. Combining all tags from a system, we can

find many words that have the same meaning or same

words but in different forms, e.g., “bag” vs “bags”,

“computer science” vs “computer_science” and

“computer-science” (Peters & Stock, 2007).

Besides the approach using folksonomy, there are

other methods to create a term corpus for taxonomy

without using crowdsourcing, such as extracting

words with top term frequency - inverse document

frequency score (Brooks & Montanez, 2006) or get

words or phrases in the top-ranked documents that

commonly co-occur with each other across many of

the passages (Sanderson & Croft, 1999).

From the terms’ corpus created in the first step,

the creators form hierarchical relationships between

terms and get the final result as a new taxonomy in

the second step. One method is to apply an algorithm

to grow deeper, bushier tree by merging saplings

created by different users, called SAP

(Plangprasopchok, Lerman, & Getoor, 2010).

Another method was introduced by Heymann and

Garcia-Molina (Heymann & Garcia-Molina, 2006).

Their idea is to convert tag into tag vectors and

calculate the similarity between tags using the cosine

similarity between tag vectors. The end product is a

tag similarity graph where each tag is represented by

a vertex, and two vertices are connected by an edge if

the similarity of the nodes they represent is above

some set threshold.

Liu et al. computes a generality score for each tag,

then use agglomerative hierarchical clustering

approach to generate the concept hierarchy (Liu,

Fang, & Zhang, 2010). Their algorithm has the same

principle as Heymann’s. Tags are sorted by their

score in descending order. In this case, it is the

generality score. Then the algorithm tries to find the

parent node in the taxonomy tree for each tag. If it

cannot be found, the tag is added as a child of the root.

Agglomerative Hierarchical Clustering (AHC) is

frequently used to build the hierarchy of tags. It relies

on how similar / distant two nodes are in building a

hierarchy. Li et al. proposed an enhance AHC

framework by skipping the error-prone step of

calculating each tag’s generality and integrating a

topic model to capture thematic correlations among

tags (Li, et al., 2012).

An interesting approach was introduced by

Karampinas and Triantafillou (Karampinas &

Triantafillou, 2012). Rather than calculating the

similarity score between two tags, they use the crowd

to annotate parent-children relationships between

tags. An algorithm, called “CrowdTaxonomy”, was

introduced to grow the taxonomy tree based on the

crowd’s annotations. The algorithm is called on every

vote. This method includes the crowd in both steps of

the taxonomy development process. The crowd is

used to form a terms corpus by the mean of social

tagging, and they vote to annotate pair between two

terms. Hierarchical relationships are built based on

KEOD 2019 - 11th International Conference on Knowledge Engineering and Ontology Development

354

their annotations. The work of Karampinas and

Triantafillou showed that the crowd could provide

high-quality input in terms of completeness and

correctness that leads to the emerging of good quality

taxonomies.

Nevertheless, the mentioned approaches are only

possible if the system already has a large number of

tags that can be used to form the term corpus. It is

difficult to develop a new taxonomy if there are no

tagged contents in the system. Furthermore, the

existing tags also needed to cover the topic of the

taxonomy that is being developed. If there are

mistakes or missing terms, the system administrators

need to correct them themselves, which is missing the

point of replacing the work of experts in the

development of taxonomy.

4 CROWDSOURCING

TAXONOMY DEVELOPMENT

In this paper, we want to apply another method to

crowdsource taxonomy development and evolution

with the support of the KM-EP. From all the

taxonomies that were created from a seed taxonomy,

the one that has the highest score (highest rated) will

be chosen. It will replace the current seed to be used

for classification, searching or navigation in the

system. Furthermore, it will act as the seed for further

expansion of the taxonomy’s evolution tree in the

next round. Figure 2 presents an example of the

evolution of a taxonomy following our approach.

The example from the figure below describes the

case where we have a simple taxonomy A as the seed.

From this taxonomy, we have different taxonomies

(A1, A2, A3, and A4) that were created by different

users using crowdsourcing. These taxonomies were

rated by the system’s users (crowdvoting). Taxonomy

A1 had the highest score and became the next official

base. Taxonomy A1 was used for content

classification, searching, and navigation in the system

until next round. In the next round, users cloned

taxonomy A1 and updated it as they find suitable. The

created taxonomies got rated again by other users, and

the taxonomy that had the highest score (A1C)

became the next seed. The process repeats as long as

the administrators allow it.

Figure 2: An Example of the Taxonomy Evolution with

Support of Crowdsourcing.

To support the evolution process, a version

control system based on Git (Git, n.d.) was

implemented. This system allows the user to save the

current state of the taxonomy, check history with

detail about taken snapshots. This is a great method

to keep track of taxonomy builds. The crowd is able

to identify which version is currently in development,

what are the changes etc. Furthermore, the user can

reset a taxonomy to a previous state or replace a

taxonomy with one of its clones. This is a crucial

feature for debugging error, which always happens in

the development of taxonomy. Caching mechanisms

were also added to increase the processing speed and

reduce the computational resource needed.

Combining with fast and efficient algorithms,

thousands of terms can be retrieved in a matter of

milliseconds.

5 EVALUATION

The newly developed Taxonomy Manager prototype

was deployed in several R&D projects as part of the

KM-EP. The goals in evaluating this prototype are

first to evaluate the feasibility, usability, and

efficiency of the user experience of the implemented

prototype based on user’s direct feedback that was

collected by means of questionnaires after initially

working with the prototype. Secondly, the goal was

to evaluate the introduced approach of crowdsourcing

taxonomy from a more effectiveness point of view.

We want to test if this proof-of-concept can be used

successfully in reality.

Supporting Taxonomy Development and Evolution by Means of Crowdsourcing

355

In the concept of the EU-funded R&D project

RAGE (RAGE, n.d.), the first goal of evaluating the

feasibility, usability, and efficiency was achieved.

Teams from different work packages of the project

were working together to create and develop a new

taxonomy about the domain of Applied Gaming using

the implemented Taxonomy Manager. To evaluate

the aspect related to the usage of the Taxonomy

Manager, an evaluation questionnaire was created by

the authors. Members of the consortium and external

game developers as well were contacted by project

members to participate in the evaluation of the

taxonomy manager. The questionnaire was combined

of questions that are related to the usability,

usefulness and user interface of the prototype, quality

of the tutorial, experience of the participants, quality

of the system’s features, such as version control,

export, and import. The result of this evaluation was

published in 2018 by the authors (Vu, et al., 2018).

Figure 3 provides an overview of the detailed results

obtained from the evaluation categories, with 0 is the

lowest score, and 7 is the highest.

Figure 3: Mean scores of all evaluation categories.

Overall, the scores show that participants, in

general, appreciate using the taxonomy manager and

its’ features, but also that it needed some

improvements in the tutorial. Nevertheless, this

evaluation does not say anything about the usefulness

of the prototype in the sense of the quality of the

taxonomy that was created by the users.

Therefore, to achieve the next goal of evaluating

the qualitative effectiveness of the tool in terms of the

quality of the work on the taxonomy, a second

evaluation is now planned. The general concept of

this evaluation is to let experts and the crowd do the

same task then compare the result and see if the crowd

is really doing a similar good or even better job than

the experts. In this second evaluation that is presented

as a concept in this paper, we chose to use IAB’s

Quality Assurance Guidelines (QAG) Taxonomy as

the initial expert taxonomy. The Interactive

Advertising Bureau (IAB) is one of the most

influential organizations in the online advertising

business and, currently, brings together more than

650 leading companies in the industry that control

86% of the U.S. market. Today IAB has become a

standard for content classification, especially in fields

with strong ties to the digital economy and new social

media (Filippis, 2018). The Quality Assurance

Guidelines Taxonomy was created in 2011 by IAB

Networks and Exchanges Committee as part of the

Quality Assurance Guidelines (QAG) Program. This

taxonomy has 2 tiers. The first tier is made of 24

categories, and the second tier has 361 sub-categories.

Table 1 presents category “Automotive” as part of the

IAB’s Quality Assurance Guidelines Taxonomy.

Table 1: Category "Automotive" and its sub-categories.

Automotive

Auto Parts

Auto Repair

Buying/Selling Cars

Car Culture

Certified Pre-Owned

Convertible

Coupe

Crossover

Diesel

Electric Vehicle

Hatchback

Hybrid

Luxury

Minivan

Motorcycles

Off-Road Vehicles

Performance Vehicles

Pickup

Road-Side Assistance

Sedan

Trucks & Accessories

Vintage Cars

Since then, IAB’s Taxonomy and Mapping

Working Group have been working on the QAG

Taxonomy with the goal of to create an enhanced and

more powerful taxonomy, enabling content creators

to more accurately and consistently describe content,

facilitating more relevant advertising and providing a

higher quality and more granular foundation for data

analysis (Flood & Agnew, 2017). As a result, a new

version of the QAG Taxonomy was introduced in

2017 called “IAB Tech Lab Content Taxonomy

Version 2.0”. The new Content Taxonomy has more

than 400 new site content classifications across 29 tier

1 categories (Content Taxonomy, n.d.). We choose to

evaluate this taxonomy as the new version of the

expert taxonomy.

For the second evaluation, an experiment

guideline for the participants was prepared. In this

guideline, information related to the experiment, such

as the introduction of taxonomy, goals of the

evaluation, introduction of IAB and the QAG

taxonomy, was described. Furthermore, the tasks and

an example of how they need to be done were also

presented. Finally, information on how to report the

result was given in the guideline.

In this second evaluation, deliberate participants

from the crowd will be given two tasks. In the first

task, all the changes that the experts made to upgrade

KEOD 2019 - 11th International Conference on Knowledge Engineering and Ontology Development

356

the initial expert taxonomy (Quality Assurance

Guidelines Taxonomy) to the new version of the

expert taxonomy (IAB Tech Lab Content Taxonomy

Version 2.0), such as adding new terms, renaming

terms, deleting terms and moving terms into groups

are given as a task to each member of the crowd. We

ask them to redo each change at a position they find

comfort in the initial taxonomy and see if they can re-

create the same new taxonomy as the experts did. The

purpose of this task is to evaluate the crowd’s

qualitative performance against the expert taxonomy

evolution. This can be considered as a benchmark

against the global “expert-based truth”.

After the experiment, the result will be collected

and analyzed. We compare the similarity between the

taxonomy created by the crowd and the new version

of the expert taxonomy (Karampinas & Triantafillou,

2012). Let S

be the set of all parent-children pairs in

the expert taxonomy and S

be the set of all parent-

children pairs in the crowd taxonomy. We have:

 





⋂







(1)

 





⋂









(2)

2∗

 ∗ 

  



(3)

Precision measures the exactness of the crowd

taxonomy evolution, and recall is a measure of the

completeness of the crowd taxonomy evolution while

F-score measures the accuracy of the evaluation. The

result will lead to the answer to the question “are the

crowd’s taxonomy evolution actions as good as those

of the experts?”.

In the second task, we show both initial taxonomy

and the new taxonomy of the expert to each member

of the crowd and ask them to answer some questions,

such as “what has been changed”, “do you agree”, “if

not, what would you change and why”. From the

result, we take the taxonomies that the crowd made in

the last question and give it to another group of user

and experts along with the initial taxonomy and new

taxonomy created by IAB. We let the group vote for

the best taxonomy and see if it is the one created by

the crowd or the experts. In this task, we hope to be

able to show which group provided a better taxonomy

and validate if the crowd is truly better than the

experts.

It is worth to mention that the second evaluation

that we described above has not been completed and

is considered as future work.

6 CONCLUSIONS AND

OUTLOOK

In this paper, we have described the concept of

knowledge as well as knowledge management and the

content and knowledge management ecosystem

portal KM-EP. Furthermore, we presented

crowdsourcing and new approach of applying

crowdsourcing in the development of taxonomy.

In result, a taxonomy management system was

implemented as a component of the KM-EP. The new

component allows the crowd to create and manage

taxonomy and its structure. The Delphi method was

replaced by crowdsourcing and crowdvoting, where

users have the ability to vote for each taxonomy. With

the support of version control, taxonomy evolution

will be faster, more efficient and agile.

Finally, an evaluation was conducted in the scope

of the EU-funded project RAGE. This evaluation

validates if the implemented prototype fulfils all the

requirements and how it performs. The outcome

proved the importance, usefulness and usability of the

implemented taxonomy management system.

Another evaluation aiming at a qualitative

comparison of expert-based taxonomy evolution with

crowd-based taxonomy solution was described and

planned. Due to time limitation, only a small set of

the crowd can be organized, but a big enough and

diverse enough crowd can be gathered in the near

future for a better evaluation result. This might be

done by using the user-base of the RAGE KM-EP,

which is growing by the success of the project.

REFERENCES

Andro, M. (2018). Digital Libraries and Crowdsourcing

(Computer Engineering: Digital Tools and Uses Set).

Wiley-ISTE.

Bar, A. R., & Maheswaran, M. (2014). Confidentiality and

Integrity in Crowdsourcing Systems. Springer.

Brabham, D. C. (2013). Crowdsourcing. The MIT Press.

Brooks, C. H., & Montanez, N. (2006). Improved

Annotation of the Blogosphere via Autotagging and

Hierarchical Clustering. Proceedings of the 15th

international conference on World Wide Web (pp. 625-

632). ACM.

Content Taxonomy. (n.d.). Retrieved from https://www.iab

.com/guidelines/taxonomy/

Dalkey, N., & Helmer, O. (1963). An Experimental

Application of the Delphi Method to the Use of Experts.

Management Science, 458-467.

Davenport, T. H., & Prusak, L. (1998). Working

Knowledge: How Organizations Manage What They

Know. Harvard Business Press.

Supporting Taxonomy Development and Evolution by Means of Crowdsourcing

357

Eskenazi, M., Levow, G.-A., Meng, H., Parent, G., &

Suendermann, D. (2013). Crowdsourcing for Speech

Processing: Applications to Data Collection,

Transcription and Assessment. Wiley.

Estellés-Arolas, E., & Guevara, F. G. (2012). Towards an

Integrated Crowdsourcing Definition. Journal of

Information Science.

Filippis, L. D. (2018, March 19). Updated version of the

IAB model in the Deep Categorization API. Retrieved

from Meaning Cloud: https://www.meaningcloud.com/

blog/updated-iab-model-meaningcloud

Flood, K., & Agnew, N. (2017, November 30). IAB Tech

Lab Announces Final Content Taxonomy v2 Ready for

Adoption. Retrieved from IAB Tech Lab:

https://iabtechlab.com/blog/iab-tech-lab-announces-

final-content-taxonomy-v2-ready-for-adoption/

Git. (n.d.). Retrieved from https://git-scm.com/

Hedden, H. (2010). The Accidental Taxonomist.

Information Today Inc.

Heymann, P., & Garcia-Molina, H. (2006). Collaborative

Creation of Communal Hierarchical Taxonomies in

Social Tagging Systems. InfoLab Technical Report.

Hodge, G. (2000). Systems of Knowledge Organization for

Digital Libraries: Beyond Traditional Authority Files.

Council on Library and Information Resources.

Jimenez-Crespo, M. A. (2017). Crowdsourcing and Online

Collaborative Translations: Expanding the Limits of

Translation Studies. John Benjamins Publishing Co.

Karampinas, D., & Triantafillou, P. (2012). Crowdsourcing

Taxonomies. The Semantic Web, 545-559.

Kitchens, F., & Crane, C. (2014). Customersourcing: to Pay

or be Paid. 27th Bled eConference. Bled, Slovenia.

Kon, H., & Hoey, M. (2005). Leveraging collective

knowledge. The 14th ACM international conference on

Information and knowledge management (pp. 560-

567). Bremen, Germany: ACM New York, NY, USA.

Lambe, P. (2007). Organising Knowledge: Taxonomies,

Knowledge and Organisational Effectiveness. Chandos

Publishing.

Li, X., Wang, H., Yin, G., Wang, T., Yang, C., Yu, Y., &

Tang, D. (2012). Inducing Taxonomy from Tags: An

Agglomerative Hierarchical Clustering Framework.

International Conference on Advanced Data Mining

and Applications (pp. 66-77). Springer.

Liu, K., Fang, B., & Zhang, W. (2010). Ontology

Emergence from Folksonomies. Proceedings of the

19th ACM international conference on Information and

knowledge management (pp. 1109-1118). Toronto, ON,

Canada: ACM New York, NY, USA.

Peters, I., & Stock, W. G. (2007). Folksonomy and

Information Retrieval. Proceedings of the American

Society for Information Science and Technology, (pp.

1-28).

Plangprasopchok, A., Lerman, K., & Getoor, L. (2010).

Growing a Tree in the Forest Constructing

Folksonomies by Integrating Structured Metadata.

Proceedings of the 16th ACM SIGKDD international

conference on Knowledge discovery and data mining

(pp. 949-958). ACM.

RAGE. (n.d.). Retrieved from http://rageproject.eu/

Ridge, M. (2014). Crowdsourcing our Cultural Heritage.

Ashgate Publishing, Ltd.

Sanderson, M., & Croft, W. B. (1999). Deriving Concept

Hierarchies from Text. Proceeding SIGIR '99

Proceedings of the 22nd annual international ACM

SIGIR conference on Research and development in

information retrieval (pp. 206-213). ACM.

Simon, F. J., Pechuan, I. G., & Estelles-Miguel, S. (2015).

Advances in Crowdsourcing. Springer International

Publishing AG.

Surowiecki, J. (2005). The Wisdom of Crowds. Anchor.

The Four V's of Big Data. (2005). Retrieved from IBM Big

Data & Analytics Hub: http://www.ibmbigdatahub.

com/infographic/four-vs-big-data

Turban, E., King, D., Lee, J. K., Liang, T.-P., & Turban, D.

C. (2015). Electronic Commerce: A Managerial and

Social Networks Perspective. Springer.

Vu, B. (2015). Realizing an Applied Gaming Ecosystem -

Extending an Education Portal Suite towards an

Ecosystem Portal. TU Darmstadt.

Vu, B., Mertens, J., Gaisbachgrabner, K., Fuchs, M., &

Hemmje, M. (2018). Supporting Taxonomy

Management and Evolution in a Web-based

Knowledge Management System. HCI 2018. Belfast,

UK.

Wikipedia. (n.d.). Retrieved from https://en.wikipedia.org/

wiki/Wikipedia

KEOD 2019 - 11th International Conference on Knowledge Engineering and Ontology Development

358