Generation of Multiple Choice Questions Including Panoramic

Information using Linked Data

Fumika Okuhara, Yuichi Sei, Yasuyuki Tahara and Akihiko Ohsuga

Graduate School of Informatics and Engineering, The University of Electro-Communications, Tokyo, Japan

Keywords:

Panoramic Information, Linked Data, Multiple Choice Question, Semantic Web.

Abstract:

In recent years, just about all subjects require students to learn panoramic information. Because the need

exists for cross-curriculum learning aimed at relating subject areas, it is useful for multiple-choice questions

to include panoramic information for learners. A question including panoramic information refers to content

that includes transverse related information and makes respondents grasp the whole knowledge. However, it

is costly to manually generate and collect appropriate multiple-choice questions for questioners and learners.

Therefore, in this research, we propose a method for the automatic generation of multiple-choice questions

including panoramic information using Linked Data. Linked Data is graphical data that can link structured

data, and it is used as a technology for data integration and utilization. Some attempts have been made to use

Linked Data as a resource for creating teaching material, and the possibility of using Semantic Web technology

in education has been veriﬁed. In this paper, we aim to realize a system for automatically generating two types

of multiple-choice questions by implementing an approach to generating questions and choices. An evaluation

method for the generation of questions and choices involves setting indicators for each evaluation item, such

as validity and the degree of the inclusion of panoramic information.

1 INTRODUCTION

Each school has a curriculum and students should le-

arn based on it. Regarding the importance of curri-

culum management, the Central Council for Educa-

tion mentioned the need for “improving educational

activities based on a cross-curriculum perspective”

(Ministry of Education, Culture, Sports, Science and

Technology, 2015). In other words, panoramic lear-

ning is necessary for learning all subjects.

The multiple-choice-question format is widely

used for qualiﬁcation exams, certiﬁcation examinati-

ons, and the like. This format is useful because it ena-

bles the quick, easy, and objective scoring of large-

scale exams. In addition, because completing these

exams simply involves “choosing the correct answer

from the choices,” responding to each question is a

highly efﬁcient process (Ikegami, 2015). Therefore,

the format may be suitable for testing a large number

of people, helping learners to demonstrate their kno-

wledge of a wide range of ﬁelds easily, and enabling

questioners to evaluate a wide range of questions from

units.

Based on the above, multiple-choice questions

including panoramic information have been deemed

useful for both learners and questioners. Howe-

ver, manually generating and collecting appropriate

multiple-choice questions is costly. In this paper, we

propose a method for automatically generating que-

stions including panoramic information based on the

given curriculum, and evaluating them.

With our proposed method, Linked Data is used as

a knowledge base, and questions and incorrect choi-

ces for a correct answer set are generated based on the

curriculum. Figure 1 displays an output image of the

proposed system. In this system, the output is gene-

rated by setting any curriculum and selection of unit

as the input, extracting the keywords included in the

curriculum, selecting the keywords to set in Answers,

and conﬁguring questions and incorrect choices with

Wikipedia data. The output is the components of the

multiple choice question. Using the Wikipedia data

which the format expressed the relationship between

these data, the question can include relevant data and

the relationships between the answer and can be one

including panoramic information. In the current rese-

arch study for evaluation of this system, requirements

were set for the generation of questions and choices,

and evaluation experiments were conducted based on

the evaluation items. These experiments included in-

110

Okuhara, F., Sei, Y., Tahara, Y. and Ohsuga, A.

Generation of Multiple Choice Questions Including Panoramic Information using Linked Data.

DOI: 10.5220/0007259301100120

In Proceedings of the 11th International Conference on Agents and Artiﬁcial Intelligence (ICAART 2019), pages 110-120

ISBN: 978-989-758-350-6

dices of the degree of inclusion of the panoramic in-

formation set, and in the result of the experiment, the

questions made by proposed method tended to have

more panoramic information than one made by com-

parison method.

The article is organized as follows: Section 2 the

related research studies; Sections 3 and 4 highlight

the purpose of this research and the proposed method

used; Sections 5 and 6 provide the implementation

method, some evaluation method and the results of

the evaluation experiments; Section 7 present the dis-

cussion; and ﬁnally, Section 8 states the conclusion

and future work.

Figure 1: An Example of output image of multiple choice

questions generated by the proposed system.

2 RELATED RESEARCH

Linked Data, as an existing technology, has been used

for the generation of questions. Linked Data is struc-

tured graphical data that Tim Berners-Lee proposed;

data sets are linked with one another using the Web

mechanism. Linked open data (hereinafter referred

to as LOD) is Linked Data that is published on the

Web. LOD represents a data format that anyone can

freely use, with various kinds of open data being lin-

ked with each other through municipalities or institu-

tions. The LOD cloud

, representing links between

available LOD, contains more than 1,000 data sets as

of May 2018. Activities related to LOD are being

carried out in various ﬁelds. Among them is DBpe-

dia

which turns the well-known Wikipedia data into

a Linked Data format. DBpedia Japanese

which is a

Japanese version, also exists, turning the information

in the InfoBox of Japanese Wikipedia into a Linked

Data format.

http://lod-cloud.net

http://dbpedia.org

http://ja.dbpedia.org

Research (Iijima et al., 2016) has also proposed a

method of presenting an unexpected connection bet-

ween multiple data sets by using Linked Data techno-

logy, which can be applied to a recommendation sy-

stem. (Maillot et al., 2014) presented a method for

extracting the targeted subpart of resource descrip-

tion framework (RDF) bases, driven by a list of se-

lected resources called the seed. In addition, a rese-

arch study in the Semantic Web ﬁeld, by Demarchi

F. et al. (Demarchi et al., 2018) proposed an imple-

mentation that would allow agents to access ontolo-

gies that are available on the Web so as to update their

beliefs based on signiﬁcant content. A case study of

an educational quiz is also presented that used the in-

formation to formulate the questions and to validate

the answers obtained.

Several attempts have been made to use Linked

Data as a resource for generating teaching materials

in the education ﬁeld.

In ASSESS (B

uhmann et al., 2015), attempted to

generate questions in several formats in the speciﬁc

ﬁeld of general knowledge using LOD. Based on this

attempt, it is possible to generate questions in natural

language by summarizing an entity based on DBpe-

dia and verbalizing the RDF. In addition, the choices

corresponding to each format of questions are imple-

mented by using LOD. Papasalouros et al. also pre-

sented a method of generation multiple-choice questi-

ons in natural language from Semantic Web Rule Lan-

guage rules which is interpreted so that if the antece-

dent holds, then the consequent must also hold; in the

simple form antecedent ⇒ consequent (Papasalouros

et al., 2008) (Zoumpatianos et al., 2011). Also, Rocha

et al. attempted generation questions that had resour-

ces that were relevant to a speciﬁc domain or topic

from a dataset (Rocha et al., 2018), and Afzal et al.

presented generation questions regarding the impor-

tant concepts that presented in a domain by relying on

the unsupervised relation extraction approach as ex-

tracted semantic relations (Afzal and Mitkov, 2014).

Furthermore, researchers in a study generated an

evaluation model of the incorrect choices created in

multiple-choice questions (Pho et al., 2015). The mo-

del was generated to enable the automatic evaluation

of the quality of the incorrect choices that the author

manually created. The model focuses on the syntactic

and semantic similarity between the choices, treating

them as elements related to the quality of these choi-

ces. Meanwhile, (Patra and Saha, 2018) considered

closeness between the key and the possible distractors

by using web information in their proposed system

for automatic named entity distractor generation. A

research study involved the generation of a historical

ontology that used LOD to generate history questions

Generation of Multiple Choice Questions Including Panoramic Information using Linked Data

111

(Jouault et al., 2016). In this research, a question-

setting system based on a learning scenario was in-

cluded. Speciﬁcally, Grasser’s classiﬁcation method,

which classiﬁed multiple knowledge bases and ques-

tion formats, was used. Statistical data, such as the

degree of difﬁculty of each test item, is used when

constructing an examination test from a large num-

ber and a wide range of questions. In other words,

when evaluating learning achievement through an ex-

amination test, it is necessary to set items at difﬁculty

levels that are considered the preset passing marks.

As a study on the difﬁculty level of an examination

test, (Ikeda et al., 2013) proposed a difﬁculty level es-

timation method focusing on the similarity between

the question pattern and the choices of the multiple-

choice question, and evaluating this based on the dif-

ﬁculty parameter of the item response theory (IRT).

In these related works, the possibility of using Se-

mantic Web technology has been veriﬁed against the

theme of generating test questions. However, these

proposed systems can generate only uniform ques-

tions and choices for keywords with these methods.

As a result, the multiple-choice questions feature sim-

ple content, such as “What is {person’s name} birth

place?” and “Which work is made by {person’s

name}?”, that contains only one or two matters con-

cerning the answer in examination sentences. Also,

there is a possibility that the questions are in a narrow

ﬁeld since their resource of the questions is composed

of highly relevant contents. Therefore, in this rese-

arch, we aimed to generate questions including more

panoramic information.

3 PURPOSE

The purpose of this research is to propose a method

for automatic generation of multiple choice questions

including panoramic information. Panoramic infor-

mation means comprehensive information that gives

us macro-perspective; through which us look down

at the whole learning subjects. A question including

panoramic information refers to content that includes

transverse related information and makes respondents

grasp the whole knowledge.

In the use scenario of the question generation sy-

stem, a person who sets examination questions and

learners can be seen as users of the system. The exam

preparer may be able to reduce costs such as time and

effort for creating test questions by using a system

into which a curriculum, including evaluation items,

is input. In addition, learners’ use scenarios include

self-study and exercise test questions.

Figure 1 shows an example of an output of multi-

ple choice questions generated by this method. In the

system, output is generated by setting any curriculum

and selection of unit as input, extracting the keywords

included in the curriculum, selecting the keywords to

set in Answers, and conﬁguring questions and incor-

rect choices with Wikipedia data. The output is the

components of the multiple choice question. There

are “Question” which is a sentence of the test ques-

tion, “Answer” which is a correct answer choice and

“Distractors” which are incorrect choices. In this re-

search, a graph is taken as a Question and referred to

as the“Question Graph”. From the linked structure of

the graph and the sentence in question, it can be un-

derstood that the answer has some relationships that

include a philosopher studied in the class, a person

who had connections to peripatetic school and one

of whose notable idea was syllogism, is a person in

the era of ancient philosophy, and inﬂuenced Socra-

tes. When selecting vocabulary that corresponds to

the answer that satisﬁes all these relationships from

ﬁve choices, “Aristotle” becomes the correct answer,

and the remaining choices become incorrect.

4 PROPOSED METHOD

This section explains the generation approaches of

Question Graph and distractors for arbitrary answer.

Incidentally, it is possible to use DBpedia, DBpedia

Japanese, etc. as a knowledge base.

4.1 Approaches to Generating the

Question Graph

The Question Graph is generated by searching triple

structures for an answer that regards an RDF graph

visualizing the relationship between the acquired in-

formation as a question sentence. In other words, a

Question Graph is a test question format that hides

an element corresponding to the answer in the graph

and guesses it from some words around it and their

properties. Then, the graph itself is deﬁned as a ques-

tion. Since the matter to be questioned in the Question

Graph is related to the data around Answer, there is

no need to document it, and as shown in Figure 1, it is

presented as “Question graph,” which is a graph as a

Question. There is existing research [Fionda 17] that

proposes an algorithm to search all subgraph structu-

res among multiple data sets (Fionda and Pirr

o, 2017).

In our approach, a searching method considering

the following requirements is devised.

ICAART 2019 - 11th International Conference on Agents and Artiﬁcial Intelligence

112

4.1.1 Requirements of Generating the Question

Graph

The requirements to be satisﬁed as the Question

Graph are set as follows. In particular, the items sta-

ting (Mandatory) are essential requirements.

Requirements of Question Graph

 

(1) (Mandatory) Each node consists of key-

words.

(2) (Mandatory) Ensure the connectivity of in-

formation around Answer.

(3) Include panoramic information as much as

possible on the entire graph.

(4) Scale of the graph can help grasp all its con-

tent.

(5) The number of vocabulary words correspon-

ding to the answer is extremely small.

 

4.1.2 Method of Generating a Question Graph

We devised a method to generate a Question Graph.

The requirements below are based on an RDF graph

using DBpedia.

Question Generation Algorithm for Single Se-

lection Form

The following describes the method of generating

question graphs for single choice questions. In this

format, the Answer node has one or more answer in

the graph, and respondents answer select one from

some choices.

The graph for an answer is generated by exten-

ding the link structures of data around Answer by se-

arching for neighbor nodes to the answer node, further

searching from neighbor nodes to each of their neig-

hbor nodes repeatedly. The link structure between the

nodes is formed by directed links of IN and OUT. In

the process of searching the link structures, the se-

arch range and the number of times are restricted pre-

liminarily in order to make the graph, considering the

scale based on requirement (4). Regarding the num-

ber of searches, when searching for neighbor nodes

by 1 hop with respect to the answer node, the number

of hops is determined as the search depth h. The se-

arch range deﬁnes the number of each of the directed

link structures of IN and OUT at the same depth h as

the search width w. This method generates a Question

Graph with a scale satisfying the restrictions of h and

w. In particular, considering the degree of inclusion of

panoramic information based on requirement (3), we

propose a search method in which all neighbor nodes

after the answer node are absolutely distant from the

origin.

We devised the following Algorithms 1 and 2 as

the basic algorithms for generating Question Graphs

for answers.

Algorithm 1: Main.

Input: KG G, Answer, depth h, width w

Output: KG

1: N

= get f ar nodes(Answer, {}, G, h,w)

2: M

= get all links(N

)

3: return (N

, M

)

Algorithm 2: get far nodes.

Input: Node target, Set of ancestor nodes Ancestors, KG G, depth

h, width w

Output: Set of nodes

1: N = {target}

2: if |Ancestors| == h then

3: return N

4: end if

5: count = 0

6: for direction ∈ {IN, OU T } do

7: B = neighbors(target, direction)

8: while count < w AND 0 < |B| do

9: f lg = True

10: n = arg max

∈B

dist(target, n

)

11: B = B\{n}

12: for n

∈ Ancestors do

13: if dist(n

, n) < dist(n

,target) then

14: f lg = False

15: end if

16: end for

17: if f lg then

18: count = count + 1

19: N = N ∪ get f ar nodes(n, Ancestors ∪

{target},G, h, w)

20: end if

21: end while

22: count = 0

23: end for

24: return N

The Main algorithm acquires all node sets rela-

ting to information around Answer, then acquires all

the node sets’ link structures and returns this infor-

mation as the subgraphs. As input, the algorithm is

given the knowledge graph KG (Knowledge Graph)

G, the correct Answer, the search depth h, the search

depth w, on the knowledge base as the output. The

subgraph consists of a set of all nodes N

and a set of

link structures between nodes M

The get far nodes algorithm returns all the nodes

in the subgraphs relating to the target node. A graph

on Answer can be ﬁnally obtained by specifying Ans-

wer as the default of target. From lines 6 to 23, the

graph extends searching neighbor nodes for each di-

rected link of target. In order to obtain the neighbor

node of the target, the function of neighbors that re-

turn the neighbor node set in line 7 is deﬁned and

used. In line.10, the neighbor node whose distance to

the target is the maximum is returned by the function

Generation of Multiple Choice Questions Including Panoramic Information using Linked Data

113

dist that ﬁnds the distance between speciﬁed nodes.

In line.12, the distance comparison between the neig-

hbor node and all its ancestors and the distance bet-

ween the target and all ancestors is recursively repe-

ated up to the search depth of h. This ensures that the

distance between all nodes, from the Answer node to

the neighbor node, will always be farther away. Fi-

nally, all the node sets are returned, and a subgraph

consisting of all node sets and their link structures as

a Question Graph is obtained.

Based on the algorithm, we aim to generate a

graph with a compact scale and a high degree of pa-

noramic information inclusion.

Question Generation Algorithm for Multiple

Selection Form

The following describes the method for generating

question graphs in multiple-choice question form.

Because this question format requires answers with

multiple choices, it should be guaranteed that there

are that two or more words corresponding to Answer

nodes on the graph. Therefore, in requirement (5),

the number of words corresponding to the answer is

adjusted.

In this approach, the vocabulary corresponding to

the Answer node is necessarily greater than the spe-

ciﬁed number nA when depth h = 1. First, if there

are nA words in the knowledge base corresponding

to nodes that have all neighbor nodes acquired by the

neighbors function, adopt them as the neighbor nodes

on the graph. The algorithm keeps a panoramic de-

gree as much as possible by checking this from the

vocabulary with the largest distance in order. After

that, getting a combination of all nodes adjacent to

Answer, and acquires multiple words that can be app-

lied to the Answer node. For h = 2 and later, the graph

is completed by connecting neighbor nodes with the

get all nodes procedure.

In addition to the above basic algorithm, we also

consider the major degree of vocabulary this time. For

weighting measures, we used the Balanced Corpus of

Contemporary Written Japanese; (BCCWJ) by “Cor-

pus Development Center, NINJAL”

. The vocabulary

of the textbook sub-corpus (OT) was set to 1.0, the

history and social science classiﬁcations among libra-

ries and the publication sub-corpus (LB and PB) were

set to 0.5, and the other was set to 0.0. This was ap-

plied by adding it to the dist function so that the vo-

cabulary existing in the corpus could be more easily

adopted. In addition, each node of depth h = 1, is pre-

ferentially adopted when the vocabulary of corpus OT

exists in the inquiry result from the knowledge base.

From the above basic procedures, a Question

http://pj.ninjal.ac.jp/corpus center/bccwj/

Graph on the answer is generated.

4.2 Approach to Generating Distractors

Since distractors are nodes that do not correspond to

an answer in the Question Graph, they can be genera-

ted by searching for nodes that do not satisfy all the

link structures with the answer (even if one of them is

satisﬁed). Therefore, we explore the method of gene-

rating distractors using the Question Graph generated

above.

4.2.1 Requirements of Generating Distractors

The requirements to be satisﬁed by distractors we set

as follows.

Requirements of Distractors

 

(1) (Mandatory) Each node consists of key-

words.

(2) (Mandatory) Incorrect answer to the Ques-

tion.

(3) Avoid words clearly recognized as incorrect

answers.

 

4.2.2 Method of Generating Distractors

From requirement (3), in order to generate distrac-

tors that are not clearly recognized as incorrect ans-

wers, a vocabulary set similar to the Answer should

be selected. Vocabularies similar to the answer can be

thought that the link structures on the knowledge base

are similar to one of the Answer. From this, distractor

candidates are generated from the adjacent link struc-

tures of answers by using the Question Graph genera-

ted by the above method.

First, a set of adjacent link structures for an answer

is extracted from the Question Graph. We regard the

words corresponding to a node that satisﬁes a set of

other link structures only when deleting one or more

links from the extracted set as distractor candidates.

Furthermore, referring to the acquisition method

of (Pho et al., 2015), narrowing down the words

belonging to the same class as answer on DBpedia

among the candidates. If an answer belongs to more

than one class, it is immediately rated class C, which

is the lowest class; that is, the direct class C of the ans-

wer is an instance, but not an instance of that subclass.

Finally, candidates corresponding to the objects obtai-

ned as a property of class C are adopted.

As a method of adopting from a candidate to an

option, we adopt them from the one with the smallest

ICAART 2019 - 11th International Conference on Agents and Artiﬁcial Intelligence

114

difference between the distances to the neighbor no-

des and largest degree of popularity. Figure 2 shows

the generating steps.

Figure 2: Generating step of Distractors based on the met-

hod.

5 IMPLEMENTATION

The implementation of generation of Question

Graphs and distractors by the above approach is

described below. We use DBpedia Japanese as a

knowledge base and set the SPARQL endpoint to

“http://ja.dbpedia.org/sparql/”.

In this research, we set social studies subjects (ge-

ography, history, citizens etc.) especially as the dom-

ain of questions. In addition, we selected words deﬁ-

ned as article titles or categories on Wikipedia in ad-

vance and used them as keywords for answers. For

implementation, no curriculum is set, and the voca-

bulary on DBpedia are set as the keywords.

5.1 Generating the Question Graph

As described in the previous section, a Question

Graph is generated by collecting information around

the answer via SPARQL query and then visualizing

the RDF graph.

In deﬁning the dist function in Algorithm 2 above,

we use value of similarity based on the pre-learned

model of word2vec

(Suzuki et al., 2016) as a compa-

rable index between words syntactically and semanti-

cally.

In the visualization of graphs, the words around

the answer and the relationships between them are

drawn using Graphviz.

http://www.cl.ecei.tohoku.ac.jp/

∼

m-suzuki/

jawiki vector/

For example, the following Figure 3 shows the

Question Graph generated based on the RDF graph

for Answer = “Socrates.” Here, the Answer node is

red, the vocabulary of the corpus OT is blue, the cor-

pus’ LB and PB are green, and other items are grey.

5.2 Generating Distractors

As in the above approach, distractors are generated by

obtaining some instances belonging to the same class

C as an answer by using the link structures with the

answer node.

Table 1 summarizes distractor candidates genera-

ted by Answer =“Socrates”.

Table 1: Distractors on Answer=“Socrates”.

links Answer/Distractors candidates{total}

(0,1,2,3,4,5) Socrates {1}

(3,4) Heraclitus {1}

(0) Diogenes (cynic school) {1}

(1) Immanuel Kant {1}

(2) John Stuart Mil {1}

(3) Anakusagorasu {1}

6 EVALUATION

For the generated Question Graph and distractors, the

evaluation method corresponding to each of the above

requirements is described below. In this evaluation,

the search depth of the Question Graph was set to

h = 2 and the width to w = 2. We selected the ap-

propriate Answer, but this time, for the evaluation of

distractors, set words with Answers to which nobody

were supposed not to know the correct answer.

6.1 Evaluation Method for Question

Graph Generation

Evaluation items for the Question Graph requirements

are listed below.

Generation of Multiple Choice Questions Including Panoramic Information using Linked Data

115

Figure 3: The Question Graph on Answer=“Socrates”.

Question Graph Evaluation items

 

 Consistency...[Requirement (2)]

• Answer corresponding to correct answer.

 The degree of inclusion of panoramic infor-

mation...[Requirement (3)]

• The degree of crossing classes of each

node

• The degree of crossing units on the curri-

culum

• The degree of crossing time

 Speciﬁcity...[Requirement (5)]

• Smallness of words corresponding to the

Answer

 Readability...[Requirement (4)]

• Compactness of the graph scale

 

The following two evaluation experiments were

conducted on the second item regarding the degree of

inclusion of panoramic information, and on the third

item regarding speciﬁcity.

6.1.1 Evaluation Method for Degree of Inclusion

of Panoramic Information

Since a speciﬁc curriculum is not set, the ﬁrst and

third evaluation items are implemented. The ﬁrst item

is an index of how far across the classes in the know-

ledge base used to generate the Question Graph are.

In this case, we examined the classes to which vo-

cabulary belongs from Class “Thing” and below in all

classes on DBpedia. In the evaluation, only nodes that

can be acquired from DBpedia are subject to calcula-

tion for the classes.

6.1.2 Speciﬁcity Evaluation Method

For the generated graph, the number of words cor-

responding to the answer node is also evaluated. It

expresses the smallness of the number of correct al-

ternative answer from the question graph when the

choices are ignored. The number of corresponding

words becomes clear by searching nodes that have all

of the same link structure as the Answer.

6.2 Evaluating Method of Generating

Distractors

Evaluation items for the requirements of distractors

are listed below.

Distractor evaluation items

 

 Incorrect answer to the Question

Graph...[Requirement (2)]

 The validity of those who do not know the

correct answer is less than one out of the

number of choices...[Requirement (3)]

 The magnitude of similarity to the ans-

wer...[Requirement (3)]

 

In particular, for the second and third evalua-

tion items, the following evaluation experiments were

conducted.

6.2.1 Comparison of Validities by Experimental

Subject

By comparing the validity to the actual generated que-

stions in subjective experiments, we veriﬁed whether

ICAART 2019 - 11th International Conference on Agents and Artiﬁcial Intelligence

116

the validity were less than one out of number of the

choices.

6.2.2 Comparison of Similarity between the

Answer and Choices

We compared similarities among the obtained distrac-

tors. From the syntactic and semantic aspects, the fol-

lowing indexes of similarity comparison are listed.

• Syntactic similarity

For both answer and distractor candidates, com-

pare both the parts of speech and the composition

patterns of the words. In this case, we used Cabo-

Cha

as a parsing tool and compared both depen-

dency and part of speech, verifying whether they

matched.

• Semantic similarity

There are methods of comparison based on

indexes of evaluation models in (Pho et al.,

2015) and comparison by item analysis in

(Mitkov et al., 2009); by type of vocabulary

{PersonLocationOrganization}; by “DBpedia en-

tity”, which is a semantic index using entities gi-

ven to vocabulary on DBpedia; and by calcula-

ting and comparing “wup similarity,” which is

a semantic index using the distance in WordNet

vocabulary hierarchical structure. For the time

being, similarity was calculated by a pre-learned

word2vec model.

6.3 Results

For the execution environment, we used MacBook

Pro for PC and macOS High Sierra for OS. In this

system, SPARQL ran with the library SPARQLWrap-

per

as the language to be used in Python. We also

measured the execution time for question genera-

tion for 10 set Answers. The Question Graph requi-

red 76.66 seconds to read the pre-learned model of

word2vec only once upon execution, and the time of

Question Graph generation for each subsequent ques-

tion was 13.09 sec. The time needed to generate dis-

tractors was on average 3.575 sec for each of the que-

stions.

In the experiment, a Question Graphs, three Dis-

tractors and an Answer were set as a question set. In

addition, we made 10 questions in a single answer for-

mat (hereinafter referred to as ”single-answer form”)

and 5 questions for multiple answer formats (herein-

after referred to as ”multiple-answer form”).

https://taku910.github.io/cabocha/

https://rdﬂib.github.io/sparqlwrapper/

6.3.1 Results of Question Graph: Evaluation

Method for Degree of Inclusion of

Panoramic Information

For the evaluation of the degree of inclusion of pa-

noramic information, we summarized the number of

classes in one graph as the degree of crossing classes

in Tables 2.

Table 2: The degree of crossing classes (single-answer

form).

Q. number of nodes number of classes

proposal random proposal random

1 21 21 7 5

2 21 17 5 3

3 21 21 6 5

4 21 21 6 7

5 21 21 3 8

6 21 21 8 5

7 21 19 6 7

8 21 21 6 3

9 21 21 3 3

10 21 21 6 3

Ave. 21 20.4 5.6 4.9

6.3.2 Results of Question Graph: Speciﬁcity

Evaluation Method

Regarding the 10 Question Graphs generated in

single-answer form, we queried the vocabulary set

corresponding to the Answers node on DBpedia, and

it was all empty except for Answer. In multiple-

answer form, when two answers were speciﬁed, all

the graphs were such that there were only two words

corresponding to Answer nodes. Therefore, it can be

said that the evaluation items of speciﬁcity were sa-

tisﬁed by this experiment.

6.3.3 Results of Distractors: Comparison of

Validities by Experimental Subject

Results by subjective experiment are shown. In Table

3 and Figure 4, results obtained from 37 subjects are

summarized, including the validity for single-answer

form. Similarly, for the multiple-answer form, the re-

sults of 23 respondents are summarized.

6.3.4 Comparison of Similarity between the

Answer and Choices

For each distractor using the dist function (proposal)

and random generation (random); random selection of

the neighbor nodes of each target node from all nodes

that have links of the target node, similarity compa-

rison by syntactic pattern (pattern) and the results of

similarity comparison by pre-learned word2vec mo-

del (word2vec) are shown in Table 4. In the table, the

Generation of Multiple Choice Questions Including Panoramic Information using Linked Data

117

Table 3: The validity in experiment (single-answer form).

Q. validity[%] selectivity of Distractors (descending)[%]

D1 D2 D3

1 64.9 24.3 8.09 2.67

2 21.6 59.4 5.36 3.56

3 21.6 48.6 18.9 10.8

4 13.5 70.3 13.5 2.67

5 40.5 32.5 18.9 8.09

6 13.6 54.0 21.6 10.8

7 91.9 8.1 0.00 0.00

8 56.8 35.1 5.41 2.69

9 21.6 40.5 29.7 8.09

10 18.9 35.1 29.7 16.2

Figure 4: Questionnaire:“Scale of Question Graphs”.

“pattern” value is set as 1 if the pattern matches the

answer, and “word2vec” indicates the similarity with

Answer. Each value is the average of the values per

question; that is, it is the value of similarity per dis-

tractor.

Table 4: Comparison of similarity between Answer and

Distractors generated by proposed method and random

(single-answer form).

Q. pattern word2vec

proposal random proposal random

1 2 2 .613 .330

2 2 2 .305 .248

3 2 1 .451 .364

4 3 1 .359 .325

5 3 2 .801 .369

6 2 0 .438 .534

7 3 0 .354 .179

8 2 1 .557 .360

9 0 2 .475 .463

10 0 2 .671 .510

Ave. 1.9 1.3 .503 .368

7 DISCUSSION

In terms of implementation, we set the vocabulary on

DBpedia as the learned keyword without setting the

curriculum, so it was obvious that mandatory require-

ments (1), “each node consists of keywords,” and (2),

“ensure the connectivity of information around Ans-

wer,” were satisﬁed in the Question Graph. On the

other hand, if setting any curriculum and keywords

exist on Linked Data to be used, the requirements are

satisﬁed and the method can be used. Also, if the ans-

wer is a special vocabulary, there is a possibility that

any link structures may not be found and the gene-

ration will fail theoretically. However, it seems that

the major vocabulary such as those that appear in the

examination and its relationships are largely covered

by LOD like DBpedia. Regarding requirement (3),

“the degree of inclusion of panoramic information,”

the proposal was exceeded the random by an average

of 0.7 classes (as seen in Table 2).

From this, with respect to the index on the number

of classes, the degree of panoramic information was

greater by proposal than by random generation in the

experiment. Regarding “scale of the graph” in requi-

rement (4), the results of the questionnaire on subject

experiments in Figure 4 showed that 4 out of 5 eva-

luations occurred most frequently in this experiment.

Therefore, the scale was large based on the subjects

chosen. Finally, regarding “the number of words cor-

responding to Answer node” in requirement (5), the

requirement was satisﬁed since there were no words

corresponding to the answer nodes except on answers

in the 15 graphs.

Regarding distractors, mandatory requirement (1),

“each node consists of keywords,” and (2), “incor-

rect answer to the question,” are satisﬁed from the ge-

neration approach in the experiment. Regarding the

second evaluation of Requirement (3), “avoid words

that are clearly recognized as incorrect answers,” in

the case all subjects do not know the correct answer,

it is desirable that the correct answer rate for each que-

stion is one out of the number of choices; 1/4 =25%

or less. For that evaluation, six questions in single-

answer form satisﬁed the index. Questions 1, 5, 7 and

8 did not. Among the four unsatisﬁed examples, in

Q.7, only one out of the three distractors was selected,

so it was a remarkable result that did not satisfy the

requirement. Regarding the multiple formats, the dif-

ference in the selectivity between distractors is less

than 13%, and the selectivity was not biased. Next, in

Table 4, results of the random is evaluation of Distrac-

tors generated by extracting randomly from words be-

longing to the same class as the answer on DBpedia.

In these indexes, the proposal tended to have more si-

ICAART 2019 - 11th International Conference on Agents and Artiﬁcial Intelligence

118

milar Distractors with Answer than random.

In this study, we conducted a questionnaire to

compare with existing questions (The Japanese His-

tory Aptitude Testing Foundation, 2017) (The Japa-

nese History Aptitude Testing Foundation, 2018) for

14 teachers’ license holders. For the ﬁve single-

answer forms, we got responses mainly on “degree of

panoramic information”. An average of 72.8% ans-

wered that the degree of panoramic of proposed que-

stions were higher than the random in Questionnaire

1. In addition, Figures 5 and 6 show the evaluation re-

sults of the degree of panoramic information and con-

tent. Not only the difference in degree of panoramic

information but also in content shows a big impres-

sion overall.

8 CONCLUSION

In this paper, we proposed a method of generating

multiple-choice questions including panoramic infor-

mation. Prospects are listed below.

In the proposed method, we considered the dis-

tance between nodes to generate a graph including

panoramic information, but did not consider the mea-

ning of links between nodes. To generate intentional

test questions, not only the nodes should be conside-

red, but also the types of links. In the evaluation of

the degree of inclusion of panoramic information, we

evaluated based on the three items; degree of cros-

sing classes, the degree of crossing time, which was

impossible with only these indexes. Therefore, we

should review the current evaluation indexes, clarify

the deﬁnition of the degree of panoramic information,

and set up an evaluation index based on it to conduct

experiments. Currently, it is necessary to analyze how

this index inﬂuences the degree of the whole graph. In

consequence, Distractors generated by proposed met-

hod tend to be more similar than the random as the

evaluation indexes of pattern and word2vec.

In the generation of distractors, if the class to

which an answer belongs is not unique, only one

kind of candidate apparently different from the ans-

wer may be generated, so this process should be im-

proved. Also, to deal with synonyms between choi-

ces, we will establish a veriﬁcation phase using Word-

Net

Also, as a new question form application of this

proposal, a combination question is considered. The

combination question is often seen in Japanese his-

tory and world history examinations of the National

Center Test for University Admissions

, which is a

https://wordnet.princeton.edu

https://www.dnc.ac.jp/center/

format that answers combinations of answers in diffe-

rent questions from choices. We are considering that

there should be a demand for this format.

In the future, we aim to improve the method of

automatically generating questions considering pano-

ramic information by reviewing the approaches and

evaluation method for Question Graphs and distrac-

tors.

ACKNOWLEDGEMENTS

This work was supported by JSPS KAKENHI Grant

Numbers JJP16K00419, JP16K12411, JP17H04705,

JP18H03229, JP18H03340 and JP18K19835.

REFERENCES

Afzal, N. and Mitkov, R. (2014). Automatic generation of

multiple choice questions using dependency-based se-

mantic relations. Soft Computing, 18(7):1269–1281.

uhmann, L., Usbeck, R., and Ngomo, A.-C. N. (2015).

Assessautomatic self-assessment using linked data. In

International Semantic Web Conference, pages 76–89.

Springer.

Demarchi, F., Santos, E. R., and Silveira, R. A. (2018). Inte-

gration between agents and remote ontologies for the

use of content on the semantic web. In Proceedings of

the 10th International Conference on Agents and Ar-

tiﬁcial Intelligence - Volume 1: ICAART,, pages 125–

132. INSTICC, SciTePress.

Fionda, V. and Pirr

o, G. (2017). Meta structures in know-

ledge graphs. In International Semantic Web Confe-

rence, pages 296–312. Springer.

Iijima, T., Kawamura, T., Sei, Y., Tahara, Y., and Ohsuga, A.

(2016). Sake selection support application for coun-

tryside tourism. In Transactions on Large-Scale Data-

and Knowledge-Centered Systems XXVII, pages 19–

30. Springer.

Ikeda, S., Takagi, T., Takagi, M., and Teshigawara, Y.

(2013). A proposal and evaluation of a method of es-

timating the difﬁculty of items focused on item types

and similarity of choices. Journal of Information Pro-

cessing, 54(1):33–44.

Ikegami, M. (2015). Comparison of strategy use bet-

ween different item format types on multiple-choice

grammar test. Studies in Language and Literature,

35(1):55–72. (in Japanese).

Jouault, C., Seta, K., and Hayashi, Y. (2016). Content-

dependent question generation using lod for history

learning in open learning space. New Generation

Computing, 34(4):367–394.

Maillot, P., Raimbault, T., Genest, D., and Loiseau, S.

(2014). Targeted linked-data extractor. In Procee-

dings of the 6th International Conference on Agents

and Artiﬁcial Intelligence - Volume 1: ICAART,, pa-

ges 336–341. INSTICC, SciTePress.

Generation of Multiple Choice Questions Including Panoramic Information using Linked Data

119

Ministry of Education, Culture, Sports, Science and

Technology (2015). The 100th primary and secondary

education subcommitteehandouts 1-1 4. necessary

measures to realize the philosophy of government gui-

delines for teaching. http://www.mext.go.jp/b menu/

shingi/chukyo/chukyo3/siryo/attach/1364319.htm.

Mitkov, R., Ha, L. A., Varga, A., and Rello, L. (2009).

Semantic similarity of distractors in multiple-choice

tests: extrinsic evaluation. In Proceedings of the

Workshop on Geometrical Models of Natural Lan-

guage Semantics, pages 49–56. Association for Com-

putational Linguistics.

Papasalouros, A., Kanaris, K., and Kotis, K. (2008). Au-

tomatic generation of multiple choice questions from

domain ontologies. In e-Learning, pages 427–434. Ci-

teseer.

Patra, R. and Saha, S. K. (2018). A hybrid approach for

automatic generation of named entity distractors for

multiple choice questions. Education and Information

Technologies, pages 1–21.

Pho, V.-M., Ligozat, A.-L., and Grau, B. (2015). Distrac-

tor quality evaluation in multiple choice questions. In

International Conference on Artiﬁcial Intelligence in

Education, pages 377–386. Springer.

Rocha, O. R., Zucker, C. F., and Giboin, A. (2018). Ex-

traction of relevant resources and questions from db-

pedia to automatically generate quizzes on speciﬁc

domains. In International Conference on Intelligent

Tutoring Systems, pages 380–385. Springer.

Suzuki, M., Matsuda, K., Sekine, S., Okazaki, N., and Inui,

K. (2016). Neural joint learning for classifying wi-

kipedia articles into ﬁne-grained named entity types.

In Proceedings of the 30th Paciﬁc Asia Conference

on Language, Information and Computation: Posters,

pages 535–544.

The Japanese History Aptitude Testing Foundation (2017).

The Japanese History Aptitude Testing Foundation in

2016, the 35th time, question collection of all the

grade, pp.29-pp.67.

The Japanese History Aptitude Testing Foundation (2018).

The Japanese History Aptitude Testing in 2017, the

36th time, question collection of all the grade, pp.33-

pp.75.

Zoumpatianos, K., Papasalouros, A., and Kotis, K. (2011).

Automated transformation of swrl rules into multiple-

choice questions. In FLAIRS conference, volume 11,

pages 570–575.

A Questionnaire Results in

Evaluation Experiment

Figure 5: Questionnairem 2: “Comparison of differences in

degree of panoramic information”.

Figure 6: Questionnaire 3: “Comparison of differences in

degree of information contents”.

ICAART 2019 - 11th International Conference on Agents and Artiﬁcial Intelligence

120