devil, spirit, method, glory, etc. might also reflect a cultural difference and thus per-
sonal familiarity and the availability of context. Even among the judges in the current
study, we observed contrastive ratings for words like town, field, carbon, pattern,
moral, humor, and theory. This suggests that personal experience and intuition might
play a more important role than other objective factors on the judgements for con-
creteness.
A potential limitation of our current categorisation of the dictionary definitions is
that abstract concepts might be defined by genus and differentiae more often than
expected. For instance, one meaning of “mercy” is “a disposition to be kind and for-
giving”, and one meaning of “illusion” is “an erroneous mental representation”. This
may be an artifact of WordNet definitions since WordNet places each sense in a hie-
rarchy of hyponymy relation, which covers both concrete and abstract concepts.
Words like “disposition” and “representation” are nevertheless abstract even when
they are the genus terms for other words. To this end, we plan to check against other
dictionaries and explore possible ways to deal with various kinds of genus terms, to
refine the concreteness index induced from definition categories.
In the current study, our human ratings on lexical and sense concreteness came
from non-native speakers of English. Although we found a high degree of agreement
between their ratings with those by native speakers, the cultural difference may have
influenced the familiarity of the raters with the word samples and thus the context
availability associated with individual words.
Also, in the current study, we have only started with and focused on one of the
possible external evidence for lexical concreteness, namely dictionary definition
styles. Given that human ratings on concreteness may be a result of the interaction of
many factors including word frequency, context availability, imageability and access
to sensory referents, etc., it will be appropriate for us to resort to other sources of
external evidence such as word association norm data, authentic linguistic context
from corpus data, domain information, etc. for a more realistic and complete model of
lexical concreteness. Hence, apart from refining our analysis and categorisation of
definition styles based on more dictionaries, as pointed out above, our next steps will
focus on the extension toward other data sources for modelling the concreteness dis-
tinction and simulating the concreteness index. This will also be investigated in rela-
tion to the various competing theories on why abstract words are harder to understand,
thus drawing from both psycholinguistic findings and existing language resources to
achieve a cognitively plausible computational simulation of the concrete-abstract
distinction. Moreover, further studies will be conducted to examine the effect of
lexical and sense concreteness on the information demand of automatic word sense
disambiguation and the use of concreteness for indicating potentially confusable
senses for better evaluation of disambiguation performance, as suggested in [8].
7 Conclusions
In this paper we have reported on our preliminary study on simulating human judge-
ments on the concreteness or abstractness of words. We have analysed and catego-
rised dictionary definitions from their surface syntactic forms, which is assumed to
92