Conversely, skills can get rusty over time if a person
stops learning. This process is referred to as
“forgetting”. The expertise modeling algorithm
processes the expertise mentions incrementally, in a
chronological order. New expertise topics are
inserted into the expertise model and their weights
are increased by reinforcement and decreased by
forgetting.
2 RELATED WORK
The related work is mainly in the area of expertise
recommendation or expert finding which attempts to
find the right person with the appropriate skills and
knowledge. This is useful for many purposes
including problem solving, question answering, and
collaboration. A significant amount of research has
been generated in the Information Retrieval
community (Smirnova1 and Balog, 2011, Balog et
al. 2007; Balog et al. 2009; Liebregts and Bogers,
2009). This line of research focuses on content-
based algorithms, similar to document search. These
algorithms identify experts based on the content of
documents that they are associated with (Liebregts
and Bogers, 2009; Serdyukov et al. 2007). While
these approaches have been very effective in finding
the most knowledgeable people on a given topic
based on a large collection of documents from an
enterprise or the internet, it’s not clear how they can
be used to assess the expertise on multiple topics
based on a single resume.
3 THE REMA ALGORITHM
The REMA algorithm is shown in Figure 1. An
input resume is first parsed into expertise mentions.
The mentions are evaluated by Natural Language
Processing (NLP) tools to extract expertise topics.
Figure 1: REMA algorithm diagram.
These topics are then processed by REMA’s
expertise model adaptation component to generate
the expertise model. This algorithm is an extension
of our user modelling algorithm RAMA
(Reinforcement and Aging Modeling Algorithm)
described in detail elsewhere (Li and Alonso, 2014;
Li and Alonso, 2012; Alonso et al., 2010).
3.1 Expertise Mentions
Expertise mentions are phrases or statements in the
resume that indicate significant learning events. For
example, a resume may mention a paper on
databases in a certain year in the publication section.
When parsing the expertise mentions, the associated
resume section and the date of the event are captured
because they are important indicators of level of
expertise. REMA uses a source relevance parameter
to register the fact that expertise mentions in
different parts of a resume carry different
significance. For example, a mention in a patent and
publication section should indicate more expertise
than one in an experience and education section.
Even within the same section, mentions originated
from different sources may carry different
significances. For example, within the publication
section, mentions of a book or journal paper are
more indicative of expertise than those of a
conference paper. The date of event mentioned
reflects the recency of the learning. In other words,
skills or expertise acquired more recently are more
up-to-date and less likely to be forgotten. We use
regular expression and GATE to extract the date and
time information from the expertise mention.
3.2 Expertise Topics
Expertise topics are terms indicating skills or
expertise such as database or machine learning.
They are extracted from expertise mentions using
NLP tools. In particular, Apache Lucene®
1
is used
to extract simple terms from text. WordNet®
2
is
used to identify noun words. GATE is used to
extract noun chunks and named entities. OpenCalais
web service is used to extract expertise related tags
including "Industry Term", "Technology", and
"Programming Language". Relationships between
1
Apache Lucene is a registered trademark of the Apache
Software Foundation within the United States and/or
other countries.
2
WordNet is a registered trademark of the Trustees of
Princeton University within the United States and/or
other countries.