Using PageRank for Detecting the Attraction between Participants

and Topics in a Conversation

Costin Chiru, Traian Rebedea and Adriana Erbaru

University Politehnica of Bucharest, Department of Computer Science and Engineering,

313 Splaiul Independetei, Bucharest, Romania

Keywords: CSCL, Natural Language Processing, Participant Assessment, PageRank, Online Conversations.

Abstract: In this paper we present a novel approach that uses the well-known PageRank algorithm for assessing multi-

threaded chat conversations. As online conversations can be modelled as directed graphs, we have

investigated a method for allowing a real-time analysis of the conversation using PageRank by computing

the ranks of the utterances based on the explicit and implicit links available in the discussion. This model

has been also extended to offer a method for computing connections between the debated topics and the chat

participants and between each of the debated topics in the conversation, called the participant-topic and the

topic-topic attraction. The results presented in this paper are promising, but also reflect several important

differences between the existent offline analysis tools for chats and the PageRank method.

1 INTRODUCTION

Chat conversations (instant messaging) represent

nowadays one of the most popular methods of

exchanging ideas online. The easiness in learning

how to use chats and the high efficiency in

transferring the information, promoted chats as one

of the favourite environments for Computer

Supported Collaborative Learning (CSCL) tasks

requiring online and synchronous textual

interactions among participants (Stahl, 2006; Stahl,

2009). Due to this fact, it has been largely adopted in

CSCL activities and it has been enhanced with

functionalities specific to these tasks such as the

explicit referencing mechanism and the whiteboard

facility present in ConcertChat (Muhlpfordt and

Wessner, 2005). Still, in spite of its popularity and

of the huge quantity of data that is exchanged

through chats, there are very few application aimed

at analyzing this type of content (Chiru et. al, 2011;

Rebedea et. al, 2011). More than that, the existing

applications are built starting from a semantic

analysis (Chiru et. al, 2011; Rebedea et. al, 2011)

but the analysis takes far too much time to be used

as a real-time process and can only be applied

offline, at the end of the conversation. Therefore, we

have been searching for a different method to

analyze these conversations faster and got

influenced by the algorithms which are used by

search engines that have to analyze huge quantities

of data in a very short time. Thus, we reached the

conclusion that if the PageRank algorithm (Page et.

al, 1998) could be adapted for chats, this method

could be applied online (displaying the results of the

processing as the conversation unfolds) and

interactive (to signal what threads should be debated

more and involving people who contributed less on

specific threads), this way improving the learning

process and enhancing the participants' innovation.

To achieve this, we started from PolyCAFe

(Rebedea et. al, 2011), a system that is using

innovative methods for analysing CSCL chat

transcripts, helping both computer-assisted learning

and the tutors in evaluating the discussions. This

system analyzes chat logs using Natural Language

Processing (NLP), Latent Semantic Analysis (LSA)

and Social Network Analysis (SNA) techniques in

order to identify the most important utterances from

the conversation (in terms of their content and of the

participants’ involvement in the discussion).

Therefore, this paper presents an extension of

PolyCAFe’s functionality, trying to enhance it with

the ability to analyze CSCL chat sessions in real-

time. The first step of our analysis consists of

detecting the important utterances from the

conversation using the PageRank algorithm. Once

these utterances are identified, we use the PageRank

294

Chiru C., Rebedea T. and Erbaru A..

Using PageRank for Detecting the Attraction between Participants and Topics in a Conversation.

DOI: 10.5220/0004798202940301

In Proceedings of the 10th International Conference on Web Information Systems and Technologies (WEBIST-2014), pages 294-301

ISBN: 978-989-758-023-9

 2014 SCITEPRESS (Science and Technology Publications, Lda.)

algorithm for detecting what is the attraction of each

participant towards the debated topics and, at the

same time, what is the probability of a topic to

follow another topic within the conversation.

Most of the existing approaches for analyzing

text using social network analysis (SNA) tools are

oriented towards systems that own explicit

referencing tools, such as forums or blogs. The

reason behind this orientation is the ease in

constructing the participant social network based on

the order in which the messages are sent and on the

recipient of the message. Still, there are a few

systems that intended to apply SNA tools to chat

conversations. One such tool was built by

Sundararajan (2010) for analyzing the content

published by the participants to 8 different courses

in order to observe how the respect and influence

earned by each participant influences their efforts to

"collaborate, learn new and conceptual knowledge"

and their satisfaction regarding the courses outcome.

Unfortunately, the author does not mention whether

this analysis is done manually or automatically.

Moreover, a regular SNA method is used for

evaluating the participants from the perspective of

their centrality, betweenness, in-degree, out-degree,

etc. in the network, which represent only

quantitative data. On the other hand, we are rather

interested in what the participants communicate

(what are the topics they know or they are interested

on) and in the interaction patterns between different

concepts that are debated in the conversation, which

is part of a qualitative evaluation of the participants,

topics and the conversation as a whole.

A more similar approach was undertaken by

Tuulos and Tirri (2004). The authors present a semi-

supervised system that uses a combination of topic

modelling and SNA to improve the information

retrieval from chat conversations. For their analysis,

they have used conversations taken from

SearchIRC.com which allowed them to use simple

heuristics in order to identify to whom each

utterance is addressed (and therefore to build the

social network). For this participant network the in-

degree, out-degree and PageRank of each participant

are determined. After that, the authors use some

existing conversations to detect the probabilities of

words to appear in conversations about different

topics, so that when they analyze new conversations

to be able to use these probabilities. Finally, they

evaluate the use of each of the SNA technique in

improving the information retrieval, considering as

baseline the results provided by the topic modelling.

Still, this approach gives them two advantages: first

of all they know both how many and what topics

should be present in the conversation (therefore

knowing what represents off-topic and being able to

discard that part); secondly, they have chosen the

topics from different topics (Bible, C++, Philosophy,

Physics, Politics, Win2000) thus simplifying the task

of identifying to what topic a given concept

corresponds. In our approach neither of these facts

can be exploited: since our system does not have a

learning phase, it gives the possibility to analyze

texts debating about any topics, without being

limited to the ones that were learnt (thus providing

generality in use). At the same time, it can be used to

distinguish between concepts that are from the same

or similar conceptual area. The examples presented

in this paper contain concepts from a single domain

(Human Computer Interaction) especially to prove

that the approach works even at this level, without

requiring that different topics to be debated in the

same conversation.

The paper continues with a short overview of the

PageRank algorithm. Then, we present the

application that has been developed and several

results that have been obtained by employing the

PageRank method adapted for CSCL chats. The

paper ends with an analysis of these results and with

our conclusions regarding the improvement of the

results’ quality.

2 OVERVIEW OF THE

PAGERANK ALGORITHM

Because previous researches have modelled an

online conversation as a graph with implicit and

explicit links between utterances (Rebedea et. al,

2011), we have started to consider that the PageRank

algorithm (Page et. al, 1998) may be a candidate for

the conversation graph analysis. PageRank is an

algorithm that was initially designed for the analysis

of a set of web pages in order to extract the relative

importance of each page from the considered set of

web pages (Page et. al, 1998). The algorithm

expresses the probability that a web surfer will be

able to “find” the considered page within a limited

number of steps (clicking on the links from one page

to another). It is a customization of a “random walk”

in a graph, which in turn is modelled as a Markov

chain in which the states are pages, and the

transitions, which are all equally probable, are the

links between pages.

The formal definition given in the initial paper

describing PageRank (Page et. al, 1998) was: if u is

a web page; Fu (forward links), the pages referred

UsingPageRankforDetectingtheAttractionbetweenParticipantsandTopicsinaConversation

295

by u; || the number of forward links; Bu

(backward links) the ones that refer u, c a

normalization constant and E(u) a source of rank to

make up for the rank sinks (such as cycles) with no

out-edges, than the value of the rank R(u) can be

computed using:









∗













∈



∗







(1)

In order to compute the vector R(u), one starts from

the square matrix (we’ll call it A) having the web

pages on the rows and columns and ,

1/ if there is a link from page  to page  or 0

otherwise. If R is the a vector of scores over the web

pages, then we can write , which can

be re-written as 



1



because the

values of PR are normalized and therefore

‖



‖





1. That means that R is the eigenvector of 1

and the method should also try to maximize the

value of c (Page et. al, 1998). Subsequent research

showed that the optimal value for c should be 0.85

(Brin and Page, 1998).

The value of R can be obtained in an iterative

manner, starting from a vector of values over the

web pages (S) that can have any values (could be the

vector E(u)), using the following iterative algorithm

(Page et. al, 1998):





←

:





←



←

‖





‖





‖





‖







←





←

‖









‖





(2)

, where  is a factor for increasing the convergence

rate and for maintaining

‖



‖



1, while10



A web page will have a high PageRank if the

sum of the webpages’ PageRank that refers it is

large. This property covers two possible cases: when

a page has many other pages referring it, or when it

is referred by pages with high PageRank.

The PageRank algorithm has proved to be

suitable not only for Google’s rank of web pages,

but also for other tasks in various domains: replacing

the ISI factor with a new formula based on the

PageRank Algorithm (Bollen, Rodriguez and Van de

Sompel, 2006), ranking academic doctoral programs

based on their records of placing their graduates in

faculty positions (Schmidt and Chingos, 2007),

predicting how many people (pedestrians or

vehicles) come to the individual spaces or streets

(Jiang, 2006), performing Word Sense

Disambiguation (Navigli and Lapata, 2010), etc.

Thus, we hoped that it could also work for chat

analysis especially as a conversation can be seen as a

graph of links between utterances where discourse

flows in a similar manner to the importance of the

web pages.

In order to use this approach, we considered that

each utterance from the chat conversation represents

a different document and in order to simulate the

forward and backward links, we used explicit and

implicit links (details will be provided in the next

section). Thus, we managed to develop a method for

very fast identification of the important utterances

from a chat, of the major threads of discussion, and

of the participants’ attraction towards these threads.

3 PAGERANK FOR AUTOMATIC

ASSESSMENT OF CSCL

CHATS

As we have already mentioned, our application starts

from PolyCAFe project and uses some of its

features:

 Detection of the areas of high collaborative

discourse from a chat;

 Evaluation of the collaboration of each participant

in the discussion (based on multiple criteria);

 Graphical representation of the results.

3.1 Pagerank for Conversation

Analysis

We consider that PageRank is appropriate for chat

evaluation as this problem is very similar to the

original problem for which it was initially designed.

The sparsity in the chat is (in our opinion) similar to

the sparsity of relevant content from the web.

Therefore, we consider that the probability of a

person to "land" on a specific page from the web is

similar to the probability of a participant to reply to

a given utterance, while the links between different

pages are well simulated by the semantic

connections represented by the repetitions of the

same word and by the explicit "reply to" links.

Practically, this probability of a participant to reply

to a given utterance can be considered the "rank of

an utterance" (and it highlights the importance of

that specific utterance in the conversation).

The first step in applying the PageRank

Algorithm was the identification of the links that

exist between chat utterances. Since the pre-

processing part of our application was borrowed

from PolyCAFe, we also kept the input format of the

WEBIST2014-InternationalConferenceonWebInformationSystemsandTechnologies

296

chats, which allowed the existence of explicit links

(references provided by the chat participants to

specify to which previous utterance their answer is

addressed).

Besides the explicit links, one can also encounter

the situation when two or more utterances contain

concepts that are strongly related and therefore their

authors consider that there is not necessary to

provide an explicit link. We considered that this

situation is a special case of connection (an implicit

link) and tried to identify it in order to augment the

number of explicit connections (that was insufficient

for our purposes). Therefore, we considered that

words repetitions (Chiru et. al, 2011) are example of

such links. If a term appears in an utterance, all lines

that follow and contain that term are considered

implicit links to the initial utterance. Given the

nature of the algorithm, the two types of links that

we consider (explicit and implicit) have equal

weight.

Once we detect all these links, we build utterance

chains (which can be interpreted as discussion

threads since they debate the same concepts) starting

from these links using the DFS (Depth First Search)

Algorithm, thus finding all the existing separate

chains. They are needed for determining the

attraction between two different threads (topics).

The steps that should be followed in order to

determine the threads are:

1. Identify all the utterances that are not referenced

(neither by explicit nor by implicit links) – these

utterances are probably off-topics and therefore

they are ignored;

2. All the remaining utterances are considered to be

roots for the DFS Algorithm;

3. From each of these utterances (considered in the

order they appear in chat) we start a function

(implementing DFS) to detect the threads that

can be built starting from that utterance;

4. Each function will return a thread of utterances.

The next step is to create the transition matrix

corresponding to the chat utterances by considering

the links identified between them. The explicit and

implicit links between two utterances will provide a

value of 1 in the matrix, while the remaining

elements are set to 0 (meaning that there is no

connection between the corresponding two

utterances). Once this matrix is built, it needs to be

normalized with respect to the sum of the elements

from each column.

Finally, the values of the PageRank algorithm for

the given matrix are obtained using the power

method implementation provided by the JAMA

library - A Java Matrix Package (Hicklin et. al, n.d.).

The operations made for the detection of the

eigenvalues and the eigenvectors are:

1. Apply the eig method, which decomposes the

matrix in two other matrices: a matrix D

containing the eigenvalues and a matrix V

containing the eigenvectors;

2. The maximum (dominant) eigenvalue from the

diagonal matrix D is determined and its index is

stored;

3. The dominant vector is the column from the

matrix V having the index identified in the

previous step;

4. The values from this vector are normalized with

respect to the sum of its elements;

5. The final values (the PageRank) are the values

obtained for the normalized eigenvector vd

(utterance[i].rank = vd[i]).

Once these values are determined, one can

evaluate the participant-topic attraction and the

topic-topic attraction as described in the following

sections.

3.2 Participant-topic Attraction

The participant-topic attraction defines the

participants’ drive to get involved in the discussion

of a given concept therefore proving its interest or

knowledge related to that concept. To determine this

factor, we have used the values of the participant’s

utterances containing the words that define the

considered topic.

If p represents a participant and t a topic, then the

attraction between p and t is given by the following

formula:





,











∗



,

∈

∈

(3)

In order to highlight this method, we provide an

example that proves how the above formula works.

For this, we have made the simplifying assumption

that all the utterances belong to the same participant.

Utt: < debated topics > utt value

u1: t t y x 0.5 = rank(u1)

u2: t y z x 0.3 = rank(u2)

u3: t t t z 0.2 = rank(u3)

Using formula (1), the following results will be

returned (2):





,



2∗



1







2





(4)

UsingPageRankforDetectingtheAttractionbetweenParticipantsandTopicsinaConversation

297

3∗



3







,







1







2







,







1







2







,







2







3



In the end, all these values are normalized.

3.3 Topic-Topic Attraction

The topic-topic attraction defines the probability of

having a specific topic following another topic in the

flow of a conversation. To determine it, we use the

utterance chains taking into account both the

frequency of each topic and the case when they co-

occur in the same utterance (topics are very closely

related) or occur separately (more loosely).

Therefore, we extract the threads corresponding

to the two topics and build a matrix for each chain.

This matrix reflects the debating of those topics in

the utterances and their corresponding values within

that chain.

The relationship between the topics and the

utterances is reflected by the matrix that is built as

follows:

1. We build the chain – topic matrix (ct), 

represents the value of the  topic in the 

utterance where

a. 0 if the  topic is not debated in

the utterance ;

b. 

_

#_

if the -th topic is

debated in the utterance .

2. After filling in the matrix, we apply formula (5)

for each topic t

and t

∑







∈





∑















∈



∑













∈



∈



|



(5)

Below we present an example of matrix (6) for

determining the topic – topic attraction for the

following chain: u5 → u4 → u3 → u2 → u1

utt: <debated topics >

u1 t1 t2 t3

u2 : t1 t5

u3 : t2 t4

u4 : t1 t3

u5 : t1 t2

Then, the attraction between topic t1 and t2 topic is

given by (5) by applying the formula (3), with the

matrix from (4), obtaining formula (7).

3.4 User Interface

The user interface allows the input file selection, and

afterwards the content of this file is analyzed and the

results are displayed in tabular form (see Fig. 1).

The left part of the GUI presents the values for

participant – topic attraction for the selected

participant, while the right part gives the values of

the topic – topic attraction for the selected topic.

Besides the values obtained for the participant –

topic attraction and the topic-topic attraction, the

application also outputs the most important

utterances from the chats computed using their

PageRank.

The results proved to be much stricter comparing

to the results obtained using PolyCafe system or

provided by the human reviewers (Gold-Standard).

This is due to the fact that only very few utterances

have a PageRank greater than 0.

In order to provide an example, we present a part

of the utterances evaluated as being important by

PageRank algorithm. We will use the same chat for

which we presented the examples from the

participant-topic and topic-topic attraction examples.

The automatic analysis performed with PolyCAFe

has identified 132 important utterances (out of 430)

as important. From these, the PageRank algorithm

also identified 17 utterances as (see Table 1). The

values for the ranks computed by PageRank may

seem pretty low, but this is what usually happens

when applying the algorithm on any graph.

Besides these 17 utterances that were considered

important by both PolyCAFe and the PageRank

method presented in this paper, the latter has

identified another 28 turns that were not considered

important by the former. In order to account for

these utterances, we analysed PolyCAFe’s results in

order to discover a possible explanation. At a careful

analysis, we observed that the 28 extra utterances

were marked by PolyCAFe as being continuations

of other utterances. Therefore, it is possible that

PolyCAFe did not consider these utterances to be













1



/3 



1



/3 



1



/3 0 0





2



/2 0 0 0 



2



0



3



/20



3



/2 0





4



/20



4



/2 0 0





5



/2 



5



/2 0 0 0







(6)





1,2









1



∗2/32/24/25

2/23/2

(7)

WEBIST2014-InternationalConferenceonWebInformationSystemsandTechnologies

298

Figure 1: Application Graphical User Interface.

Table 1: The utterances identified by our algorithm that receive a grade higher than 8 by PolyCAFe.

Utt.

PageRank

score

PolyCAFe

score

Utterance Content

169 0.002 10.07

yes, they have wikis that are publicly available, with public information, for the

everyday user that takes an intrest in that company's products

167 0.002 10.01

all major companies have wikis for their technologies. most people like to search

wikis cause they provide accurate and easy to access information. Also, that way

our database servers won't be so used

404 0.202 10.01

Indeed. Our companies image will grow if we have a forum, a blog, a wiki and a

cool web-site that customers or developers can use

310 0.004 9.93

the only problem that still remains is that we need someone to check wiki articles,

blog and forums posts so that classified data does not accidentally reach a "public"

area

348 0.004 9.28

A svn is an open-source revision control system. Users can work on a version of the

application code and commit it. If two users are working on the same thing when

they commit a merge is made with the 2 versions

308 0.001 9.03

we can use a person or a team of people to handle the wiki posts, forums posts,

wave documents and all the other important stuff

331 0.002 8.93

I mean everyone of our employees knows how to use a wiki, forum blog and chat,

and google wave has a extremely friendly interface

314 0.003 8.85 but how can you use a filter for a forum?

399 0.009 8.79

well i think chat is important for our employees, it helps them talk and colaborate,

spare time by not meeting in conferences that much, and be on track with all theit

colleagues are doing

312 0.002 8.78 We can use filters for that firewalls. That can save some money

388 0.047 8.45 not necesarly computers,you can change acounts

333 0.001 8.4 evrybody can use a chat , forum , bog or something like that

423 0.015 8.32 good night everyone, and thank you for your collaboration:)

341 0.004 8.21 We can also use a SVN for our code. What do you think of this?

376 0.042 8.12 this could make them loose time...

373 0.029 8.08 they will use another machine. the restrictions will be only for certain computers

387 0.025 8.07

for every function you don't remember in a programming language, you will have to

move to other computer to find out... but it's ok ... it wouldn't be a big problem i

guess

UsingPageRankforDetectingtheAttractionbetweenParticipantsandTopicsinaConversation

299

important, since the same ideas were present in the

previous utterances, but PageRank, through its

nature, favours this kind of utterances since being

identified as continuations it means that they have

links from other utterances that were considered

important in the past and therefore they receive a

part of these utterances’ rank.

For a better evaluation of our method, we asked

30 students from the Human-Computer Interaction

class to annotate 3 different chat conversations with

the most important utterances. Thus each of the three

chat conversations was evaluated by 10 different

students. We computed the inter-rater agreement

using Fleiss' Kappa for m raters and we have

obtained the values for Kappa 0.133 for the first

chat, 0.142 for the second and 0.177 for the third

one, while the p-values were always 0.000. These

results show how difficult this task is even for

humans. When we computed the results obtained by

our method with the gold standard results provided

by the annotators, we obtained the values of kappa

0.085 for the first chat, 0.0882 for the second and

0.0894 for the last one. These results are below those

of the raters, but we have remarked that if we

discard the last 15% of the utterances in all chat

conversations, where the PageRank accumulated too

much, the results are much better: 0.131 for the first

chat, 0.128 for the second and 0.173 for the last chat

conversation. These results are closer to the inter-

rater agreement and highlight that we should add a

decaying factor for utterances that are closer to the

end of the discussion (as they have fewer out-going

links and thus the rank tends to accumulate in them).

4 INTERPRETATION

OF RESULTS

There are several important observations that can be

drawn up based on the results that we have obtained

and analyzed. First of all, the computed ranks for the

utterances are rarely different from 0, this fact being

generated by several reasons:

 Most of the chats contain very few explicit links

(they seem to be ignored quite often by the chat

participants) – we have observed a direct

dependence between the number of explicit links

from the chats and the number of utterances

having non-zero values after applying the

PageRank algorithm.

 The PageRank algorithm determines the

utterances’ ranks as a random walk in the graph

of utterances. The significance of these values is

the probability to get to a certain utterance after a

number of steps that goes to infinity. Therefore,

once the algorithm gets to a (relatively small) set

of utterances (lines / columns from the transition

matrix), it will be very difficult to get out of that

set (in the context of random walk) and so the

remaining values will tend to be 0.

 The PageRank Algorithm is designed for the

web, where a lot of links exist between different

resources (therefore creating large chains of

links, most of the times having a lot of cycles),

while in the chats the utterance chains are usually

short and rarely having such cycles. Besides, the

topics repetitions might not be synchronized with

the explicit references, therefore not leading to

cycles.

 In the current version of the system, we proposed

equal importance for the explicit and implicit

links, which can lead to determining a value too

high / low for some utterances, depending on the

number of words used in that utterance and in the

one to which it is linked.

Secondly, most utterances having values greater

than 0 are positioned at the end of the discussion.

This happens due to lack of explicit links, and

therefore those utterances accumulate a very high

score due to topics repetition, which propagates

through the chat from the beginning until the end. A

solution is needed to link these utterances to other

ones in the chat.

Finally, there are some utterances considered

significant by the PageRank algorithm, but not by

other algorithms or by human evaluators. The reason

is the same as for the previous observation: some

utterances (that are not very significant in terms of

conversation) may receive high values because of

the rank accumulation over time from other

utterances that contain the same topics.

There are a couple of solutions that could be

tried in order to alleviate the presented problems. A

first solution might be the detection of dialog acts

and adjacency pairs, since the main problem of the

proposed method is the lack of explicit links. This

way, one can detect the dialog acts that are present

in the chat (question - answer, agreement –

disagreement, greetings and so on) very quickly and

to use these links as additional explicit links.

Another possibility is to use LSA or lexical chains to

find out more semantic connections between

utterances.

Another solution to avoid reaching too many

zeroes for the computed ranks is to use the Iterative

Method instead of the Power Method for computing

the PageRank values. This way, one is not

WEBIST2014-InternationalConferenceonWebInformationSystemsandTechnologies

300

constrained to apply the algorithm until convergence

(after an infinite number of steps), but can stop after

a limited number of steps, so that fewer utterances

reach a zero-value influence.

Finally, in order to be able to discriminate

between the importance of explicit and implicit

links, one can use a generalized algorithm based on

Markov chains having different values for different

link types (explicit or implicit links).

5 CONCLUSIONS

To sum up, our current adaptation of the PageRank

algorithm for online conversations (using only the

explicit and implicit links given by the topics

repetitions) is not powerful enough to provide results

that have the desired accuracy compared with other

solutions that analyse the discussions offline. The

main explanation is that there are not enough

explicit links added by the participants during the

discussion and using only repetitions for detecting

implicit links does not build a graph that is dense

enough. However, the method is much faster and it

can be used online and in real-time for the dynamic

evaluation of multi-threaded discussions involving

multiple participants.

Moreover, in our opinion the assumptions made

in this paper are novel for the analysis of online

discussions and they have not been used to assess

the importance/rank of an utterance in an online

discussion although PageRank follows from

previous work in citation analysis (where the links

between papers are made explicit by authors). The

preliminary results also support the use of PageRank

to compute the most important utterances in a multi-

party online conversation, but several improvements

of this method need to be investigated in order to

achieve similar results to the current state of the art

methods that also employ linguistic analysis.

ACKNOWLEDGEMENTS

This research was supported by project No.264207,

ERRIC-Empowering Romanian Research on

Intelligent Information Technologies/FP7-REGPOT-

2010-1.

REFERENCES

Bollen, J., Rodriguez, M. A. and Van de Sompel, H. 2006.

Journal Status. In: Scientometrics 69 (3), pp. 669-687.

Brin, S.; Page, L. 1998. The anatomy of a large-scale

hypertextual Web search engine. In: Computer

Networks and ISDN Systems 30: 107–117.

Chiru, C., Cojocaru, V. Trausan-Matu, S., Rebedea, T. and

Mihaila, D. 2011. Repetition and Rhythmicity Based

Assessment for Chat Conversation. In: ISMIS 2011,

LNCS 6804, Springer, pp 513-523.

Hicklin, J., Moler, C., Webb, P., Boisvert, R., Miller, B.,

Pozo, R., Remington, K. Jama: a Java matrix package.

(http://math.nist.gov/javanumerics/jama/ - accessed

16/08/2012).

Jiang, B. 2006. Ranking spaces for predicting human

movement in an urban environment. In: International

Journal of Geographical Information Science 23 (7),

pp. 823–837.

Muhlpfordt, M. and Wessner, M. 2005. Explicit

referencing in chat supports collaborative learning.

Paper presented at the Proceedings of CSCL 2005.

Navigli, R. and Lapata, M. 2010. An Experimental Study

of Graph Connectivity for Unsupervised Word Sense

Disambiguation. In: IEEE TPAMI, 32(4), IEEE Press,

pp. 678–692.

Page, L., Brin, S., Motwani, R., Winograd, T. 1998. The

PageRank Citation Ranking: Bringing Order to the

Web, Technical Report. Stanford InfoLab.

Rebedea, T., Dascălu, M., Trausan-Matu, Armitt, G., and

Chiru, C. 2011. Automatic Assessment of

Collaborative Chat Conversations with PolyCAFe. In:

Proceedings of ECTEL 2011, LNCS 6964, Springer,

pp. 299-312.

Schmidt, B. M. and Chingos, M. M. 2007. Ranking

Doctoral Programs by Placement: A New Method. In:

PS: Political Science and Politics 40, pp. 523–529.

Stahl, G. 2006. Group cognition. Computer support for

building collaborative knowledge. Cambridge: MIT

Press.

Stahl, G. 2009. Studying Virtual Math Teams. New York:

Springer.

Tuulos, V. H. and Tirri, H., 2004. Combining topic

models and social networks for chat data mining. In:

Proceedings of WI’04, pp. 206–213.

Sundararajan, B., 2010. Emergence of the Most

Knowledgeable Other (MKO): Social Network

Analysis of Chat and Bulletin Board Conversations in

a CSCL System. Electronic Journal of E-Learning,

8(2), pp. 191-207.

UsingPageRankforDetectingtheAttractionbetweenParticipantsandTopicsinaConversation

301