correlation relationship, that is, the impact of changes 
in one attribute of data on other attributes. This study 
is mainly used to analyze the correlation between 
college students' Internet behavior and academic 
performance. 
2.3  Classification of Students' Network 
Behaviors 
2.3.1  URL Based Keyword Acquisition 
User behavior classification refers to the process of 
classifying users according to category preferences 
when browsing web pages. The collected log data 
contains the URL of the web page visited by the user, 
extracts the keywords from the URL, and establishes 
the URL topic list of the web page visited by the user. 
Based on the user's topic list, you can build a category 
of web behavior associated with the website category.   
The webpage classification method based on URL 
keyword extraction is used to classify the webpage. In 
order to get the theme of the web page according to 
the URL string, the most direct method is to get the 
theme corresponding to the URL by matching the 
artificially marked website category directory. 
Website classification directory is the collection of 
information on the Internet website together, 
according to different classification topics, placed in 
the corresponding directory. It is the most direct and 
accurate method to get the URL topic by matching the 
website classification directory, but due to the large 
workload of marking, it needs to consume huge 
manpower and time, and the amount of marking data 
is also very limited. In order to solve this problem, the 
design of webpage classifier is proposed, based on the 
N-gram language model of webpage classification 
algorithm, using URL classification directory 
matching to determine the URL theme, the URL of all 
web pages are mapped to the corresponding webpage 
theme one by one. Improve the efficiency and 
accuracy of web topic determination, so as to obtain 
more comprehensive user behavior classification 
information. 
2.3.2  User Behavior Classification 
Information Representation 
After accurately obtaining the topic information of the 
web page visited by the user, the topic list is 
transformed into the user behavior classification 
information to provide material for the input part of 
the user behavior classification model. After the topic 
list is obtained, by counting the number of 
occurrences of each topic in the topic list, we get the 
binary group composed of topic   t
 and frequency 
Table 1: URL keywords for web page categories. 
Topic Keywords 
Game 
gamersky, game, 4399, 7k7k, 17173, ali213, yy, 
douyu,egame, 
Social 
Network 
extshort.weixin.qq,weibo, 
btrace.qq,weibo,tieba.baidu,jiayuan,tianya,zhihu, 
Contact 
music.163,kugou,y.qq,fm.taihe,xiami,kuwo,yiny
uetai,changba,music, 
Video 
policy.video.iqiyi,video.ptqy,video,ixigua,v.qq,ha
okan.baidu,youku,v.baidu,mgtv,acfun 
Study 
wps,cnki ,dict.youdao,wpscdn,flashapp,chinaz,pr
ocesson,dxzy163,icourse163,mooc 
Science 
Ludashi,windowsupdate,apple,idianshijia,sandai,
duba,ubuntu,zol,ithome 
Load 
Download,sz-download.weiyun,
ardownload.adobe,download.hongbaoshu 
Read 
xxsy,zongheng,qidian,read,faloo,qidian,novel,jjw
xc,lrts.me,zongheng,ximalaya, 
Search 
Baidu.sohu,news.sina,candian,guancha,mil.ifeng,
huanqiu,junshi.china,yahoo,sogou 
Shopping 
taobao,alibaba,alipay,dangdang,suning,mogu,168
8,mi, 
 
𝑐
 (t
, c
)} and form all the resulting binary groups 
into a binary list {(t1,c1), (t2,c2), …, (t
 ,  c
 )},  the 
binary list is the user interest information. It will serve 
as input to the building part of the college student 
interest set. When keywords corresponding to the 
theme appear in a URL, they are mapped to the 
corresponding theme, and the statistical URL 
keywords are partially displayed (see Table 1). 
2.3.3  The Classification and Construction 
Process of College Students' Network 
Behavior 
First, the list of urls accessed by users is obtained, and 
the URL topic is obtained by extracting URL 
keywords as mentioned above, so as to obtain user 
behavior information. Then, the classification model 
of college students' network behavior is constructed 
by user behavior classification algorithm. The 
construction process of college students' network 
behavior classification is divided into four steps: 
Step1: Extract the original information. Extract 
urls accessed by users, count the number of visits to 
each URL, and generate a binary of urls and visits 
(URL, counts). 
Step2: Obtain keyword information. Use the 
keyword acquisition method based on URL feature 
extraction to get the Topic information of URL, that