the least squares (OLS) method, which showed that
those with higher education degrees had higher
earnings from education.
From the above studies, it can be tentatively
concluded that there exists a strong link between
higher education and wage returns, and that the two
are positively related. Most of the existing studies
have examined the relationship between years of
education and wage return earnings, and there is some
literature on the relationship between education levels
and labour market wage earnings in China, but most
studies have compared the difference in wage returns
between primary and tertiary education. With the
reform of higher education in China in recent years,
more and more people have been able to access higher
education, higher education has become universal,
and the wage income levels of those who have
received higher education are significantly higher
than those who have only received primary education,
so more attention should be paid to studying the
relationship between higher education and wage
income returns. However, there are different
classifications and standards for the quality of higher
education in China, and existing studies do not take
into account the actual national context of China.
Therefore, this thesis classifies the level of higher
education according to four different levels: college,
undergraduate, master and doctor, according to the
actual situation in China. In addition, the traditional
'education-income' model does not take into account
the endogeneity of education, so this paper uses an
instrumental variables approach to correct for
endogeneity.
3 DATA AND METHODOLOGY
3.1 Data
The data used in this paper is the China Labour Force
Dynamics Survey data included in the 2018 survey by
the Social Science Research Centre of Sun Yat-sen
University, referred to as CLDS 2018. The China
Labour Force Survey is a project started by Sun Yat-
sen University since 2012, and this project is a
biennial tracking survey of urban and rural residents
in China, covering individuals, households and
communities in almost all provinces of China (except
Taiwan Province and Tibet), and the coverage of the
survey includes the education level, employment and
income of the respondents, and the data are cross-
sectional. The CLDS study used a round-tripping
questionnaire in which the sample was randomly
divided into four sections, which were followed for a
total of six years and then updated. The data structure
of this survey can be roughly divided into six layers:
information about the individual's community,
information about the individual's family, basic
information about the individual and his/her parents,
information about the individual's work, information
about the individual's history and some other
information about the individual. The relationship
between higher education qualifications and wage
returns is the subject of this study, and the survey
includes the qualifications of the individual
respondents, which meets the needs of this study. A
total of 16,537 respondents were included in the
CLDS2018 data, and after excluding some missing
samples, the study data for this paper is 1,480.
3.2 Methodology
The underlying model used in this paper is the Mincer
income equation model, which can be expressed by
the following equation.
lnwage=α+β
0
E+β
1
S+β
2
exp+β
3
exp
2
+γZ+ε (1)
The following are the meanings of the expressions
in the formula. The first variable lnwage represents the
logarithm of the respondent's wage and the wage
chosen is the wage level given in the database for
2017. S indicates the number of years of education of
the respondent, but the database chosen does not give
the number of years of education of the respondent
directly, so it should be calculated using equation (3).
β
0
represents the wage returns to different higher
education qualifications, β
1
is a coefficient on years of
education, β
2
is a coefficient on years of work, and β
3
is a coefficient on the square of years of work. E
represents the different levels of education in higher
education and exp represents the work experience of
the respondents, but as work experience is not
measurable, the number of years the respondents have
worked was chosen as a measure of work experience,
and exp2 represents the squared term of work
experience, Z is some other control variable and ε is
the residual term. However, the years of work is also
not given directly in the database of CLDS2018.
Therefore, it needs to be calculated by equation (4) to
obtain it.
age=2018-birth year (2)
S=Highest degree graduation yea