Bi-Response Semiparametric Regression Model based on Spline
Truncated for Estimating Computer based National Exam in West
Nusa Tenggara
Lilik Hidayati
1
, I Nyoman Budiantara
2
and Nur Chamidah
3
1
Doctoral Student Majoring in Mathematics and Natural Sciences, Airlangga University, Surabaya, Indonesia
2
Department of Statistics, Sepuluh Nopember Institute of Technology, Surabaya, Indonesia
3
Department of Mathematics, Faculty of Sciences and Technology, Airlangga University, Surabaya, Indonesia
Kampus C Universitas Airlangga, Mulyorejo, Surabaya 60115, Indonesia
Keywords: Bi-Response Semiparametric Model, Spline Truncated, Computer-Based National Exam.
Abstract: Bi-response semiparametric regression model is a regression model consisting of two components, parametric
and nonparametric with two response variables. The propose of this research, we estimate the parameters of
bi-response semiparametric regression model based on spline truncated by using the weighted least square
method. Then model it using bi-respon semiparametric regression model based on a spline truncated
estimator. The joint point of combination of the truncated or the point that indicates the occurrence of changes
in curve behavior at these intervals are called knots. The best model is determined by the optimal knot point,
the method used to select the optimal knot point is to use the generalized cross validation method. The model
is applied to the computer based national examination values of West Nusa Tenggara province. Based on the
result of the estimation model, we get knot optimal brapathe determination coefficient (R
2
) tends to one (i.e.,
90%) and MSE tend to zero that it satisfies goodness of fit criterions.
1 INTRODUCTION
Regression analysis is an analysis to know the pattern
of functional relationship between response variable
(y) with predictor variable (x). If the regression curve
is assumed to follow a certain pattern called
parametric regression, while the regression curve is
assumed not to follow a certain pattern called
nonparametric regression. Semiparametric regression
is a combination of parametric regression and
nonparametric regression (Wahba, 1990). The
development of research conducted by researchers
who focused their research in semiparametric
regression include using a penalized spline estimator
(Bandyopadhyay and Maity, 2011; Tong et al., 2012;
Yang and Yang, 2016); use smoothing spline
estimators (Kim, 2013; Chen and Song, 2013); use
the Truncated Spline estimator (Loklominet al., 2017;
Pratiwiet al., 2017). Furthermore, the bi-response
semiparametric regression model that has been
conducted is using linear local estimators (Chamidah
and Rifada, 2016); use a penalized spline estimator
(Chamidah and Eridani, 2015). The development of
research that has been carried out by previous
researchers who focused their research in a
semiparametric regression model that uses a
truncated spline estimator is only limited to uni
respon. So, the novelty of this research is the
development of theory in estimating parameters and
design a program algorithm for semiparametric Bi-
response regression models based on a truncated
spline estimator that is implemented on the data of the
Computer Based National Examination (CBNE)
Vocational High School (VHS) data in West Nusa
Tenggara Province.
The development of industrial resources which
gives more attention to vocational education, namely
competency-based education (PP RI No. 41, 2015).
Competency-based education is defined as VHS, so
schools in VHS can be a solution for students to get
an expertise after graduation. Vocational High School
(VHS) graduation standards have been determined by
the government in collaboration with the Department
of Education and Culture (DEC) in each region
(Permen RI No.3, 2017). The successful
implementation of CBNE properly and smoothly
must be supported by all parties, between DEC and
Hidayati, L., Budiantara, I. and Chamidah, N.
Bi-Response Semiparametric Regression Model based on Spline Truncated for Estimating Computer based National Exam in West Nusa Tenggara.
DOI: 10.5220/0008521903570361
In Proceedings of the International Conference on Mathematics and Islam (ICMIs 2018), pages 357-361
ISBN: 978-989-758-407-7
Copyright
c
2020 by SCITEPRESS Science and Technology Publications, Lda. All rights reserved
357
schools as policy makers, teachers and students as
implementers. Mahmud said that variables that affect
student achievement in this case CBNE, consists of
two factors, namely internal and external, among
others, gender, accreditation value, distance traveled,
parental education, report card grades, school
examination scores (Mahmud, 1989). Internal and
external factors are called predictor variables (x)
while the response variable (y) is the value of CBNE
in Mathematics subjects and competency skill. Based
on the scatter plot data from these variables, some
form certain patterns as parametric components and
some do not form a particular pattern as a
nonparametric component, so it is suitable to use a
truncated spline estimator that can handle data
patterns that experience behavior changes in certain
sub-sub intervals.
The average value of CBNE VHS in West Nusa
Tenggara Province nationally from year to year
occupies the lowest position compared to other
provinces in Indonesia
(PuspendikBalitbangKemendikbud, 2017). Based on
these facts, the research on the value of CBNE in
2017 is suitable using Bi-Response Semiparametric
regression model based on spline truncated estimator
2 METHODS
Sources of data used in this study is data CBNE VHS
on the Department of Computer Network
Engineering (CNE) in West Nusa Tenggara Province
in 2017. The response variable in this study is the data
of CBNE values in the competency skill (y
1
) and
Mathematics subjects (y
2
). Predictor variables in this
study were the grades of math (x
1
) and competency
skill, gender (x
2
), accreditation status of the
department (x
3
), the distance of students from home
to school (t
1
), the duration of parent education (t
2
),
and the joint school examination on the subject of
competency skill (t
3
) and Mathematics.
2.1 Estimating the Parameters of
Semiparametric Birespon
Regression Model based on Spline
Truncated Estimator
1. Assume paired data is
()
( , , )
r
pm
y x t
that meets
semi parametric bi-response regression model
based on spline truncated estimator as follows:
( ) ( ) ( )
( ) ( ) ( )
0 1 1
( ) ( )
1
( )
r r r
r r r
i pi pi
i i i
m
rr
hi i
h
y x x
ft

(1)
With assume
2
~ (0 ) ,
i
IIDN

Where i=1,2,...,n ; r=1,2 ; j=1,2,...,p and
h=1,2,...,m
2. Approached the regression curve
()r
i
y
by using
semiparametric bi-response regression model
based on linear spline-truncated estimator with K
knot
(1 )
( ) ( ) ( ) ( ) ( ) ( )
0 1 0
( ) ( ) ( )
1 ( )
1
( )
ck
p
md
r r r r r r
i ji ji ci ci hi
j h c
K
r r r
r
i
hi k
k
y x t
tK

(2)
where
1
( ) ,
()
0,
d
i k i k
ik
ik
t K t K
tK
tK


The point of knots
12
( , ,..., )
K
K K K
are a point that
shows the pattern of changes in behavior of
functions at a certain sub-interval.
3. Equation (2) can be written in matrix
y
XT
(3)
Where
parametric component
parameters
()
X
in the bi-respon semiparametric
regression model for the parametric component
aren x 1= (n+1) x (1 + p)) + (1+p) x n+1.The
parameter
()
T
of the nonparametric component
in the bi-response semiparametric regression
model is n x 1= (n+1) x (1 + m)) + (1+p) x n+1.
4. Form a new matrix notation of multi respon
semiparametric regression model for statistical
inference purposes. For example,
C XT
so the new regression model can be
written in another form that is
yC


(4)
5. Obtain estimates for parameters
using the
Weighted Least Square (WLS) method with the
following steps
a) Forming function Q
1
( ) ( ) ( )Q y C W y C
b) Minimize the Q equation by solving the
following equation
ICMIs 2018 - International Conference on Mathematics and Islam
358
()
0
L
c) Get estimates from
ˆ
i.e

2.2 Create Algorithms and Programs
of Semiparametric Bi-Response
Regression Model Parameters
based on Spline Truncated
Estimators
1. Test the correlation between response variables
2. Determine optimal knot order and point based
on minimum Generalized Cross Validation
(GCV) criterion
2
1
1
1
1
( ,..., )
( ,..., )
( ,..., )
K
K
K
KK
KK
KK
MSE
GCV
n tr I A
(5)
where,
2
1
1
1
( ,..., )
ˆ
()
n
K
ii
i
KKMSE n y f t

matriks
obtained from the
equation:
1
( ,..., )
ˆ
K
KKf A y
(6)
3. Determining the weighted matrix W
4. Estimate the function in equation (1)
3 RESULT AND DISCUSSION
Given data
with the response variable
(1) (2)
,yy
are called Bi-response. Predictor variables
for parametric components
12
, ,...,
p
x x x
while
predictor variables for nonparametric components
12
, ,...,
m
t t t
.So that the spline semiparametric
regression model is cut Bi-response which contains
both components as stated in equation (1) can be
written in matrix notation as follows:
(1)
(1)
(1) (1) (1)
(2)
(2) (2) (2) (2)
0
()
0
()
y
x
ft
x
y f t










where
(1) (1)
(1) (1)
12
(2) (2)
(2) (2)
12
,
T
n
T
n
y y y y
y y y y




(1) (1) (1)
11 21 1
(1) (1) (1)
(1)
12 22 2
(1) (1)
(1)
12
1
1
,
1
p
p
pn
nn
x x x
x x x
x
x x x








(2) (2) (2)
11 21 1
(2) (2) (2)
(2)
11 22 2
(2) (2)
(2)
12
1
1
1
p
p
pn
nn
x x x
x x x
x
x x x








(1) (1)
(1) (1)
01 12
,
T
pn


(2) (2)
(2) (2)
01 12
T
pn


2
(1) (1) (1) (1) (1) (1) (1)
(1)
11
01 11 11 21 11 11
1
(1) (1) (1) (1) (1) (1)
11 1 11 2
( 1)1 ( 2)1
(1) (1) (1)
11
( )1
(2) (2) (2
(2)
11
01 11 11
()
( ) ( )
( )
()
d
d
dd
dd
d
d m k
f t t t t
t K t K
tK
f t t





2
) (2) (2) (2) (2)
21 11 11
1
(2) (2) (2) (2) (2) (2)
11 1 11 2
( 1)1 ( 2)1
(2) (2) (2)
11
( )1
( ) ( )
( )
d
d
dd
dd
d
d m k
tt
t K t K
tK




The random vector error for each equation is:
(1) (1)
(1) (1)
12
(2) (2)
(2) (2)
12
T
n
T
n




Semiparametric bi-response truncated bi-response
regression models can be formed like equations (4).
The parameter estimation uses the Weighted Least
Square (WLS) optimization method, so that by
minimizing the goodness of fit of the semiparametric
bi-response regression model in equation (4) is
obtained:
1
1
( ) ( ) ( )
( ) ( ) ( )
T
T
Min Q Min y C W y C
Q y C W y C



The estimator of the parameter θ is obtained by
decreasing each of the parameters θ so that it is
obtained:
1 1 1
()
0
ˆ
( )
TT
Q
C W C y W C
(7)
Furthermore, the equation of the semiparametric
biresponse regression model in equation (4) is used
for the purposes of statistical inference based on the
Bi-Response Semiparametric Regression Model based on Spline Truncated for Estimating Computer based National Exam in West Nusa
Tenggara
359
truncated spline estimator. The data used for the
implementation of the estimation of the
semiparametric spline truncated bi-respon regression
model is the CBNE Value in the Department of
Computer Engineering Network of VHS in West
Nusa Tenggara Province in 2017.The correlation test
between the two response variables was carried out,
namely the score of Mathematics CBNE with the
score of competency skill CBNE, using the following
hypothesis: Zero hypothesis (H0) that is if both
variables do not have a linear relationship
( 0)
;
the alternative hypothesis (H
1
) is if both have a linear
relationship
( 0).
Based on the results of the
correlation test obtained p-value <0.05 then reject H
0
so that it can be concluded that there is a correlation
between responses.
The next step is modeling the value CBNE of
VHS in the Province of West Nusa Tenggara in 2017
using a semiparametric Bi-response spline truncated
regression model at each knot point. The knot point is
a joint fusion point where data behavior changes.
Optimal knot points are obtained from the minimum
GCV value. Based on the analysis carried out, the best
model is the bi-response semiparametric regression
model based on the truncated spline estimator
resulting in the minimum GCV of 0.00000691 with
three knots. After obtaining the minimum GCV score,
the next step calculates the estimate for the
semiparametric bi-response spline truncated
regression model with three knot points as follows:
1 1 2 3 1
0,24 3,66 5,22 1,74 0,47 y D D D x
2 1 1
1 1 1 1
+26,5t 7,89 11,19( 2) 3,88( 3)t t t

1 2 1
1 2 2 2
+0,64( 5) 53,01 6,26 16,04( 5)t t t t

1 1 2
2 2 3 3
+15,19( 6) 5,43( 7) 6,01 0,11t t t t

11
33
15,57( 66) 27,37( 67) 12,07tt

1
3
( 68)t
2 1 2 3 1
136,37 2,14 3,87 3,08 0,26 y D D D x
2 1 1
1 1 1 1
-11,09t 3,51 6,10( 2) 3,5( 3)t t t

1 2 1
1 2 2 2
-1,42( 5) 36,55 4,49 13,31( 5)t t t t

11
2 2 3
+14,28( 6) 5,59( 7) 0,14 1,08t t t

1 1 1
3 3 3
( 83) 0,84( 84) 0,46( 87)t t t
Furthermore, for the criteria of goodness, the
semiparametric bi-response regression model of
truncated spline obtained MSE value of 48.92 with R
2
of 0.90.Based on the two best models, it can be
interpreted as follows: 1) every increase in one unit of
math report card and report card competency skill, it
will result in an increase in CBNE scores in each of
these subjects. 2) The value of CBNE in each subject
is based on the gender of the students, so for male
students it is more than female students. 3) Based on
the value of school accreditation, the school exam
scores on the competency skill subjects are increased
based on the value of school accreditation, meaning
that students of VHS with an accreditation at the
school examination scores on the subjects of
competence are higher than B accreditation; then B
accreditation is higher than C accreditation. 4) then
based on the school distance variable fluctuated at
knots 66, 67, and 68 for the Computer-Based National
Exams scores namely Mathematics CBNE and
Competency Skill CBNE. 5) Likewise, the education
variables of parents experienced fluctuations in knots
5, 6, and 7 for both UNBK scores namely
Mathematics CBNE and Competency Skill CBNE. 6)
Furthermore, for the mathematics school examination
variable values fluctuated at 66, 67, and68 knots for
both CBNE values namely Mathematics and
Competency skill. 7) Likewise, the variable scores on
school competency skills scores also fluctuated at
knots 83, 84, and 87 for both CBNE scores namely
Mathematics and Competency skill.
4 CONCLUSIONS
Based on the two best models, each increase in one
unit of mathematical report value and competency
skills, resulting in an increase in each CBNE value.
Based on gender, male students are higher in value
than female students. Whereas based on the school
accreditation value, the accreditation value of A is
higher than B and C. In school distance variables,
parent education, and school exam scores on
mathematics subjects and competency skill fluctuate
on certain knots in each CBNE value. The result of
the estimation Bi-response semiparametric regression
model, we get the R
2
tends to one (i.e., 90%) and MSE
tend to zero that it satisfies goodness of fit criterions.
ACKNOWLEDGEMENTS
Acknowledgment to the Excellence Scholarship of
the Bureau of Planning and Foreign Cooperation of
the Secretariat General of the Ministry of Education
and Culture which has supported the funding of
tuition fees for the doctoral study.
ICMIs 2018 - International Conference on Mathematics and Islam
360
REFERENCES
Bandyopadhyay, S and Maity, A, Analysis of Sabine river
flow data using semiparametric spline modeling, in
Journal of Hydrology, vol.399, 2011, pp.274280.
Chamidah, N and Rifada, M., 2016, Local Linier Estimator
in Bi-Reaponse Semiparametric Regression Model for
Estimating Median Growth Charts of Children, Far
East Journal of Mathematical Sciences (FJMS)
Volume 99, Number 8, pp.1233-1244.
Chamidah, N and Eridani, 2015. Designing of Growth
Reference Chart by Using Birespon Semiparametric
Regression Approach Based on P-Spline Estimator.
International Journal of Applied Mathematics and
Statistics, Int. J. Appl. Math. Stat, Vol.53, Issue No. 3.
Chen, M and Song, Q., 2016. Semiparametric estimation
and forecasting for exogenous log-GARCHmodels.
Journal of TEST, vol, 25, pp.93112.
Hidayati, L and Budiantara, I, N., 2012. Multivariable
Cubic Spline Regression in Score Modeling National
Exam. Proceeding2nd Basic Science International
Conference.s27-s30.
Kim, Y, J., 2013. A partial spline approach for
semiparametric estimation of varying-coefficient
partially linear models. Journal of Computational
Statistics and Data Analysis, vol.62, pp.181-187.
Loklomin, S, B., Budiantara, I, N and Zain, I., 2017. Factor
that influence the Human Development Index in
Moluccas island using Interval Convidence approach
for Parameters of Spline Truncated Semiparametric
Regression Model. Proceeding 3rd International
Seminar on Science and Technology (ISST.
Mahmud, D., 1989. PsikologiPendidikan. Jakarta.
DepdikbudDirjen.
PeraturanPemerintah RI No. 41, 2015. Pembangunan
SumberDayaIndustri.
Permen RI No.3, 2017. Penilaian Hasil Belajar oleh
Pemerintah dan Penilaian Hasil Belajar oleh Satuan
Pendidikan.
Pratiwi, D. A., Budiantara, I N. and Wibowo, W., 2017,
Pendekatan Regresi Semiparametrik Spline untuk
Memodelkan Rata-rata Umur Kawin Pertama (UKP) di
Provinsi Jawa Timur, Jurnal Sains dan Seni ITS : Vol.
6, No.1, hal.129-136.
Puspendik Balitbang Kemendikbud, 2017. Panduan
Pemanfaatan Hasil Ujian Nasional Tahun Pelajaran
2016/2017. BSNP Jakarta.
Tong, T., Wu, and He, X., 2012. Coordinate ascent for
penalized semiparametric regression on high-
dimensional panel count data. Journal of
Computational Statistics and Data Analysis, vol.56,
pp.23-33.
Wahba, G., 1990. Spline Model for Observational Data,
Society for Industrial and Applied Mathematics.
Philadelphia.
Yang, J. and Yang, H., 2016. A robust penalized estimation
for identification in semiparametric additive models.
Statistics and Probability Letters, vol.110, pp. 268-277.
Bi-Response Semiparametric Regression Model based on Spline Truncated for Estimating Computer based National Exam in West Nusa
Tenggara
361