level, and discriminating power are used as
reference in the process of determining the
feasibility of the test items. The test items accepted
are those whose criteria of it validity is valid, whose
reliability criteria is at least medium, the difficulty
criteria is medium/difficult, and the criteria of the
discriminating power is good
enough/good/extremely good. The test that have to
be revised are the test items that have the validity
criteria is valid, its reliability criteria is minimal, its
difficulty criteria is easy, and its discriminating
power criteria is sufficient. While the test items are
rejected if the test items validity is not valid, its
reliability criteria is low, its difficulty criteria is easy
or difficult, and the discriminating power criteria is
extremely poor.
Based on the results of test items feasibility
analysis of DPBD FPBS UPI at Even Semester in
2016/2017, there are 67% test items are feasible
items, 12% must be revised, and 21% must be
replaced. Based on the composition of the course
subject group, the linguistics group goes to be the
group whose questions are the most categorized
worthy to be used, followed by culture, literature,
and learning pedagogy. The existence of the
language learning test items analysis, it has provided
a systematic procedure that offers very specific
information about the test items prepared. Analysis
of this test is conducted as one of the activities that
need to be held in order to improve the test
instruments quality, both the quality of the overall
test and the quality of each test items as part of the
test. The test as an evaluation instrument is expected
to produce an objective and accurate score.
Therefore, it is necessary to make sure that the tests
given to the students are as good as possible and
good quality.
A good test can be used over and over with a few
changes. If there is a test that has poor quality, it will
be better the test is discard or not used to test the
students. A test can be classified as a feasible
measurement instrument, if it fulfils the test
requirements. The test requirements that include
validity, reliability, have the discriminating power,
and have good difficulty level.
4 CONCLUSIONS
Based on the results and discussion above, it can be
concluded that the validity level of the test items
tested distributed to 59% are valid, while the rest
41% test items are invalid. The number of valid test
items is greater than the invalid one. It indicates that
many test items tested are generally capable to
measure the competence of students that is in line
with the course subject content, but rest of them are
not capable. The reliability level of the question
tested is in the medium category. This shows that the
test items have good reliability. The reliability
includes the accuracy of measurement results, and
the stability of the measurement results. So that, if
several tests are conducted on the test items, these
will give a predetermined result. Based on the level
of complexity, there are about 58% of test items are
medium category, 20% test items are easy, 14% test
items are difficult, 7% test items are very easy, and
2% test items are very difficult. The result of this
test states that the test items tested have a good
proportion. Based on the discriminating power, there
are 43% test items are considered good enough, 39%
are bad, 10% are good, 6% are extremely bad, and
2% are extremely good. The average of the
differences level of the test items tested is in good
enough to be able to discriminate between the
answers of students who have high-ability level and
students who have low-ability level. The feasibility
level of the test items analysed i.e. 67% are feasible,
12% must be revised, and 21% must be replaced. It
is found in this analysis that the majority of test
items are qualified to meet the requirements of good
quality test items based on validity, reliability,
difficulty, and differences power. Based on the
above conclusions, the result of the test items
analysis at DPBD FPBS UPI reveals good enough
result. In spite of this, there are also test items that
have invalid status, no discriminating power,
disproportionate in the difficulty level, and
unacceptable feasibility. Therefore, it needs to revise
or to improve the test items of the learning pedagogy
groups, linguistics, literature, and culture in further
tests. The test items that need to be improved should
be selected from the course subject that has been
learned and discussed by the student and presented
in high quality test items. To realize that, the ability,
carefulness, and good experience of lecturers are
required to improve the quality of the feasibility of
the test items. So that, the results will be more
accurate in measuring learning competencies that
have been achieved by the students.
REFERENCES
Anastasi, A. and Urbina, S., 1997. Psychological Testing,
Prentice-Hall, Inc. New Jersey.
Arikunto, S., 2013. Dasar-dasar Evaluasi Pendidikan, PT.
Bumi Aksara. Jakarta.
CONAPLIN and ICOLLITE 2017 - Tenth Conference on Applied Linguistics and the Second English Language Teaching and Technology
Conference in collaboration with the First International Conference on Language, Literature, Culture, and Education
662