Language Testing Assignment
VALIDITY
Compiled by :
1. ELISA DEVI
SUSANTI ( 10.1.01.08.0083 )
2. MEDIA
ROSALINA P. ( 10.1.01.08.0158
)
3. NIMAS RATRI
K. ( 10.1.01.08.0183
)
4. NURUL
FADILAH (
10.1.01.08.0195 )
5. PUTRA
ARDANA (
10.1.01.08.0212 )
6. RIA HADI (
10.1.01.08.0220 )
7. SITI
ROMELAH (
10.1.01.08.0257 )
8. VENESIA
SEPTININDAR ( 10.1.01.08.0281 )
ENGLISH DEPARTMENT
FACULTY OF TEACHING TRAINING EDUCATION
UNIVERSITY OF NUSANTARA PGRI KEDIRI
2012
PREFACE
First of all, the writer would like to express the gratitude for
the God, Allah SWT who has given blessing and mercies. So, the writer can
finish this paper about “VALIDITY” test in language testing well.
This paper is supposed to complete final assignment of language
testing subject. It content the explanation of validity, its types and the
example of it.
The writer aware that this paper is far from being perfect. We
welcome any constructive criticism and suggestion for better compilation of
this paper. Finally it is expected that this paper useful for the readers.
Kediri,
July 19th, 2012
The
Writer
VALIDITY
A.
WHAT IS VALIDITY ?
Validity
is arguably the most important criteria for the quality of a test. The term
validity refers to whether or not the test measures what it claims to measure.
On a test with high validity the items will be closely linked to the test’s
intended focus. For many certification and licensure tests this means that the
items will be highly related to a specific job or occupation. If a test has
poor validity then it does not measure the job-related content and competencies
it ought to. When this is the case, there is no justification for using the
test results for their intended purpose. There are several ways to estimate the
validity of a test including content validity, construct validity, criterion
validity, consequential validity,and face validity. It is vital for a test to be valid in order for the results to
be accurately applied and interpreted.
As the example, many recreational activities of
high school students involve driving cars. A researcher, wanting to measure
whether recreational activities have a negative effect on grade point average
in high school students, might conduct a survey asking how many students drive
to school and then attempt to find a correlation between these two factors.
Because many students might use their cars for purposes other than or in
addition to recreation (e.g., driving to work after school, driving to school
rather than walking or taking a bus), this research study might prove invalid.
Even if a strong correlation was found between driving and grade point average,
driving to school in and of itself would seem to be an invalid measure of
recreational activity.
B.
TYPES OF VALIDITY
1. Content
Validity
While there are several types of validity, the most important type
for most certification and licensure programs is probably that of content
validity. Content validity is a logical process where connections between the
test items and the job-related tasks are established. If a thorough test
development process was followed, a job analysis was properly conducted, an
appropriate set of test specifications were developed, and item writing
guidelines were carefully followed, then the content validity of the test is
likely to be very high. Content validity is typically estimated by gathering a
group of subject matter experts (SMEs) together to review the test items.
Specifically, these SMEs are given the list of content areas specified in the
test blueprint, along with the test items intended to be based on each content
area. The SMEs are then asked to indicate whether or not they agree that each
item is appropriately matched to the content area indicated. Any items that the
SMEs identify as being inadequately matched to the test blueprint, or flawed in
any other way, are either revised or dropped from the test.
In brief a test said to have contentvalidity if it’s
contentconstitutes a frepresentative sample of the language skills, structures,
etc. With which it is meant to be concerned. Here are the important of content
validity :
1.
The greater
a test’s content validity, the morelikely it is to be an accurate measure of
what it is supposed to measure.
2.
A test is
likely to have a harmful backwash effect. Areas which are not tested are likely
to become areas ignored in teaching and learning.
2. Construct
Validity
A test has construct validity if it demonstrates an association between the
test scores and the prediction of a theoretical trait. The word construct refers to any underlying
ability which is hypothesised in atheory of language ability. Takean example do
construct validity to know the test-taker’s ability in reading test. The
ability of reading invoves a number of sub abilities, such as the ability for
guessing the meaning of the word from the context in which they are met,
finding the main idea of the text, finding implicit information of the text
,etc.
Another example in writing test, there are a number of sub abilities engage
in writing ability, such as how to organizing a paragraph, punctuation,
capitalization, structure, diction, etc. In speaking test, a sub ability of it
concern in performance of the speaker; the fluency in speaking, the accuracy in
speaking, intonation, expression,etc. In listening, a sub ability consist ability of guessing the meaning of the text,
finding implicit information of the text, etc. If we attemted to measure that
abilities in a particular test, then the part of the test would have construct
validity only if we were able to demonstrate that we were indeed measuring just
that abilities.
3. Criterion Validity
A test is said to have criterion-related validity when the test has
demonstrated its effectiveness in predicting criterion or indicators of a
construct. Criterion validity assesses
whether a test reflects a certain set of abilities. For instance, when the
teachers want to measure student’s proficiency in English based on standart
competences and basic competences are determined on the syllabus. For example,
the third grade for senior high school students at the second semester. Based
the syllabus, the standart and basic competence for monolog text is about
narrative, explanation and discussion text. To get criterion validity in the
test, the items of the test must cover all of the abilities on that syllabus.
Anothe sample is, when an employer hires new employees based on
normal hiring procedures like interviews, education, and experience. This
method demonstrates that people who do well on a test will do well on a job,
and people with low score on test will do poorly on a job. There are two different types of criterion validity:
a)
Concurrent Validity occurs when the criterion measures are obtained at the same time as the
test scores. This indicates the extent to which the test scores accurately
estimate an individual’s current state with regards to the criterion. For
example, on a test that measures levels of depression, the test would be said
to have concurrent validity if it measured the current levels of depression
experienced by the test taker.
b) Predictive Validity occurs when the
criterion measures are obtained at a time after the test. Examples of test with
predictive validity are career or aptitude tests, which are helpful in
determining who is likely to succeed or fail in certain subjects or
occupations.
4. Consequential Validity
Consequential-related evidence of validity is concerned with the appropriateness
of the intended unintended outcomes that ensure an assesment. Such outcomes can
include entry into program or services, such as honor societies, advanced
courses, remidiation services or special education services. They can also
include promotion to the next grade level, graduation from high school, and
admission into post secondary education. Outcome can also be effective in
nature, influencing student motivation, beliefs or dispositions.
For example, a strunggling
student may perform well on an assesment in preparation for which the teacher
had scaffolded learning. The consequence
may be a more positive attitude
toward the teacher, the subject, and learning, more generally. In this example,
the consequence is positive and ultimately leads to improving student learning.
The teacher may conclude, therefore, that the test has a high degree of
consequential validity for this particular students regarding her sense of self
efficacy for learning. Conseverely, the consequences of an assesment may be
more insidious. If an assesment is perceived to be anfairly difficult, to asses
knowledge or skills that were indequately taught, or to be administred in such
a way that students are unable to demonstrate their true learning, negative
preseptions and feelings may be endangered. If in such a situation the teacher
had aimed, in part, to foster not only knowledge of but also an appresiation
for the scientific method in his students, the assesment may diminish its very
validity because the test itself in advertently hinders students’ acquisition
of this important intended outcome of learning.
5. Face Validity
Face validity is a property of a test intended to measure
something. It is the validity of a test at face value. In other words, a test can be said to have face validity
if it "looks like" it is going to measure what it is supposed to
measure. For example, a test which pretended to measure pronunciation ability
but which did not require the candidate to speak (and there have been some)
might be thought to lack face validity.
Some people use the term face validity
only to refer to the validity of observers who are not expert in testing
methodologies. For instance, if you have a test that is designed to measure
whether children are good spellers, and you ask their parents whether the test
is a good test, you are studying the face validity of the test. If you ask an
expert in testing spelling, some people would argue that you are not testing
face validity. This distinction seems too careful for most applications.
Generally face validity means that the test "looks like" it will
work, as opposed to "has been shown to work".
C.
THE USE OF VALIDITY
Every effort should be made in constucting tests to
ensure content validity. Where possible, the tests should be validated
empirically against some criterion. Particularly where it is intended to use
indirect testing, reference should be made to the research literature to
confirm that measurement of the relevant underlying constructs has been
demonstrated using the testing techniques that are to be used (this may often
result in disappointment – another reason for favouring direct testing).
Any published test should supply details of its
validation, without which its validity (and suitability) can hardly be judged
by a potential puchaser. Tests for which validity information is not available
should be treated with caution.