2008 - Marketing Educators' Association
2008 - Marketing Educators' Association
2008 - Marketing Educators' Association
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
ankings, alumni ratings, and the rating of trained<br />
observers (Aleamoni, 1999; Cashin, 1995; Kulik,<br />
2001). However, even here, problems emerge.<br />
Finkelstein (1995) reviewed some literature to<br />
suggest that the instruments will not lead to<br />
instructional improvement unless skilled consultants<br />
help the faculty interpret the ratings. The literature<br />
seems to be suggesting that the instruments are<br />
concurrently valid, but that the concurrent<br />
experience that faculty members or students have<br />
with these instruments, without expert supervision, is<br />
invalid.<br />
Predictive Validity<br />
The predictive validity of a measurement instrument<br />
relates to its usefulness as a predictor of some other<br />
characteristic or behavior. The research base here<br />
is generally lacking and contradictory. These<br />
contradictions can be shown by two examples.<br />
“There is no evidence that the use of teaching<br />
ratings improves learning in the long run”<br />
(Armstrong, 1998). But Overall and Marsh (1979)<br />
stated, “Students of instructors who got student<br />
feedback scored higher on achievement test and<br />
assessments of motivation for learning than students<br />
of instructors who got no feedback.” Many writers<br />
maintain, without reservation, that the evaluation<br />
instruments can be utilized to improve instruction<br />
(Aleamoni, 1999; Kulik, 2001; Theall & Franklin,<br />
2001), even though Kulik (2001, p. 10) begins his<br />
arguments by stating, “The catch is that no one<br />
knows what measure to use as the criterion of<br />
teaching effectiveness.” The reported need for<br />
consultants in this process (Finkelstein, 1995)<br />
weakens the argument.<br />
Construct Evaluations: Convergent Validity<br />
Constructs are complex, necessitating several<br />
methods of validation. One of them is to establish<br />
that the measurement of the construct is associated<br />
with independent measures that, according to the<br />
underlying structural hypothesis, should be<br />
associated with the construct. A major problem with<br />
academic measures is highlighted by Garson (2006,<br />
p. 2):<br />
“A good construct has a theoretical basis<br />
which is translated through clear operational<br />
definitions involving measurable indicators. A<br />
poor construct may be characterized by lack<br />
of theoretical agreement on its content, or by<br />
flawed operationalization such that its<br />
indicators may be construed as measuring<br />
one thing by one researcher and another<br />
thing by another researcher.”<br />
9<br />
According to this definition, many researchers would<br />
insist that academic measures would intrinsically<br />
lack construct validity. A validity measure can,<br />
however, be approximated. For example, although<br />
the definition of “good” teaching can vary, most<br />
educators would generally agree that students will<br />
learn more from a “good” teacher than from a “bad”<br />
teacher. In addition, most observers would conclude<br />
that effort and learning are related. Numerous<br />
attempts have been made to establish that SET has<br />
these relationships with mixed results. Many<br />
researchers have reported finding a positive<br />
relationship between learning and student ratings of<br />
instructors (Dowell & Neal, 1982; Marlin & Niss,<br />
1980; Lundsten, 1986; Baird, 1987), many others<br />
have found either no, or even a negative,<br />
relationship (Attiyeh & Lumsden, 1972; Jake, 1998;<br />
Johnson, 2003; Robin & Robin, 1972, Weinberg,<br />
Fleisher & Hashimoto, 2007; Yunker & Yunker,<br />
2003). The same pattern with rigor has been found<br />
with many researchers finding a strong relationship<br />
between rigor and SET (Chacko, 1983; Cashin,<br />
1995; Sixbury & Cashin, 1995), while others have<br />
found no, or a negative, relationship (Clayson &<br />
Haley, 1990; Greenwald & Gillmore, 1997a; Marks,<br />
2000).<br />
The number of articles on this topic is a testament to<br />
the difficulty of establishing convergent validity with<br />
SET. It also shows why honest researchers can<br />
come to such widely different conclusions in the<br />
debate about the convergent validity of SET.<br />
Construct Evaluations: Discriminant Validity<br />
The evaluative measures of a construct should not<br />
be related to criterion unrelated to the construct.<br />
This is the area that has created the most heated<br />
debate with SET.<br />
For example, an instructor should not be able to<br />
“buy” good evaluations by giving good grades.<br />
Numerous studies have found that grades are not<br />
unjustifiably related to the evaluations (Cashin,<br />
1995; Marsh & Dunkin, 1992; Marsh & Roche, 1999,<br />
2000; Kaplan, Mets, & Cook, 2000; see Marsh &<br />
Roche (2000) for extensive reviews). Another large<br />
group of studies have found that grades are related<br />
to the evaluations (Clayson, 2004; Gillmore &<br />
Greenwald, 1999; Johnson, 2003; Weinberg,<br />
Fleisher, & Hashimoto, 2007; see Clayson, Frost, &<br />
Sheffet (2005) for extensive reviews).<br />
Personality unrelated to learning, the so-called “Dr.<br />
Fox Effect,” should not be related to SET. Many<br />
studies and reviews have found personality to not be<br />
related to the evaluations (Aleamoni, 1999;