24.09.2013 Views

2008 - Marketing Educators' Association

2008 - Marketing Educators' Association

2008 - Marketing Educators' Association

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

ankings, alumni ratings, and the rating of trained<br />

observers (Aleamoni, 1999; Cashin, 1995; Kulik,<br />

2001). However, even here, problems emerge.<br />

Finkelstein (1995) reviewed some literature to<br />

suggest that the instruments will not lead to<br />

instructional improvement unless skilled consultants<br />

help the faculty interpret the ratings. The literature<br />

seems to be suggesting that the instruments are<br />

concurrently valid, but that the concurrent<br />

experience that faculty members or students have<br />

with these instruments, without expert supervision, is<br />

invalid.<br />

Predictive Validity<br />

The predictive validity of a measurement instrument<br />

relates to its usefulness as a predictor of some other<br />

characteristic or behavior. The research base here<br />

is generally lacking and contradictory. These<br />

contradictions can be shown by two examples.<br />

“There is no evidence that the use of teaching<br />

ratings improves learning in the long run”<br />

(Armstrong, 1998). But Overall and Marsh (1979)<br />

stated, “Students of instructors who got student<br />

feedback scored higher on achievement test and<br />

assessments of motivation for learning than students<br />

of instructors who got no feedback.” Many writers<br />

maintain, without reservation, that the evaluation<br />

instruments can be utilized to improve instruction<br />

(Aleamoni, 1999; Kulik, 2001; Theall & Franklin,<br />

2001), even though Kulik (2001, p. 10) begins his<br />

arguments by stating, “The catch is that no one<br />

knows what measure to use as the criterion of<br />

teaching effectiveness.” The reported need for<br />

consultants in this process (Finkelstein, 1995)<br />

weakens the argument.<br />

Construct Evaluations: Convergent Validity<br />

Constructs are complex, necessitating several<br />

methods of validation. One of them is to establish<br />

that the measurement of the construct is associated<br />

with independent measures that, according to the<br />

underlying structural hypothesis, should be<br />

associated with the construct. A major problem with<br />

academic measures is highlighted by Garson (2006,<br />

p. 2):<br />

“A good construct has a theoretical basis<br />

which is translated through clear operational<br />

definitions involving measurable indicators. A<br />

poor construct may be characterized by lack<br />

of theoretical agreement on its content, or by<br />

flawed operationalization such that its<br />

indicators may be construed as measuring<br />

one thing by one researcher and another<br />

thing by another researcher.”<br />

9<br />

According to this definition, many researchers would<br />

insist that academic measures would intrinsically<br />

lack construct validity. A validity measure can,<br />

however, be approximated. For example, although<br />

the definition of “good” teaching can vary, most<br />

educators would generally agree that students will<br />

learn more from a “good” teacher than from a “bad”<br />

teacher. In addition, most observers would conclude<br />

that effort and learning are related. Numerous<br />

attempts have been made to establish that SET has<br />

these relationships with mixed results. Many<br />

researchers have reported finding a positive<br />

relationship between learning and student ratings of<br />

instructors (Dowell & Neal, 1982; Marlin & Niss,<br />

1980; Lundsten, 1986; Baird, 1987), many others<br />

have found either no, or even a negative,<br />

relationship (Attiyeh & Lumsden, 1972; Jake, 1998;<br />

Johnson, 2003; Robin & Robin, 1972, Weinberg,<br />

Fleisher & Hashimoto, 2007; Yunker & Yunker,<br />

2003). The same pattern with rigor has been found<br />

with many researchers finding a strong relationship<br />

between rigor and SET (Chacko, 1983; Cashin,<br />

1995; Sixbury & Cashin, 1995), while others have<br />

found no, or a negative, relationship (Clayson &<br />

Haley, 1990; Greenwald & Gillmore, 1997a; Marks,<br />

2000).<br />

The number of articles on this topic is a testament to<br />

the difficulty of establishing convergent validity with<br />

SET. It also shows why honest researchers can<br />

come to such widely different conclusions in the<br />

debate about the convergent validity of SET.<br />

Construct Evaluations: Discriminant Validity<br />

The evaluative measures of a construct should not<br />

be related to criterion unrelated to the construct.<br />

This is the area that has created the most heated<br />

debate with SET.<br />

For example, an instructor should not be able to<br />

“buy” good evaluations by giving good grades.<br />

Numerous studies have found that grades are not<br />

unjustifiably related to the evaluations (Cashin,<br />

1995; Marsh & Dunkin, 1992; Marsh & Roche, 1999,<br />

2000; Kaplan, Mets, & Cook, 2000; see Marsh &<br />

Roche (2000) for extensive reviews). Another large<br />

group of studies have found that grades are related<br />

to the evaluations (Clayson, 2004; Gillmore &<br />

Greenwald, 1999; Johnson, 2003; Weinberg,<br />

Fleisher, & Hashimoto, 2007; see Clayson, Frost, &<br />

Sheffet (2005) for extensive reviews).<br />

Personality unrelated to learning, the so-called “Dr.<br />

Fox Effect,” should not be related to SET. Many<br />

studies and reviews have found personality to not be<br />

related to the evaluations (Aleamoni, 1999;

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!