Eric Grosch, Letter to Dr. Morgenstern on LOR - Semmelweis ...
Eric Grosch, Letter to Dr. Morgenstern on LOR - Semmelweis ...
Eric Grosch, Letter to Dr. Morgenstern on LOR - Semmelweis ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
DR. ERIC N GROSCH - A new approach <str<strong>on</strong>g>to</str<strong>on</strong>g> <strong>LOR</strong><br />
Dear <str<strong>on</strong>g>Dr</str<strong>on</strong>g>. <str<strong>on</strong>g>Morgenstern</str<strong>on</strong>g>:<br />
I read your article[1] with interest. You wrote:<br />
In c<strong>on</strong>sidering a new approach <str<strong>on</strong>g>to</str<strong>on</strong>g> the pediatric <strong>LOR</strong>, COMSEP and the APPD welcome the input<br />
of the broader pediatric community.<br />
Thanks for expressing an interest in receiving my comments, since I'm an internist, not a<br />
pediatrician. I'm pleased <str<strong>on</strong>g>to</str<strong>on</strong>g> c<strong>on</strong>tribute <str<strong>on</strong>g>to</str<strong>on</strong>g> the dialogue, which I think is important. I apologize, in<br />
advance for the length of my text but I think the <str<strong>on</strong>g>to</str<strong>on</strong>g>pic warrants it. Quoted text is indented and my<br />
own unindented.<br />
The <strong>LOR</strong> and its close relative, the performance-appraisal, are sacred cows in medical educati<strong>on</strong>,<br />
training and job-placement. They purport <str<strong>on</strong>g>to</str<strong>on</strong>g> provide means of communicating a candidate's traits<br />
am<strong>on</strong>g men<str<strong>on</strong>g>to</str<strong>on</strong>g>rs -- the performance-appraisal am<strong>on</strong>g men<str<strong>on</strong>g>to</str<strong>on</strong>g>rs within an instituti<strong>on</strong>; the <strong>LOR</strong> from<br />
men<str<strong>on</strong>g>to</str<strong>on</strong>g>rs in <strong>on</strong>e instituti<strong>on</strong> <str<strong>on</strong>g>to</str<strong>on</strong>g> those in another. The approach for each is much the same.<br />
That purpose seems analogous <str<strong>on</strong>g>to</str<strong>on</strong>g> the medical record in patient-care, which provides a means of<br />
communicating a patient's disease-traits am<strong>on</strong>g the patient's physicians. The analogy is false for<br />
reas<strong>on</strong>s I cite, in the chart, below:<br />
Clinical chart <strong>LOR</strong>/performance-appraisal<br />
appraisal and appropriate appraisal summative, end of rotati<strong>on</strong> presented usually acti<strong>on</strong> at least<br />
daily after the fact, <str<strong>on</strong>g>to</str<strong>on</strong>g>o late for improvement<br />
goal is improvement of the patient goal varies between promoti<strong>on</strong> of the candidate <str<strong>on</strong>g>to</str<strong>on</strong>g> his<br />
eliminati<strong>on</strong> from c<strong>on</strong>siderati<strong>on</strong><br />
Relies <strong>on</strong> objective evidence for Often relies <strong>on</strong> rumor, innuendo, scuttlebutt for decisi<strong>on</strong>-making<br />
decisi<strong>on</strong>-making<br />
Documentati<strong>on</strong> is as l<strong>on</strong>g as is Documentati<strong>on</strong> is as brief as possible <str<strong>on</strong>g>to</str<strong>on</strong>g> save<br />
necessary reading time<br />
Documentati<strong>on</strong> is in terms of specific Documentati<strong>on</strong> is in terms of unsubstantiated<br />
clinical events opini<strong>on</strong>s, couched in generalities, of men<str<strong>on</strong>g>to</str<strong>on</strong>g>rs, peers, etc.<br />
On the evidence that I've examined, I believe that there are better ways than the <strong>LOR</strong> and the<br />
performance-appraisal <str<strong>on</strong>g>to</str<strong>on</strong>g> accomplish the missi<strong>on</strong>. I d<strong>on</strong>'t c<strong>on</strong>sider that a flippant belief. I've<br />
arrived at my oppositi<strong>on</strong> <str<strong>on</strong>g>to</str<strong>on</strong>g> the <strong>LOR</strong>/performance-appraisal through c<strong>on</strong>siderable thought, reading
and anecdotal experience, both as an author and subject of <strong>LOR</strong>s. Most of what I say here is in<br />
the public domain and obvious. I get the impressi<strong>on</strong> that nobody ever puts it <str<strong>on</strong>g>to</str<strong>on</strong>g>gether, so I've<br />
d<strong>on</strong>e that, though, perhaps incompletely. If you have any objecti<strong>on</strong>s <str<strong>on</strong>g>to</str<strong>on</strong>g> what I've said here, please<br />
let me know them.<br />
I divide my reas<strong>on</strong>s in<str<strong>on</strong>g>to</str<strong>on</strong>g> generic secti<strong>on</strong>s:<br />
1. Golden Rule: Do un<str<strong>on</strong>g>to</str<strong>on</strong>g> others as you would have others do un<str<strong>on</strong>g>to</str<strong>on</strong>g> you. The Golden Rule[2],<br />
al<strong>on</strong>e, should persuade any<strong>on</strong>e with any insight in<str<strong>on</strong>g>to</str<strong>on</strong>g> the treatment that he would ideally prefer for<br />
himself, that the performance-appraisal/<strong>LOR</strong> can never work.<br />
2. Comparis<strong>on</strong>s are always odious.<br />
3. Deleterious effect: The idea of performance-appraisal is fundamentally flawed, even<br />
dysfuncti<strong>on</strong>al, because of the often deleterious effect it has <strong>on</strong> those trainees that appraisers rate<br />
as less than the very best, even though quality of performance is a lottery, governed in large part<br />
by random chance. Accordingly, rating people who are of the system makes no sense.<br />
4. Improper substitute for “where do I stand”: <strong>LOR</strong>s and performance-appraisals serve the<br />
organizati<strong>on</strong> or instituti<strong>on</strong>, not the individual appraised.<br />
5. Inaccuracy:<br />
a. misapplicati<strong>on</strong> of the Likert-scale principle<br />
b. inevitability of rating-inflati<strong>on</strong><br />
c. popularity-c<strong>on</strong>test<br />
d. mismeasure of “excellence”<br />
e. men<str<strong>on</strong>g>to</str<strong>on</strong>g>r-inattenti<strong>on</strong>: Men<str<strong>on</strong>g>to</str<strong>on</strong>g>rs, who are supposed <str<strong>on</strong>g>to</str<strong>on</strong>g> do the evaluati<strong>on</strong>s and <strong>LOR</strong>s, d<strong>on</strong>'t pay<br />
enough attenti<strong>on</strong> <str<strong>on</strong>g>to</str<strong>on</strong>g> their trainees' performance <str<strong>on</strong>g>to</str<strong>on</strong>g> fulfill that functi<strong>on</strong> adequately because their<br />
c<strong>on</strong>tact with trainees is minimal and sporadic, so their appraisal of the performance of their<br />
trainees is most often inaccurate and may even reverse the reality <strong>on</strong> the ground.<br />
f. self-fulfilling prophecy<br />
g. absence of evidence-basis<br />
h. glittering generalities: Even if men<str<strong>on</strong>g>to</str<strong>on</strong>g>rs paid attenti<strong>on</strong> <str<strong>on</strong>g>to</str<strong>on</strong>g> trainees' performance, the rating<br />
systems that they use address <strong>on</strong>ly glittering generalities, such as “general medical knowledge,”<br />
require presentati<strong>on</strong> of no supporting evidence and rarely <str<strong>on</strong>g>to</str<strong>on</strong>g> never address the <strong>on</strong>ly index of<br />
work-performance in medicine, namely, clinical outcomes of patients under the trainees' care.
i. dis<str<strong>on</strong>g>to</str<strong>on</strong>g>rti<strong>on</strong> from “c<strong>on</strong>fidentiality,” under perpetual tensi<strong>on</strong><br />
6. The ultimate goal: communi<strong>on</strong> of “<str<strong>on</strong>g>to</str<strong>on</strong>g>p” talent in “<str<strong>on</strong>g>to</str<strong>on</strong>g>p” instituti<strong>on</strong>s<br />
7. Illustrative anecdote which is more typical than it should be<br />
1. Golden Rule[2]<br />
The performance-appraisal and the <strong>LOR</strong> are exercises in disregarding the needs of others and in<br />
attributing <str<strong>on</strong>g>to</str<strong>on</strong>g> others the character and nature of objects. The psychic mechanisms that prompt<br />
those in authority <str<strong>on</strong>g>to</str<strong>on</strong>g> impose performance-appraisal/<strong>LOR</strong> <strong>on</strong> others -- what they would not want<br />
for themselves -- are obscure but the most likely reas<strong>on</strong> seems <str<strong>on</strong>g>to</str<strong>on</strong>g> be that the very act of<br />
impositi<strong>on</strong> may, in and of itself, provide a pleasurable and ego-boosting exercise of arbitrary<br />
authority.<br />
Whatever the psychic mechanisms, the proof of the observati<strong>on</strong> appears in the c<strong>on</strong>trast between<br />
the AMA's c<strong>on</strong>sistent endorsement of peer-review for physicians they presume, generically, <str<strong>on</strong>g>to</str<strong>on</strong>g> be<br />
“bad doc<str<strong>on</strong>g>to</str<strong>on</strong>g>rs and its disparagment of Professi<strong>on</strong>al Standards Review Organizati<strong>on</strong>s (PSROs).<br />
For example, it is a matter of record that the AMA was a str<strong>on</strong>g supporter of enactment of the<br />
Health Care Quality Improvement Act of 1986 (HCQIA), which codified peer-review provisi<strong>on</strong>s<br />
from hospital bylaws in<str<strong>on</strong>g>to</str<strong>on</strong>g> federal law. The AMA initially supported the HCQIA because it<br />
supports peer-review of “bad doc<str<strong>on</strong>g>to</str<strong>on</strong>g>rs” but opposed <strong>on</strong>e feature of that act, the Nati<strong>on</strong>al<br />
Practiti<strong>on</strong>ers Data Bank (NPDB), and withdrew its support al<str<strong>on</strong>g>to</str<strong>on</strong>g>gether from the HCQIA over that<br />
issue, but it eventually obtained a quid pro quo: NPDB as well as absolute immunity from liability<br />
for hospital-level peer-reviewers, a provisi<strong>on</strong> that has led <str<strong>on</strong>g>to</str<strong>on</strong>g> the proliferati<strong>on</strong> of bad-fath<br />
peer-review.[3]<br />
At the same time, JAMA and other journals have published articles that have impugned the<br />
accuracy of the findings of PSROs, the agents of which may c<strong>on</strong>duct peer-review <strong>on</strong> any<br />
practicing physician, including <strong>on</strong>e whom the AMA would presume <str<strong>on</strong>g>to</str<strong>on</strong>g> be a “good doc<str<strong>on</strong>g>to</str<strong>on</strong>g>r.”<br />
Neither JAMA nor any other medical journal has published even <strong>on</strong>e article that has examined the<br />
accuracy or validity of hospital-based peer-review, which the AMA enthusiastically approves<br />
because statutes render such peer-review privileged/c<strong>on</strong>fidential.<br />
2. Comparis<strong>on</strong>s are always odious<br />
Farrell[4] noted the emoti<strong>on</strong>al effect of sex-reversed beauty-c<strong>on</strong>tests am<strong>on</strong>g men. The winner<br />
was, of course, ecstatic, enthusiastic and high in self-esteem but the runner-ups felt devastated at<br />
the relative rejecti<strong>on</strong> and they experienced an epiphany: why women (at least those of runner-up<br />
grade [or less] physical appearance) dislike cus<str<strong>on</strong>g>to</str<strong>on</strong>g>mary beauty-c<strong>on</strong>tests. Comparis<strong>on</strong>s are always<br />
odious because the comparis<strong>on</strong>-game is a zero-sum propositi<strong>on</strong>. Performance-appraisal is an<br />
appraisal of something that the evaluee can c<strong>on</strong>trol <str<strong>on</strong>g>to</str<strong>on</strong>g> some extent by his c<strong>on</strong>scious will, as<br />
opposed <str<strong>on</strong>g>to</str<strong>on</strong>g> appearance, which he can't, so it's marginally less pernicious than a beauty-c<strong>on</strong>test but<br />
not much. The fact remains that the better <strong>on</strong>e individual rates, the worse others do. It's
inescapable.<br />
The performance-appraisal/<strong>LOR</strong> in most fields, including medical educati<strong>on</strong>, always hinges <strong>on</strong><br />
comparing <strong>on</strong>e trainee, by various criteria, with others. The comparis<strong>on</strong> may appear as a<br />
class-rank or as a comparis<strong>on</strong> of the subject's rating with an ideal rating, e.g., 6 of a possible 10<br />
points, 4 of a possible 5 points, etc. The message is always: “You d<strong>on</strong>'t measure up.”<br />
That message is especially demoralizing <str<strong>on</strong>g>to</str<strong>on</strong>g> the usual medical trainee, since the very fact that he's<br />
survived <str<strong>on</strong>g>to</str<strong>on</strong>g> the stage of medical training means that he has already survived very stringent<br />
selecti<strong>on</strong>/exclusi<strong>on</strong> filters and thus become accus<str<strong>on</strong>g>to</str<strong>on</strong>g>med <str<strong>on</strong>g>to</str<strong>on</strong>g> superlative accolades in his early<br />
educati<strong>on</strong>, through schooling and his undergraduate years. Thrown in the midst of similar<br />
high-achievers, the normative performance is likely <str<strong>on</strong>g>to</str<strong>on</strong>g> be uniformly high and he may rate merely<br />
“average.”<br />
3. Deleterious effect:<br />
Deming lists the performance-appraisal (and, by implicati<strong>on</strong>, also the cus<str<strong>on</strong>g>to</str<strong>on</strong>g>mary <strong>LOR</strong>) am<strong>on</strong>g the<br />
deadly diseases of business-organizati<strong>on</strong>s. The same ideas apply, in spades <str<strong>on</strong>g>to</str<strong>on</strong>g> medical<br />
organizati<strong>on</strong>s and <str<strong>on</strong>g>to</str<strong>on</strong>g> <strong>LOR</strong>s, which are retrospective, summative performance-appraisals, frozen<br />
and immutable, in perpetuity:<br />
...the deadly diseases...<br />
3. Evaluati<strong>on</strong> of performance, merit rating...Many companies in America have systems by which<br />
every<strong>on</strong>e...receives from his superior...a rating...(101) Management by objective leads <str<strong>on</strong>g>to</str<strong>on</strong>g> the same<br />
evil...Management by fear would be a better name...(Deming 1986 107)<br />
Fair rating is impossible. A comm<strong>on</strong> fallacy is the suppositi<strong>on</strong> that it is possible <str<strong>on</strong>g>to</str<strong>on</strong>g> rate people; <str<strong>on</strong>g>to</str<strong>on</strong>g><br />
put them in rank order of performance for next year, based <strong>on</strong> performance last year.<br />
The performance of anybody is the result of a combinati<strong>on</strong> of many forces -- the pers<strong>on</strong> himself,<br />
the people that he works with, the job, the material that he works <strong>on</strong>, his equipment, his cus<str<strong>on</strong>g>to</str<strong>on</strong>g>mer,<br />
his management, his supervisi<strong>on</strong>, envir<strong>on</strong>mental c<strong>on</strong>diti<strong>on</strong>s (noise, c<strong>on</strong>fusi<strong>on</strong>, poor food in the<br />
company's cafeteria). (109) These forces...[which]...arise almost entirely from acti<strong>on</strong> of the<br />
system...will produce...large differences between people....A man not promoted is unable <str<strong>on</strong>g>to</str<strong>on</strong>g><br />
understand why his performance is lower than some<strong>on</strong>e else's. No w<strong>on</strong>der; his rating was the<br />
result of a lottery. Unfortunately, he takes his rating seriously...(Deming 1986, 110)<br />
The effect is devastating:<br />
It nourishes short-term performance, annihilates l<strong>on</strong>g-term planning, builds fear, demolishes<br />
teamwork, nourishes rivalry and politics.<br />
It leaves people bitter, crushed, bruised, battered, desolate, desp<strong>on</strong>dent, dejected, feeling inferior,<br />
some even depressed, unfit for work for weeks after receipt of rating unable <str<strong>on</strong>g>to</str<strong>on</strong>g> comprehend why
they were inferior. It is unfair, as it ascribes <str<strong>on</strong>g>to</str<strong>on</strong>g> the people in a group differences that may be<br />
caused <str<strong>on</strong>g>to</str<strong>on</strong>g>tally by the system that they work in.<br />
...what is wr<strong>on</strong>g is that the performance appraisal or merit rating focuses <strong>on</strong> the end product, at<br />
the end of the stream, not <strong>on</strong> leadership <str<strong>on</strong>g>to</str<strong>on</strong>g> help people. This is a way <str<strong>on</strong>g>to</str<strong>on</strong>g> avoid the problems of<br />
people. A manager becomes, in effect, manager of defects.<br />
The idea of merit rating is alluring. The sound of the words captivates the imaginati<strong>on</strong>: pay for<br />
what you get; get what you pay for; motivate people <str<strong>on</strong>g>to</str<strong>on</strong>g> do their best, for their own good.<br />
The effect is exactly the opposite of what the words promise. Every<strong>on</strong>e propels himself forward,<br />
or tries <str<strong>on</strong>g>to</str<strong>on</strong>g>, for his own good, <strong>on</strong> his own life preserver. The organizati<strong>on</strong> is the loser.<br />
Merit rating rewards people that do well in the system. It does not reward attempts <str<strong>on</strong>g>to</str<strong>on</strong>g> improve<br />
the system. D<strong>on</strong>'t rock the boat.<br />
...a merit rating is meaningless as a predic<str<strong>on</strong>g>to</str<strong>on</strong>g>r of performance, except for some<strong>on</strong>e that falls<br />
outside the limits of dif- (102) ferences attributable <str<strong>on</strong>g>to</str<strong>on</strong>g> the system that the people work in...<br />
Traditi<strong>on</strong>al appraisal systems increase the variability of performance of people. The trouble lies in<br />
the implied preciseness of rating schemes...Somebody is rated below average, takes a look at<br />
people that are rated above average; naturally w<strong>on</strong>ders why the difference exists. He tries <str<strong>on</strong>g>to</str<strong>on</strong>g><br />
emulate people above average. The result is impairment of performance. (103)<br />
...The problem lies in the difficulty <str<strong>on</strong>g>to</str<strong>on</strong>g> define a meaningful measure of performance. The <strong>on</strong>ly<br />
verifiable measure is a short-term count of some kind...(Deming 1986, 103)<br />
Degenerati<strong>on</strong> <str<strong>on</strong>g>to</str<strong>on</strong>g> counting. One of the main effects of evaluati<strong>on</strong> of performance is nourishment of<br />
short-term thinking and short-time performance...(103) A man must have something <str<strong>on</strong>g>to</str<strong>on</strong>g> show. His<br />
superior is forced in<str<strong>on</strong>g>to</str<strong>on</strong>g> numerics. It is easy <str<strong>on</strong>g>to</str<strong>on</strong>g> count. Counts relieve management of the necessity<br />
<str<strong>on</strong>g>to</str<strong>on</strong>g> c<strong>on</strong>trive a measure with meaning.<br />
...people that are measured by counting are deprived of pride of workmanship. Number of designs<br />
that an engineer turns out in a period of time would be an example of an index that provides no<br />
chance for pride of workmanship. He dare not take time <str<strong>on</strong>g>to</str<strong>on</strong>g> study and amend the design just<br />
completed. To do so would decrease his output. (105)<br />
A good rating for work <strong>on</strong> new product and new service that may generate new business five or<br />
eight years hence, and provide better material living, requires enlightened management. He that<br />
engages in such work would study changes in educati<strong>on</strong>, changes in style of living, migrati<strong>on</strong> in<br />
and out of urban areas. He would attend meetings of the American Sociological society, the<br />
Business Secti<strong>on</strong> of the American Statistical Associati<strong>on</strong>, the American Marketing Associati<strong>on</strong>.<br />
He would write professi<strong>on</strong>al papers <str<strong>on</strong>g>to</str<strong>on</strong>g> deliver at such meetings, all of which are necessary for the<br />
planning of product and service of the future. He would not for years have anything <str<strong>on</strong>g>to</str<strong>on</strong>g> show for<br />
his labors. Meanwhile, in the absence of enlightened management, other people getting good
atings <strong>on</strong> short-run projects would leave him behind. (Deming 1986, 106)<br />
Stifling teamwork. Evaluati<strong>on</strong> of performance explains...why it is difficult for staff areas <str<strong>on</strong>g>to</str<strong>on</strong>g> work<br />
<str<strong>on</strong>g>to</str<strong>on</strong>g>gether for the good of the company. They work instead as prima d<strong>on</strong>nas, <str<strong>on</strong>g>to</str<strong>on</strong>g> the defeat of the<br />
company. Good performance <strong>on</strong> a team helps the company but leads <str<strong>on</strong>g>to</str<strong>on</strong>g> less tangible results <str<strong>on</strong>g>to</str<strong>on</strong>g><br />
count for the individual. The problem <strong>on</strong> a team is: who did what?<br />
How could the people in the purchasing department, under the present system of evaluati<strong>on</strong>, take<br />
an interest in improvement of quality of materials for producti<strong>on</strong>, service, <str<strong>on</strong>g>to</str<strong>on</strong>g>ols, and other<br />
materials for n<strong>on</strong>productive purposes? This would require cooperati<strong>on</strong> with manufacturing. It<br />
would impede productivity in the purchasing department, which is often measured by the number<br />
of c<strong>on</strong>tracts negotiated per man-year, without regard <str<strong>on</strong>g>to</str<strong>on</strong>g> performance of materials or services<br />
purchased. If there be an accomplishment <str<strong>on</strong>g>to</str<strong>on</strong>g> boast about the people in manufacturing might get<br />
the credit, not the people in purchasing. Or, it could be the other way around. Thus...teamwork so<br />
highly desirable, can not thrive under the annual rating. Fear grips every<strong>on</strong>e. Be careful; d<strong>on</strong>'t take<br />
a risk; go al<strong>on</strong>g.<br />
Heard in a seminar. One gets a good rating for fighting a fire. The result is visible: can be<br />
quantified. If you do it right the first time, you are invisible. You satisfied the requirements. That<br />
is your job. Mess it up, and correct it later, you become a hero.<br />
Two chemists work <str<strong>on</strong>g>to</str<strong>on</strong>g>gether <strong>on</strong> a project, and write up their work as a scientific paper. The paper<br />
is accepted for a meeting in Hamburg...<strong>on</strong>ly <strong>on</strong>e of the pair may go <str<strong>on</strong>g>to</str<strong>on</strong>g> Hamburg <str<strong>on</strong>g>to</str<strong>on</strong>g> deliver the<br />
paper -- viz., the <strong>on</strong>e with the higher rating. The <strong>on</strong>e with the lower rating vows never again <str<strong>on</strong>g>to</str<strong>on</strong>g><br />
work close with any<strong>on</strong>e else.<br />
Result: every man for himself.<br />
Evaluati<strong>on</strong> of performance nourishes fear. People are afraid <str<strong>on</strong>g>to</str<strong>on</strong>g> ask questi<strong>on</strong>s that might indicate<br />
any possible doubt about the boss's ideas and decisi<strong>on</strong>s, or about his logic. The game becomes<br />
<strong>on</strong>e of politics. Keep <strong>on</strong> the good side of the boss. Any<strong>on</strong>e that presents another point of view or<br />
asks questi<strong>on</strong>s runs the risk of being called disloyal, not a team player, trying <str<strong>on</strong>g>to</str<strong>on</strong>g> push himself<br />
ahead. Be a yes man.<br />
Top levels of salaries and b<strong>on</strong>uses are in many American companies sky-high. It is human nature<br />
for a young man <str<strong>on</strong>g>to</str<strong>on</strong>g> aspire...<str<strong>on</strong>g>to</str<strong>on</strong>g>...<strong>on</strong>e of these positi<strong>on</strong>s. The <strong>on</strong>ly chance <str<strong>on</strong>g>to</str<strong>on</strong>g> reach a high level is by<br />
c<strong>on</strong>sistent, unfailing promoti<strong>on</strong>, year after year. The aspiring man's quest is not how <str<strong>on</strong>g>to</str<strong>on</strong>g> serve the<br />
company with whatever knowledge he has, but how <str<strong>on</strong>g>to</str<strong>on</strong>g> get a good rating. Miss <strong>on</strong>e raise, you<br />
w<strong>on</strong>'t make it: Some<strong>on</strong>e else will. (108)<br />
A man dare not take a risk. D<strong>on</strong>'t change a procedure. Change might not work well. What would<br />
happen <str<strong>on</strong>g>to</str<strong>on</strong>g> him that changed it? He must guard his own security. It is safer <str<strong>on</strong>g>to</str<strong>on</strong>g> stay in line.<br />
The manager, under the review system, like the people that he manages, works as an individual<br />
for his own advancement, not for the company. He must make a good showing for himself.
Another Irving Langmuir? Can American his<str<strong>on</strong>g>to</str<strong>on</strong>g>ry, under handicap of the annual rating, produce<br />
another Irving Langmuir, a Nobel Prize winner, or another W. D. Coolidge? Both these men were<br />
with the General Electric Company. Could the Siemens company produce another Ernst Werner<br />
v<strong>on</strong> Siemens?<br />
...It is worthy of note that the 80 American Nobel prize winners all had tenure, security. They<br />
were answerable <strong>on</strong>ly <str<strong>on</strong>g>to</str<strong>on</strong>g> themselves. (Deming 1986, 109)<br />
“It can't be all bad.”...<str<strong>on</strong>g>to</str<strong>on</strong>g>p management delay[s aboliti<strong>on</strong>]...of the annual rating of performance...by<br />
refuge in the...corollary that “It can't be all bad. It put me in<str<strong>on</strong>g>to</str<strong>on</strong>g> this positi<strong>on</strong>.”...He reached this<br />
positi<strong>on</strong> by coming out <strong>on</strong> <str<strong>on</strong>g>to</str<strong>on</strong>g>p in every annual rating, at the ruinati<strong>on</strong> of the lives of a score of<br />
other men. There is a better way.<br />
Modern Principles of Leadership...will replace the annual performance review. The first step...will<br />
be <str<strong>on</strong>g>to</str<strong>on</strong>g> provide educati<strong>on</strong> in leadership. The annual perfor- (116) mance review may then be<br />
abolished. Leadership will take its place...<br />
The annual performance review sneaked in and became popular because it does not require<br />
any<strong>on</strong>e <str<strong>on</strong>g>to</str<strong>on</strong>g> face the problems of people. It is easier <str<strong>on</strong>g>to</str<strong>on</strong>g> rate them; focus <strong>on</strong> the outcome...Western<br />
industry needs...methods that will improve the outcome. Suggesti<strong>on</strong>s follow.<br />
1. Institute educati<strong>on</strong> in leadership; obligati<strong>on</strong>s, principles, and methods.<br />
2. More careful selecti<strong>on</strong> of the people in the first place.[5]<br />
It seems difficult <str<strong>on</strong>g>to</str<strong>on</strong>g> imagine selecting medical trainees by methods any more careful than current<br />
<strong>on</strong>es.<br />
3. Better training and educati<strong>on</strong> after selecti<strong>on</strong>.<br />
4. A leader, instead of being a judge, will be a colleague, counseling and leading his people <strong>on</strong> a<br />
day-<str<strong>on</strong>g>to</str<strong>on</strong>g>-day basis, learning from them and with them. Everybody must be <strong>on</strong> a team <str<strong>on</strong>g>to</str<strong>on</strong>g> work for<br />
improvement of quality in the four steps of the Shewhart cycle:...<br />
In the absence of numerical data, a leader must make subjective judgment. A leader will spend<br />
hours with every <strong>on</strong>e of his people.. They will know what kind of help they need. There will<br />
sometimes be inc<strong>on</strong>trovertible evidence of excellent performance, such as patients, publicati<strong>on</strong> of<br />
papers, invitati<strong>on</strong>s <str<strong>on</strong>g>to</str<strong>on</strong>g> give lectures.<br />
People that are <strong>on</strong> the poor side of the sytem will require individual help...(Deming 1986, 117)<br />
a. What could be the most important accomplishments of this team? What changes might be<br />
desirable? What data are available? Are new observati<strong>on</strong>s needed? If yes, plan a change or test.<br />
Decide how <str<strong>on</strong>g>to</str<strong>on</strong>g> use the observati<strong>on</strong>s.
. Carry out the change or test decided up<strong>on</strong>, preferably <strong>on</strong> a small scale.<br />
c. Observe the effects of the change or test.<br />
d. Study the results. What did we learn? What can we predict?...(Deming 1986, 88)<br />
5. A leader will discover who if any of his people is (a) outside the system <strong>on</strong> the good side, (b)<br />
outside <strong>on</strong> the poor side, (c) bel<strong>on</strong>ging <str<strong>on</strong>g>to</str<strong>on</strong>g> the system. The calculati<strong>on</strong>s required...are...simple if<br />
numbers are used for measures of performance. Ranking of people...that bel<strong>on</strong>g <str<strong>on</strong>g>to</str<strong>on</strong>g> the system<br />
violates scientific logic and is ruinous as a policy,...<br />
In the absence of numerical data, a leader must make subjective judgment. A leader will spend<br />
hours with every <strong>on</strong>e of his people. They will know what kind of help they need...<br />
People...<strong>on</strong> the poor side of the system will require individual help....(Deming 1986 117)...<br />
7. Hold a l<strong>on</strong>g interview...three or four hours, at least...not for criticism, but for help<br />
and...everybody[‘s]...better understanding...<br />
8. Figures <strong>on</strong> performance should be used not <str<strong>on</strong>g>to</str<strong>on</strong>g> rank the people...that fall within the system, but<br />
<str<strong>on</strong>g>to</str<strong>on</strong>g> assist the leader <str<strong>on</strong>g>to</str<strong>on</strong>g> accomplish improvement of the system...(118)<br />
...Running a company <strong>on</strong> visible figures al<strong>on</strong>e (counting the m<strong>on</strong>ey). One can not be successful <strong>on</strong><br />
visible figures al<strong>on</strong>e...he that would run his company <strong>on</strong> visible figures al<strong>on</strong>e will in time have<br />
neither company nor figures.<br />
...the most important figures...are unknown and unknowable..., but successful management must<br />
nevertheless take account of them. Examples.<br />
Fallacies of reward for winning in a lottery. A man in the pers<strong>on</strong>nel department of a large<br />
company came forth with an idea, held as brilliant...<str<strong>on</strong>g>to</str<strong>on</strong>g> reward the <str<strong>on</strong>g>to</str<strong>on</strong>g>p (274) man of the m<strong>on</strong>th <strong>on</strong><br />
a certain producti<strong>on</strong> line (the man that made the lowest proporti<strong>on</strong> defective over the m<strong>on</strong>th) with<br />
a citati<strong>on</strong>. There would be a small party <strong>on</strong> the job in his h<strong>on</strong>or, and he would get half a day off.<br />
This might be a great idea if he were indeed an unusual performer for the m<strong>on</strong>th. There were 50<br />
men <strong>on</strong> the producti<strong>on</strong> line.<br />
Do the results of inspecti<strong>on</strong> of their work form a statistical system...? If the work of the group<br />
forms a statistical system, then the prize would be merely a lottery...if the <str<strong>on</strong>g>to</str<strong>on</strong>g>p man is a special<br />
cause <strong>on</strong> the side of low proporti<strong>on</strong> defective, then he is indeed outstanding. He would deserve<br />
recogniti<strong>on</strong>, and he could be a focal point for teaching men how <str<strong>on</strong>g>to</str<strong>on</strong>g> do the job.<br />
There is no harm in a lottery...provided it is called a lottery. To call it an award of merit when the<br />
selecti<strong>on</strong> is merely a lottery...is <str<strong>on</strong>g>to</str<strong>on</strong>g> demoralize the whole force, prize winners included. Everybody<br />
will suppose that there are good reas<strong>on</strong>s for the selecti<strong>on</strong> and will be trying <str<strong>on</strong>g>to</str<strong>on</strong>g> explain and reduce<br />
differences between men. This would be a futile exercise when the <strong>on</strong>ly differences are random
deviati<strong>on</strong>s, as is the case when the performance of the 50 men form[s] a statistical system.<br />
(Deming 1986 275) [5]<br />
In a similar vein, Ierodiak<strong>on</strong>ou and Vandenbroucke term medicine a s<str<strong>on</strong>g>to</str<strong>on</strong>g>chastic art:<br />
Ancient Greek philosophers thought that medicine was an art with peculiar characteristics, and<br />
they called medicine a s<str<strong>on</strong>g>to</str<strong>on</strong>g>chastic art. A doc<str<strong>on</strong>g>to</str<strong>on</strong>g>r might treat a patient c<strong>on</strong>scientiously according <str<strong>on</strong>g>to</str<strong>on</strong>g><br />
all learned precepts; yet the patients' c<strong>on</strong>diti<strong>on</strong> might deteriorate. Another patient might be treated<br />
rather carelessly by another doc<str<strong>on</strong>g>to</str<strong>on</strong>g>r; yet the patient might regain full health. Thus, in medicine<br />
there exists unpredictability between means and ends. By c<strong>on</strong>trast with other arts a diligent<br />
executi<strong>on</strong> of the tasks does not guarantee a good outcome, and vice versa...<br />
...we have l<strong>on</strong>g witnessed a debate <strong>on</strong> the right way <str<strong>on</strong>g>to</str<strong>on</strong>g> measure the quality of medical care: should<br />
we use outcome or process criteria?...For instance, a few years ago, (542) a series of outcome<br />
investigati<strong>on</strong>s was started in the USA. Presumably, some administra<str<strong>on</strong>g>to</str<strong>on</strong>g>rs had been c<strong>on</strong>vinced that<br />
even in health care, quality of performance should be measured according <str<strong>on</strong>g>to</str<strong>on</strong>g> strict outcome<br />
criteria, as is practised in the Japanese car industry, for example. The simplest outcome measure<br />
was mortality in hospital. Third party payers such as the Health Care Financing Administrati<strong>on</strong>,<br />
which administrates Medicare, started <str<strong>on</strong>g>to</str<strong>on</strong>g> rank hospitals according <str<strong>on</strong>g>to</str<strong>on</strong>g> mortality rates for specific<br />
procedures. The mere idea sent ripples of alarm through the American Medical Associati<strong>on</strong><br />
(AMA). Do not we intuitively know that medical centres with the highest reputati<strong>on</strong> attract<br />
patients whose illnesses are close <str<strong>on</strong>g>to</str<strong>on</strong>g> being bey<strong>on</strong>d rescue? Advanced epidemiological techniques<br />
have proved that differences in hospital mortality can be explained away by adjustment for<br />
differences in patient mix. To use outcome as a means <str<strong>on</strong>g>to</str<strong>on</strong>g> m<strong>on</strong>i<str<strong>on</strong>g>to</str<strong>on</strong>g>r quality would necessitate<br />
c<strong>on</strong>tinuous evaluati<strong>on</strong> of all individual patient characteristics. This process would be a gigantic<br />
research effort, close <str<strong>on</strong>g>to</str<strong>on</strong>g> the examinati<strong>on</strong> of treatments in randomised c<strong>on</strong>trolled trials, and would<br />
defy all realistic efforts at quality assurance. The whole armamentarium of epidemiology and<br />
statistics, such as randomisati<strong>on</strong>, matching, blinding, placebo-procedures, strict selecti<strong>on</strong> criteria,<br />
and modelling, aims at mastering the s<str<strong>on</strong>g>to</str<strong>on</strong>g>chastic elements that c<strong>on</strong>found our judgment...[6]<br />
Why should the performance of trainees be deterministic, not s<str<strong>on</strong>g>to</str<strong>on</strong>g>chastic, inasmuch as the<br />
s<str<strong>on</strong>g>to</str<strong>on</strong>g>chastic vagaries of patient-characteristics (case-mix) must influence it, in any individual<br />
instance? The difference is that hospital-administra<str<strong>on</strong>g>to</str<strong>on</strong>g>rs have an influential voice in such<br />
decisi<strong>on</strong>-making. The individual trainee does not.<br />
Under pervasive fear of rating, the trainee dares not ask a questi<strong>on</strong> that he thinks that the rater<br />
might think foolish. He has <str<strong>on</strong>g>to</str<strong>on</strong>g> c<strong>on</strong>fine his questi<strong>on</strong>s <strong>on</strong>ly <str<strong>on</strong>g>to</str<strong>on</strong>g> what he c<strong>on</strong>siders “intelligent,” posed<br />
with the purpose in mind of impressing the rater with his insight and wisdom, bey<strong>on</strong>d his years.<br />
The good regard of the rater is the trainee's life-preserver, <strong>on</strong> a s<str<strong>on</strong>g>to</str<strong>on</strong>g>rmy sea of insecurity. The<br />
trainee walks <strong>on</strong> eggshells, fearing that his every move, his every word is an element in a<br />
cumulative body of chit-marks that may eventually <str<strong>on</strong>g>to</str<strong>on</strong>g>rpedo his reputati<strong>on</strong>. If he should get a black<br />
mark against him, for any reas<strong>on</strong>, and the rater learns of it, the trainee shall thereby have lost the<br />
rater's support and enter career free-fall. Accordingly, he pre-edits all questi<strong>on</strong>s for acceptability<br />
before letting them out of his mouth. If he can't think of a zinger of a questi<strong>on</strong>, he'll most likely<br />
stay mum and live in ignorance about a broad variety of subjects, out of fear of asking a “stupid
questi<strong>on</strong>.”<br />
MIT Cal Tech and other high-prestige instituti<strong>on</strong>s have experimented with a pass-fail grading<br />
system because they unders<str<strong>on</strong>g>to</str<strong>on</strong>g>od the inherent absurdity of grading-systems, with their arbitrary<br />
cut-off points for each letter-designati<strong>on</strong>. That represented a rejecti<strong>on</strong> of the very noti<strong>on</strong> of<br />
grading. I'm not certain of the status at those instituti<strong>on</strong>s at the moment. Maybe their graduates<br />
have had difficulty translating their academic performance in<str<strong>on</strong>g>to</str<strong>on</strong>g> terms that other instituti<strong>on</strong>s, that<br />
recognize grading, understand, so maybe they've g<strong>on</strong>e back <str<strong>on</strong>g>to</str<strong>on</strong>g> grading.<br />
4. Improper substitute for “where do I stand”<br />
Coens and Jenkins deplore performance-appraisals and, by implicati<strong>on</strong>, also <strong>LOR</strong>s:<br />
At a recent quality c<strong>on</strong>ference, a CEO was questi<strong>on</strong>ed as <str<strong>on</strong>g>to</str<strong>on</strong>g> why his organizati<strong>on</strong> c<strong>on</strong>tinued <str<strong>on</strong>g>to</str<strong>on</strong>g> use<br />
appraisals after shifting <str<strong>on</strong>g>to</str<strong>on</strong>g> a quality management culture of system and process<br />
improvement...“We think we owe it <str<strong>on</strong>g>to</str<strong>on</strong>g> people <str<strong>on</strong>g>to</str<strong>on</strong>g> let them know where they stand.”...(27)<br />
What people really want is access <str<strong>on</strong>g>to</str<strong>on</strong>g> the knowledge and informati<strong>on</strong> that influences the<br />
organizati<strong>on</strong>'s pay, promoti<strong>on</strong>, and status systems and how these affect or apply <str<strong>on</strong>g>to</str<strong>on</strong>g> them...People<br />
are insatiably curious about Where do I stand? because, in most organizati<strong>on</strong>s, this query is<br />
decided with a maze of unspoken rules, inscrutable political influences and other dynamics of<br />
organizati<strong>on</strong>al life. Appraisal is not the system that drives pay, careers, and status; it is an<br />
incidental effect of those dynamic systems. Appraisal is...the paper-shuffling that sanctifies<br />
decisi<strong>on</strong>s already made.(28)[7]<br />
The cognate of pay and promoti<strong>on</strong>, in the corporate setting, is gaining acceptance in<str<strong>on</strong>g>to</str<strong>on</strong>g> a “<str<strong>on</strong>g>to</str<strong>on</strong>g>p”<br />
(whatever that might mean) training program, in the medical-educati<strong>on</strong>al setting.<br />
Too often, a trainee finds out, <str<strong>on</strong>g>to</str<strong>on</strong>g> his surprise or shock, where he stands <strong>on</strong>ly when he reads his<br />
retrospective performance-appraisal/<strong>LOR</strong> and, by then, it's <str<strong>on</strong>g>to</str<strong>on</strong>g>o late <str<strong>on</strong>g>to</str<strong>on</strong>g> do anything about it. <strong>LOR</strong>s<br />
and performance-appraisals have an especially pernicious affect <strong>on</strong> medical students, at that<br />
vulnerable stage in their development, but they're bad for any trainee and for any pers<strong>on</strong>.<br />
5. Inaccuracy:<br />
a. misapplicati<strong>on</strong> of the Likert-scale<br />
“Likert” seems an unlikely choice for naming the method, since Likert used the scale in canvassing<br />
members of populati<strong>on</strong>-samples <str<strong>on</strong>g>to</str<strong>on</strong>g> obtain aggregate ratings of their attitudes in his<br />
1932-article[8], the presumed basis of the ep<strong>on</strong>ym. Likert, himself, had the good sense not <str<strong>on</strong>g>to</str<strong>on</strong>g><br />
apply such rating scales <str<strong>on</strong>g>to</str<strong>on</strong>g> important matters that could affect people's livelihoods, though his<br />
predecessors already had and his successors still do. Likert[9] credited prior authors, Fechner and<br />
Gal<str<strong>on</strong>g>to</str<strong>on</strong>g>n, without citing a reference, for the originati<strong>on</strong> of such questi<strong>on</strong>naires, circa 1888. Scott<br />
introduced the system <str<strong>on</strong>g>to</str<strong>on</strong>g> the United States Army in the early part of the last (20th) century[10].<br />
Paters<strong>on</strong>, an employee of the Scott-Company, described a later adaptati<strong>on</strong> of Scott's
method[11,12] for “objective” evaluati<strong>on</strong> of job-performance, the purpose of interest here.<br />
With no apparent insight in<str<strong>on</strong>g>to</str<strong>on</strong>g> the inherent vagueness of the method, Paters<strong>on</strong> claimed <str<strong>on</strong>g>to</str<strong>on</strong>g><br />
distinguish objective from subjective qualities without doing so:<br />
objective qualities . . . “efficiency,” “originality,” “perseverance,” and “quickness” . . . subjective<br />
qualities . . . “courage,” “cheerfulness” and “kindliness.” . . .[12]<br />
The criteria he cited for rating workers were similarly vague:<br />
Ability <str<strong>on</strong>g>to</str<strong>on</strong>g> Learn, Quantity of Work, Quality of Work, Industry, Initiative, Co-operativeness,<br />
Knowledge of Work[13]<br />
Strangely, Paters<strong>on</strong> disregarded the opportunity for evidence-based assessment of the criteri<strong>on</strong><br />
most amenable <str<strong>on</strong>g>to</str<strong>on</strong>g> objective evaluati<strong>on</strong>, namely quantity of work, in terms, say, of number of units<br />
the worker produces per unit-time. Instead, the instructi<strong>on</strong>s bade the rater give a worker <strong>on</strong> that<br />
criteri<strong>on</strong> a rating-score, presumably <str<strong>on</strong>g>to</str<strong>on</strong>g> foster “uniformity”[14] with ratings of other criteria, not<br />
amenable <str<strong>on</strong>g>to</str<strong>on</strong>g> objective assessment.<br />
Other authors described errors and pitfalls inherent in the method. Thorndike first described the<br />
halo effect as a<br />
. . . c<strong>on</strong>stant error <str<strong>on</strong>g>to</str<strong>on</strong>g>ward suffusing ratings of special features with a halo bel<strong>on</strong>ging <str<strong>on</strong>g>to</str<strong>on</strong>g> the<br />
individual as a whole[15]<br />
He found even the most capable rater<br />
unable <str<strong>on</strong>g>to</str<strong>on</strong>g> treat an individual as a compound of separate qualities and <str<strong>on</strong>g>to</str<strong>on</strong>g> assign a magnitude of<br />
each . . . in independence of the others.[16]<br />
As a countermeasure <str<strong>on</strong>g>to</str<strong>on</strong>g> minimize the halo-effect, he exhorted:<br />
. . . the observer should report the evidence, not a rating, and the rating should be given <strong>on</strong> the<br />
evidence <str<strong>on</strong>g>to</str<strong>on</strong>g> each quality separately without knowledge of the evidence c<strong>on</strong>cerning any other<br />
quality in the same individual.[16]<br />
Thorndike did not explain how a rater could avoid having knowledge of ratings he had given <strong>on</strong><br />
other criteria listed <strong>on</strong> the same form. The “evidence” Thorndike had in mind c<strong>on</strong>sisted in vague<br />
descriptive adjectives, similar <str<strong>on</strong>g>to</str<strong>on</strong>g> those that Paters<strong>on</strong> cited,[17] that the rater had a general<br />
impressi<strong>on</strong> might apply <str<strong>on</strong>g>to</str<strong>on</strong>g> the ratee.<br />
Kingsbury addressed accuracy:<br />
. . . ratings as ordinarily made are . . . unreliable, and . . . <strong>on</strong>ly under what may be called ideally<br />
favorable c<strong>on</strong>diti<strong>on</strong>s will they approximate accuracy, even <strong>on</strong> a scale so gross as <strong>on</strong>e of five
divisi<strong>on</strong>s. (18)<br />
Kingsbury enumerated those allegedly ideal c<strong>on</strong>diti<strong>on</strong>s:<br />
Ratings, <str<strong>on</strong>g>to</str<strong>on</strong>g> be reliable, necessitate (1) averaging three independent ratings, each made <strong>on</strong> an<br />
objective scale; (2) these scales must be comparable and equivalent, made in c<strong>on</strong>ference under<br />
expert supervisi<strong>on</strong>; (3) the three raters must be competent <str<strong>on</strong>g>to</str<strong>on</strong>g> rate.[18]<br />
Paters<strong>on</strong> joined, with slightly different wording, in affirming Kingsbury's ideal c<strong>on</strong>diti<strong>on</strong>s (2) and<br />
(3) and in thus implicitly alluding <str<strong>on</strong>g>to</str<strong>on</strong>g> pitfalls of the method:<br />
. . . Ratings should be accepted and filed for use <strong>on</strong>ly from those who have proved themselves<br />
capable of accurately judging human qualities. . . a rating scheme will not work au<str<strong>on</strong>g>to</str<strong>on</strong>g>matically. It<br />
must be closely supervised preferably by trained pers<strong>on</strong>nel research workers who must c<strong>on</strong>tinually<br />
subject the ratings <str<strong>on</strong>g>to</str<strong>on</strong>g> critical analysis and assist in training executives in proper use of the method.<br />
There is no escape from this requirement.[19]<br />
Paters<strong>on</strong> and Kingsbury omitted menti<strong>on</strong> of what specifics the training they proposed for the<br />
pers<strong>on</strong>nel-research workers should comprise and accomplish but they presumably intended,<br />
am<strong>on</strong>g other things, that the trained supervisors should somehow ensure separate evaluati<strong>on</strong> of<br />
labeled traits <str<strong>on</strong>g>to</str<strong>on</strong>g> exclude Thorndike's halo-effect; then, by averaging, fine-tuning, adjustment and<br />
manipulati<strong>on</strong> of the scores from at least three raters, all of whom knew how <str<strong>on</strong>g>to</str<strong>on</strong>g> provide accurate<br />
ratings (presumably assessed by the raters' mutual agreement <strong>on</strong> each candidate's score <strong>on</strong> each<br />
criteri<strong>on</strong>) obtain a set of ratings c<strong>on</strong>sistent with the aggregate global impressi<strong>on</strong> each candidate<br />
made <strong>on</strong> the raters (the candidate's halo). The circularity of the rati<strong>on</strong>ale seems inescapable.<br />
Prior <str<strong>on</strong>g>to</str<strong>on</strong>g> receiving requests <str<strong>on</strong>g>to</str<strong>on</strong>g> fill out forms c<strong>on</strong>sisting of Likert-scale ratings <strong>on</strong> others'<br />
performance, I have never received any of the extensive training or testing <str<strong>on</strong>g>to</str<strong>on</strong>g> prove myself<br />
“capable of accurately judging human qualities,” nor, I daresay, has any appraiser of my<br />
performance received such training and testing, <str<strong>on</strong>g>to</str<strong>on</strong>g> my knowledge. The origina<str<strong>on</strong>g>to</str<strong>on</strong>g>rs of such forms<br />
seemed <str<strong>on</strong>g>to</str<strong>on</strong>g> assume that the rating scemes would “work au<str<strong>on</strong>g>to</str<strong>on</strong>g>matically,” c<strong>on</strong>trary <str<strong>on</strong>g>to</str<strong>on</strong>g> Kingsbury's<br />
adm<strong>on</strong>iti<strong>on</strong>.<br />
Rugg may have had more insight:<br />
. . . The unordered -- yes, the chaotic -- character of the judgments appears, irrespective of what<br />
traits are c<strong>on</strong>sidered or of what kinds of scales are compared. I now believe that the evidence<br />
establishes the futility of obtaining single “ratings” <strong>on</strong> point scales of such dynamic qualities as<br />
“intelligence,” “pers<strong>on</strong>al qualities,” “general work,” and the like.[20]<br />
Paters<strong>on</strong> cauti<strong>on</strong>ed and predicted:<br />
These rating methods should not be looked up<strong>on</strong> as perfect or final. Further research is necessary,<br />
and industry will profit . . . as progressive, experimentally minded executives realize the scope of<br />
the problem and engage in the necessary research . . . <str<strong>on</strong>g>to</str<strong>on</strong>g> develop newer and more reliable methods
than we now possess.[21]<br />
The progress Patters<strong>on</strong> envisi<strong>on</strong>ed has been slow in developing, as the medical-educati<strong>on</strong><br />
evaluati<strong>on</strong>-literature amply shows [22- 28]. The Likert-scale remains alive, well and unimproved<br />
since Paters<strong>on</strong>, Kingsbury and Thorndike fretted over it and <str<strong>on</strong>g>to</str<strong>on</strong>g>rtured it and since Rugg dismissed<br />
it as inherently invalid over eighty years ago.<br />
Rating-criteria in medical educati<strong>on</strong> c<strong>on</strong>tinue <str<strong>on</strong>g>to</str<strong>on</strong>g> be as vague as Paters<strong>on</strong>'s, e.g., “general medical<br />
knowledge (1-5),” “procedural skill (1-5),” “rapport with patients (1-5),” “rapport with nurses<br />
(1-5),” “overall general impressi<strong>on</strong> (1-5)” (the most global “halo”-criteri<strong>on</strong> of all) and the like.<br />
Many,[22-28] though not all[29,30] current users and discussants of the Likert-scale treat it as an<br />
axiomatically good and self-explana<str<strong>on</strong>g>to</str<strong>on</strong>g>ry scheme.<br />
In current medical-educati<strong>on</strong> usage, men<str<strong>on</strong>g>to</str<strong>on</strong>g>rs rate trainees and trainees rate men<str<strong>on</strong>g>to</str<strong>on</strong>g>rs without any<br />
expert supervisi<strong>on</strong> - in disregard of Kingsbury's ideal c<strong>on</strong>diti<strong>on</strong>s[18], Paters<strong>on</strong>'s precauti<strong>on</strong>s[19]<br />
and Rugg's skepticism[20] -- perhaps in imitati<strong>on</strong> of Likert[8], who may have felt justified in<br />
ignoring the precauti<strong>on</strong>s, c<strong>on</strong>diti<strong>on</strong>s and invalidity of the method for objective evaluati<strong>on</strong> because<br />
he pursued <strong>on</strong>ly subjective attitudes rather than purportedly objective traits. Yet, those who<br />
publish studies based <strong>on</strong> Likert-scale “data” apply the numerical scores derived as if they were<br />
facts and manipulate them with parametric statistics as if they were not ordinal[ 22-28] .<br />
Guilford[30] appears <str<strong>on</strong>g>to</str<strong>on</strong>g> equate Likert-scales with formal psychometric tests by including them in<br />
his book, entitled, “Psychometric Methods.” Worse, many seem <str<strong>on</strong>g>to</str<strong>on</strong>g> follow Thorndike[9] in<br />
attributing traits <str<strong>on</strong>g>to</str<strong>on</strong>g> ratings, a tendency Try<strong>on</strong> deplores, even for psychological tests, which are<br />
more formal than Likert-ratings, yet purveyors of Likert -ratings attribute traits <str<strong>on</strong>g>to</str<strong>on</strong>g> them:<br />
The test-trait fallacy [c<strong>on</strong>sists in presuming] that test scores provide measures of enduring and<br />
generalized characteristics of the pers<strong>on</strong>, called traits. . .<br />
The test-trait fallacy begins with the assumpti<strong>on</strong> that test scores are trait measures. The sec<strong>on</strong>d<br />
assumpti<strong>on</strong> is that trait measures are basic properties of the pers<strong>on</strong>. It easily follows that test<br />
scores reflect basic properties of the pers<strong>on</strong>. . . hence a measurement is reified in<str<strong>on</strong>g>to</str<strong>on</strong>g> a causal force.<br />
. . the unsound logic of drawing inferences about ability <strong>on</strong> the basis of observed performance is<br />
integral <str<strong>on</strong>g>to</str<strong>on</strong>g> the test-trait fallacy. . .[32]<br />
Traits are alluring because they are . . . compatible with the stimulus-organism-resp<strong>on</strong>se paradigm<br />
<str<strong>on</strong>g>to</str<strong>on</strong>g> which virtually all psychologists subscribe. . . To presume that psychological tests . . . measure<br />
organismic traits and <str<strong>on</strong>g>to</str<strong>on</strong>g> further presume that such traits are the basic properties that cause<br />
behavior is <str<strong>on</strong>g>to</str<strong>on</strong>g> place the psychologist in an attractively powerful theoretical and clinical positi<strong>on</strong>.<br />
The volume of psychological tests . . . is evidence of their allure for clinicians and researchers<br />
alike.[33]<br />
Authors even apply statistical methods <str<strong>on</strong>g>to</str<strong>on</strong>g> aggregate number-scores from a group of raters,<br />
compute inter-observer correlati<strong>on</strong>s and the like. Literature-approval of Likert-scale “data”<br />
encourages decisi<strong>on</strong>-makers <str<strong>on</strong>g>to</str<strong>on</strong>g> attach unwarranted worth <str<strong>on</strong>g>to</str<strong>on</strong>g> Likert-scale merit-ratings and
serenely <str<strong>on</strong>g>to</str<strong>on</strong>g> apply them in life-altering decisi<strong>on</strong>s <str<strong>on</strong>g>to</str<strong>on</strong>g>uching subordinate trainees,[22] such as<br />
recommendati<strong>on</strong> for certifying examinati<strong>on</strong>s, and employment, and even in promoting<br />
faculty-members[25].<br />
Albanes[34] suggests that “real life” ratings, presumably of qualified physicians, are objective and<br />
based <strong>on</strong> outcomes, yet Carey[35] asserts that evaluati<strong>on</strong>s of physician-faculty must be subjective.<br />
Codman[36] and his spiritual successors[37-41] have called for outcome-based rating of<br />
performance and, by extensi<strong>on</strong>, of competence, but physicians and hospitals have pointed the<br />
deficits of that method and prevented its spread, <str<strong>on</strong>g>to</str<strong>on</strong>g> date, by citing the multiplicity of fac<str<strong>on</strong>g>to</str<strong>on</strong>g>rs,<br />
unrelated <str<strong>on</strong>g>to</str<strong>on</strong>g> instituti<strong>on</strong>al or physician-competence, that determine outcome.[42]<br />
The champi<strong>on</strong>s of rating attribute two roles <str<strong>on</strong>g>to</str<strong>on</strong>g> it, evaluative or summative (entailing punitive and<br />
deterrent purposes) and formative.[42] Paters<strong>on</strong> <str<strong>on</strong>g>to</str<strong>on</strong>g>uted the formative purpose:<br />
I. Rating methods have been developed because of a recogniti<strong>on</strong> of the educati<strong>on</strong>al value of<br />
ratings . . .<br />
a. . . . <strong>on</strong> those who make the ratings. . . insures the analysis of subordinates in terms of the traits<br />
essential for success in the work.<br />
b. . . . <strong>on</strong> the employee. . . encourages self-analysis and provides an incentive for<br />
self-improvement in . . . traits in which he is weakest.[35]<br />
As educati<strong>on</strong>al feedback, rating fails <str<strong>on</strong>g>to</str<strong>on</strong>g> fulfill Ziegenfuss' proposed criteria for adequacy and<br />
efficacy:<br />
. . . the art of feeding back quality-related data is a critical point of quality improvement work. . .<br />
Feedback is effective when the following c<strong>on</strong>diti<strong>on</strong>s are met:<br />
1. Clarity of Purpose. Data can be used for development or for rendering judgment (formative<br />
versus summative . . .). . . for . . . organizati<strong>on</strong>al development, . . . the purpose is . . . formative . .<br />
. Learning and change <str<strong>on</strong>g>to</str<strong>on</strong>g> improve processes is the goal. A judgmental purpose (summative) offers<br />
a . . . grade of pass or fail and is designed for accountability. . .[45]<br />
Since “accountability” entails punishment[46], it does not bel<strong>on</strong>g in any workplace.[4] In<br />
educati<strong>on</strong>, by definiti<strong>on</strong>, the <strong>on</strong>ly appropriate purpose of feedback is the formative <strong>on</strong>e. The<br />
Likert-rating, in its cus<str<strong>on</strong>g>to</str<strong>on</strong>g>mary applicati<strong>on</strong>, succeeds in the summative, punitive goal of criteri<strong>on</strong> 1<br />
but fails in its formative goal.<br />
2. Clear and Specific Data. Data . . . must be . . . relevant <str<strong>on</strong>g>to</str<strong>on</strong>g> the . . . recipient.[45]<br />
The vague expressi<strong>on</strong>, “general medical knowledge, 3” (or any other number) is unclear and<br />
n<strong>on</strong>-specific, so rating fails criteri<strong>on</strong> 2 and is not relevant <str<strong>on</strong>g>to</str<strong>on</strong>g> the recipient (see criteri<strong>on</strong> 5).
3. Descriptive, Not Evaluative. Useful feedback describes what is happening but does not offer an<br />
evaluative judgment (unless that is the intended purpose). The presenters must not rush <str<strong>on</strong>g>to</str<strong>on</strong>g><br />
judgment without some interactive discussi<strong>on</strong> with the audience.[45]<br />
The Likert-scale rating substitutes for relevant evidence and thus fails criteri<strong>on</strong> 3.<br />
4. Timely. How close <str<strong>on</strong>g>to</str<strong>on</strong>g> the acti<strong>on</strong> . . . reviewed are the data describing the events? The golden<br />
rule is quick feedback . . . Old data is useful for his<str<strong>on</strong>g>to</str<strong>on</strong>g>rical and l<strong>on</strong>gitudinal purposes but is not<br />
supportive of behavior change in the near term.[45]<br />
The team, employed l<strong>on</strong>g-term, may <str<strong>on</strong>g>to</str<strong>on</strong>g>lerate a m<strong>on</strong>thly, quarterly or semi-annual feedback-cycle.<br />
The medical student or other trainee, who often has m<strong>on</strong>thly rotati<strong>on</strong>s in clinical departments,<br />
needs a shorter feedback-cycle. Feedback should be c<strong>on</strong>tinuous, its formal aspect should be at<br />
least weekly, and, preferably, daily. The comm<strong>on</strong>est Likert-rating comes <str<strong>on</strong>g>to</str<strong>on</strong>g> the ratee's attenti<strong>on</strong> as<br />
a summative, end-of-rotati<strong>on</strong> event, delivered <str<strong>on</strong>g>to</str<strong>on</strong>g>o late for him <str<strong>on</strong>g>to</str<strong>on</strong>g> implement improvement, so it<br />
fails criteri<strong>on</strong> 4.<br />
5. Limited. How great is the scope of the data? . . . tailor and focus the data <str<strong>on</strong>g>to</str<strong>on</strong>g> fit the specific,<br />
targeted needs of users. . .[45]<br />
A Likert-rating, e.g., “general medical knowledge, 3,” is <str<strong>on</strong>g>to</str<strong>on</strong>g>o vague and global <str<strong>on</strong>g>to</str<strong>on</strong>g> serve a ratee's<br />
needs. It invites the so-called halo-effect and fails criteri<strong>on</strong> 5.<br />
6. Comparative. . . To leave out comparative informati<strong>on</strong> is <str<strong>on</strong>g>to</str<strong>on</strong>g> deprive the recipients of<br />
knowledge about their progress or lack thereof. . .[45]<br />
Albanes[34] deplored the rater's “failure <str<strong>on</strong>g>to</str<strong>on</strong>g> discriminate” am<strong>on</strong>g trainees in awarding them equal<br />
marks. He thereby pursued a similar goal of making distincti<strong>on</strong>s for distincti<strong>on</strong>s' sake al<strong>on</strong>e and<br />
disregarded the “lottery”-nature of rating people who operate “within the system,”[5]<br />
Kingsbury likewise suggested:<br />
. . . we do have <str<strong>on</strong>g>to</str<strong>on</strong>g> make distincti<strong>on</strong>s between people . . .<br />
. . . and the rater should realize that it is not so disastrous <str<strong>on</strong>g>to</str<strong>on</strong>g> make some employees 2 who are not<br />
much worse than some he marks 3, as it is <str<strong>on</strong>g>to</str<strong>on</strong>g> mark them all alike <str<strong>on</strong>g>to</str<strong>on</strong>g> avoid seeming <str<strong>on</strong>g>to</str<strong>on</strong>g> magnify the<br />
difference. . .[47]<br />
As Deming eloquently explains,[5] it's disastrous for an individual <str<strong>on</strong>g>to</str<strong>on</strong>g> suffer a low rating. A low<br />
rating may be especially crushing <str<strong>on</strong>g>to</str<strong>on</strong>g> a medical student, accus<str<strong>on</strong>g>to</str<strong>on</strong>g>med such a tender soul often is,<br />
from the experience of a lifetime, <str<strong>on</strong>g>to</str<strong>on</strong>g> high academic ratings.<br />
If two or more employees or trainees perform equally well and very well, say, 5 of 5, they would<br />
deserve equal marks because equality of their performance reflects truth. The company, <str<strong>on</strong>g>to</str<strong>on</strong>g> which<br />
marking two or more employees alike, e.g. 5 of 5, may seem disastrous, can stand the gaff more
easily than an individual arbitrarily marked down, despite his best effort, merely <str<strong>on</strong>g>to</str<strong>on</strong>g> “make<br />
distincti<strong>on</strong>s between people.” Neither Albanes[34] nor Kingsbury[47] justified the need <str<strong>on</strong>g>to</str<strong>on</strong>g> make<br />
such distincti<strong>on</strong>s. He presumably c<strong>on</strong>sidered the principle axiomatic and self-evident.<br />
Since the end-of-rotati<strong>on</strong> Likert-scale rating provides no progressive comparis<strong>on</strong>s and since<br />
men<str<strong>on</strong>g>to</str<strong>on</strong>g>rs might balk at the administrative burden of completing Likert-scale ratings more often<br />
than <strong>on</strong>ce m<strong>on</strong>thly, it provides no sequential comparis<strong>on</strong> and fails criteri<strong>on</strong> 6.<br />
7. Participative Interpretati<strong>on</strong>. . . . final . . . analysis can[not] be c<strong>on</strong>ducted without audience<br />
involvement. Joint interpretati<strong>on</strong> is c<strong>on</strong>sistent with the developmental/formative purpose, as<br />
<str<strong>on</strong>g>to</str<strong>on</strong>g>gether we discuss meaning and follow-up acti<strong>on</strong> . . .[45]<br />
In the medical-educati<strong>on</strong> c<strong>on</strong>text, the Likert-scale rater rarely discusses his rating with his ratee(s)<br />
prior <str<strong>on</strong>g>to</str<strong>on</strong>g> entering it. It comes most often <str<strong>on</strong>g>to</str<strong>on</strong>g> the recipient's attenti<strong>on</strong> as a fait accompli, <str<strong>on</strong>g>to</str<strong>on</strong>g>o late for<br />
him <str<strong>on</strong>g>to</str<strong>on</strong>g> improve it. The Likert-scale rating fails criteri<strong>on</strong> 7.<br />
8. Safety and Security. Receiving performance feedback is . . . technical and . . . psychological . . .<br />
We need first <str<strong>on</strong>g>to</str<strong>on</strong>g> have the data correct (technical). . . Presenters must be sensitive <str<strong>on</strong>g>to</str<strong>on</strong>g> the<br />
psychology of the process and offer language and behavior that protect the recipients.[45]<br />
The Likert-scale rating inherently fails the technical criteri<strong>on</strong>, since it c<strong>on</strong>sists of a set of numerical<br />
scores which obscures the evidence that purports <str<strong>on</strong>g>to</str<strong>on</strong>g> form its basis. Various errors, <str<strong>on</strong>g>to</str<strong>on</strong>g> wit, the<br />
halo-effect (supra) and tendency <str<strong>on</strong>g>to</str<strong>on</strong>g>ward the mean[48,49] inhere in the Likert-scale.<br />
As applied, it most often fails the psychological criteri<strong>on</strong> since the social-c<strong>on</strong>trol functi<strong>on</strong>, which<br />
Albanes[34] advocated is crucial <str<strong>on</strong>g>to</str<strong>on</strong>g> its deterrent/punitive functi<strong>on</strong>. To pull a punch at the moment<br />
of delivery would diminish or annihilate the crushing impact the rater can otherwise accomplish.<br />
9. Practical and Acti<strong>on</strong> Oriented. To be useful, the data should suggest some followup acti<strong>on</strong> and<br />
should be practical enough <str<strong>on</strong>g>to</str<strong>on</strong>g> be used by professi<strong>on</strong>als in the field. . . [32,50]<br />
Having received a rating of, e.g., “general medical knowledge, 3,” the recipient can discern no<br />
idea from the rating how <str<strong>on</strong>g>to</str<strong>on</strong>g> improve. The Likert-rating fails criteri<strong>on</strong> 9.<br />
The evidence seems clear that ratings fail all of Ziegenfuss's rati<strong>on</strong>al criteria for effective feedback.<br />
b. inevitability of rating-inflati<strong>on</strong><br />
A universal human c<strong>on</strong>ceit holds that everybody's a fool and a moral pervert except for thee and<br />
me and I'm not so sure about thee. The individual expects others <str<strong>on</strong>g>to</str<strong>on</strong>g> rate him in a manner<br />
c<strong>on</strong>s<strong>on</strong>ant with the intrinsic, superlative characteristics that he attributes <str<strong>on</strong>g>to</str<strong>on</strong>g> himself. When<br />
men<str<strong>on</strong>g>to</str<strong>on</strong>g>rs, in a medical-educati<strong>on</strong> setting, rate him harshly, he feels helpless and often n<strong>on</strong>-plussed<br />
and feels an urge <str<strong>on</strong>g>to</str<strong>on</strong>g> press his raters <str<strong>on</strong>g>to</str<strong>on</strong>g> improve his rating.<br />
Some years ago, Sissela Bok, philospher and wife of Derek Bok, former President of Harvard
University, addressed merit-ratings <strong>on</strong> “fitness-reports” in the US Army. Her c<strong>on</strong>text was “lying”<br />
and her example of a liar was the supervisor who rated his subordinates <str<strong>on</strong>g>to</str<strong>on</strong>g>o highly <strong>on</strong> traits, such<br />
as “leadership,” “appearance,” etc., which are at least as nebulous as entities that raters in<br />
medicine attempt <str<strong>on</strong>g>to</str<strong>on</strong>g> address, e.g., “general medical knowledge,” “rapport with staff,” etc. They're<br />
all manifestati<strong>on</strong>s of the great tendency <str<strong>on</strong>g>to</str<strong>on</strong>g> generalize from skillful executi<strong>on</strong> of a narrow scope of<br />
activities, such as getting high scores <strong>on</strong> tests, <str<strong>on</strong>g>to</str<strong>on</strong>g> global “excellence,” “outstandingness” or<br />
“bestness,” in general.<br />
Bok's descripti<strong>on</strong> shows that your observati<strong>on</strong> that “excellent” is a third-tier rating has a his<str<strong>on</strong>g>to</str<strong>on</strong>g>ry:<br />
...Those who rate officers are asked <str<strong>on</strong>g>to</str<strong>on</strong>g> give them scores of “outstanding,” “superior,' “excellent,”<br />
“effective,” “marginal,” and “inadequate.” Raters know...that those who are ranked anything less<br />
than “outstanding” (say “superior” or “excellent”) are then at a great disadvantage, and become<br />
likely candidates for discharge...superficial verbal harmlessness combines with the harsh realities<br />
of the competiti<strong>on</strong> for advancement and job retenti<strong>on</strong> <str<strong>on</strong>g>to</str<strong>on</strong>g> produce an inflated set of standards <str<strong>on</strong>g>to</str<strong>on</strong>g><br />
which most feel bound <str<strong>on</strong>g>to</str<strong>on</strong>g> c<strong>on</strong>form. (Bok 73)<br />
...The US Army tried <str<strong>on</strong>g>to</str<strong>on</strong>g> scale down evaluati<strong>on</strong>s by publishing the evaluati<strong>on</strong> report...cited. It<br />
suggested mean scores for the different ranks, but few felt free <str<strong>on</strong>g>to</str<strong>on</strong>g> follow these means in individual<br />
cases, for fear of hurting the pers<strong>on</strong>s being rated. As a result, the suggested mean scores <strong>on</strong>ce<br />
again lost all value. (Bok 74) [51]<br />
Professor Bok, writing as a member of the establishment. <strong>LOR</strong>s and performance-evaluati<strong>on</strong>s<br />
cause little <str<strong>on</strong>g>to</str<strong>on</strong>g> no worry <str<strong>on</strong>g>to</str<strong>on</strong>g> her and her husband, who have made it <str<strong>on</strong>g>to</str<strong>on</strong>g> the <str<strong>on</strong>g>to</str<strong>on</strong>g>p of the academic<br />
heap, from which pinnacle, they may comment <strong>on</strong> us, herebelow:<br />
In elite . . . organizati<strong>on</strong>s, the evaluati<strong>on</strong> model tends <str<strong>on</strong>g>to</str<strong>on</strong>g> be elitism. Two lines of argument are<br />
involved. First, since the organizati<strong>on</strong>s have selected the best people, evaluati<strong>on</strong> of performance is<br />
irrelevant. After all, if the best people could not succeed, who could do better? Sec<strong>on</strong>d, since the<br />
quality of the organizati<strong>on</strong>s and their output is determined primarily by the equality of their<br />
people, attenti<strong>on</strong> <str<strong>on</strong>g>to</str<strong>on</strong>g> system, methods, or management is inc<strong>on</strong>sequential. It follows that, if the<br />
organizati<strong>on</strong>s already have the best people, “the opportunities for increased productivity in them<br />
are small and come slowly.” Finally, . . . elitism tends <str<strong>on</strong>g>to</str<strong>on</strong>g> create self-perpetuating closed circles<br />
whose members are exempt from review except by peers within.<br />
C<strong>on</strong>verting work problems in<str<strong>on</strong>g>to</str<strong>on</strong>g> people problems is a process of denying organizati<strong>on</strong>al<br />
accountability. It is a process of establishing a hierarchy of special privilege and immunity <str<strong>on</strong>g>to</str<strong>on</strong>g> rank<br />
with the hierarchy of authority. It is a process of maintaining the status quo; it denies both the<br />
need for change and the possibility. (27)[52]<br />
Accordingly, Professor Bok focused <strong>on</strong> the “lies” perpetrated <str<strong>on</strong>g>to</str<strong>on</strong>g> help the plebeian but omitted any<br />
menti<strong>on</strong> of organizati<strong>on</strong> dish<strong>on</strong>esty: the rumor-grapevines, chiefly by teleph<strong>on</strong>e, which leave no<br />
paper-trail, and which circumvent and subvert the normal channels of committed, transparent,<br />
written communicati<strong>on</strong>, <str<strong>on</strong>g>to</str<strong>on</strong>g> which subjects of ratings <strong>on</strong> <strong>LOR</strong>s might obtain access.[53]<br />
Pers<strong>on</strong>nel-managers use such underhanded means <str<strong>on</strong>g>to</str<strong>on</strong>g> evade legal liability for defamati<strong>on</strong> of
character <str<strong>on</strong>g>to</str<strong>on</strong>g> find out from former employers “what applicants are really like.”<br />
You wrote in your article[1] in nearly identical terms of the inevitable tendency <str<strong>on</strong>g>to</str<strong>on</strong>g>ward rating<br />
inflati<strong>on</strong>, your “hierarchy of superlatives,” the Lake Wobeg<strong>on</strong> effect, in which everybody is<br />
“above average,” and the tendency <str<strong>on</strong>g>to</str<strong>on</strong>g>ward rating-fragmentati<strong>on</strong> <str<strong>on</strong>g>to</str<strong>on</strong>g> permit raters <str<strong>on</strong>g>to</str<strong>on</strong>g> distinguish the<br />
"is <strong>on</strong>e of the finest medical students of the year," "...<strong>on</strong>e of the best medical students I have ever<br />
worked with," "richly deserves the h<strong>on</strong>ors awarded in the rotati<strong>on</strong>," or "receives my highest<br />
recommendati<strong>on</strong>",[54] from am<strong>on</strong>g the best and those who are the very best in the past year, the<br />
best ever, etc., etc. Speer et al cited grade-inflati<strong>on</strong> in internal medicine as well:<br />
. . . a significant number of clerkship direc<str<strong>on</strong>g>to</str<strong>on</strong>g>rs (43%) felt that we are unable <str<strong>on</strong>g>to</str<strong>on</strong>g> appropriately<br />
identify students with failing performances. The implicati<strong>on</strong> for our ability <str<strong>on</strong>g>to</str<strong>on</strong>g> certify students as<br />
clinically competent is c<strong>on</strong>cerning. . . (116)[55]<br />
That's evidently not their c<strong>on</strong>cern. They express more c<strong>on</strong>cern with labeling trainees clinically in<br />
competent.<br />
. . . faculty were the key <str<strong>on</strong>g>to</str<strong>on</strong>g> both the cause and soluti<strong>on</strong>. (116)[55]<br />
That is a truer statement than Speer et al perhaps realized, though faculty would probably prefer<br />
<str<strong>on</strong>g>to</str<strong>on</strong>g> blame the trainee-victims.<br />
Yet, clinical medicine simply doesn't c<strong>on</strong>tain tasks of sufficient sophisticati<strong>on</strong> that trainees could<br />
perform that would enable a trainee could distinguish himself from his fellows <str<strong>on</strong>g>to</str<strong>on</strong>g> the extent<br />
depicted in all the finely nuanced and ever mounting expressi<strong>on</strong>s of enthusiasm. The difficulty<br />
would be quite similar <str<strong>on</strong>g>to</str<strong>on</strong>g> the difficulty of rating a patient in similar terms, according <str<strong>on</strong>g>to</str<strong>on</strong>g> his<br />
resp<strong>on</strong>se <str<strong>on</strong>g>to</str<strong>on</strong>g> treatment. Objectively, he either gets better, stays the same or gets worse. It's difficult<br />
<str<strong>on</strong>g>to</str<strong>on</strong>g> imagine that an evalua<str<strong>on</strong>g>to</str<strong>on</strong>g>r of patients could find rati<strong>on</strong>al criteria for appraising a patient's<br />
recovery as “excellent,” “outstanding,” <strong>on</strong>e of the best <strong>on</strong> the ward,” “<strong>on</strong>e of the best in the past<br />
year,” “the best ever,” etc. If a rater can't do it for a patient, how can he do it for a trainee?<br />
Gould attributed the fallacy of c<strong>on</strong>fusing objects with labels <str<strong>on</strong>g>to</str<strong>on</strong>g> John Stuart Mill:<br />
The tendency has always been str<strong>on</strong>g <str<strong>on</strong>g>to</str<strong>on</strong>g> believe that whatever received a name must be an entity<br />
or being, having an independent existence of its own. And if no real entity answering <str<strong>on</strong>g>to</str<strong>on</strong>g> the name<br />
could be found, men did not for that reas<strong>on</strong> suppose that n<strong>on</strong>e existed, but imagined that it was<br />
something peculiarly abstruse and mysterious.[56]<br />
Gould cited the fallacy in noting that Benet, origina<str<strong>on</strong>g>to</str<strong>on</strong>g>r of IQ, intended n<strong>on</strong>e of the social elitism <str<strong>on</strong>g>to</str<strong>on</strong>g><br />
which it has given rise.[56] Such reificati<strong>on</strong> of jarg<strong>on</strong> is a prominent feature also of rating<br />
practice.<br />
c. popularity-c<strong>on</strong>test<br />
What feats of clinical derring-do can a trainee, at any level, perform that would make him so much
etter than any of his c<strong>on</strong>temporaries that he would qualify for such sterling and distinctive<br />
accolades as "is <strong>on</strong>e of the finest medical students of the year," "is <strong>on</strong>e of the best medical<br />
students I have ever worked with," "richly deserves the h<strong>on</strong>ors awarded in the rotati<strong>on</strong>," or<br />
"receives my highest recommendati<strong>on</strong>",[54] in c<strong>on</strong>tradistincti<strong>on</strong> <str<strong>on</strong>g>to</str<strong>on</strong>g> his fellows, whose<br />
performance might rate a mere “excellent”?<br />
Did pediatric resident A miraculously heal a girl with Friedrich's Ataxia so she never progressed<br />
and even achieved a normal gait? If so, how did he do it? By Divine Interventi<strong>on</strong>? By Black<br />
Magic? By weird science? Miracle-healing, if accomplished, would obviously exceed cus<str<strong>on</strong>g>to</str<strong>on</strong>g>mary<br />
expectati<strong>on</strong>s and be well the upper c<strong>on</strong>trol-limit of performance that Deming defines as “within<br />
the system.” Miracle-healing may thus warrant the highest accolades but even outstanding<br />
residents rarely <str<strong>on</strong>g>to</str<strong>on</strong>g> never perform it.<br />
The <strong>on</strong>ly realistic answer that comes <str<strong>on</strong>g>to</str<strong>on</strong>g> my mind is that the highly regarded trainee manufactures<br />
his high regard by ingratiating himself, through force of intrinsic pers<strong>on</strong>ality or insidious, political<br />
means, in<str<strong>on</strong>g>to</str<strong>on</strong>g> the rater's favor. The rater then comes <str<strong>on</strong>g>to</str<strong>on</strong>g> like the trainee pers<strong>on</strong>ally so much that he's<br />
willing <str<strong>on</strong>g>to</str<strong>on</strong>g> go out <strong>on</strong> a limb for him with various superlative terms of enthusiasm, presumably<br />
assuming his performance be at least adequate. In other words, rating of trainees for<br />
pers<strong>on</strong>nel-records and the <strong>LOR</strong> are popularity-c<strong>on</strong>tests. Who has the bubbliest pers<strong>on</strong>ality? Who<br />
is the most “well liked?”[57]<br />
Such a system could select for those who go al<strong>on</strong>g <str<strong>on</strong>g>to</str<strong>on</strong>g> get al<strong>on</strong>g and who may rate pleasing their<br />
administrative superiors <str<strong>on</strong>g>to</str<strong>on</strong>g> enhance the chance of their own advancement as more important than<br />
performing what's right for a patient, perhaps c<strong>on</strong>trary <str<strong>on</strong>g>to</str<strong>on</strong>g> the will of his superiors. Such disregard<br />
of objectively correct performance may lead <str<strong>on</strong>g>to</str<strong>on</strong>g> deteriorati<strong>on</strong> of quality of patient-care, ostensibly<br />
the opposite of rati<strong>on</strong>al goals for a health-care system.<br />
d. mismeasure of “excellence”<br />
Mere prattle without practice doesn't necessarily tranfer well <str<strong>on</strong>g>to</str<strong>on</strong>g> good real-world outcomes.<br />
Howard Zinn spoke of “the best and the brightest”:<br />
The New York Times did a survey of high-school students <str<strong>on</strong>g>to</str<strong>on</strong>g> see how much his<str<strong>on</strong>g>to</str<strong>on</strong>g>ry they knew.<br />
They do this every few years. They do a survey of young people <str<strong>on</strong>g>to</str<strong>on</strong>g> prove how dumb they are and<br />
<str<strong>on</strong>g>to</str<strong>on</strong>g> prove how smart are the givers of the tests and so they gave this test <str<strong>on</strong>g>to</str<strong>on</strong>g> high-school seniors and<br />
corroborated what they thought. Young people d<strong>on</strong>'t know anything about his<str<strong>on</strong>g>to</str<strong>on</strong>g>ry. They asked<br />
questi<strong>on</strong>s like, “Who was the President during the War of 1812?” “Who was the President during<br />
the Mexican War?”...We're in a great quiz-culture...“What came first the Homestead Act or the<br />
Civil Service Act?” You recognize questi<strong>on</strong>s like that because those are the questi<strong>on</strong>s that appear<br />
<strong>on</strong> tests which enable you <str<strong>on</strong>g>to</str<strong>on</strong>g> get in<str<strong>on</strong>g>to</str<strong>on</strong>g> graduate-school. You can go very far if you know enough of<br />
those answers. You'll be Phi Beta Kappa. You'll become an advisor <str<strong>on</strong>g>to</str<strong>on</strong>g> the President of the United<br />
States. You remember the book, The Best and the Brightest, which was precisely about that<br />
point, that the people surrounding the President were...the people who got the highest scores.<br />
They were Phi Beta Kappa and they were the architects of the War in Vietnam.[58]
Holman cited an analogous problem related <str<strong>on</strong>g>to</str<strong>on</strong>g> inflated self-esteem, the ‘excellence' decepti<strong>on</strong> in<br />
medicine[59].<br />
Simps<strong>on</strong> addressed the examinati<strong>on</strong>-system but his remarks apply at least as well <str<strong>on</strong>g>to</str<strong>on</strong>g> any rating<br />
system:<br />
...the traditi<strong>on</strong>al examinati<strong>on</strong> system...achieves...pseudo-precisi<strong>on</strong>, for it has chosen the accurate<br />
measurement of the barely relevant in preference <str<strong>on</strong>g>to</str<strong>on</strong>g> the less precise measurement of the most<br />
highly relevant...our cultural bias <str<strong>on</strong>g>to</str<strong>on</strong>g>wards believing that anything expressed in numbers must be<br />
significantly more true than the same thing expressed in words...allows the student <str<strong>on</strong>g>to</str<strong>on</strong>g> accumulate<br />
a sequence of numerical ascripti<strong>on</strong>s and grades, often of very dubious reliability and<br />
validity...added <str<strong>on</strong>g>to</str<strong>on</strong>g>gether and averaged <str<strong>on</strong>g>to</str<strong>on</strong>g> help us guess at whether he is fit <str<strong>on</strong>g>to</str<strong>on</strong>g> leave medical<br />
school. This is as logical as making a pre-operative surgical assessment by adding and averaging<br />
your patient's haemoblobin, potassium, urea and blood sugar levels. It produces results...of little<br />
or no predictive validity and...neither tell the student who has passed the exam why he has d<strong>on</strong>e<br />
well (so that we can be reas<strong>on</strong>ably sure he can do it again) nor tell the student who has failed<br />
anything of much use <str<strong>on</strong>g>to</str<strong>on</strong>g> him in avoiding further failure...[60]<br />
e. men<str<strong>on</strong>g>to</str<strong>on</strong>g>r-inattenti<strong>on</strong><br />
The descripti<strong>on</strong>s of how recipients of <strong>LOR</strong>s perpetrate Mill's reificati<strong>on</strong>-fallacy in an attempt <str<strong>on</strong>g>to</str<strong>on</strong>g><br />
attach specific meanings <str<strong>on</strong>g>to</str<strong>on</strong>g> various phrases that the phrases themselves d<strong>on</strong>'t necessarily<br />
denote[61], seems especially anomalous in a c<strong>on</strong>text in which the author may be a<br />
department-chairman who may even c<strong>on</strong>cede that he has never had any c<strong>on</strong>tact with the trainees,<br />
about whom he has a duty <str<strong>on</strong>g>to</str<strong>on</strong>g> write <strong>LOR</strong>s, not have even what Albanes called<br />
The episodic, fragmented, and...small amount of c<strong>on</strong>tact that clinical faculty have with<br />
students...(Albanes 653)[34]<br />
Albanes claimed that that circumstance<br />
...leaves them [raters] reluctant <str<strong>on</strong>g>to</str<strong>on</strong>g> make ratings that would call attenti<strong>on</strong> <str<strong>on</strong>g>to</str<strong>on</strong>g> students' performance<br />
deficits...(Albanes 653)[34]<br />
In those circumstances, faculty-members' reluctance <str<strong>on</strong>g>to</str<strong>on</strong>g> make ratings of any sort , at all, would<br />
bespeak their simple h<strong>on</strong>esty. Yet, somehow, most faculty-members, whether in good c<strong>on</strong>science<br />
or not, rate their students and other trainees after clinical rotati<strong>on</strong>s and later when they write<br />
<strong>LOR</strong>s for them.<br />
Kefalides affirms faculty-expectati<strong>on</strong>s and complains that insurance-rules newly require<br />
faculty-members <str<strong>on</strong>g>to</str<strong>on</strong>g> take care of patients and thus provide golden opportunities for clinical<br />
teaching, which he seems <str<strong>on</strong>g>to</str<strong>on</strong>g> disparage.[62]<br />
Cydulka et al present time-cosuming, close observati<strong>on</strong> of trainees as a startling new<br />
departure.[63]
In the industrial setting, in which TQM arose, nobody could ever c<strong>on</strong>fuse the manufactured<br />
product with the worker whose efforts produce it. In another article,[64] Albanes did just that. He<br />
attempted <str<strong>on</strong>g>to</str<strong>on</strong>g> apply TQM <str<strong>on</strong>g>to</str<strong>on</strong>g> medical educati<strong>on</strong> but, in the process, he c<strong>on</strong>flated students as human<br />
beings with students as objects, products of the educati<strong>on</strong>-process and got his ideas twisted. As a<br />
result, in <strong>on</strong>e secti<strong>on</strong> of his article, grading is good, while in another, it's bad. The very fact that<br />
Academic Medicine published his article indicates the likelihood that the thinking, am<strong>on</strong>g many<br />
academics, about rating and evaluating the performance of trainees and others is c<strong>on</strong>fused.<br />
f. self-fulfilling prophecy<br />
Bosk noted:<br />
One striking feature of the clinical judgment of residents is how easily the whole process may turn<br />
in<str<strong>on</strong>g>to</str<strong>on</strong>g> self-fulfilling prophesy ( sic )....good reputati<strong>on</strong>s exercise a protective or deviance-reducing<br />
effect while bad <strong>on</strong>es generate a destructive or deviance-amplifying <strong>on</strong>e. If a resident is c<strong>on</strong>sidered<br />
trustworthy, m<strong>on</strong>i<str<strong>on</strong>g>to</str<strong>on</strong>g>ring by attendings is decreased. Therefore, deficiencies are less likely <str<strong>on</strong>g>to</str<strong>on</strong>g> be<br />
discovered. C<strong>on</strong>versely, if a resident is suspect, m<strong>on</strong>i<str<strong>on</strong>g>to</str<strong>on</strong>g>ring increases. C<strong>on</strong>vinced that they are<br />
there for the finding, an attending is more likely <str<strong>on</strong>g>to</str<strong>on</strong>g> find evidence of sloppy work. When found,<br />
these <strong>on</strong>ly increase surveillance, which again increases the probability of mistakes. Clearly<br />
suspici<strong>on</strong> does not create residents who are unfit-- after all, something creates the suspici<strong>on</strong>.<br />
N<strong>on</strong>etheless, being suspect is for a resident a very vulnerable and demoralizing positi<strong>on</strong>. Not <strong>on</strong>ly<br />
that, being above suspici<strong>on</strong> gives a fair amount of protecti<strong>on</strong>, especially when mistakes need not<br />
be seen as innocent error. Given these dynamics, it is not surprising that those who fall <strong>on</strong> the<br />
short end of evaluati<strong>on</strong> (or their at<str<strong>on</strong>g>to</str<strong>on</strong>g>rneys) often characterize it as arbitrary and capricious.[65<br />
Strangely, when a physician so abuses ancillary pers<strong>on</strong>nel that they lose their self-c<strong>on</strong>fidence in an<br />
analogous manner, he becomes a “disruptive physician,”[66] fit <strong>on</strong>ly for expulsi<strong>on</strong>. Yet, in the<br />
setting of medical educati<strong>on</strong>, such abuse is <str<strong>on</strong>g>to</str<strong>on</strong>g>lerable, even cus<str<strong>on</strong>g>to</str<strong>on</strong>g>mary.<br />
g. Absence of evidence-basis<br />
Rating/evaluati<strong>on</strong> is particularly vulnerable <str<strong>on</strong>g>to</str<strong>on</strong>g> charges of resting <strong>on</strong> an inadequate evidence-basis:<br />
...In perusing the folders of the residents in the training program that I studied, I found <strong>on</strong>ly <strong>on</strong>e<br />
evaluati<strong>on</strong> that menti<strong>on</strong>ed a specific incident. This leads me <str<strong>on</strong>g>to</str<strong>on</strong>g> suspect that residents who are<br />
dismissed from programs could easily argue that their “due-process rights” were violated, which<br />
raises a very thorny issue. Surgery <str<strong>on</strong>g>to</str<strong>on</strong>g> a large degree rests <strong>on</strong> peer trust, and it is unclear what<br />
degree of formal, c<strong>on</strong>crete evaluati<strong>on</strong> is c<strong>on</strong>sistent with that trust. (12)[67]<br />
Two pediatric core-curricula have come out for emergency-pediatrics,[68,69] <strong>on</strong>e for pediatric<br />
interventi<strong>on</strong>al cardiology,[70] <strong>on</strong>e core-c<strong>on</strong>tent inven<str<strong>on</strong>g>to</str<strong>on</strong>g>ry for adult emergency-medicine[71] and a<br />
retrospective inven<str<strong>on</strong>g>to</str<strong>on</strong>g>ry of diagnoses encountered in internal-medicine residency.[72]<br />
Other core-curricula may exist in other specialties, yet, in no specialty, do recommendati<strong>on</strong>s for<br />
the <strong>LOR</strong> relate in any manner <str<strong>on</strong>g>to</str<strong>on</strong>g> specific elements of any defined core-curriculum. If the <strong>LOR</strong> is
supposed <str<strong>on</strong>g>to</str<strong>on</strong>g> reflect job-performance, what justificati<strong>on</strong> is there for omitting any menti<strong>on</strong> of<br />
job-performance criteria, delineated in nati<strong>on</strong>al core-curricula, core-c<strong>on</strong>tent statements or<br />
otherwise?<br />
In all the literature <strong>on</strong> <strong>LOR</strong>s and evaluati<strong>on</strong>s, n<strong>on</strong>e that I've seen suggest including the cumulative<br />
statistics <strong>on</strong> clinical outcomes of patients under the care of the subject of the <strong>LOR</strong>. Yet, without<br />
such evidence of actual job-performance, in terms of numbers and proporti<strong>on</strong>s of patients saved,<br />
lost and improved, the rest is nothing.<br />
The medical literature is replete with accounts of physicians' inaccurate performance-appraisals of<br />
their colleagues and of trainees.[73-86] Those accounts render the idea of entrusting<br />
performance-appraisal of any<strong>on</strong>e <str<strong>on</strong>g>to</str<strong>on</strong>g> physicians patently absurd.<br />
Perhaps the most c<strong>on</strong>crete, objectively verifiable category is “procedural skills.” The trainee either<br />
succeeds at the lumbar puncture by obtaining CSF or not, succeeds in intubating a patient or not.<br />
No performance-evaluati<strong>on</strong> I've ever seen has any space devoted <str<strong>on</strong>g>to</str<strong>on</strong>g> citing the specific number of<br />
procedures that the men<str<strong>on</strong>g>to</str<strong>on</strong>g>r observed the subject performing, far less a score-card that documents<br />
how many he performed successfully and in how many he failed. What would be the distincti<strong>on</strong> in<br />
a rating of 3/10 vs. a rating of 7/10 in the category, “procedural skills?” One might imagine that<br />
the evaluee succeeded in 30% or 70%, respectively, of the procedures he performed during a<br />
clinical rotati<strong>on</strong>. Did a m<strong>on</strong>th-l<strong>on</strong>g rotati<strong>on</strong> provide even ten opportunities for each of, say three<br />
trainees, <str<strong>on</strong>g>to</str<strong>on</strong>g> perform lumbar punctures or intubati<strong>on</strong>s? It seems unlikely.<br />
If the trainee's score was low, where is the documentati<strong>on</strong> of the help that the men<str<strong>on</strong>g>to</str<strong>on</strong>g>r provided <str<strong>on</strong>g>to</str<strong>on</strong>g><br />
the trainee <str<strong>on</strong>g>to</str<strong>on</strong>g> improve his performance? I've never seen it and the proposed standard <str<strong>on</strong>g>Letter</str<strong>on</strong>g> of<br />
Recommendati<strong>on</strong> (S<strong>LOR</strong>), in emergency-medicine, omits menti<strong>on</strong> of anything like it.[54,63]<br />
Where is the documentati<strong>on</strong> of the progress in the trainee's score during the m<strong>on</strong>th? Was his score<br />
2 of 10 at the beginning of his rotati<strong>on</strong> and 8 of 10 at the end? I've never seen anything like that,<br />
either, possibly because a trainee may have <strong>on</strong>e opportunity <str<strong>on</strong>g>to</str<strong>on</strong>g> perform <strong>on</strong>e clinical procedure in a<br />
m<strong>on</strong>th, if he's lucky. The cus<str<strong>on</strong>g>to</str<strong>on</strong>g>mary evaluati<strong>on</strong> is post hoc, delivered as a summative accolade or<br />
c<strong>on</strong>demnati<strong>on</strong>, l<strong>on</strong>g after the trainee can do anything about his scores.<br />
What was the quality of his performance? What did he do <str<strong>on</strong>g>to</str<strong>on</strong>g> succeed in the procedure, if he<br />
succeeded? Did he fracture teeth of patients he intubated? If so, how many each? How many of<br />
the patients <strong>on</strong> whom he performed a lumbar puncture required a blood-patch afterwards <str<strong>on</strong>g>to</str<strong>on</strong>g> stem<br />
post-procedure CSF-leakage? I've never seen any such evaluati<strong>on</strong> in writing.<br />
How many of the procedures that the evaluee performed did the men<str<strong>on</strong>g>to</str<strong>on</strong>g>r pers<strong>on</strong>ally observe?<br />
S<strong>LOR</strong> has no space for any such entry[54,63]. Is the rating based <strong>on</strong> a “general impressi<strong>on</strong>” of the<br />
evaluee's procedural skill, as an intrinsic trait, derived from rumor? If so, up<strong>on</strong> what specific<br />
evidence or criteria did the evalua<str<strong>on</strong>g>to</str<strong>on</strong>g>r base the score that he assigned?<br />
Is <strong>on</strong>e of the men<str<strong>on</strong>g>to</str<strong>on</strong>g>r's c<strong>on</strong>siderati<strong>on</strong>s his own anxiety over giving the evaluee a big head? Did the<br />
evaluee, rather, need a low score <str<strong>on</strong>g>to</str<strong>on</strong>g> give him a harsh dose of “reality?” If so, <strong>on</strong> what evidentiary
criteria did the evalua<str<strong>on</strong>g>to</str<strong>on</strong>g>r base his c<strong>on</strong>cept of “reality,” such that a low rating would give the<br />
evaluee a dose thereof and, in some sense (what sense?) improve the evaluee's outlook? Did the<br />
evalua<str<strong>on</strong>g>to</str<strong>on</strong>g>r apply his dose of reality <str<strong>on</strong>g>to</str<strong>on</strong>g> all evaluees c<strong>on</strong>sistently? If not, why not? Did he c<strong>on</strong>demn<br />
those whom he pers<strong>on</strong>ally disliked (perhaps because they asked him <str<strong>on</strong>g>to</str<strong>on</strong>g>ugh questi<strong>on</strong>s <str<strong>on</strong>g>to</str<strong>on</strong>g> which the<br />
men<str<strong>on</strong>g>to</str<strong>on</strong>g>r felt embarrassed at not knowing the answers) and favor those whom he pers<strong>on</strong>ally liked<br />
(perhaps because they never asked him any <str<strong>on</strong>g>to</str<strong>on</strong>g>ugh questi<strong>on</strong>s)? If he applied his dose of reality<br />
c<strong>on</strong>sistently, without regard <str<strong>on</strong>g>to</str<strong>on</strong>g> the evaluee's actual performance (which the evalua<str<strong>on</strong>g>to</str<strong>on</strong>g>r may never<br />
have observed -- my c<strong>on</strong>sistent experience, throughout “training”), isn't that practice arbitrary,<br />
unreas<strong>on</strong>able and capricious, i.e., a manifestati<strong>on</strong> of chaos and irrati<strong>on</strong>ality, in a setting where<br />
rati<strong>on</strong>al thought is supposed <str<strong>on</strong>g>to</str<strong>on</strong>g> prevail?<br />
Most important, what does the rating score tell the relevant candidate about what he should do <str<strong>on</strong>g>to</str<strong>on</strong>g><br />
improve his performance?<br />
One might argue that success in procedures, like medicine, itself, is a s<str<strong>on</strong>g>to</str<strong>on</strong>g>chastic matter[5,6], i.e.,<br />
that some procedures fail even in the best of hands and some succeed even in the worst of hands,<br />
in whatever sense of “best” and “worst” <strong>on</strong>e might choose <str<strong>on</strong>g>to</str<strong>on</strong>g> apply. I would reply that that's<br />
correct. Success in procedures is, at least <str<strong>on</strong>g>to</str<strong>on</strong>g> some extent what Deming terms a lottery,[5] no<br />
questi<strong>on</strong>. Given that truism, what's the point of making “procedural skills a ratable category, in<br />
the first place?<br />
h. glittering generalities<br />
Greenburg et al wrote, in relati<strong>on</strong> <str<strong>on</strong>g>to</str<strong>on</strong>g> <strong>LOR</strong>s:<br />
Brevity and generality...come across as distinctly negative features, causing the reader <str<strong>on</strong>g>to</str<strong>on</strong>g><br />
w<strong>on</strong>der...whether the writer actually knows the applicant. (197)[87]<br />
Yet, the evaluati<strong>on</strong>-criteria, up<strong>on</strong> which <strong>LOR</strong>s are most often based, rely up<strong>on</strong> brevity and<br />
generality, presumably in the assumpti<strong>on</strong> that evalua<str<strong>on</strong>g>to</str<strong>on</strong>g>rs' general opini<strong>on</strong>s of candidates reflect the<br />
truth. No evidence supports that propositi<strong>on</strong> and my pers<strong>on</strong>al observati<strong>on</strong> is that it is false. If<br />
brevity and generality be negative features of a <strong>LOR</strong>, how can the same features be acceptable in<br />
the underlying evaluati<strong>on</strong>-criteria?<br />
Bosk terms vague indices of “quality” of the candidate, such as “general medical knowledge,”<br />
“rapport with staff” and the like “essentially-c<strong>on</strong>tested c<strong>on</strong>cepts.”[67] They are summative,<br />
glittering generalities, intended <str<strong>on</strong>g>to</str<strong>on</strong>g> make the evaluati<strong>on</strong>-form brief, that have no necessary<br />
evidentiary relati<strong>on</strong>, either <str<strong>on</strong>g>to</str<strong>on</strong>g> the subject-physician's actual performance, clinical acumen or <str<strong>on</strong>g>to</str<strong>on</strong>g><br />
clinical outcomes of his patients.<br />
In the field of academic emergency-medicine, Harwood et al referred <str<strong>on</strong>g>to</str<strong>on</strong>g> various elements of<br />
evaluative jarg<strong>on</strong>:<br />
Of the applicants submitting S<strong>LOR</strong>s <str<strong>on</strong>g>to</str<strong>on</strong>g> our EM residency program, 49% or more received the<br />
superlative resp<strong>on</strong>se in the categories of "commitment," "work ethic," and "pers<strong>on</strong>ality." In
c<strong>on</strong>trast, <strong>on</strong>ly 35% of the applicants received the superlative resp<strong>on</strong>se regarding their "differential<br />
diagnosis ability." The "global assessment" operated similarly, with 37% of the applicants<br />
receiving the superlative resp<strong>on</strong>se. The least comm<strong>on</strong> superlative resp<strong>on</strong>se was the "match<br />
rating," with <strong>on</strong>ly 23% of the applicants receiving a "guaranteed match."<br />
These data can serve as a reference for both interpreting and writing S<strong>LOR</strong>s. The data show that<br />
EM applicants least comm<strong>on</strong>ly receive the superlative resp<strong>on</strong>se in the categories of "differential<br />
diagnosis ability," "global assessment," and "match rating," making these key categories for<br />
residency selecti<strong>on</strong> committees. These results suggest that authors can justifiably evaluate most<br />
applicants in the highest categories of pers<strong>on</strong>al traits, but that they should be more discerning with<br />
assessing "differential diagnosis ability," "global assessment," and "match rating."[88]<br />
Harwood et al seem <str<strong>on</strong>g>to</str<strong>on</strong>g> pretend as if ratings were objective facts, rather than what they are,<br />
subjective appraisals based <strong>on</strong> the author's claimed but unverifiable (and probably negligible)<br />
familiarity with the trainee.<br />
In the foregoing passage, Harwood et al urged instituti<strong>on</strong>al authors of S<strong>LOR</strong>s (standardized<br />
letters of reference) <str<strong>on</strong>g>to</str<strong>on</strong>g> manipulate their performance-appraisals in various secti<strong>on</strong>s of the S<strong>LOR</strong>,<br />
<strong>on</strong> the premise that the evidentiary basis of such appraisals d<strong>on</strong>'t matter, with a view <str<strong>on</strong>g>to</str<strong>on</strong>g> pandering<br />
<str<strong>on</strong>g>to</str<strong>on</strong>g> the selecti<strong>on</strong>-committees for emergency-medicine residencies and manipulating the outcomes<br />
of their deliberati<strong>on</strong>s over trainee-selecti<strong>on</strong>. Harwood et al seem <str<strong>on</strong>g>to</str<strong>on</strong>g> ignore the possibility that<br />
fewer authors rate trainees highly in the glittering-generality categories of "differential diagnosis<br />
ability," "global assessment," and "match rating" than in the glittering-generality categories,<br />
"commitment," "work ethic," and "pers<strong>on</strong>ality" because the authors could be inappropriately<br />
ungenerous with their ratings in the first three categories, most likely because those categories are<br />
the clinically oriented <strong>on</strong>es and authors would very likely believe that they weren't performing<br />
their watchdog/gatekeeper functi<strong>on</strong> properly (<str<strong>on</strong>g>to</str<strong>on</strong>g> keep bad doc<str<strong>on</strong>g>to</str<strong>on</strong>g>rs from practicing<br />
emergency-medicine) unless they had c<strong>on</strong>demned a certain quota of trainee-candidates with each<br />
batch that left their respective instituti<strong>on</strong>s. Hea<str<strong>on</strong>g>to</str<strong>on</strong>g>n depicts that practice in terms of what he calls<br />
the “basic process”:<br />
The Basic Process<br />
The basic process of an individual in a hierarchy is <str<strong>on</strong>g>to</str<strong>on</strong>g> avoid mistakes. . . individuals are rated by<br />
their errors, for their tasks are predetermined. There is no premium for achievement outside<br />
assigned hierarchical tasks but there are penalties for every shortfall from perfecti<strong>on</strong>.<br />
The normal distributi<strong>on</strong> in a hierarchy includes a percentage of failures, so grading <strong>on</strong> a curve<br />
means that students making the most mistakes are given failing grades. . . When failing students<br />
are eliminated, those next above them succeed <str<strong>on</strong>g>to</str<strong>on</strong>g> the failing category. The rule of thumb is for<br />
<strong>on</strong>e-third <str<strong>on</strong>g>to</str<strong>on</strong>g> leave between the fifth and twelfth grades, . . . The next third become<br />
failure-threatened, declining in rank regardless of effort or improvement. Apprehensi<strong>on</strong> then<br />
blocks learning so there can <strong>on</strong>ly be unskilled repetiti<strong>on</strong>. Thus this middle third is taught<br />
submissi<strong>on</strong> and place within the hierarchy. . . (32)
Is it true or false that for every winner there has <str<strong>on</strong>g>to</str<strong>on</strong>g> be a loser? False – there has <str<strong>on</strong>g>to</str<strong>on</strong>g> be a c<strong>on</strong>tinuing<br />
supply of losers if a winner is <str<strong>on</strong>g>to</str<strong>on</strong>g> keep <strong>on</strong> winning. In schools, grading <strong>on</strong> a curve . . . means that<br />
the A student needs an F student at the other end of the normal distributi<strong>on</strong>; then annually or<br />
more often, when the F student is eliminated or drops out, another student must be pushed in<str<strong>on</strong>g>to</str<strong>on</strong>g><br />
the failing positi<strong>on</strong>. . . companies seem <str<strong>on</strong>g>to</str<strong>on</strong>g> survive <strong>on</strong>ly by establishing a large pool of marginal<br />
workers who can be picked up when needed and dropped when business is slow. . .<br />
. . . Schools in exclusive suburbs do not produce so many failures . . . Instead they assume their<br />
students are mostly in the upper half of a normal (59) distributi<strong>on</strong>. . . there are schools which<br />
assume their students are mostly in the lower half of a normal distributi<strong>on</strong>. In <strong>on</strong>e vocati<strong>on</strong>al high<br />
school in New York, no teacher could give a grade above C without special approval by the<br />
principal. In a ghet<str<strong>on</strong>g>to</str<strong>on</strong>g> high school a department head <str<strong>on</strong>g>to</str<strong>on</strong>g>ld me that <strong>on</strong>ly <strong>on</strong>e student in a . . . class of<br />
twenty was capable of learning. I knew the students were capable and interested, but sure enough,<br />
nineteen dropped out and failed. . . grading in schools is a process that produces failures and<br />
accomplishes rejecting.<br />
Winners are cus<str<strong>on</strong>g>to</str<strong>on</strong>g>m-made, but losers are mass-produced. . . (62)[52]<br />
Pursuing a similar line of “reas<strong>on</strong>ing,” raters in medical educati<strong>on</strong> may believe that they can<br />
enhance the reputati<strong>on</strong> and credibility of their respective instituti<strong>on</strong>s by making a big show of<br />
being “<str<strong>on</strong>g>to</str<strong>on</strong>g>ugh graders” and the clinically oriented rating criteria are the most attractive targets for<br />
that sort of behavior.<br />
i. dis<str<strong>on</strong>g>to</str<strong>on</strong>g>rti<strong>on</strong> from “c<strong>on</strong>fidentiality,” under perpetual tensi<strong>on</strong>.<br />
In both edi<str<strong>on</strong>g>to</str<strong>on</strong>g>rial peer-review and performance-appraisal/<strong>LOR</strong>, the thesis is that the rater cannot<br />
deliver an “h<strong>on</strong>est and accurate”[89] rating unless he labors under the protecti<strong>on</strong> of<br />
“c<strong>on</strong>fidentiality,”[61,89,90] meaning that everybody except for the subject, gets <str<strong>on</strong>g>to</str<strong>on</strong>g> see the rating.<br />
Decades of organizati<strong>on</strong>al oppressi<strong>on</strong>, in which ratees had <str<strong>on</strong>g>to</str<strong>on</strong>g> <str<strong>on</strong>g>to</str<strong>on</strong>g>lerate the in<str<strong>on</strong>g>to</str<strong>on</strong>g>lerable, finally<br />
prompted the US C<strong>on</strong>gress <str<strong>on</strong>g>to</str<strong>on</strong>g> enact the enlightened Buckley Amendment, a federal law that<br />
requires schools that receive federal funding <str<strong>on</strong>g>to</str<strong>on</strong>g> make student records available for viewing by<br />
parents and the students themselves if they are 18 or older.[89,91] Accordingly, even though<br />
federal law mandates that the trainee should be able <str<strong>on</strong>g>to</str<strong>on</strong>g> see his rating, those in medical educati<strong>on</strong>,<br />
prefer the old oppressi<strong>on</strong>. They recommend that the organizati<strong>on</strong> should compel the trainee <str<strong>on</strong>g>to</str<strong>on</strong>g><br />
“waive” his legal right, under the Buckley-Amendment, <str<strong>on</strong>g>to</str<strong>on</strong>g> see his rating, in the interest of<br />
“h<strong>on</strong>esty”,[1,90] “authenticity,”[89] and “objectivity” (read freedom of the rater <str<strong>on</strong>g>to</str<strong>on</strong>g> give an<br />
adverse rating with the security of knowing that the ratee cannot learn of it and therefore not have<br />
grounds for retaliati<strong>on</strong>) of the letter and of its “value”[1] <str<strong>on</strong>g>to</str<strong>on</strong>g> the receiving instituti<strong>on</strong>.<br />
In edi<str<strong>on</strong>g>to</str<strong>on</strong>g>rial peer-review, the author receives the rating but not the identity of the rater. In<br />
performance-appraisal, the trainee knows the identity of the rater but, ideally, not the rating.<br />
Yet, dissenting voices resist such organizati<strong>on</strong>al oppressi<strong>on</strong>, and for good reas<strong>on</strong>, in my view:
...One of our wisest and most experienced faculty members, <str<strong>on</strong>g>Dr</str<strong>on</strong>g>. Douglas Lindsey, offers <str<strong>on</strong>g>to</str<strong>on</strong>g> write<br />
letters for every medical student. He writes them h<strong>on</strong>estly. He then shows the student the letter. It<br />
is up <str<strong>on</strong>g>to</str<strong>on</strong>g> the student <str<strong>on</strong>g>to</str<strong>on</strong>g> decide wether it is sent. This is an excellent policy of a great teacher.<br />
Unfortunately, it is probably unique. (320)<br />
A few students have been asked <str<strong>on</strong>g>to</str<strong>on</strong>g> sign statements that they have not seen their reference letters.<br />
This is ridiculous and unenforceable. D<strong>on</strong>'t sign...it is comm<strong>on</strong> practice for students <str<strong>on</strong>g>to</str<strong>on</strong>g> be asked <str<strong>on</strong>g>to</str<strong>on</strong>g><br />
sign a waiver of their right <str<strong>on</strong>g>to</str<strong>on</strong>g> request <str<strong>on</strong>g>to</str<strong>on</strong>g> see referee letters. If you are forced in<str<strong>on</strong>g>to</str<strong>on</strong>g> this type of<br />
situati<strong>on</strong>, you may have <str<strong>on</strong>g>to</str<strong>on</strong>g> sign it and hope for good letters. If possible you do want <str<strong>on</strong>g>to</str<strong>on</strong>g> see those<br />
letters before they go out. (321)[53]<br />
The practice of “c<strong>on</strong>fidentiality,” under compulsi<strong>on</strong> and under false color of “h<strong>on</strong>esty,” in the<br />
rater, may thus spawn duplicity and dish<strong>on</strong>esty in the ratee.<br />
On the subject of so-called h<strong>on</strong>esty, <strong>on</strong>e naturally w<strong>on</strong>ders whether the “h<strong>on</strong>esty” will be<br />
even-handed or biased. A few obvious questi<strong>on</strong>s spring <str<strong>on</strong>g>to</str<strong>on</strong>g> mind:<br />
Will the rater be as “h<strong>on</strong>est” about how he himself prioritized the needs of trainees lower than his<br />
own pers<strong>on</strong>al needs and therefore devoted insufficient time <str<strong>on</strong>g>to</str<strong>on</strong>g> the those in need of guidance <str<strong>on</strong>g>to</str<strong>on</strong>g><br />
foster their improvement as he claims <str<strong>on</strong>g>to</str<strong>on</strong>g> be about the shortcomings of those trainees, whom the<br />
rater thus aband<strong>on</strong>ed? Will he be h<strong>on</strong>est about his own failures <str<strong>on</strong>g>to</str<strong>on</strong>g> implement and incorporate core<br />
c<strong>on</strong>tent of his specialty (e.g., emergency-medicine) in training and rating his trainees? Will he be<br />
h<strong>on</strong>est about his own failure <str<strong>on</strong>g>to</str<strong>on</strong>g> provide daily feedback <str<strong>on</strong>g>to</str<strong>on</strong>g> trainees <str<strong>on</strong>g>to</str<strong>on</strong>g> keep them informed of what<br />
specific performances they needed <str<strong>on</strong>g>to</str<strong>on</strong>g> dem<strong>on</strong>strate the following day <str<strong>on</strong>g>to</str<strong>on</strong>g> show improvement? Will<br />
the rater be h<strong>on</strong>est about his own failure <str<strong>on</strong>g>to</str<strong>on</strong>g> document daily or weekly improvement or otherwise<br />
and reas<strong>on</strong>s therefor in his rating-comments? Will the rater be h<strong>on</strong>est about his own failure <str<strong>on</strong>g>to</str<strong>on</strong>g><br />
define behavioral educati<strong>on</strong>al objectives,[92] <str<strong>on</strong>g>to</str<strong>on</strong>g>ward which the trainees might strive? Will the<br />
rater be h<strong>on</strong>est about how he exchanged gossip with other faculty about various trainees and<br />
thereby formed a collective, united, homogenized opini<strong>on</strong> of trainees, insteade of expressing his<br />
own opini<strong>on</strong>, based <strong>on</strong> his pers<strong>on</strong>al observati<strong>on</strong>s? Will the rater be h<strong>on</strong>est about casting the<br />
evaluati<strong>on</strong> in terms <strong>on</strong>ly of the trainee's failures, not in terms of systematic failures of the<br />
instituti<strong>on</strong>?<br />
T<strong>on</strong>esk provides a twisted view of objectivity vs. subjectivity and authority-relati<strong>on</strong>ships in<br />
medical educati<strong>on</strong>.[93]<br />
In the realm of edi<str<strong>on</strong>g>to</str<strong>on</strong>g>rial peer-review, Walsh et al found referees more c<strong>on</strong>siderate and courteous<br />
<str<strong>on</strong>g>to</str<strong>on</strong>g>ward authors if their names attached <str<strong>on</strong>g>to</str<strong>on</strong>g> their reports[94]. What's wr<strong>on</strong>g, therefore, with<br />
accountability in <strong>LOR</strong>s?<br />
Flacks wrote:<br />
. . . maintaining the c<strong>on</strong>fidentiality of the c<strong>on</strong>tents of evaluati<strong>on</strong>s and letters of reference would<br />
[not] improve the quality of such assessments. On the c<strong>on</strong>trary, . . . I've become c<strong>on</strong>vinced . . .<br />
that the reverse is true. New state laws and university regulati<strong>on</strong>s have opened the process . . --
and the results . . . have been good. Faculty members and departments now have the opportunity<br />
<str<strong>on</strong>g>to</str<strong>on</strong>g> resp<strong>on</strong>d <str<strong>on</strong>g>to</str<strong>on</strong>g> negative reviews . . . timely . . . and with some understanding of the arguments that<br />
may merit rebuttal. The review process is now more cumbersome, but it is . . . less Kafkaesque. . .<br />
A new law that would require full disclosure has passed the legislature but is being c<strong>on</strong>tested in<br />
the courts by the University. I am quite sure that the . . . motivati<strong>on</strong> for the University's resistance<br />
is not so much <str<strong>on</strong>g>to</str<strong>on</strong>g> protect the quality of the review process as it is <str<strong>on</strong>g>to</str<strong>on</strong>g> protect the discreti<strong>on</strong>ary<br />
powers of the administrati<strong>on</strong>.<br />
. . . The need for open evaluati<strong>on</strong>s is not simply that such openness promotes due process. The<br />
due process argument applies <str<strong>on</strong>g>to</str<strong>on</strong>g> all instituti<strong>on</strong>s in their treatment of workers. . . open access helps<br />
ensure that each member can benefit from critical feedback and also ensures that criticisms are<br />
made in a way that is resp<strong>on</strong>sible <str<strong>on</strong>g>to</str<strong>on</strong>g> can<strong>on</strong>s of scholarly objectivity. . .[95]<br />
Fashing wrote:<br />
. . . If we allow people <str<strong>on</strong>g>to</str<strong>on</strong>g> require an<strong>on</strong>ymity as the price for the exercise of candor and<br />
professi<strong>on</strong>al resp<strong>on</strong>sibility, then surely we encourage a pernicious form of cowardice. Are our<br />
sensibilities so delicate that they cannot c<strong>on</strong>tend with the requirement <str<strong>on</strong>g>to</str<strong>on</strong>g> render our negative<br />
judgments openly and h<strong>on</strong>estly with whatever risk that entails? And if they are, should we<br />
c<strong>on</strong>tinue <str<strong>on</strong>g>to</str<strong>on</strong>g> encourage such delicacy or should we begin <str<strong>on</strong>g>to</str<strong>on</strong>g> require a modicum of courage <str<strong>on</strong>g>to</str<strong>on</strong>g> go<br />
with our “candor”? I for <strong>on</strong>e believe we should. . . the requirement of an<strong>on</strong>ymity raises serious<br />
questi<strong>on</strong>s of credibility in its own right. Why should we believe that an<strong>on</strong>ymity is the price of<br />
h<strong>on</strong>esty any more than that it is an opportunity for dish<strong>on</strong>esty? . . .<br />
. . . there are compelling reas<strong>on</strong>s for c<strong>on</strong>fr<strong>on</strong>ting intellectual, professi<strong>on</strong>al, and . . . pers<strong>on</strong>al<br />
differences as a minimal requirement for the development of any serious sense of community. This<br />
will no doubt produce some unpleasant moments in the c<strong>on</strong>text of whatever c<strong>on</strong>flicts surface, but<br />
what group that c<strong>on</strong>stitutes a serious community, or perhaps more importantly, a community <str<strong>on</strong>g>to</str<strong>on</strong>g><br />
be taken seriously, especially in intellectual terms, is without c<strong>on</strong>flict? That c<strong>on</strong>sensus about all<br />
issues is unnecessary <str<strong>on</strong>g>to</str<strong>on</strong>g> the maintenance of a healthy community is recognized by all but the most<br />
resolutely c<strong>on</strong>servative members of the academy. To address such differences and <str<strong>on</strong>g>to</str<strong>on</strong>g> resolve<br />
them, or in the case of intellectual differences, <str<strong>on</strong>g>to</str<strong>on</strong>g> provide a climate in which debate and c<strong>on</strong>flict of<br />
opposing ideas are a catalyst for intellectual growth and creativity, strikes me as the essence of<br />
academic community and a primary requirement for intellectual and academic freedom. In this<br />
sense disclosure should promote rather than retard intellectual excellence. (222)[96]<br />
6. The ultimate goal: communi<strong>on</strong> of “<str<strong>on</strong>g>to</str<strong>on</strong>g>p” talent in “<str<strong>on</strong>g>to</str<strong>on</strong>g>p” instituti<strong>on</strong>s<br />
The counter-argument <str<strong>on</strong>g>to</str<strong>on</strong>g> the foregoing is that the most competitive programs have <str<strong>on</strong>g>to</str<strong>on</strong>g> select the<br />
most competitive trainees.<br />
Why? Even assuming that the selecti<strong>on</strong>-process be valid, a dubious propositi<strong>on</strong>, what ultimate<br />
utility is there in aid of quality of patient-care in c<strong>on</strong>centrating “<str<strong>on</strong>g>to</str<strong>on</strong>g>p talent” in “<str<strong>on</strong>g>to</str<strong>on</strong>g>p instituti<strong>on</strong>s?”<br />
Isn't that just elitism run amock? What about spreading the wealth, if that's what it is (a dubious<br />
propositi<strong>on</strong>), around a little? Wouldn't the “n<strong>on</strong>-competitive” trainees gain from exposure <str<strong>on</strong>g>to</str<strong>on</strong>g> “<str<strong>on</strong>g>to</str<strong>on</strong>g>p
instituti<strong>on</strong>s” and wouldn't “competitive trainees,” if they offer any genuine advantage over<br />
n<strong>on</strong>-competitive trainees, be able <str<strong>on</strong>g>to</str<strong>on</strong>g> work their magic in instituti<strong>on</strong>s in more humble locati<strong>on</strong>s?<br />
I've interacted with finished physicians from a broad range of instituti<strong>on</strong>s and I'm c<strong>on</strong>stantly<br />
impressed with how alike they are. Physicians from Harvard, Yale and other Ivy League<br />
instituti<strong>on</strong>s are no great shakes and some of the most impressive come from the hinterlands. What<br />
was all the fuss about during educati<strong>on</strong> and training, then?<br />
7. Illustrative anecdote which is more typical than it should be<br />
When I worked as a civilian in the ER of the military hospital, Fort Stewart, GA, my military<br />
supervisor, a Major in the Army Medical Corps, liked me pretty well at first but seemed <str<strong>on</strong>g>to</str<strong>on</strong>g> dislike<br />
me more and more as time went <strong>on</strong>, evidently because of c<strong>on</strong>flicts that swirled around me.<br />
He criticized my handwriting, so I brought in a word-processor <str<strong>on</strong>g>to</str<strong>on</strong>g> write up my charts and make<br />
them optimally legible. He didn't s<str<strong>on</strong>g>to</str<strong>on</strong>g>p me from doing that but, l<strong>on</strong>g after I'd left there, I obtained<br />
copies of my pers<strong>on</strong>nel-records, including documentati<strong>on</strong> of his commentary <strong>on</strong> the episode.<br />
Without explaining what he intended, he put an exclamati<strong>on</strong> after the statement, “he brought in a<br />
word-processor!” I gather he disapproved of my c<strong>on</strong>structive resp<strong>on</strong>se <str<strong>on</strong>g>to</str<strong>on</strong>g> his criticism, yet he<br />
suggested no other alternative. What did he want from me? Did he expect me suddenly <str<strong>on</strong>g>to</str<strong>on</strong>g> develop<br />
handwriting like his? He never explained.<br />
In perhaps the emblematic episode of my tenure there, I pissed off <strong>on</strong>e of his fellow Army-officers<br />
by calling him in at night <str<strong>on</strong>g>to</str<strong>on</strong>g> attend a female patient of his by admitting for her evaluati<strong>on</strong> and<br />
m<strong>on</strong>i<str<strong>on</strong>g>to</str<strong>on</strong>g>ring of her chest-pain that I suspected had a cardiac origin. He chewed me out for<br />
disturbing his sleep and wanted me <str<strong>on</strong>g>to</str<strong>on</strong>g> release her home without forcing him <str<strong>on</strong>g>to</str<strong>on</strong>g> come in and<br />
examine her. He claimed <str<strong>on</strong>g>to</str<strong>on</strong>g> know her so well that he KNEW that her chest-pain was not cardiac<br />
but, instead, was from her COPD. The rules, not of my making, required him <str<strong>on</strong>g>to</str<strong>on</strong>g> come in and<br />
examine a patient whom the ER-physician suspected of requiring admissi<strong>on</strong>. Under protest, he<br />
came in, chewed me out some more in fr<strong>on</strong>t of nurses and other pers<strong>on</strong>nel and released her home.<br />
A few weeks later, her cardiac catheterizati<strong>on</strong> at Fort Gord<strong>on</strong> revealed severe cor<strong>on</strong>ary artery<br />
disease. I had committed an unpard<strong>on</strong>able sin: being right when an army-doc<str<strong>on</strong>g>to</str<strong>on</strong>g>r was wr<strong>on</strong>g.<br />
It's not as if this were a diagnostic coup. It could hardly have been more stereotypical. She had<br />
chest-pain, reminiscent of cardiac chest-pain. It was bread-and-butter medicine. She needed<br />
admissi<strong>on</strong> for the sake of safety. The officer fulfilled his paper-duty under protest by getting out<br />
of bed and examining the patient. He failed in his duty <str<strong>on</strong>g>to</str<strong>on</strong>g> admit her for m<strong>on</strong>i<str<strong>on</strong>g>to</str<strong>on</strong>g>ring.<br />
I pissed off a pediatrician by calling him in at night a few times <str<strong>on</strong>g>to</str<strong>on</strong>g> attend febrile infants who I<br />
thought might need admissi<strong>on</strong>, as a posted directive required me <str<strong>on</strong>g>to</str<strong>on</strong>g> do. Whether the patient's<br />
c<strong>on</strong>diti<strong>on</strong> is serious enough <str<strong>on</strong>g>to</str<strong>on</strong>g> warrant admissi<strong>on</strong> is a matter of judgment and, if I think the patient<br />
needs admissi<strong>on</strong>, the pediatrician may disagree. I assumed that <str<strong>on</strong>g>to</str<strong>on</strong>g> be in the realm of disagreement<br />
am<strong>on</strong>g reas<strong>on</strong>able people. He evidently disagreed, even with that principle, probably because he<br />
was the pediatrician <strong>on</strong> call and fulfilling his duty required him <str<strong>on</strong>g>to</str<strong>on</strong>g> exert unwelcome effort. He<br />
impugned my “judgment,” as a tactic in his campaign. He sent all the patients I referred <str<strong>on</strong>g>to</str<strong>on</strong>g> him
home, possibly as a way of accumulating incompetence-points against me. Those incidents<br />
illustrate the principle, universal, in my observati<strong>on</strong>, that hospital-pers<strong>on</strong>nel pay abundant<br />
lip-service <str<strong>on</strong>g>to</str<strong>on</strong>g> c<strong>on</strong>cern for quality of patient-care but their acti<strong>on</strong>s bespeak <strong>on</strong>ly c<strong>on</strong>cern for their<br />
own c<strong>on</strong>venience.<br />
Thus, I accumulated “complaints” against me but the hospital never preferred any charges against<br />
me or offered me a peer-review hearing for me <str<strong>on</strong>g>to</str<strong>on</strong>g> rebut such charges, presumably because the<br />
noti<strong>on</strong> would have been absurd, even <str<strong>on</strong>g>to</str<strong>on</strong>g> Army-brass.<br />
Hypothetical charge 1: diagnosing chest-pain as cardiac which later proved <str<strong>on</strong>g>to</str<strong>on</strong>g> be cardiac but<br />
pissing off Army-Officer in the meantime by calling him in at night <str<strong>on</strong>g>to</str<strong>on</strong>g> do his duty. Charge 2:<br />
complying with posted hospital-directive by calling in Army-Officers in relevant specialties<br />
“unnecessarily,” and thereby pissing them off, <strong>on</strong> nights when they're <strong>on</strong> call <str<strong>on</strong>g>to</str<strong>on</strong>g> attend patients,<br />
possibly appropriate for admissi<strong>on</strong>, and <str<strong>on</strong>g>to</str<strong>on</strong>g> render their opini<strong>on</strong>s.<br />
Instead of taking a formal route, they chose a typical bureaucratic route: my supervisor completed<br />
c<strong>on</strong>secutive evaluati<strong>on</strong>-reports in secret and never discussed them with me. The pers<strong>on</strong>nel-records<br />
I obtained years later, exhibited an unmistakeable halo-effect: In all comp<strong>on</strong>ents, from “medical<br />
knowledge” and “rapport with staff” <str<strong>on</strong>g>to</str<strong>on</strong>g> “health” and “appearance,” the ratings descended in<br />
parallel from 9 or 10 of 10, steadily downward, <str<strong>on</strong>g>to</str<strong>on</strong>g> end at about 3 or 4 out of 10, under the<br />
influence of multiple complaints of pissing off Army-physicians by asking them <str<strong>on</strong>g>to</str<strong>on</strong>g> do their duty.<br />
That is, each evaluati<strong>on</strong>-cycle, my supervisor assigned all comp<strong>on</strong>ents the same rating: all 9s, all<br />
8s, all 7s, all 6s and so forth. Yet, my “appearance” and “health” were verifiably the same<br />
throughout that time: fine and stable. He presented not a scintilla of evidence of my deteriorating<br />
health, for example, yet he “documented” its deteriorati<strong>on</strong> in his numerical ratings. This pers<strong>on</strong><br />
had an MD-degree!<br />
Thereup<strong>on</strong>, enough poor pseudo-ratings had accumulated against me <str<strong>on</strong>g>to</str<strong>on</strong>g> “justify” my terminati<strong>on</strong><br />
and <str<strong>on</strong>g>to</str<strong>on</strong>g> provide an ir<strong>on</strong>clad “paper-trail,” in case I should have decided, at some point, <str<strong>on</strong>g>to</str<strong>on</strong>g> c<strong>on</strong>test<br />
my terminati<strong>on</strong> legally.<br />
<strong>LOR</strong>s that I requested from Fort Stewart stated <strong>on</strong>ly the dates of my employment there but made<br />
no menti<strong>on</strong> whatever of my performance, e.g., my thoroughness and my diligence, for the benefit<br />
of patients, against the odds of dysfuncti<strong>on</strong>al military-bureaucratic obfuscati<strong>on</strong>. Those <strong>LOR</strong>s<br />
illustrate a fundamental principle of all <strong>LOR</strong>s: <strong>LOR</strong>s accommodate the needs of the ambient<br />
power-hierarchy, not of the subject thereof. That makes them inherently inaccurate. If the<br />
academic is h<strong>on</strong>est with himself, he will c<strong>on</strong>cede that academic power-hierarchies exhibit similar<br />
manifestati<strong>on</strong>s.<br />
I could provide other anecdotes with similar import but I've g<strong>on</strong>e <strong>on</strong> far <str<strong>on</strong>g>to</str<strong>on</strong>g>o l<strong>on</strong>g already, so I'll<br />
s<str<strong>on</strong>g>to</str<strong>on</strong>g>p.<br />
When will decisi<strong>on</strong>-makers cop themselves <strong>on</strong> <str<strong>on</strong>g>to</str<strong>on</strong>g> the inherent unfeasibility of rating human<br />
beings?