10.04.2013 Views

Eric Grosch, Letter to Dr. Morgenstern on LOR - Semmelweis ...

Eric Grosch, Letter to Dr. Morgenstern on LOR - Semmelweis ...

Eric Grosch, Letter to Dr. Morgenstern on LOR - Semmelweis ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

DR. ERIC N GROSCH - A new approach <str<strong>on</strong>g>to</str<strong>on</strong>g> <strong>LOR</strong><br />

Dear <str<strong>on</strong>g>Dr</str<strong>on</strong>g>. <str<strong>on</strong>g>Morgenstern</str<strong>on</strong>g>:<br />

I read your article[1] with interest. You wrote:<br />

In c<strong>on</strong>sidering a new approach <str<strong>on</strong>g>to</str<strong>on</strong>g> the pediatric <strong>LOR</strong>, COMSEP and the APPD welcome the input<br />

of the broader pediatric community.<br />

Thanks for expressing an interest in receiving my comments, since I'm an internist, not a<br />

pediatrician. I'm pleased <str<strong>on</strong>g>to</str<strong>on</strong>g> c<strong>on</strong>tribute <str<strong>on</strong>g>to</str<strong>on</strong>g> the dialogue, which I think is important. I apologize, in<br />

advance for the length of my text but I think the <str<strong>on</strong>g>to</str<strong>on</strong>g>pic warrants it. Quoted text is indented and my<br />

own unindented.<br />

The <strong>LOR</strong> and its close relative, the performance-appraisal, are sacred cows in medical educati<strong>on</strong>,<br />

training and job-placement. They purport <str<strong>on</strong>g>to</str<strong>on</strong>g> provide means of communicating a candidate's traits<br />

am<strong>on</strong>g men<str<strong>on</strong>g>to</str<strong>on</strong>g>rs -- the performance-appraisal am<strong>on</strong>g men<str<strong>on</strong>g>to</str<strong>on</strong>g>rs within an instituti<strong>on</strong>; the <strong>LOR</strong> from<br />

men<str<strong>on</strong>g>to</str<strong>on</strong>g>rs in <strong>on</strong>e instituti<strong>on</strong> <str<strong>on</strong>g>to</str<strong>on</strong>g> those in another. The approach for each is much the same.<br />

That purpose seems analogous <str<strong>on</strong>g>to</str<strong>on</strong>g> the medical record in patient-care, which provides a means of<br />

communicating a patient's disease-traits am<strong>on</strong>g the patient's physicians. The analogy is false for<br />

reas<strong>on</strong>s I cite, in the chart, below:<br />

Clinical chart <strong>LOR</strong>/performance-appraisal<br />

appraisal and appropriate appraisal summative, end of rotati<strong>on</strong> presented usually acti<strong>on</strong> at least<br />

daily after the fact, <str<strong>on</strong>g>to</str<strong>on</strong>g>o late for improvement<br />

goal is improvement of the patient goal varies between promoti<strong>on</strong> of the candidate <str<strong>on</strong>g>to</str<strong>on</strong>g> his<br />

eliminati<strong>on</strong> from c<strong>on</strong>siderati<strong>on</strong><br />

Relies <strong>on</strong> objective evidence for Often relies <strong>on</strong> rumor, innuendo, scuttlebutt for decisi<strong>on</strong>-making<br />

decisi<strong>on</strong>-making<br />

Documentati<strong>on</strong> is as l<strong>on</strong>g as is Documentati<strong>on</strong> is as brief as possible <str<strong>on</strong>g>to</str<strong>on</strong>g> save<br />

necessary reading time<br />

Documentati<strong>on</strong> is in terms of specific Documentati<strong>on</strong> is in terms of unsubstantiated<br />

clinical events opini<strong>on</strong>s, couched in generalities, of men<str<strong>on</strong>g>to</str<strong>on</strong>g>rs, peers, etc.<br />

On the evidence that I've examined, I believe that there are better ways than the <strong>LOR</strong> and the<br />

performance-appraisal <str<strong>on</strong>g>to</str<strong>on</strong>g> accomplish the missi<strong>on</strong>. I d<strong>on</strong>'t c<strong>on</strong>sider that a flippant belief. I've<br />

arrived at my oppositi<strong>on</strong> <str<strong>on</strong>g>to</str<strong>on</strong>g> the <strong>LOR</strong>/performance-appraisal through c<strong>on</strong>siderable thought, reading


and anecdotal experience, both as an author and subject of <strong>LOR</strong>s. Most of what I say here is in<br />

the public domain and obvious. I get the impressi<strong>on</strong> that nobody ever puts it <str<strong>on</strong>g>to</str<strong>on</strong>g>gether, so I've<br />

d<strong>on</strong>e that, though, perhaps incompletely. If you have any objecti<strong>on</strong>s <str<strong>on</strong>g>to</str<strong>on</strong>g> what I've said here, please<br />

let me know them.<br />

I divide my reas<strong>on</strong>s in<str<strong>on</strong>g>to</str<strong>on</strong>g> generic secti<strong>on</strong>s:<br />

1. Golden Rule: Do un<str<strong>on</strong>g>to</str<strong>on</strong>g> others as you would have others do un<str<strong>on</strong>g>to</str<strong>on</strong>g> you. The Golden Rule[2],<br />

al<strong>on</strong>e, should persuade any<strong>on</strong>e with any insight in<str<strong>on</strong>g>to</str<strong>on</strong>g> the treatment that he would ideally prefer for<br />

himself, that the performance-appraisal/<strong>LOR</strong> can never work.<br />

2. Comparis<strong>on</strong>s are always odious.<br />

3. Deleterious effect: The idea of performance-appraisal is fundamentally flawed, even<br />

dysfuncti<strong>on</strong>al, because of the often deleterious effect it has <strong>on</strong> those trainees that appraisers rate<br />

as less than the very best, even though quality of performance is a lottery, governed in large part<br />

by random chance. Accordingly, rating people who are of the system makes no sense.<br />

4. Improper substitute for “where do I stand”: <strong>LOR</strong>s and performance-appraisals serve the<br />

organizati<strong>on</strong> or instituti<strong>on</strong>, not the individual appraised.<br />

5. Inaccuracy:<br />

a. misapplicati<strong>on</strong> of the Likert-scale principle<br />

b. inevitability of rating-inflati<strong>on</strong><br />

c. popularity-c<strong>on</strong>test<br />

d. mismeasure of “excellence”<br />

e. men<str<strong>on</strong>g>to</str<strong>on</strong>g>r-inattenti<strong>on</strong>: Men<str<strong>on</strong>g>to</str<strong>on</strong>g>rs, who are supposed <str<strong>on</strong>g>to</str<strong>on</strong>g> do the evaluati<strong>on</strong>s and <strong>LOR</strong>s, d<strong>on</strong>'t pay<br />

enough attenti<strong>on</strong> <str<strong>on</strong>g>to</str<strong>on</strong>g> their trainees' performance <str<strong>on</strong>g>to</str<strong>on</strong>g> fulfill that functi<strong>on</strong> adequately because their<br />

c<strong>on</strong>tact with trainees is minimal and sporadic, so their appraisal of the performance of their<br />

trainees is most often inaccurate and may even reverse the reality <strong>on</strong> the ground.<br />

f. self-fulfilling prophecy<br />

g. absence of evidence-basis<br />

h. glittering generalities: Even if men<str<strong>on</strong>g>to</str<strong>on</strong>g>rs paid attenti<strong>on</strong> <str<strong>on</strong>g>to</str<strong>on</strong>g> trainees' performance, the rating<br />

systems that they use address <strong>on</strong>ly glittering generalities, such as “general medical knowledge,”<br />

require presentati<strong>on</strong> of no supporting evidence and rarely <str<strong>on</strong>g>to</str<strong>on</strong>g> never address the <strong>on</strong>ly index of<br />

work-performance in medicine, namely, clinical outcomes of patients under the trainees' care.


i. dis<str<strong>on</strong>g>to</str<strong>on</strong>g>rti<strong>on</strong> from “c<strong>on</strong>fidentiality,” under perpetual tensi<strong>on</strong><br />

6. The ultimate goal: communi<strong>on</strong> of “<str<strong>on</strong>g>to</str<strong>on</strong>g>p” talent in “<str<strong>on</strong>g>to</str<strong>on</strong>g>p” instituti<strong>on</strong>s<br />

7. Illustrative anecdote which is more typical than it should be<br />

1. Golden Rule[2]<br />

The performance-appraisal and the <strong>LOR</strong> are exercises in disregarding the needs of others and in<br />

attributing <str<strong>on</strong>g>to</str<strong>on</strong>g> others the character and nature of objects. The psychic mechanisms that prompt<br />

those in authority <str<strong>on</strong>g>to</str<strong>on</strong>g> impose performance-appraisal/<strong>LOR</strong> <strong>on</strong> others -- what they would not want<br />

for themselves -- are obscure but the most likely reas<strong>on</strong> seems <str<strong>on</strong>g>to</str<strong>on</strong>g> be that the very act of<br />

impositi<strong>on</strong> may, in and of itself, provide a pleasurable and ego-boosting exercise of arbitrary<br />

authority.<br />

Whatever the psychic mechanisms, the proof of the observati<strong>on</strong> appears in the c<strong>on</strong>trast between<br />

the AMA's c<strong>on</strong>sistent endorsement of peer-review for physicians they presume, generically, <str<strong>on</strong>g>to</str<strong>on</strong>g> be<br />

“bad doc<str<strong>on</strong>g>to</str<strong>on</strong>g>rs and its disparagment of Professi<strong>on</strong>al Standards Review Organizati<strong>on</strong>s (PSROs).<br />

For example, it is a matter of record that the AMA was a str<strong>on</strong>g supporter of enactment of the<br />

Health Care Quality Improvement Act of 1986 (HCQIA), which codified peer-review provisi<strong>on</strong>s<br />

from hospital bylaws in<str<strong>on</strong>g>to</str<strong>on</strong>g> federal law. The AMA initially supported the HCQIA because it<br />

supports peer-review of “bad doc<str<strong>on</strong>g>to</str<strong>on</strong>g>rs” but opposed <strong>on</strong>e feature of that act, the Nati<strong>on</strong>al<br />

Practiti<strong>on</strong>ers Data Bank (NPDB), and withdrew its support al<str<strong>on</strong>g>to</str<strong>on</strong>g>gether from the HCQIA over that<br />

issue, but it eventually obtained a quid pro quo: NPDB as well as absolute immunity from liability<br />

for hospital-level peer-reviewers, a provisi<strong>on</strong> that has led <str<strong>on</strong>g>to</str<strong>on</strong>g> the proliferati<strong>on</strong> of bad-fath<br />

peer-review.[3]<br />

At the same time, JAMA and other journals have published articles that have impugned the<br />

accuracy of the findings of PSROs, the agents of which may c<strong>on</strong>duct peer-review <strong>on</strong> any<br />

practicing physician, including <strong>on</strong>e whom the AMA would presume <str<strong>on</strong>g>to</str<strong>on</strong>g> be a “good doc<str<strong>on</strong>g>to</str<strong>on</strong>g>r.”<br />

Neither JAMA nor any other medical journal has published even <strong>on</strong>e article that has examined the<br />

accuracy or validity of hospital-based peer-review, which the AMA enthusiastically approves<br />

because statutes render such peer-review privileged/c<strong>on</strong>fidential.<br />

2. Comparis<strong>on</strong>s are always odious<br />

Farrell[4] noted the emoti<strong>on</strong>al effect of sex-reversed beauty-c<strong>on</strong>tests am<strong>on</strong>g men. The winner<br />

was, of course, ecstatic, enthusiastic and high in self-esteem but the runner-ups felt devastated at<br />

the relative rejecti<strong>on</strong> and they experienced an epiphany: why women (at least those of runner-up<br />

grade [or less] physical appearance) dislike cus<str<strong>on</strong>g>to</str<strong>on</strong>g>mary beauty-c<strong>on</strong>tests. Comparis<strong>on</strong>s are always<br />

odious because the comparis<strong>on</strong>-game is a zero-sum propositi<strong>on</strong>. Performance-appraisal is an<br />

appraisal of something that the evaluee can c<strong>on</strong>trol <str<strong>on</strong>g>to</str<strong>on</strong>g> some extent by his c<strong>on</strong>scious will, as<br />

opposed <str<strong>on</strong>g>to</str<strong>on</strong>g> appearance, which he can't, so it's marginally less pernicious than a beauty-c<strong>on</strong>test but<br />

not much. The fact remains that the better <strong>on</strong>e individual rates, the worse others do. It's


inescapable.<br />

The performance-appraisal/<strong>LOR</strong> in most fields, including medical educati<strong>on</strong>, always hinges <strong>on</strong><br />

comparing <strong>on</strong>e trainee, by various criteria, with others. The comparis<strong>on</strong> may appear as a<br />

class-rank or as a comparis<strong>on</strong> of the subject's rating with an ideal rating, e.g., 6 of a possible 10<br />

points, 4 of a possible 5 points, etc. The message is always: “You d<strong>on</strong>'t measure up.”<br />

That message is especially demoralizing <str<strong>on</strong>g>to</str<strong>on</strong>g> the usual medical trainee, since the very fact that he's<br />

survived <str<strong>on</strong>g>to</str<strong>on</strong>g> the stage of medical training means that he has already survived very stringent<br />

selecti<strong>on</strong>/exclusi<strong>on</strong> filters and thus become accus<str<strong>on</strong>g>to</str<strong>on</strong>g>med <str<strong>on</strong>g>to</str<strong>on</strong>g> superlative accolades in his early<br />

educati<strong>on</strong>, through schooling and his undergraduate years. Thrown in the midst of similar<br />

high-achievers, the normative performance is likely <str<strong>on</strong>g>to</str<strong>on</strong>g> be uniformly high and he may rate merely<br />

“average.”<br />

3. Deleterious effect:<br />

Deming lists the performance-appraisal (and, by implicati<strong>on</strong>, also the cus<str<strong>on</strong>g>to</str<strong>on</strong>g>mary <strong>LOR</strong>) am<strong>on</strong>g the<br />

deadly diseases of business-organizati<strong>on</strong>s. The same ideas apply, in spades <str<strong>on</strong>g>to</str<strong>on</strong>g> medical<br />

organizati<strong>on</strong>s and <str<strong>on</strong>g>to</str<strong>on</strong>g> <strong>LOR</strong>s, which are retrospective, summative performance-appraisals, frozen<br />

and immutable, in perpetuity:<br />

...the deadly diseases...<br />

3. Evaluati<strong>on</strong> of performance, merit rating...Many companies in America have systems by which<br />

every<strong>on</strong>e...receives from his superior...a rating...(101) Management by objective leads <str<strong>on</strong>g>to</str<strong>on</strong>g> the same<br />

evil...Management by fear would be a better name...(Deming 1986 107)<br />

Fair rating is impossible. A comm<strong>on</strong> fallacy is the suppositi<strong>on</strong> that it is possible <str<strong>on</strong>g>to</str<strong>on</strong>g> rate people; <str<strong>on</strong>g>to</str<strong>on</strong>g><br />

put them in rank order of performance for next year, based <strong>on</strong> performance last year.<br />

The performance of anybody is the result of a combinati<strong>on</strong> of many forces -- the pers<strong>on</strong> himself,<br />

the people that he works with, the job, the material that he works <strong>on</strong>, his equipment, his cus<str<strong>on</strong>g>to</str<strong>on</strong>g>mer,<br />

his management, his supervisi<strong>on</strong>, envir<strong>on</strong>mental c<strong>on</strong>diti<strong>on</strong>s (noise, c<strong>on</strong>fusi<strong>on</strong>, poor food in the<br />

company's cafeteria). (109) These forces...[which]...arise almost entirely from acti<strong>on</strong> of the<br />

system...will produce...large differences between people....A man not promoted is unable <str<strong>on</strong>g>to</str<strong>on</strong>g><br />

understand why his performance is lower than some<strong>on</strong>e else's. No w<strong>on</strong>der; his rating was the<br />

result of a lottery. Unfortunately, he takes his rating seriously...(Deming 1986, 110)<br />

The effect is devastating:<br />

It nourishes short-term performance, annihilates l<strong>on</strong>g-term planning, builds fear, demolishes<br />

teamwork, nourishes rivalry and politics.<br />

It leaves people bitter, crushed, bruised, battered, desolate, desp<strong>on</strong>dent, dejected, feeling inferior,<br />

some even depressed, unfit for work for weeks after receipt of rating unable <str<strong>on</strong>g>to</str<strong>on</strong>g> comprehend why


they were inferior. It is unfair, as it ascribes <str<strong>on</strong>g>to</str<strong>on</strong>g> the people in a group differences that may be<br />

caused <str<strong>on</strong>g>to</str<strong>on</strong>g>tally by the system that they work in.<br />

...what is wr<strong>on</strong>g is that the performance appraisal or merit rating focuses <strong>on</strong> the end product, at<br />

the end of the stream, not <strong>on</strong> leadership <str<strong>on</strong>g>to</str<strong>on</strong>g> help people. This is a way <str<strong>on</strong>g>to</str<strong>on</strong>g> avoid the problems of<br />

people. A manager becomes, in effect, manager of defects.<br />

The idea of merit rating is alluring. The sound of the words captivates the imaginati<strong>on</strong>: pay for<br />

what you get; get what you pay for; motivate people <str<strong>on</strong>g>to</str<strong>on</strong>g> do their best, for their own good.<br />

The effect is exactly the opposite of what the words promise. Every<strong>on</strong>e propels himself forward,<br />

or tries <str<strong>on</strong>g>to</str<strong>on</strong>g>, for his own good, <strong>on</strong> his own life preserver. The organizati<strong>on</strong> is the loser.<br />

Merit rating rewards people that do well in the system. It does not reward attempts <str<strong>on</strong>g>to</str<strong>on</strong>g> improve<br />

the system. D<strong>on</strong>'t rock the boat.<br />

...a merit rating is meaningless as a predic<str<strong>on</strong>g>to</str<strong>on</strong>g>r of performance, except for some<strong>on</strong>e that falls<br />

outside the limits of dif- (102) ferences attributable <str<strong>on</strong>g>to</str<strong>on</strong>g> the system that the people work in...<br />

Traditi<strong>on</strong>al appraisal systems increase the variability of performance of people. The trouble lies in<br />

the implied preciseness of rating schemes...Somebody is rated below average, takes a look at<br />

people that are rated above average; naturally w<strong>on</strong>ders why the difference exists. He tries <str<strong>on</strong>g>to</str<strong>on</strong>g><br />

emulate people above average. The result is impairment of performance. (103)<br />

...The problem lies in the difficulty <str<strong>on</strong>g>to</str<strong>on</strong>g> define a meaningful measure of performance. The <strong>on</strong>ly<br />

verifiable measure is a short-term count of some kind...(Deming 1986, 103)<br />

Degenerati<strong>on</strong> <str<strong>on</strong>g>to</str<strong>on</strong>g> counting. One of the main effects of evaluati<strong>on</strong> of performance is nourishment of<br />

short-term thinking and short-time performance...(103) A man must have something <str<strong>on</strong>g>to</str<strong>on</strong>g> show. His<br />

superior is forced in<str<strong>on</strong>g>to</str<strong>on</strong>g> numerics. It is easy <str<strong>on</strong>g>to</str<strong>on</strong>g> count. Counts relieve management of the necessity<br />

<str<strong>on</strong>g>to</str<strong>on</strong>g> c<strong>on</strong>trive a measure with meaning.<br />

...people that are measured by counting are deprived of pride of workmanship. Number of designs<br />

that an engineer turns out in a period of time would be an example of an index that provides no<br />

chance for pride of workmanship. He dare not take time <str<strong>on</strong>g>to</str<strong>on</strong>g> study and amend the design just<br />

completed. To do so would decrease his output. (105)<br />

A good rating for work <strong>on</strong> new product and new service that may generate new business five or<br />

eight years hence, and provide better material living, requires enlightened management. He that<br />

engages in such work would study changes in educati<strong>on</strong>, changes in style of living, migrati<strong>on</strong> in<br />

and out of urban areas. He would attend meetings of the American Sociological society, the<br />

Business Secti<strong>on</strong> of the American Statistical Associati<strong>on</strong>, the American Marketing Associati<strong>on</strong>.<br />

He would write professi<strong>on</strong>al papers <str<strong>on</strong>g>to</str<strong>on</strong>g> deliver at such meetings, all of which are necessary for the<br />

planning of product and service of the future. He would not for years have anything <str<strong>on</strong>g>to</str<strong>on</strong>g> show for<br />

his labors. Meanwhile, in the absence of enlightened management, other people getting good


atings <strong>on</strong> short-run projects would leave him behind. (Deming 1986, 106)<br />

Stifling teamwork. Evaluati<strong>on</strong> of performance explains...why it is difficult for staff areas <str<strong>on</strong>g>to</str<strong>on</strong>g> work<br />

<str<strong>on</strong>g>to</str<strong>on</strong>g>gether for the good of the company. They work instead as prima d<strong>on</strong>nas, <str<strong>on</strong>g>to</str<strong>on</strong>g> the defeat of the<br />

company. Good performance <strong>on</strong> a team helps the company but leads <str<strong>on</strong>g>to</str<strong>on</strong>g> less tangible results <str<strong>on</strong>g>to</str<strong>on</strong>g><br />

count for the individual. The problem <strong>on</strong> a team is: who did what?<br />

How could the people in the purchasing department, under the present system of evaluati<strong>on</strong>, take<br />

an interest in improvement of quality of materials for producti<strong>on</strong>, service, <str<strong>on</strong>g>to</str<strong>on</strong>g>ols, and other<br />

materials for n<strong>on</strong>productive purposes? This would require cooperati<strong>on</strong> with manufacturing. It<br />

would impede productivity in the purchasing department, which is often measured by the number<br />

of c<strong>on</strong>tracts negotiated per man-year, without regard <str<strong>on</strong>g>to</str<strong>on</strong>g> performance of materials or services<br />

purchased. If there be an accomplishment <str<strong>on</strong>g>to</str<strong>on</strong>g> boast about the people in manufacturing might get<br />

the credit, not the people in purchasing. Or, it could be the other way around. Thus...teamwork so<br />

highly desirable, can not thrive under the annual rating. Fear grips every<strong>on</strong>e. Be careful; d<strong>on</strong>'t take<br />

a risk; go al<strong>on</strong>g.<br />

Heard in a seminar. One gets a good rating for fighting a fire. The result is visible: can be<br />

quantified. If you do it right the first time, you are invisible. You satisfied the requirements. That<br />

is your job. Mess it up, and correct it later, you become a hero.<br />

Two chemists work <str<strong>on</strong>g>to</str<strong>on</strong>g>gether <strong>on</strong> a project, and write up their work as a scientific paper. The paper<br />

is accepted for a meeting in Hamburg...<strong>on</strong>ly <strong>on</strong>e of the pair may go <str<strong>on</strong>g>to</str<strong>on</strong>g> Hamburg <str<strong>on</strong>g>to</str<strong>on</strong>g> deliver the<br />

paper -- viz., the <strong>on</strong>e with the higher rating. The <strong>on</strong>e with the lower rating vows never again <str<strong>on</strong>g>to</str<strong>on</strong>g><br />

work close with any<strong>on</strong>e else.<br />

Result: every man for himself.<br />

Evaluati<strong>on</strong> of performance nourishes fear. People are afraid <str<strong>on</strong>g>to</str<strong>on</strong>g> ask questi<strong>on</strong>s that might indicate<br />

any possible doubt about the boss's ideas and decisi<strong>on</strong>s, or about his logic. The game becomes<br />

<strong>on</strong>e of politics. Keep <strong>on</strong> the good side of the boss. Any<strong>on</strong>e that presents another point of view or<br />

asks questi<strong>on</strong>s runs the risk of being called disloyal, not a team player, trying <str<strong>on</strong>g>to</str<strong>on</strong>g> push himself<br />

ahead. Be a yes man.<br />

Top levels of salaries and b<strong>on</strong>uses are in many American companies sky-high. It is human nature<br />

for a young man <str<strong>on</strong>g>to</str<strong>on</strong>g> aspire...<str<strong>on</strong>g>to</str<strong>on</strong>g>...<strong>on</strong>e of these positi<strong>on</strong>s. The <strong>on</strong>ly chance <str<strong>on</strong>g>to</str<strong>on</strong>g> reach a high level is by<br />

c<strong>on</strong>sistent, unfailing promoti<strong>on</strong>, year after year. The aspiring man's quest is not how <str<strong>on</strong>g>to</str<strong>on</strong>g> serve the<br />

company with whatever knowledge he has, but how <str<strong>on</strong>g>to</str<strong>on</strong>g> get a good rating. Miss <strong>on</strong>e raise, you<br />

w<strong>on</strong>'t make it: Some<strong>on</strong>e else will. (108)<br />

A man dare not take a risk. D<strong>on</strong>'t change a procedure. Change might not work well. What would<br />

happen <str<strong>on</strong>g>to</str<strong>on</strong>g> him that changed it? He must guard his own security. It is safer <str<strong>on</strong>g>to</str<strong>on</strong>g> stay in line.<br />

The manager, under the review system, like the people that he manages, works as an individual<br />

for his own advancement, not for the company. He must make a good showing for himself.


Another Irving Langmuir? Can American his<str<strong>on</strong>g>to</str<strong>on</strong>g>ry, under handicap of the annual rating, produce<br />

another Irving Langmuir, a Nobel Prize winner, or another W. D. Coolidge? Both these men were<br />

with the General Electric Company. Could the Siemens company produce another Ernst Werner<br />

v<strong>on</strong> Siemens?<br />

...It is worthy of note that the 80 American Nobel prize winners all had tenure, security. They<br />

were answerable <strong>on</strong>ly <str<strong>on</strong>g>to</str<strong>on</strong>g> themselves. (Deming 1986, 109)<br />

“It can't be all bad.”...<str<strong>on</strong>g>to</str<strong>on</strong>g>p management delay[s aboliti<strong>on</strong>]...of the annual rating of performance...by<br />

refuge in the...corollary that “It can't be all bad. It put me in<str<strong>on</strong>g>to</str<strong>on</strong>g> this positi<strong>on</strong>.”...He reached this<br />

positi<strong>on</strong> by coming out <strong>on</strong> <str<strong>on</strong>g>to</str<strong>on</strong>g>p in every annual rating, at the ruinati<strong>on</strong> of the lives of a score of<br />

other men. There is a better way.<br />

Modern Principles of Leadership...will replace the annual performance review. The first step...will<br />

be <str<strong>on</strong>g>to</str<strong>on</strong>g> provide educati<strong>on</strong> in leadership. The annual perfor- (116) mance review may then be<br />

abolished. Leadership will take its place...<br />

The annual performance review sneaked in and became popular because it does not require<br />

any<strong>on</strong>e <str<strong>on</strong>g>to</str<strong>on</strong>g> face the problems of people. It is easier <str<strong>on</strong>g>to</str<strong>on</strong>g> rate them; focus <strong>on</strong> the outcome...Western<br />

industry needs...methods that will improve the outcome. Suggesti<strong>on</strong>s follow.<br />

1. Institute educati<strong>on</strong> in leadership; obligati<strong>on</strong>s, principles, and methods.<br />

2. More careful selecti<strong>on</strong> of the people in the first place.[5]<br />

It seems difficult <str<strong>on</strong>g>to</str<strong>on</strong>g> imagine selecting medical trainees by methods any more careful than current<br />

<strong>on</strong>es.<br />

3. Better training and educati<strong>on</strong> after selecti<strong>on</strong>.<br />

4. A leader, instead of being a judge, will be a colleague, counseling and leading his people <strong>on</strong> a<br />

day-<str<strong>on</strong>g>to</str<strong>on</strong>g>-day basis, learning from them and with them. Everybody must be <strong>on</strong> a team <str<strong>on</strong>g>to</str<strong>on</strong>g> work for<br />

improvement of quality in the four steps of the Shewhart cycle:...<br />

In the absence of numerical data, a leader must make subjective judgment. A leader will spend<br />

hours with every <strong>on</strong>e of his people.. They will know what kind of help they need. There will<br />

sometimes be inc<strong>on</strong>trovertible evidence of excellent performance, such as patients, publicati<strong>on</strong> of<br />

papers, invitati<strong>on</strong>s <str<strong>on</strong>g>to</str<strong>on</strong>g> give lectures.<br />

People that are <strong>on</strong> the poor side of the sytem will require individual help...(Deming 1986, 117)<br />

a. What could be the most important accomplishments of this team? What changes might be<br />

desirable? What data are available? Are new observati<strong>on</strong>s needed? If yes, plan a change or test.<br />

Decide how <str<strong>on</strong>g>to</str<strong>on</strong>g> use the observati<strong>on</strong>s.


. Carry out the change or test decided up<strong>on</strong>, preferably <strong>on</strong> a small scale.<br />

c. Observe the effects of the change or test.<br />

d. Study the results. What did we learn? What can we predict?...(Deming 1986, 88)<br />

5. A leader will discover who if any of his people is (a) outside the system <strong>on</strong> the good side, (b)<br />

outside <strong>on</strong> the poor side, (c) bel<strong>on</strong>ging <str<strong>on</strong>g>to</str<strong>on</strong>g> the system. The calculati<strong>on</strong>s required...are...simple if<br />

numbers are used for measures of performance. Ranking of people...that bel<strong>on</strong>g <str<strong>on</strong>g>to</str<strong>on</strong>g> the system<br />

violates scientific logic and is ruinous as a policy,...<br />

In the absence of numerical data, a leader must make subjective judgment. A leader will spend<br />

hours with every <strong>on</strong>e of his people. They will know what kind of help they need...<br />

People...<strong>on</strong> the poor side of the system will require individual help....(Deming 1986 117)...<br />

7. Hold a l<strong>on</strong>g interview...three or four hours, at least...not for criticism, but for help<br />

and...everybody[‘s]...better understanding...<br />

8. Figures <strong>on</strong> performance should be used not <str<strong>on</strong>g>to</str<strong>on</strong>g> rank the people...that fall within the system, but<br />

<str<strong>on</strong>g>to</str<strong>on</strong>g> assist the leader <str<strong>on</strong>g>to</str<strong>on</strong>g> accomplish improvement of the system...(118)<br />

...Running a company <strong>on</strong> visible figures al<strong>on</strong>e (counting the m<strong>on</strong>ey). One can not be successful <strong>on</strong><br />

visible figures al<strong>on</strong>e...he that would run his company <strong>on</strong> visible figures al<strong>on</strong>e will in time have<br />

neither company nor figures.<br />

...the most important figures...are unknown and unknowable..., but successful management must<br />

nevertheless take account of them. Examples.<br />

Fallacies of reward for winning in a lottery. A man in the pers<strong>on</strong>nel department of a large<br />

company came forth with an idea, held as brilliant...<str<strong>on</strong>g>to</str<strong>on</strong>g> reward the <str<strong>on</strong>g>to</str<strong>on</strong>g>p (274) man of the m<strong>on</strong>th <strong>on</strong><br />

a certain producti<strong>on</strong> line (the man that made the lowest proporti<strong>on</strong> defective over the m<strong>on</strong>th) with<br />

a citati<strong>on</strong>. There would be a small party <strong>on</strong> the job in his h<strong>on</strong>or, and he would get half a day off.<br />

This might be a great idea if he were indeed an unusual performer for the m<strong>on</strong>th. There were 50<br />

men <strong>on</strong> the producti<strong>on</strong> line.<br />

Do the results of inspecti<strong>on</strong> of their work form a statistical system...? If the work of the group<br />

forms a statistical system, then the prize would be merely a lottery...if the <str<strong>on</strong>g>to</str<strong>on</strong>g>p man is a special<br />

cause <strong>on</strong> the side of low proporti<strong>on</strong> defective, then he is indeed outstanding. He would deserve<br />

recogniti<strong>on</strong>, and he could be a focal point for teaching men how <str<strong>on</strong>g>to</str<strong>on</strong>g> do the job.<br />

There is no harm in a lottery...provided it is called a lottery. To call it an award of merit when the<br />

selecti<strong>on</strong> is merely a lottery...is <str<strong>on</strong>g>to</str<strong>on</strong>g> demoralize the whole force, prize winners included. Everybody<br />

will suppose that there are good reas<strong>on</strong>s for the selecti<strong>on</strong> and will be trying <str<strong>on</strong>g>to</str<strong>on</strong>g> explain and reduce<br />

differences between men. This would be a futile exercise when the <strong>on</strong>ly differences are random


deviati<strong>on</strong>s, as is the case when the performance of the 50 men form[s] a statistical system.<br />

(Deming 1986 275) [5]<br />

In a similar vein, Ierodiak<strong>on</strong>ou and Vandenbroucke term medicine a s<str<strong>on</strong>g>to</str<strong>on</strong>g>chastic art:<br />

Ancient Greek philosophers thought that medicine was an art with peculiar characteristics, and<br />

they called medicine a s<str<strong>on</strong>g>to</str<strong>on</strong>g>chastic art. A doc<str<strong>on</strong>g>to</str<strong>on</strong>g>r might treat a patient c<strong>on</strong>scientiously according <str<strong>on</strong>g>to</str<strong>on</strong>g><br />

all learned precepts; yet the patients' c<strong>on</strong>diti<strong>on</strong> might deteriorate. Another patient might be treated<br />

rather carelessly by another doc<str<strong>on</strong>g>to</str<strong>on</strong>g>r; yet the patient might regain full health. Thus, in medicine<br />

there exists unpredictability between means and ends. By c<strong>on</strong>trast with other arts a diligent<br />

executi<strong>on</strong> of the tasks does not guarantee a good outcome, and vice versa...<br />

...we have l<strong>on</strong>g witnessed a debate <strong>on</strong> the right way <str<strong>on</strong>g>to</str<strong>on</strong>g> measure the quality of medical care: should<br />

we use outcome or process criteria?...For instance, a few years ago, (542) a series of outcome<br />

investigati<strong>on</strong>s was started in the USA. Presumably, some administra<str<strong>on</strong>g>to</str<strong>on</strong>g>rs had been c<strong>on</strong>vinced that<br />

even in health care, quality of performance should be measured according <str<strong>on</strong>g>to</str<strong>on</strong>g> strict outcome<br />

criteria, as is practised in the Japanese car industry, for example. The simplest outcome measure<br />

was mortality in hospital. Third party payers such as the Health Care Financing Administrati<strong>on</strong>,<br />

which administrates Medicare, started <str<strong>on</strong>g>to</str<strong>on</strong>g> rank hospitals according <str<strong>on</strong>g>to</str<strong>on</strong>g> mortality rates for specific<br />

procedures. The mere idea sent ripples of alarm through the American Medical Associati<strong>on</strong><br />

(AMA). Do not we intuitively know that medical centres with the highest reputati<strong>on</strong> attract<br />

patients whose illnesses are close <str<strong>on</strong>g>to</str<strong>on</strong>g> being bey<strong>on</strong>d rescue? Advanced epidemiological techniques<br />

have proved that differences in hospital mortality can be explained away by adjustment for<br />

differences in patient mix. To use outcome as a means <str<strong>on</strong>g>to</str<strong>on</strong>g> m<strong>on</strong>i<str<strong>on</strong>g>to</str<strong>on</strong>g>r quality would necessitate<br />

c<strong>on</strong>tinuous evaluati<strong>on</strong> of all individual patient characteristics. This process would be a gigantic<br />

research effort, close <str<strong>on</strong>g>to</str<strong>on</strong>g> the examinati<strong>on</strong> of treatments in randomised c<strong>on</strong>trolled trials, and would<br />

defy all realistic efforts at quality assurance. The whole armamentarium of epidemiology and<br />

statistics, such as randomisati<strong>on</strong>, matching, blinding, placebo-procedures, strict selecti<strong>on</strong> criteria,<br />

and modelling, aims at mastering the s<str<strong>on</strong>g>to</str<strong>on</strong>g>chastic elements that c<strong>on</strong>found our judgment...[6]<br />

Why should the performance of trainees be deterministic, not s<str<strong>on</strong>g>to</str<strong>on</strong>g>chastic, inasmuch as the<br />

s<str<strong>on</strong>g>to</str<strong>on</strong>g>chastic vagaries of patient-characteristics (case-mix) must influence it, in any individual<br />

instance? The difference is that hospital-administra<str<strong>on</strong>g>to</str<strong>on</strong>g>rs have an influential voice in such<br />

decisi<strong>on</strong>-making. The individual trainee does not.<br />

Under pervasive fear of rating, the trainee dares not ask a questi<strong>on</strong> that he thinks that the rater<br />

might think foolish. He has <str<strong>on</strong>g>to</str<strong>on</strong>g> c<strong>on</strong>fine his questi<strong>on</strong>s <strong>on</strong>ly <str<strong>on</strong>g>to</str<strong>on</strong>g> what he c<strong>on</strong>siders “intelligent,” posed<br />

with the purpose in mind of impressing the rater with his insight and wisdom, bey<strong>on</strong>d his years.<br />

The good regard of the rater is the trainee's life-preserver, <strong>on</strong> a s<str<strong>on</strong>g>to</str<strong>on</strong>g>rmy sea of insecurity. The<br />

trainee walks <strong>on</strong> eggshells, fearing that his every move, his every word is an element in a<br />

cumulative body of chit-marks that may eventually <str<strong>on</strong>g>to</str<strong>on</strong>g>rpedo his reputati<strong>on</strong>. If he should get a black<br />

mark against him, for any reas<strong>on</strong>, and the rater learns of it, the trainee shall thereby have lost the<br />

rater's support and enter career free-fall. Accordingly, he pre-edits all questi<strong>on</strong>s for acceptability<br />

before letting them out of his mouth. If he can't think of a zinger of a questi<strong>on</strong>, he'll most likely<br />

stay mum and live in ignorance about a broad variety of subjects, out of fear of asking a “stupid


questi<strong>on</strong>.”<br />

MIT Cal Tech and other high-prestige instituti<strong>on</strong>s have experimented with a pass-fail grading<br />

system because they unders<str<strong>on</strong>g>to</str<strong>on</strong>g>od the inherent absurdity of grading-systems, with their arbitrary<br />

cut-off points for each letter-designati<strong>on</strong>. That represented a rejecti<strong>on</strong> of the very noti<strong>on</strong> of<br />

grading. I'm not certain of the status at those instituti<strong>on</strong>s at the moment. Maybe their graduates<br />

have had difficulty translating their academic performance in<str<strong>on</strong>g>to</str<strong>on</strong>g> terms that other instituti<strong>on</strong>s, that<br />

recognize grading, understand, so maybe they've g<strong>on</strong>e back <str<strong>on</strong>g>to</str<strong>on</strong>g> grading.<br />

4. Improper substitute for “where do I stand”<br />

Coens and Jenkins deplore performance-appraisals and, by implicati<strong>on</strong>, also <strong>LOR</strong>s:<br />

At a recent quality c<strong>on</strong>ference, a CEO was questi<strong>on</strong>ed as <str<strong>on</strong>g>to</str<strong>on</strong>g> why his organizati<strong>on</strong> c<strong>on</strong>tinued <str<strong>on</strong>g>to</str<strong>on</strong>g> use<br />

appraisals after shifting <str<strong>on</strong>g>to</str<strong>on</strong>g> a quality management culture of system and process<br />

improvement...“We think we owe it <str<strong>on</strong>g>to</str<strong>on</strong>g> people <str<strong>on</strong>g>to</str<strong>on</strong>g> let them know where they stand.”...(27)<br />

What people really want is access <str<strong>on</strong>g>to</str<strong>on</strong>g> the knowledge and informati<strong>on</strong> that influences the<br />

organizati<strong>on</strong>'s pay, promoti<strong>on</strong>, and status systems and how these affect or apply <str<strong>on</strong>g>to</str<strong>on</strong>g> them...People<br />

are insatiably curious about Where do I stand? because, in most organizati<strong>on</strong>s, this query is<br />

decided with a maze of unspoken rules, inscrutable political influences and other dynamics of<br />

organizati<strong>on</strong>al life. Appraisal is not the system that drives pay, careers, and status; it is an<br />

incidental effect of those dynamic systems. Appraisal is...the paper-shuffling that sanctifies<br />

decisi<strong>on</strong>s already made.(28)[7]<br />

The cognate of pay and promoti<strong>on</strong>, in the corporate setting, is gaining acceptance in<str<strong>on</strong>g>to</str<strong>on</strong>g> a “<str<strong>on</strong>g>to</str<strong>on</strong>g>p”<br />

(whatever that might mean) training program, in the medical-educati<strong>on</strong>al setting.<br />

Too often, a trainee finds out, <str<strong>on</strong>g>to</str<strong>on</strong>g> his surprise or shock, where he stands <strong>on</strong>ly when he reads his<br />

retrospective performance-appraisal/<strong>LOR</strong> and, by then, it's <str<strong>on</strong>g>to</str<strong>on</strong>g>o late <str<strong>on</strong>g>to</str<strong>on</strong>g> do anything about it. <strong>LOR</strong>s<br />

and performance-appraisals have an especially pernicious affect <strong>on</strong> medical students, at that<br />

vulnerable stage in their development, but they're bad for any trainee and for any pers<strong>on</strong>.<br />

5. Inaccuracy:<br />

a. misapplicati<strong>on</strong> of the Likert-scale<br />

“Likert” seems an unlikely choice for naming the method, since Likert used the scale in canvassing<br />

members of populati<strong>on</strong>-samples <str<strong>on</strong>g>to</str<strong>on</strong>g> obtain aggregate ratings of their attitudes in his<br />

1932-article[8], the presumed basis of the ep<strong>on</strong>ym. Likert, himself, had the good sense not <str<strong>on</strong>g>to</str<strong>on</strong>g><br />

apply such rating scales <str<strong>on</strong>g>to</str<strong>on</strong>g> important matters that could affect people's livelihoods, though his<br />

predecessors already had and his successors still do. Likert[9] credited prior authors, Fechner and<br />

Gal<str<strong>on</strong>g>to</str<strong>on</strong>g>n, without citing a reference, for the originati<strong>on</strong> of such questi<strong>on</strong>naires, circa 1888. Scott<br />

introduced the system <str<strong>on</strong>g>to</str<strong>on</strong>g> the United States Army in the early part of the last (20th) century[10].<br />

Paters<strong>on</strong>, an employee of the Scott-Company, described a later adaptati<strong>on</strong> of Scott's


method[11,12] for “objective” evaluati<strong>on</strong> of job-performance, the purpose of interest here.<br />

With no apparent insight in<str<strong>on</strong>g>to</str<strong>on</strong>g> the inherent vagueness of the method, Paters<strong>on</strong> claimed <str<strong>on</strong>g>to</str<strong>on</strong>g><br />

distinguish objective from subjective qualities without doing so:<br />

objective qualities . . . “efficiency,” “originality,” “perseverance,” and “quickness” . . . subjective<br />

qualities . . . “courage,” “cheerfulness” and “kindliness.” . . .[12]<br />

The criteria he cited for rating workers were similarly vague:<br />

Ability <str<strong>on</strong>g>to</str<strong>on</strong>g> Learn, Quantity of Work, Quality of Work, Industry, Initiative, Co-operativeness,<br />

Knowledge of Work[13]<br />

Strangely, Paters<strong>on</strong> disregarded the opportunity for evidence-based assessment of the criteri<strong>on</strong><br />

most amenable <str<strong>on</strong>g>to</str<strong>on</strong>g> objective evaluati<strong>on</strong>, namely quantity of work, in terms, say, of number of units<br />

the worker produces per unit-time. Instead, the instructi<strong>on</strong>s bade the rater give a worker <strong>on</strong> that<br />

criteri<strong>on</strong> a rating-score, presumably <str<strong>on</strong>g>to</str<strong>on</strong>g> foster “uniformity”[14] with ratings of other criteria, not<br />

amenable <str<strong>on</strong>g>to</str<strong>on</strong>g> objective assessment.<br />

Other authors described errors and pitfalls inherent in the method. Thorndike first described the<br />

halo effect as a<br />

. . . c<strong>on</strong>stant error <str<strong>on</strong>g>to</str<strong>on</strong>g>ward suffusing ratings of special features with a halo bel<strong>on</strong>ging <str<strong>on</strong>g>to</str<strong>on</strong>g> the<br />

individual as a whole[15]<br />

He found even the most capable rater<br />

unable <str<strong>on</strong>g>to</str<strong>on</strong>g> treat an individual as a compound of separate qualities and <str<strong>on</strong>g>to</str<strong>on</strong>g> assign a magnitude of<br />

each . . . in independence of the others.[16]<br />

As a countermeasure <str<strong>on</strong>g>to</str<strong>on</strong>g> minimize the halo-effect, he exhorted:<br />

. . . the observer should report the evidence, not a rating, and the rating should be given <strong>on</strong> the<br />

evidence <str<strong>on</strong>g>to</str<strong>on</strong>g> each quality separately without knowledge of the evidence c<strong>on</strong>cerning any other<br />

quality in the same individual.[16]<br />

Thorndike did not explain how a rater could avoid having knowledge of ratings he had given <strong>on</strong><br />

other criteria listed <strong>on</strong> the same form. The “evidence” Thorndike had in mind c<strong>on</strong>sisted in vague<br />

descriptive adjectives, similar <str<strong>on</strong>g>to</str<strong>on</strong>g> those that Paters<strong>on</strong> cited,[17] that the rater had a general<br />

impressi<strong>on</strong> might apply <str<strong>on</strong>g>to</str<strong>on</strong>g> the ratee.<br />

Kingsbury addressed accuracy:<br />

. . . ratings as ordinarily made are . . . unreliable, and . . . <strong>on</strong>ly under what may be called ideally<br />

favorable c<strong>on</strong>diti<strong>on</strong>s will they approximate accuracy, even <strong>on</strong> a scale so gross as <strong>on</strong>e of five


divisi<strong>on</strong>s. (18)<br />

Kingsbury enumerated those allegedly ideal c<strong>on</strong>diti<strong>on</strong>s:<br />

Ratings, <str<strong>on</strong>g>to</str<strong>on</strong>g> be reliable, necessitate (1) averaging three independent ratings, each made <strong>on</strong> an<br />

objective scale; (2) these scales must be comparable and equivalent, made in c<strong>on</strong>ference under<br />

expert supervisi<strong>on</strong>; (3) the three raters must be competent <str<strong>on</strong>g>to</str<strong>on</strong>g> rate.[18]<br />

Paters<strong>on</strong> joined, with slightly different wording, in affirming Kingsbury's ideal c<strong>on</strong>diti<strong>on</strong>s (2) and<br />

(3) and in thus implicitly alluding <str<strong>on</strong>g>to</str<strong>on</strong>g> pitfalls of the method:<br />

. . . Ratings should be accepted and filed for use <strong>on</strong>ly from those who have proved themselves<br />

capable of accurately judging human qualities. . . a rating scheme will not work au<str<strong>on</strong>g>to</str<strong>on</strong>g>matically. It<br />

must be closely supervised preferably by trained pers<strong>on</strong>nel research workers who must c<strong>on</strong>tinually<br />

subject the ratings <str<strong>on</strong>g>to</str<strong>on</strong>g> critical analysis and assist in training executives in proper use of the method.<br />

There is no escape from this requirement.[19]<br />

Paters<strong>on</strong> and Kingsbury omitted menti<strong>on</strong> of what specifics the training they proposed for the<br />

pers<strong>on</strong>nel-research workers should comprise and accomplish but they presumably intended,<br />

am<strong>on</strong>g other things, that the trained supervisors should somehow ensure separate evaluati<strong>on</strong> of<br />

labeled traits <str<strong>on</strong>g>to</str<strong>on</strong>g> exclude Thorndike's halo-effect; then, by averaging, fine-tuning, adjustment and<br />

manipulati<strong>on</strong> of the scores from at least three raters, all of whom knew how <str<strong>on</strong>g>to</str<strong>on</strong>g> provide accurate<br />

ratings (presumably assessed by the raters' mutual agreement <strong>on</strong> each candidate's score <strong>on</strong> each<br />

criteri<strong>on</strong>) obtain a set of ratings c<strong>on</strong>sistent with the aggregate global impressi<strong>on</strong> each candidate<br />

made <strong>on</strong> the raters (the candidate's halo). The circularity of the rati<strong>on</strong>ale seems inescapable.<br />

Prior <str<strong>on</strong>g>to</str<strong>on</strong>g> receiving requests <str<strong>on</strong>g>to</str<strong>on</strong>g> fill out forms c<strong>on</strong>sisting of Likert-scale ratings <strong>on</strong> others'<br />

performance, I have never received any of the extensive training or testing <str<strong>on</strong>g>to</str<strong>on</strong>g> prove myself<br />

“capable of accurately judging human qualities,” nor, I daresay, has any appraiser of my<br />

performance received such training and testing, <str<strong>on</strong>g>to</str<strong>on</strong>g> my knowledge. The origina<str<strong>on</strong>g>to</str<strong>on</strong>g>rs of such forms<br />

seemed <str<strong>on</strong>g>to</str<strong>on</strong>g> assume that the rating scemes would “work au<str<strong>on</strong>g>to</str<strong>on</strong>g>matically,” c<strong>on</strong>trary <str<strong>on</strong>g>to</str<strong>on</strong>g> Kingsbury's<br />

adm<strong>on</strong>iti<strong>on</strong>.<br />

Rugg may have had more insight:<br />

. . . The unordered -- yes, the chaotic -- character of the judgments appears, irrespective of what<br />

traits are c<strong>on</strong>sidered or of what kinds of scales are compared. I now believe that the evidence<br />

establishes the futility of obtaining single “ratings” <strong>on</strong> point scales of such dynamic qualities as<br />

“intelligence,” “pers<strong>on</strong>al qualities,” “general work,” and the like.[20]<br />

Paters<strong>on</strong> cauti<strong>on</strong>ed and predicted:<br />

These rating methods should not be looked up<strong>on</strong> as perfect or final. Further research is necessary,<br />

and industry will profit . . . as progressive, experimentally minded executives realize the scope of<br />

the problem and engage in the necessary research . . . <str<strong>on</strong>g>to</str<strong>on</strong>g> develop newer and more reliable methods


than we now possess.[21]<br />

The progress Patters<strong>on</strong> envisi<strong>on</strong>ed has been slow in developing, as the medical-educati<strong>on</strong><br />

evaluati<strong>on</strong>-literature amply shows [22- 28]. The Likert-scale remains alive, well and unimproved<br />

since Paters<strong>on</strong>, Kingsbury and Thorndike fretted over it and <str<strong>on</strong>g>to</str<strong>on</strong>g>rtured it and since Rugg dismissed<br />

it as inherently invalid over eighty years ago.<br />

Rating-criteria in medical educati<strong>on</strong> c<strong>on</strong>tinue <str<strong>on</strong>g>to</str<strong>on</strong>g> be as vague as Paters<strong>on</strong>'s, e.g., “general medical<br />

knowledge (1-5),” “procedural skill (1-5),” “rapport with patients (1-5),” “rapport with nurses<br />

(1-5),” “overall general impressi<strong>on</strong> (1-5)” (the most global “halo”-criteri<strong>on</strong> of all) and the like.<br />

Many,[22-28] though not all[29,30] current users and discussants of the Likert-scale treat it as an<br />

axiomatically good and self-explana<str<strong>on</strong>g>to</str<strong>on</strong>g>ry scheme.<br />

In current medical-educati<strong>on</strong> usage, men<str<strong>on</strong>g>to</str<strong>on</strong>g>rs rate trainees and trainees rate men<str<strong>on</strong>g>to</str<strong>on</strong>g>rs without any<br />

expert supervisi<strong>on</strong> - in disregard of Kingsbury's ideal c<strong>on</strong>diti<strong>on</strong>s[18], Paters<strong>on</strong>'s precauti<strong>on</strong>s[19]<br />

and Rugg's skepticism[20] -- perhaps in imitati<strong>on</strong> of Likert[8], who may have felt justified in<br />

ignoring the precauti<strong>on</strong>s, c<strong>on</strong>diti<strong>on</strong>s and invalidity of the method for objective evaluati<strong>on</strong> because<br />

he pursued <strong>on</strong>ly subjective attitudes rather than purportedly objective traits. Yet, those who<br />

publish studies based <strong>on</strong> Likert-scale “data” apply the numerical scores derived as if they were<br />

facts and manipulate them with parametric statistics as if they were not ordinal[ 22-28] .<br />

Guilford[30] appears <str<strong>on</strong>g>to</str<strong>on</strong>g> equate Likert-scales with formal psychometric tests by including them in<br />

his book, entitled, “Psychometric Methods.” Worse, many seem <str<strong>on</strong>g>to</str<strong>on</strong>g> follow Thorndike[9] in<br />

attributing traits <str<strong>on</strong>g>to</str<strong>on</strong>g> ratings, a tendency Try<strong>on</strong> deplores, even for psychological tests, which are<br />

more formal than Likert-ratings, yet purveyors of Likert -ratings attribute traits <str<strong>on</strong>g>to</str<strong>on</strong>g> them:<br />

The test-trait fallacy [c<strong>on</strong>sists in presuming] that test scores provide measures of enduring and<br />

generalized characteristics of the pers<strong>on</strong>, called traits. . .<br />

The test-trait fallacy begins with the assumpti<strong>on</strong> that test scores are trait measures. The sec<strong>on</strong>d<br />

assumpti<strong>on</strong> is that trait measures are basic properties of the pers<strong>on</strong>. It easily follows that test<br />

scores reflect basic properties of the pers<strong>on</strong>. . . hence a measurement is reified in<str<strong>on</strong>g>to</str<strong>on</strong>g> a causal force.<br />

. . the unsound logic of drawing inferences about ability <strong>on</strong> the basis of observed performance is<br />

integral <str<strong>on</strong>g>to</str<strong>on</strong>g> the test-trait fallacy. . .[32]<br />

Traits are alluring because they are . . . compatible with the stimulus-organism-resp<strong>on</strong>se paradigm<br />

<str<strong>on</strong>g>to</str<strong>on</strong>g> which virtually all psychologists subscribe. . . To presume that psychological tests . . . measure<br />

organismic traits and <str<strong>on</strong>g>to</str<strong>on</strong>g> further presume that such traits are the basic properties that cause<br />

behavior is <str<strong>on</strong>g>to</str<strong>on</strong>g> place the psychologist in an attractively powerful theoretical and clinical positi<strong>on</strong>.<br />

The volume of psychological tests . . . is evidence of their allure for clinicians and researchers<br />

alike.[33]<br />

Authors even apply statistical methods <str<strong>on</strong>g>to</str<strong>on</strong>g> aggregate number-scores from a group of raters,<br />

compute inter-observer correlati<strong>on</strong>s and the like. Literature-approval of Likert-scale “data”<br />

encourages decisi<strong>on</strong>-makers <str<strong>on</strong>g>to</str<strong>on</strong>g> attach unwarranted worth <str<strong>on</strong>g>to</str<strong>on</strong>g> Likert-scale merit-ratings and


serenely <str<strong>on</strong>g>to</str<strong>on</strong>g> apply them in life-altering decisi<strong>on</strong>s <str<strong>on</strong>g>to</str<strong>on</strong>g>uching subordinate trainees,[22] such as<br />

recommendati<strong>on</strong> for certifying examinati<strong>on</strong>s, and employment, and even in promoting<br />

faculty-members[25].<br />

Albanes[34] suggests that “real life” ratings, presumably of qualified physicians, are objective and<br />

based <strong>on</strong> outcomes, yet Carey[35] asserts that evaluati<strong>on</strong>s of physician-faculty must be subjective.<br />

Codman[36] and his spiritual successors[37-41] have called for outcome-based rating of<br />

performance and, by extensi<strong>on</strong>, of competence, but physicians and hospitals have pointed the<br />

deficits of that method and prevented its spread, <str<strong>on</strong>g>to</str<strong>on</strong>g> date, by citing the multiplicity of fac<str<strong>on</strong>g>to</str<strong>on</strong>g>rs,<br />

unrelated <str<strong>on</strong>g>to</str<strong>on</strong>g> instituti<strong>on</strong>al or physician-competence, that determine outcome.[42]<br />

The champi<strong>on</strong>s of rating attribute two roles <str<strong>on</strong>g>to</str<strong>on</strong>g> it, evaluative or summative (entailing punitive and<br />

deterrent purposes) and formative.[42] Paters<strong>on</strong> <str<strong>on</strong>g>to</str<strong>on</strong>g>uted the formative purpose:<br />

I. Rating methods have been developed because of a recogniti<strong>on</strong> of the educati<strong>on</strong>al value of<br />

ratings . . .<br />

a. . . . <strong>on</strong> those who make the ratings. . . insures the analysis of subordinates in terms of the traits<br />

essential for success in the work.<br />

b. . . . <strong>on</strong> the employee. . . encourages self-analysis and provides an incentive for<br />

self-improvement in . . . traits in which he is weakest.[35]<br />

As educati<strong>on</strong>al feedback, rating fails <str<strong>on</strong>g>to</str<strong>on</strong>g> fulfill Ziegenfuss' proposed criteria for adequacy and<br />

efficacy:<br />

. . . the art of feeding back quality-related data is a critical point of quality improvement work. . .<br />

Feedback is effective when the following c<strong>on</strong>diti<strong>on</strong>s are met:<br />

1. Clarity of Purpose. Data can be used for development or for rendering judgment (formative<br />

versus summative . . .). . . for . . . organizati<strong>on</strong>al development, . . . the purpose is . . . formative . .<br />

. Learning and change <str<strong>on</strong>g>to</str<strong>on</strong>g> improve processes is the goal. A judgmental purpose (summative) offers<br />

a . . . grade of pass or fail and is designed for accountability. . .[45]<br />

Since “accountability” entails punishment[46], it does not bel<strong>on</strong>g in any workplace.[4] In<br />

educati<strong>on</strong>, by definiti<strong>on</strong>, the <strong>on</strong>ly appropriate purpose of feedback is the formative <strong>on</strong>e. The<br />

Likert-rating, in its cus<str<strong>on</strong>g>to</str<strong>on</strong>g>mary applicati<strong>on</strong>, succeeds in the summative, punitive goal of criteri<strong>on</strong> 1<br />

but fails in its formative goal.<br />

2. Clear and Specific Data. Data . . . must be . . . relevant <str<strong>on</strong>g>to</str<strong>on</strong>g> the . . . recipient.[45]<br />

The vague expressi<strong>on</strong>, “general medical knowledge, 3” (or any other number) is unclear and<br />

n<strong>on</strong>-specific, so rating fails criteri<strong>on</strong> 2 and is not relevant <str<strong>on</strong>g>to</str<strong>on</strong>g> the recipient (see criteri<strong>on</strong> 5).


3. Descriptive, Not Evaluative. Useful feedback describes what is happening but does not offer an<br />

evaluative judgment (unless that is the intended purpose). The presenters must not rush <str<strong>on</strong>g>to</str<strong>on</strong>g><br />

judgment without some interactive discussi<strong>on</strong> with the audience.[45]<br />

The Likert-scale rating substitutes for relevant evidence and thus fails criteri<strong>on</strong> 3.<br />

4. Timely. How close <str<strong>on</strong>g>to</str<strong>on</strong>g> the acti<strong>on</strong> . . . reviewed are the data describing the events? The golden<br />

rule is quick feedback . . . Old data is useful for his<str<strong>on</strong>g>to</str<strong>on</strong>g>rical and l<strong>on</strong>gitudinal purposes but is not<br />

supportive of behavior change in the near term.[45]<br />

The team, employed l<strong>on</strong>g-term, may <str<strong>on</strong>g>to</str<strong>on</strong>g>lerate a m<strong>on</strong>thly, quarterly or semi-annual feedback-cycle.<br />

The medical student or other trainee, who often has m<strong>on</strong>thly rotati<strong>on</strong>s in clinical departments,<br />

needs a shorter feedback-cycle. Feedback should be c<strong>on</strong>tinuous, its formal aspect should be at<br />

least weekly, and, preferably, daily. The comm<strong>on</strong>est Likert-rating comes <str<strong>on</strong>g>to</str<strong>on</strong>g> the ratee's attenti<strong>on</strong> as<br />

a summative, end-of-rotati<strong>on</strong> event, delivered <str<strong>on</strong>g>to</str<strong>on</strong>g>o late for him <str<strong>on</strong>g>to</str<strong>on</strong>g> implement improvement, so it<br />

fails criteri<strong>on</strong> 4.<br />

5. Limited. How great is the scope of the data? . . . tailor and focus the data <str<strong>on</strong>g>to</str<strong>on</strong>g> fit the specific,<br />

targeted needs of users. . .[45]<br />

A Likert-rating, e.g., “general medical knowledge, 3,” is <str<strong>on</strong>g>to</str<strong>on</strong>g>o vague and global <str<strong>on</strong>g>to</str<strong>on</strong>g> serve a ratee's<br />

needs. It invites the so-called halo-effect and fails criteri<strong>on</strong> 5.<br />

6. Comparative. . . To leave out comparative informati<strong>on</strong> is <str<strong>on</strong>g>to</str<strong>on</strong>g> deprive the recipients of<br />

knowledge about their progress or lack thereof. . .[45]<br />

Albanes[34] deplored the rater's “failure <str<strong>on</strong>g>to</str<strong>on</strong>g> discriminate” am<strong>on</strong>g trainees in awarding them equal<br />

marks. He thereby pursued a similar goal of making distincti<strong>on</strong>s for distincti<strong>on</strong>s' sake al<strong>on</strong>e and<br />

disregarded the “lottery”-nature of rating people who operate “within the system,”[5]<br />

Kingsbury likewise suggested:<br />

. . . we do have <str<strong>on</strong>g>to</str<strong>on</strong>g> make distincti<strong>on</strong>s between people . . .<br />

. . . and the rater should realize that it is not so disastrous <str<strong>on</strong>g>to</str<strong>on</strong>g> make some employees 2 who are not<br />

much worse than some he marks 3, as it is <str<strong>on</strong>g>to</str<strong>on</strong>g> mark them all alike <str<strong>on</strong>g>to</str<strong>on</strong>g> avoid seeming <str<strong>on</strong>g>to</str<strong>on</strong>g> magnify the<br />

difference. . .[47]<br />

As Deming eloquently explains,[5] it's disastrous for an individual <str<strong>on</strong>g>to</str<strong>on</strong>g> suffer a low rating. A low<br />

rating may be especially crushing <str<strong>on</strong>g>to</str<strong>on</strong>g> a medical student, accus<str<strong>on</strong>g>to</str<strong>on</strong>g>med such a tender soul often is,<br />

from the experience of a lifetime, <str<strong>on</strong>g>to</str<strong>on</strong>g> high academic ratings.<br />

If two or more employees or trainees perform equally well and very well, say, 5 of 5, they would<br />

deserve equal marks because equality of their performance reflects truth. The company, <str<strong>on</strong>g>to</str<strong>on</strong>g> which<br />

marking two or more employees alike, e.g. 5 of 5, may seem disastrous, can stand the gaff more


easily than an individual arbitrarily marked down, despite his best effort, merely <str<strong>on</strong>g>to</str<strong>on</strong>g> “make<br />

distincti<strong>on</strong>s between people.” Neither Albanes[34] nor Kingsbury[47] justified the need <str<strong>on</strong>g>to</str<strong>on</strong>g> make<br />

such distincti<strong>on</strong>s. He presumably c<strong>on</strong>sidered the principle axiomatic and self-evident.<br />

Since the end-of-rotati<strong>on</strong> Likert-scale rating provides no progressive comparis<strong>on</strong>s and since<br />

men<str<strong>on</strong>g>to</str<strong>on</strong>g>rs might balk at the administrative burden of completing Likert-scale ratings more often<br />

than <strong>on</strong>ce m<strong>on</strong>thly, it provides no sequential comparis<strong>on</strong> and fails criteri<strong>on</strong> 6.<br />

7. Participative Interpretati<strong>on</strong>. . . . final . . . analysis can[not] be c<strong>on</strong>ducted without audience<br />

involvement. Joint interpretati<strong>on</strong> is c<strong>on</strong>sistent with the developmental/formative purpose, as<br />

<str<strong>on</strong>g>to</str<strong>on</strong>g>gether we discuss meaning and follow-up acti<strong>on</strong> . . .[45]<br />

In the medical-educati<strong>on</strong> c<strong>on</strong>text, the Likert-scale rater rarely discusses his rating with his ratee(s)<br />

prior <str<strong>on</strong>g>to</str<strong>on</strong>g> entering it. It comes most often <str<strong>on</strong>g>to</str<strong>on</strong>g> the recipient's attenti<strong>on</strong> as a fait accompli, <str<strong>on</strong>g>to</str<strong>on</strong>g>o late for<br />

him <str<strong>on</strong>g>to</str<strong>on</strong>g> improve it. The Likert-scale rating fails criteri<strong>on</strong> 7.<br />

8. Safety and Security. Receiving performance feedback is . . . technical and . . . psychological . . .<br />

We need first <str<strong>on</strong>g>to</str<strong>on</strong>g> have the data correct (technical). . . Presenters must be sensitive <str<strong>on</strong>g>to</str<strong>on</strong>g> the<br />

psychology of the process and offer language and behavior that protect the recipients.[45]<br />

The Likert-scale rating inherently fails the technical criteri<strong>on</strong>, since it c<strong>on</strong>sists of a set of numerical<br />

scores which obscures the evidence that purports <str<strong>on</strong>g>to</str<strong>on</strong>g> form its basis. Various errors, <str<strong>on</strong>g>to</str<strong>on</strong>g> wit, the<br />

halo-effect (supra) and tendency <str<strong>on</strong>g>to</str<strong>on</strong>g>ward the mean[48,49] inhere in the Likert-scale.<br />

As applied, it most often fails the psychological criteri<strong>on</strong> since the social-c<strong>on</strong>trol functi<strong>on</strong>, which<br />

Albanes[34] advocated is crucial <str<strong>on</strong>g>to</str<strong>on</strong>g> its deterrent/punitive functi<strong>on</strong>. To pull a punch at the moment<br />

of delivery would diminish or annihilate the crushing impact the rater can otherwise accomplish.<br />

9. Practical and Acti<strong>on</strong> Oriented. To be useful, the data should suggest some followup acti<strong>on</strong> and<br />

should be practical enough <str<strong>on</strong>g>to</str<strong>on</strong>g> be used by professi<strong>on</strong>als in the field. . . [32,50]<br />

Having received a rating of, e.g., “general medical knowledge, 3,” the recipient can discern no<br />

idea from the rating how <str<strong>on</strong>g>to</str<strong>on</strong>g> improve. The Likert-rating fails criteri<strong>on</strong> 9.<br />

The evidence seems clear that ratings fail all of Ziegenfuss's rati<strong>on</strong>al criteria for effective feedback.<br />

b. inevitability of rating-inflati<strong>on</strong><br />

A universal human c<strong>on</strong>ceit holds that everybody's a fool and a moral pervert except for thee and<br />

me and I'm not so sure about thee. The individual expects others <str<strong>on</strong>g>to</str<strong>on</strong>g> rate him in a manner<br />

c<strong>on</strong>s<strong>on</strong>ant with the intrinsic, superlative characteristics that he attributes <str<strong>on</strong>g>to</str<strong>on</strong>g> himself. When<br />

men<str<strong>on</strong>g>to</str<strong>on</strong>g>rs, in a medical-educati<strong>on</strong> setting, rate him harshly, he feels helpless and often n<strong>on</strong>-plussed<br />

and feels an urge <str<strong>on</strong>g>to</str<strong>on</strong>g> press his raters <str<strong>on</strong>g>to</str<strong>on</strong>g> improve his rating.<br />

Some years ago, Sissela Bok, philospher and wife of Derek Bok, former President of Harvard


University, addressed merit-ratings <strong>on</strong> “fitness-reports” in the US Army. Her c<strong>on</strong>text was “lying”<br />

and her example of a liar was the supervisor who rated his subordinates <str<strong>on</strong>g>to</str<strong>on</strong>g>o highly <strong>on</strong> traits, such<br />

as “leadership,” “appearance,” etc., which are at least as nebulous as entities that raters in<br />

medicine attempt <str<strong>on</strong>g>to</str<strong>on</strong>g> address, e.g., “general medical knowledge,” “rapport with staff,” etc. They're<br />

all manifestati<strong>on</strong>s of the great tendency <str<strong>on</strong>g>to</str<strong>on</strong>g> generalize from skillful executi<strong>on</strong> of a narrow scope of<br />

activities, such as getting high scores <strong>on</strong> tests, <str<strong>on</strong>g>to</str<strong>on</strong>g> global “excellence,” “outstandingness” or<br />

“bestness,” in general.<br />

Bok's descripti<strong>on</strong> shows that your observati<strong>on</strong> that “excellent” is a third-tier rating has a his<str<strong>on</strong>g>to</str<strong>on</strong>g>ry:<br />

...Those who rate officers are asked <str<strong>on</strong>g>to</str<strong>on</strong>g> give them scores of “outstanding,” “superior,' “excellent,”<br />

“effective,” “marginal,” and “inadequate.” Raters know...that those who are ranked anything less<br />

than “outstanding” (say “superior” or “excellent”) are then at a great disadvantage, and become<br />

likely candidates for discharge...superficial verbal harmlessness combines with the harsh realities<br />

of the competiti<strong>on</strong> for advancement and job retenti<strong>on</strong> <str<strong>on</strong>g>to</str<strong>on</strong>g> produce an inflated set of standards <str<strong>on</strong>g>to</str<strong>on</strong>g><br />

which most feel bound <str<strong>on</strong>g>to</str<strong>on</strong>g> c<strong>on</strong>form. (Bok 73)<br />

...The US Army tried <str<strong>on</strong>g>to</str<strong>on</strong>g> scale down evaluati<strong>on</strong>s by publishing the evaluati<strong>on</strong> report...cited. It<br />

suggested mean scores for the different ranks, but few felt free <str<strong>on</strong>g>to</str<strong>on</strong>g> follow these means in individual<br />

cases, for fear of hurting the pers<strong>on</strong>s being rated. As a result, the suggested mean scores <strong>on</strong>ce<br />

again lost all value. (Bok 74) [51]<br />

Professor Bok, writing as a member of the establishment. <strong>LOR</strong>s and performance-evaluati<strong>on</strong>s<br />

cause little <str<strong>on</strong>g>to</str<strong>on</strong>g> no worry <str<strong>on</strong>g>to</str<strong>on</strong>g> her and her husband, who have made it <str<strong>on</strong>g>to</str<strong>on</strong>g> the <str<strong>on</strong>g>to</str<strong>on</strong>g>p of the academic<br />

heap, from which pinnacle, they may comment <strong>on</strong> us, herebelow:<br />

In elite . . . organizati<strong>on</strong>s, the evaluati<strong>on</strong> model tends <str<strong>on</strong>g>to</str<strong>on</strong>g> be elitism. Two lines of argument are<br />

involved. First, since the organizati<strong>on</strong>s have selected the best people, evaluati<strong>on</strong> of performance is<br />

irrelevant. After all, if the best people could not succeed, who could do better? Sec<strong>on</strong>d, since the<br />

quality of the organizati<strong>on</strong>s and their output is determined primarily by the equality of their<br />

people, attenti<strong>on</strong> <str<strong>on</strong>g>to</str<strong>on</strong>g> system, methods, or management is inc<strong>on</strong>sequential. It follows that, if the<br />

organizati<strong>on</strong>s already have the best people, “the opportunities for increased productivity in them<br />

are small and come slowly.” Finally, . . . elitism tends <str<strong>on</strong>g>to</str<strong>on</strong>g> create self-perpetuating closed circles<br />

whose members are exempt from review except by peers within.<br />

C<strong>on</strong>verting work problems in<str<strong>on</strong>g>to</str<strong>on</strong>g> people problems is a process of denying organizati<strong>on</strong>al<br />

accountability. It is a process of establishing a hierarchy of special privilege and immunity <str<strong>on</strong>g>to</str<strong>on</strong>g> rank<br />

with the hierarchy of authority. It is a process of maintaining the status quo; it denies both the<br />

need for change and the possibility. (27)[52]<br />

Accordingly, Professor Bok focused <strong>on</strong> the “lies” perpetrated <str<strong>on</strong>g>to</str<strong>on</strong>g> help the plebeian but omitted any<br />

menti<strong>on</strong> of organizati<strong>on</strong> dish<strong>on</strong>esty: the rumor-grapevines, chiefly by teleph<strong>on</strong>e, which leave no<br />

paper-trail, and which circumvent and subvert the normal channels of committed, transparent,<br />

written communicati<strong>on</strong>, <str<strong>on</strong>g>to</str<strong>on</strong>g> which subjects of ratings <strong>on</strong> <strong>LOR</strong>s might obtain access.[53]<br />

Pers<strong>on</strong>nel-managers use such underhanded means <str<strong>on</strong>g>to</str<strong>on</strong>g> evade legal liability for defamati<strong>on</strong> of


character <str<strong>on</strong>g>to</str<strong>on</strong>g> find out from former employers “what applicants are really like.”<br />

You wrote in your article[1] in nearly identical terms of the inevitable tendency <str<strong>on</strong>g>to</str<strong>on</strong>g>ward rating<br />

inflati<strong>on</strong>, your “hierarchy of superlatives,” the Lake Wobeg<strong>on</strong> effect, in which everybody is<br />

“above average,” and the tendency <str<strong>on</strong>g>to</str<strong>on</strong>g>ward rating-fragmentati<strong>on</strong> <str<strong>on</strong>g>to</str<strong>on</strong>g> permit raters <str<strong>on</strong>g>to</str<strong>on</strong>g> distinguish the<br />

"is <strong>on</strong>e of the finest medical students of the year," "...<strong>on</strong>e of the best medical students I have ever<br />

worked with," "richly deserves the h<strong>on</strong>ors awarded in the rotati<strong>on</strong>," or "receives my highest<br />

recommendati<strong>on</strong>",[54] from am<strong>on</strong>g the best and those who are the very best in the past year, the<br />

best ever, etc., etc. Speer et al cited grade-inflati<strong>on</strong> in internal medicine as well:<br />

. . . a significant number of clerkship direc<str<strong>on</strong>g>to</str<strong>on</strong>g>rs (43%) felt that we are unable <str<strong>on</strong>g>to</str<strong>on</strong>g> appropriately<br />

identify students with failing performances. The implicati<strong>on</strong> for our ability <str<strong>on</strong>g>to</str<strong>on</strong>g> certify students as<br />

clinically competent is c<strong>on</strong>cerning. . . (116)[55]<br />

That's evidently not their c<strong>on</strong>cern. They express more c<strong>on</strong>cern with labeling trainees clinically in<br />

competent.<br />

. . . faculty were the key <str<strong>on</strong>g>to</str<strong>on</strong>g> both the cause and soluti<strong>on</strong>. (116)[55]<br />

That is a truer statement than Speer et al perhaps realized, though faculty would probably prefer<br />

<str<strong>on</strong>g>to</str<strong>on</strong>g> blame the trainee-victims.<br />

Yet, clinical medicine simply doesn't c<strong>on</strong>tain tasks of sufficient sophisticati<strong>on</strong> that trainees could<br />

perform that would enable a trainee could distinguish himself from his fellows <str<strong>on</strong>g>to</str<strong>on</strong>g> the extent<br />

depicted in all the finely nuanced and ever mounting expressi<strong>on</strong>s of enthusiasm. The difficulty<br />

would be quite similar <str<strong>on</strong>g>to</str<strong>on</strong>g> the difficulty of rating a patient in similar terms, according <str<strong>on</strong>g>to</str<strong>on</strong>g> his<br />

resp<strong>on</strong>se <str<strong>on</strong>g>to</str<strong>on</strong>g> treatment. Objectively, he either gets better, stays the same or gets worse. It's difficult<br />

<str<strong>on</strong>g>to</str<strong>on</strong>g> imagine that an evalua<str<strong>on</strong>g>to</str<strong>on</strong>g>r of patients could find rati<strong>on</strong>al criteria for appraising a patient's<br />

recovery as “excellent,” “outstanding,” <strong>on</strong>e of the best <strong>on</strong> the ward,” “<strong>on</strong>e of the best in the past<br />

year,” “the best ever,” etc. If a rater can't do it for a patient, how can he do it for a trainee?<br />

Gould attributed the fallacy of c<strong>on</strong>fusing objects with labels <str<strong>on</strong>g>to</str<strong>on</strong>g> John Stuart Mill:<br />

The tendency has always been str<strong>on</strong>g <str<strong>on</strong>g>to</str<strong>on</strong>g> believe that whatever received a name must be an entity<br />

or being, having an independent existence of its own. And if no real entity answering <str<strong>on</strong>g>to</str<strong>on</strong>g> the name<br />

could be found, men did not for that reas<strong>on</strong> suppose that n<strong>on</strong>e existed, but imagined that it was<br />

something peculiarly abstruse and mysterious.[56]<br />

Gould cited the fallacy in noting that Benet, origina<str<strong>on</strong>g>to</str<strong>on</strong>g>r of IQ, intended n<strong>on</strong>e of the social elitism <str<strong>on</strong>g>to</str<strong>on</strong>g><br />

which it has given rise.[56] Such reificati<strong>on</strong> of jarg<strong>on</strong> is a prominent feature also of rating<br />

practice.<br />

c. popularity-c<strong>on</strong>test<br />

What feats of clinical derring-do can a trainee, at any level, perform that would make him so much


etter than any of his c<strong>on</strong>temporaries that he would qualify for such sterling and distinctive<br />

accolades as "is <strong>on</strong>e of the finest medical students of the year," "is <strong>on</strong>e of the best medical<br />

students I have ever worked with," "richly deserves the h<strong>on</strong>ors awarded in the rotati<strong>on</strong>," or<br />

"receives my highest recommendati<strong>on</strong>",[54] in c<strong>on</strong>tradistincti<strong>on</strong> <str<strong>on</strong>g>to</str<strong>on</strong>g> his fellows, whose<br />

performance might rate a mere “excellent”?<br />

Did pediatric resident A miraculously heal a girl with Friedrich's Ataxia so she never progressed<br />

and even achieved a normal gait? If so, how did he do it? By Divine Interventi<strong>on</strong>? By Black<br />

Magic? By weird science? Miracle-healing, if accomplished, would obviously exceed cus<str<strong>on</strong>g>to</str<strong>on</strong>g>mary<br />

expectati<strong>on</strong>s and be well the upper c<strong>on</strong>trol-limit of performance that Deming defines as “within<br />

the system.” Miracle-healing may thus warrant the highest accolades but even outstanding<br />

residents rarely <str<strong>on</strong>g>to</str<strong>on</strong>g> never perform it.<br />

The <strong>on</strong>ly realistic answer that comes <str<strong>on</strong>g>to</str<strong>on</strong>g> my mind is that the highly regarded trainee manufactures<br />

his high regard by ingratiating himself, through force of intrinsic pers<strong>on</strong>ality or insidious, political<br />

means, in<str<strong>on</strong>g>to</str<strong>on</strong>g> the rater's favor. The rater then comes <str<strong>on</strong>g>to</str<strong>on</strong>g> like the trainee pers<strong>on</strong>ally so much that he's<br />

willing <str<strong>on</strong>g>to</str<strong>on</strong>g> go out <strong>on</strong> a limb for him with various superlative terms of enthusiasm, presumably<br />

assuming his performance be at least adequate. In other words, rating of trainees for<br />

pers<strong>on</strong>nel-records and the <strong>LOR</strong> are popularity-c<strong>on</strong>tests. Who has the bubbliest pers<strong>on</strong>ality? Who<br />

is the most “well liked?”[57]<br />

Such a system could select for those who go al<strong>on</strong>g <str<strong>on</strong>g>to</str<strong>on</strong>g> get al<strong>on</strong>g and who may rate pleasing their<br />

administrative superiors <str<strong>on</strong>g>to</str<strong>on</strong>g> enhance the chance of their own advancement as more important than<br />

performing what's right for a patient, perhaps c<strong>on</strong>trary <str<strong>on</strong>g>to</str<strong>on</strong>g> the will of his superiors. Such disregard<br />

of objectively correct performance may lead <str<strong>on</strong>g>to</str<strong>on</strong>g> deteriorati<strong>on</strong> of quality of patient-care, ostensibly<br />

the opposite of rati<strong>on</strong>al goals for a health-care system.<br />

d. mismeasure of “excellence”<br />

Mere prattle without practice doesn't necessarily tranfer well <str<strong>on</strong>g>to</str<strong>on</strong>g> good real-world outcomes.<br />

Howard Zinn spoke of “the best and the brightest”:<br />

The New York Times did a survey of high-school students <str<strong>on</strong>g>to</str<strong>on</strong>g> see how much his<str<strong>on</strong>g>to</str<strong>on</strong>g>ry they knew.<br />

They do this every few years. They do a survey of young people <str<strong>on</strong>g>to</str<strong>on</strong>g> prove how dumb they are and<br />

<str<strong>on</strong>g>to</str<strong>on</strong>g> prove how smart are the givers of the tests and so they gave this test <str<strong>on</strong>g>to</str<strong>on</strong>g> high-school seniors and<br />

corroborated what they thought. Young people d<strong>on</strong>'t know anything about his<str<strong>on</strong>g>to</str<strong>on</strong>g>ry. They asked<br />

questi<strong>on</strong>s like, “Who was the President during the War of 1812?” “Who was the President during<br />

the Mexican War?”...We're in a great quiz-culture...“What came first the Homestead Act or the<br />

Civil Service Act?” You recognize questi<strong>on</strong>s like that because those are the questi<strong>on</strong>s that appear<br />

<strong>on</strong> tests which enable you <str<strong>on</strong>g>to</str<strong>on</strong>g> get in<str<strong>on</strong>g>to</str<strong>on</strong>g> graduate-school. You can go very far if you know enough of<br />

those answers. You'll be Phi Beta Kappa. You'll become an advisor <str<strong>on</strong>g>to</str<strong>on</strong>g> the President of the United<br />

States. You remember the book, The Best and the Brightest, which was precisely about that<br />

point, that the people surrounding the President were...the people who got the highest scores.<br />

They were Phi Beta Kappa and they were the architects of the War in Vietnam.[58]


Holman cited an analogous problem related <str<strong>on</strong>g>to</str<strong>on</strong>g> inflated self-esteem, the ‘excellence' decepti<strong>on</strong> in<br />

medicine[59].<br />

Simps<strong>on</strong> addressed the examinati<strong>on</strong>-system but his remarks apply at least as well <str<strong>on</strong>g>to</str<strong>on</strong>g> any rating<br />

system:<br />

...the traditi<strong>on</strong>al examinati<strong>on</strong> system...achieves...pseudo-precisi<strong>on</strong>, for it has chosen the accurate<br />

measurement of the barely relevant in preference <str<strong>on</strong>g>to</str<strong>on</strong>g> the less precise measurement of the most<br />

highly relevant...our cultural bias <str<strong>on</strong>g>to</str<strong>on</strong>g>wards believing that anything expressed in numbers must be<br />

significantly more true than the same thing expressed in words...allows the student <str<strong>on</strong>g>to</str<strong>on</strong>g> accumulate<br />

a sequence of numerical ascripti<strong>on</strong>s and grades, often of very dubious reliability and<br />

validity...added <str<strong>on</strong>g>to</str<strong>on</strong>g>gether and averaged <str<strong>on</strong>g>to</str<strong>on</strong>g> help us guess at whether he is fit <str<strong>on</strong>g>to</str<strong>on</strong>g> leave medical<br />

school. This is as logical as making a pre-operative surgical assessment by adding and averaging<br />

your patient's haemoblobin, potassium, urea and blood sugar levels. It produces results...of little<br />

or no predictive validity and...neither tell the student who has passed the exam why he has d<strong>on</strong>e<br />

well (so that we can be reas<strong>on</strong>ably sure he can do it again) nor tell the student who has failed<br />

anything of much use <str<strong>on</strong>g>to</str<strong>on</strong>g> him in avoiding further failure...[60]<br />

e. men<str<strong>on</strong>g>to</str<strong>on</strong>g>r-inattenti<strong>on</strong><br />

The descripti<strong>on</strong>s of how recipients of <strong>LOR</strong>s perpetrate Mill's reificati<strong>on</strong>-fallacy in an attempt <str<strong>on</strong>g>to</str<strong>on</strong>g><br />

attach specific meanings <str<strong>on</strong>g>to</str<strong>on</strong>g> various phrases that the phrases themselves d<strong>on</strong>'t necessarily<br />

denote[61], seems especially anomalous in a c<strong>on</strong>text in which the author may be a<br />

department-chairman who may even c<strong>on</strong>cede that he has never had any c<strong>on</strong>tact with the trainees,<br />

about whom he has a duty <str<strong>on</strong>g>to</str<strong>on</strong>g> write <strong>LOR</strong>s, not have even what Albanes called<br />

The episodic, fragmented, and...small amount of c<strong>on</strong>tact that clinical faculty have with<br />

students...(Albanes 653)[34]<br />

Albanes claimed that that circumstance<br />

...leaves them [raters] reluctant <str<strong>on</strong>g>to</str<strong>on</strong>g> make ratings that would call attenti<strong>on</strong> <str<strong>on</strong>g>to</str<strong>on</strong>g> students' performance<br />

deficits...(Albanes 653)[34]<br />

In those circumstances, faculty-members' reluctance <str<strong>on</strong>g>to</str<strong>on</strong>g> make ratings of any sort , at all, would<br />

bespeak their simple h<strong>on</strong>esty. Yet, somehow, most faculty-members, whether in good c<strong>on</strong>science<br />

or not, rate their students and other trainees after clinical rotati<strong>on</strong>s and later when they write<br />

<strong>LOR</strong>s for them.<br />

Kefalides affirms faculty-expectati<strong>on</strong>s and complains that insurance-rules newly require<br />

faculty-members <str<strong>on</strong>g>to</str<strong>on</strong>g> take care of patients and thus provide golden opportunities for clinical<br />

teaching, which he seems <str<strong>on</strong>g>to</str<strong>on</strong>g> disparage.[62]<br />

Cydulka et al present time-cosuming, close observati<strong>on</strong> of trainees as a startling new<br />

departure.[63]


In the industrial setting, in which TQM arose, nobody could ever c<strong>on</strong>fuse the manufactured<br />

product with the worker whose efforts produce it. In another article,[64] Albanes did just that. He<br />

attempted <str<strong>on</strong>g>to</str<strong>on</strong>g> apply TQM <str<strong>on</strong>g>to</str<strong>on</strong>g> medical educati<strong>on</strong> but, in the process, he c<strong>on</strong>flated students as human<br />

beings with students as objects, products of the educati<strong>on</strong>-process and got his ideas twisted. As a<br />

result, in <strong>on</strong>e secti<strong>on</strong> of his article, grading is good, while in another, it's bad. The very fact that<br />

Academic Medicine published his article indicates the likelihood that the thinking, am<strong>on</strong>g many<br />

academics, about rating and evaluating the performance of trainees and others is c<strong>on</strong>fused.<br />

f. self-fulfilling prophecy<br />

Bosk noted:<br />

One striking feature of the clinical judgment of residents is how easily the whole process may turn<br />

in<str<strong>on</strong>g>to</str<strong>on</strong>g> self-fulfilling prophesy ( sic )....good reputati<strong>on</strong>s exercise a protective or deviance-reducing<br />

effect while bad <strong>on</strong>es generate a destructive or deviance-amplifying <strong>on</strong>e. If a resident is c<strong>on</strong>sidered<br />

trustworthy, m<strong>on</strong>i<str<strong>on</strong>g>to</str<strong>on</strong>g>ring by attendings is decreased. Therefore, deficiencies are less likely <str<strong>on</strong>g>to</str<strong>on</strong>g> be<br />

discovered. C<strong>on</strong>versely, if a resident is suspect, m<strong>on</strong>i<str<strong>on</strong>g>to</str<strong>on</strong>g>ring increases. C<strong>on</strong>vinced that they are<br />

there for the finding, an attending is more likely <str<strong>on</strong>g>to</str<strong>on</strong>g> find evidence of sloppy work. When found,<br />

these <strong>on</strong>ly increase surveillance, which again increases the probability of mistakes. Clearly<br />

suspici<strong>on</strong> does not create residents who are unfit-- after all, something creates the suspici<strong>on</strong>.<br />

N<strong>on</strong>etheless, being suspect is for a resident a very vulnerable and demoralizing positi<strong>on</strong>. Not <strong>on</strong>ly<br />

that, being above suspici<strong>on</strong> gives a fair amount of protecti<strong>on</strong>, especially when mistakes need not<br />

be seen as innocent error. Given these dynamics, it is not surprising that those who fall <strong>on</strong> the<br />

short end of evaluati<strong>on</strong> (or their at<str<strong>on</strong>g>to</str<strong>on</strong>g>rneys) often characterize it as arbitrary and capricious.[65<br />

Strangely, when a physician so abuses ancillary pers<strong>on</strong>nel that they lose their self-c<strong>on</strong>fidence in an<br />

analogous manner, he becomes a “disruptive physician,”[66] fit <strong>on</strong>ly for expulsi<strong>on</strong>. Yet, in the<br />

setting of medical educati<strong>on</strong>, such abuse is <str<strong>on</strong>g>to</str<strong>on</strong>g>lerable, even cus<str<strong>on</strong>g>to</str<strong>on</strong>g>mary.<br />

g. Absence of evidence-basis<br />

Rating/evaluati<strong>on</strong> is particularly vulnerable <str<strong>on</strong>g>to</str<strong>on</strong>g> charges of resting <strong>on</strong> an inadequate evidence-basis:<br />

...In perusing the folders of the residents in the training program that I studied, I found <strong>on</strong>ly <strong>on</strong>e<br />

evaluati<strong>on</strong> that menti<strong>on</strong>ed a specific incident. This leads me <str<strong>on</strong>g>to</str<strong>on</strong>g> suspect that residents who are<br />

dismissed from programs could easily argue that their “due-process rights” were violated, which<br />

raises a very thorny issue. Surgery <str<strong>on</strong>g>to</str<strong>on</strong>g> a large degree rests <strong>on</strong> peer trust, and it is unclear what<br />

degree of formal, c<strong>on</strong>crete evaluati<strong>on</strong> is c<strong>on</strong>sistent with that trust. (12)[67]<br />

Two pediatric core-curricula have come out for emergency-pediatrics,[68,69] <strong>on</strong>e for pediatric<br />

interventi<strong>on</strong>al cardiology,[70] <strong>on</strong>e core-c<strong>on</strong>tent inven<str<strong>on</strong>g>to</str<strong>on</strong>g>ry for adult emergency-medicine[71] and a<br />

retrospective inven<str<strong>on</strong>g>to</str<strong>on</strong>g>ry of diagnoses encountered in internal-medicine residency.[72]<br />

Other core-curricula may exist in other specialties, yet, in no specialty, do recommendati<strong>on</strong>s for<br />

the <strong>LOR</strong> relate in any manner <str<strong>on</strong>g>to</str<strong>on</strong>g> specific elements of any defined core-curriculum. If the <strong>LOR</strong> is


supposed <str<strong>on</strong>g>to</str<strong>on</strong>g> reflect job-performance, what justificati<strong>on</strong> is there for omitting any menti<strong>on</strong> of<br />

job-performance criteria, delineated in nati<strong>on</strong>al core-curricula, core-c<strong>on</strong>tent statements or<br />

otherwise?<br />

In all the literature <strong>on</strong> <strong>LOR</strong>s and evaluati<strong>on</strong>s, n<strong>on</strong>e that I've seen suggest including the cumulative<br />

statistics <strong>on</strong> clinical outcomes of patients under the care of the subject of the <strong>LOR</strong>. Yet, without<br />

such evidence of actual job-performance, in terms of numbers and proporti<strong>on</strong>s of patients saved,<br />

lost and improved, the rest is nothing.<br />

The medical literature is replete with accounts of physicians' inaccurate performance-appraisals of<br />

their colleagues and of trainees.[73-86] Those accounts render the idea of entrusting<br />

performance-appraisal of any<strong>on</strong>e <str<strong>on</strong>g>to</str<strong>on</strong>g> physicians patently absurd.<br />

Perhaps the most c<strong>on</strong>crete, objectively verifiable category is “procedural skills.” The trainee either<br />

succeeds at the lumbar puncture by obtaining CSF or not, succeeds in intubating a patient or not.<br />

No performance-evaluati<strong>on</strong> I've ever seen has any space devoted <str<strong>on</strong>g>to</str<strong>on</strong>g> citing the specific number of<br />

procedures that the men<str<strong>on</strong>g>to</str<strong>on</strong>g>r observed the subject performing, far less a score-card that documents<br />

how many he performed successfully and in how many he failed. What would be the distincti<strong>on</strong> in<br />

a rating of 3/10 vs. a rating of 7/10 in the category, “procedural skills?” One might imagine that<br />

the evaluee succeeded in 30% or 70%, respectively, of the procedures he performed during a<br />

clinical rotati<strong>on</strong>. Did a m<strong>on</strong>th-l<strong>on</strong>g rotati<strong>on</strong> provide even ten opportunities for each of, say three<br />

trainees, <str<strong>on</strong>g>to</str<strong>on</strong>g> perform lumbar punctures or intubati<strong>on</strong>s? It seems unlikely.<br />

If the trainee's score was low, where is the documentati<strong>on</strong> of the help that the men<str<strong>on</strong>g>to</str<strong>on</strong>g>r provided <str<strong>on</strong>g>to</str<strong>on</strong>g><br />

the trainee <str<strong>on</strong>g>to</str<strong>on</strong>g> improve his performance? I've never seen it and the proposed standard <str<strong>on</strong>g>Letter</str<strong>on</strong>g> of<br />

Recommendati<strong>on</strong> (S<strong>LOR</strong>), in emergency-medicine, omits menti<strong>on</strong> of anything like it.[54,63]<br />

Where is the documentati<strong>on</strong> of the progress in the trainee's score during the m<strong>on</strong>th? Was his score<br />

2 of 10 at the beginning of his rotati<strong>on</strong> and 8 of 10 at the end? I've never seen anything like that,<br />

either, possibly because a trainee may have <strong>on</strong>e opportunity <str<strong>on</strong>g>to</str<strong>on</strong>g> perform <strong>on</strong>e clinical procedure in a<br />

m<strong>on</strong>th, if he's lucky. The cus<str<strong>on</strong>g>to</str<strong>on</strong>g>mary evaluati<strong>on</strong> is post hoc, delivered as a summative accolade or<br />

c<strong>on</strong>demnati<strong>on</strong>, l<strong>on</strong>g after the trainee can do anything about his scores.<br />

What was the quality of his performance? What did he do <str<strong>on</strong>g>to</str<strong>on</strong>g> succeed in the procedure, if he<br />

succeeded? Did he fracture teeth of patients he intubated? If so, how many each? How many of<br />

the patients <strong>on</strong> whom he performed a lumbar puncture required a blood-patch afterwards <str<strong>on</strong>g>to</str<strong>on</strong>g> stem<br />

post-procedure CSF-leakage? I've never seen any such evaluati<strong>on</strong> in writing.<br />

How many of the procedures that the evaluee performed did the men<str<strong>on</strong>g>to</str<strong>on</strong>g>r pers<strong>on</strong>ally observe?<br />

S<strong>LOR</strong> has no space for any such entry[54,63]. Is the rating based <strong>on</strong> a “general impressi<strong>on</strong>” of the<br />

evaluee's procedural skill, as an intrinsic trait, derived from rumor? If so, up<strong>on</strong> what specific<br />

evidence or criteria did the evalua<str<strong>on</strong>g>to</str<strong>on</strong>g>r base the score that he assigned?<br />

Is <strong>on</strong>e of the men<str<strong>on</strong>g>to</str<strong>on</strong>g>r's c<strong>on</strong>siderati<strong>on</strong>s his own anxiety over giving the evaluee a big head? Did the<br />

evaluee, rather, need a low score <str<strong>on</strong>g>to</str<strong>on</strong>g> give him a harsh dose of “reality?” If so, <strong>on</strong> what evidentiary


criteria did the evalua<str<strong>on</strong>g>to</str<strong>on</strong>g>r base his c<strong>on</strong>cept of “reality,” such that a low rating would give the<br />

evaluee a dose thereof and, in some sense (what sense?) improve the evaluee's outlook? Did the<br />

evalua<str<strong>on</strong>g>to</str<strong>on</strong>g>r apply his dose of reality <str<strong>on</strong>g>to</str<strong>on</strong>g> all evaluees c<strong>on</strong>sistently? If not, why not? Did he c<strong>on</strong>demn<br />

those whom he pers<strong>on</strong>ally disliked (perhaps because they asked him <str<strong>on</strong>g>to</str<strong>on</strong>g>ugh questi<strong>on</strong>s <str<strong>on</strong>g>to</str<strong>on</strong>g> which the<br />

men<str<strong>on</strong>g>to</str<strong>on</strong>g>r felt embarrassed at not knowing the answers) and favor those whom he pers<strong>on</strong>ally liked<br />

(perhaps because they never asked him any <str<strong>on</strong>g>to</str<strong>on</strong>g>ugh questi<strong>on</strong>s)? If he applied his dose of reality<br />

c<strong>on</strong>sistently, without regard <str<strong>on</strong>g>to</str<strong>on</strong>g> the evaluee's actual performance (which the evalua<str<strong>on</strong>g>to</str<strong>on</strong>g>r may never<br />

have observed -- my c<strong>on</strong>sistent experience, throughout “training”), isn't that practice arbitrary,<br />

unreas<strong>on</strong>able and capricious, i.e., a manifestati<strong>on</strong> of chaos and irrati<strong>on</strong>ality, in a setting where<br />

rati<strong>on</strong>al thought is supposed <str<strong>on</strong>g>to</str<strong>on</strong>g> prevail?<br />

Most important, what does the rating score tell the relevant candidate about what he should do <str<strong>on</strong>g>to</str<strong>on</strong>g><br />

improve his performance?<br />

One might argue that success in procedures, like medicine, itself, is a s<str<strong>on</strong>g>to</str<strong>on</strong>g>chastic matter[5,6], i.e.,<br />

that some procedures fail even in the best of hands and some succeed even in the worst of hands,<br />

in whatever sense of “best” and “worst” <strong>on</strong>e might choose <str<strong>on</strong>g>to</str<strong>on</strong>g> apply. I would reply that that's<br />

correct. Success in procedures is, at least <str<strong>on</strong>g>to</str<strong>on</strong>g> some extent what Deming terms a lottery,[5] no<br />

questi<strong>on</strong>. Given that truism, what's the point of making “procedural skills a ratable category, in<br />

the first place?<br />

h. glittering generalities<br />

Greenburg et al wrote, in relati<strong>on</strong> <str<strong>on</strong>g>to</str<strong>on</strong>g> <strong>LOR</strong>s:<br />

Brevity and generality...come across as distinctly negative features, causing the reader <str<strong>on</strong>g>to</str<strong>on</strong>g><br />

w<strong>on</strong>der...whether the writer actually knows the applicant. (197)[87]<br />

Yet, the evaluati<strong>on</strong>-criteria, up<strong>on</strong> which <strong>LOR</strong>s are most often based, rely up<strong>on</strong> brevity and<br />

generality, presumably in the assumpti<strong>on</strong> that evalua<str<strong>on</strong>g>to</str<strong>on</strong>g>rs' general opini<strong>on</strong>s of candidates reflect the<br />

truth. No evidence supports that propositi<strong>on</strong> and my pers<strong>on</strong>al observati<strong>on</strong> is that it is false. If<br />

brevity and generality be negative features of a <strong>LOR</strong>, how can the same features be acceptable in<br />

the underlying evaluati<strong>on</strong>-criteria?<br />

Bosk terms vague indices of “quality” of the candidate, such as “general medical knowledge,”<br />

“rapport with staff” and the like “essentially-c<strong>on</strong>tested c<strong>on</strong>cepts.”[67] They are summative,<br />

glittering generalities, intended <str<strong>on</strong>g>to</str<strong>on</strong>g> make the evaluati<strong>on</strong>-form brief, that have no necessary<br />

evidentiary relati<strong>on</strong>, either <str<strong>on</strong>g>to</str<strong>on</strong>g> the subject-physician's actual performance, clinical acumen or <str<strong>on</strong>g>to</str<strong>on</strong>g><br />

clinical outcomes of his patients.<br />

In the field of academic emergency-medicine, Harwood et al referred <str<strong>on</strong>g>to</str<strong>on</strong>g> various elements of<br />

evaluative jarg<strong>on</strong>:<br />

Of the applicants submitting S<strong>LOR</strong>s <str<strong>on</strong>g>to</str<strong>on</strong>g> our EM residency program, 49% or more received the<br />

superlative resp<strong>on</strong>se in the categories of "commitment," "work ethic," and "pers<strong>on</strong>ality." In


c<strong>on</strong>trast, <strong>on</strong>ly 35% of the applicants received the superlative resp<strong>on</strong>se regarding their "differential<br />

diagnosis ability." The "global assessment" operated similarly, with 37% of the applicants<br />

receiving the superlative resp<strong>on</strong>se. The least comm<strong>on</strong> superlative resp<strong>on</strong>se was the "match<br />

rating," with <strong>on</strong>ly 23% of the applicants receiving a "guaranteed match."<br />

These data can serve as a reference for both interpreting and writing S<strong>LOR</strong>s. The data show that<br />

EM applicants least comm<strong>on</strong>ly receive the superlative resp<strong>on</strong>se in the categories of "differential<br />

diagnosis ability," "global assessment," and "match rating," making these key categories for<br />

residency selecti<strong>on</strong> committees. These results suggest that authors can justifiably evaluate most<br />

applicants in the highest categories of pers<strong>on</strong>al traits, but that they should be more discerning with<br />

assessing "differential diagnosis ability," "global assessment," and "match rating."[88]<br />

Harwood et al seem <str<strong>on</strong>g>to</str<strong>on</strong>g> pretend as if ratings were objective facts, rather than what they are,<br />

subjective appraisals based <strong>on</strong> the author's claimed but unverifiable (and probably negligible)<br />

familiarity with the trainee.<br />

In the foregoing passage, Harwood et al urged instituti<strong>on</strong>al authors of S<strong>LOR</strong>s (standardized<br />

letters of reference) <str<strong>on</strong>g>to</str<strong>on</strong>g> manipulate their performance-appraisals in various secti<strong>on</strong>s of the S<strong>LOR</strong>,<br />

<strong>on</strong> the premise that the evidentiary basis of such appraisals d<strong>on</strong>'t matter, with a view <str<strong>on</strong>g>to</str<strong>on</strong>g> pandering<br />

<str<strong>on</strong>g>to</str<strong>on</strong>g> the selecti<strong>on</strong>-committees for emergency-medicine residencies and manipulating the outcomes<br />

of their deliberati<strong>on</strong>s over trainee-selecti<strong>on</strong>. Harwood et al seem <str<strong>on</strong>g>to</str<strong>on</strong>g> ignore the possibility that<br />

fewer authors rate trainees highly in the glittering-generality categories of "differential diagnosis<br />

ability," "global assessment," and "match rating" than in the glittering-generality categories,<br />

"commitment," "work ethic," and "pers<strong>on</strong>ality" because the authors could be inappropriately<br />

ungenerous with their ratings in the first three categories, most likely because those categories are<br />

the clinically oriented <strong>on</strong>es and authors would very likely believe that they weren't performing<br />

their watchdog/gatekeeper functi<strong>on</strong> properly (<str<strong>on</strong>g>to</str<strong>on</strong>g> keep bad doc<str<strong>on</strong>g>to</str<strong>on</strong>g>rs from practicing<br />

emergency-medicine) unless they had c<strong>on</strong>demned a certain quota of trainee-candidates with each<br />

batch that left their respective instituti<strong>on</strong>s. Hea<str<strong>on</strong>g>to</str<strong>on</strong>g>n depicts that practice in terms of what he calls<br />

the “basic process”:<br />

The Basic Process<br />

The basic process of an individual in a hierarchy is <str<strong>on</strong>g>to</str<strong>on</strong>g> avoid mistakes. . . individuals are rated by<br />

their errors, for their tasks are predetermined. There is no premium for achievement outside<br />

assigned hierarchical tasks but there are penalties for every shortfall from perfecti<strong>on</strong>.<br />

The normal distributi<strong>on</strong> in a hierarchy includes a percentage of failures, so grading <strong>on</strong> a curve<br />

means that students making the most mistakes are given failing grades. . . When failing students<br />

are eliminated, those next above them succeed <str<strong>on</strong>g>to</str<strong>on</strong>g> the failing category. The rule of thumb is for<br />

<strong>on</strong>e-third <str<strong>on</strong>g>to</str<strong>on</strong>g> leave between the fifth and twelfth grades, . . . The next third become<br />

failure-threatened, declining in rank regardless of effort or improvement. Apprehensi<strong>on</strong> then<br />

blocks learning so there can <strong>on</strong>ly be unskilled repetiti<strong>on</strong>. Thus this middle third is taught<br />

submissi<strong>on</strong> and place within the hierarchy. . . (32)


Is it true or false that for every winner there has <str<strong>on</strong>g>to</str<strong>on</strong>g> be a loser? False – there has <str<strong>on</strong>g>to</str<strong>on</strong>g> be a c<strong>on</strong>tinuing<br />

supply of losers if a winner is <str<strong>on</strong>g>to</str<strong>on</strong>g> keep <strong>on</strong> winning. In schools, grading <strong>on</strong> a curve . . . means that<br />

the A student needs an F student at the other end of the normal distributi<strong>on</strong>; then annually or<br />

more often, when the F student is eliminated or drops out, another student must be pushed in<str<strong>on</strong>g>to</str<strong>on</strong>g><br />

the failing positi<strong>on</strong>. . . companies seem <str<strong>on</strong>g>to</str<strong>on</strong>g> survive <strong>on</strong>ly by establishing a large pool of marginal<br />

workers who can be picked up when needed and dropped when business is slow. . .<br />

. . . Schools in exclusive suburbs do not produce so many failures . . . Instead they assume their<br />

students are mostly in the upper half of a normal (59) distributi<strong>on</strong>. . . there are schools which<br />

assume their students are mostly in the lower half of a normal distributi<strong>on</strong>. In <strong>on</strong>e vocati<strong>on</strong>al high<br />

school in New York, no teacher could give a grade above C without special approval by the<br />

principal. In a ghet<str<strong>on</strong>g>to</str<strong>on</strong>g> high school a department head <str<strong>on</strong>g>to</str<strong>on</strong>g>ld me that <strong>on</strong>ly <strong>on</strong>e student in a . . . class of<br />

twenty was capable of learning. I knew the students were capable and interested, but sure enough,<br />

nineteen dropped out and failed. . . grading in schools is a process that produces failures and<br />

accomplishes rejecting.<br />

Winners are cus<str<strong>on</strong>g>to</str<strong>on</strong>g>m-made, but losers are mass-produced. . . (62)[52]<br />

Pursuing a similar line of “reas<strong>on</strong>ing,” raters in medical educati<strong>on</strong> may believe that they can<br />

enhance the reputati<strong>on</strong> and credibility of their respective instituti<strong>on</strong>s by making a big show of<br />

being “<str<strong>on</strong>g>to</str<strong>on</strong>g>ugh graders” and the clinically oriented rating criteria are the most attractive targets for<br />

that sort of behavior.<br />

i. dis<str<strong>on</strong>g>to</str<strong>on</strong>g>rti<strong>on</strong> from “c<strong>on</strong>fidentiality,” under perpetual tensi<strong>on</strong>.<br />

In both edi<str<strong>on</strong>g>to</str<strong>on</strong>g>rial peer-review and performance-appraisal/<strong>LOR</strong>, the thesis is that the rater cannot<br />

deliver an “h<strong>on</strong>est and accurate”[89] rating unless he labors under the protecti<strong>on</strong> of<br />

“c<strong>on</strong>fidentiality,”[61,89,90] meaning that everybody except for the subject, gets <str<strong>on</strong>g>to</str<strong>on</strong>g> see the rating.<br />

Decades of organizati<strong>on</strong>al oppressi<strong>on</strong>, in which ratees had <str<strong>on</strong>g>to</str<strong>on</strong>g> <str<strong>on</strong>g>to</str<strong>on</strong>g>lerate the in<str<strong>on</strong>g>to</str<strong>on</strong>g>lerable, finally<br />

prompted the US C<strong>on</strong>gress <str<strong>on</strong>g>to</str<strong>on</strong>g> enact the enlightened Buckley Amendment, a federal law that<br />

requires schools that receive federal funding <str<strong>on</strong>g>to</str<strong>on</strong>g> make student records available for viewing by<br />

parents and the students themselves if they are 18 or older.[89,91] Accordingly, even though<br />

federal law mandates that the trainee should be able <str<strong>on</strong>g>to</str<strong>on</strong>g> see his rating, those in medical educati<strong>on</strong>,<br />

prefer the old oppressi<strong>on</strong>. They recommend that the organizati<strong>on</strong> should compel the trainee <str<strong>on</strong>g>to</str<strong>on</strong>g><br />

“waive” his legal right, under the Buckley-Amendment, <str<strong>on</strong>g>to</str<strong>on</strong>g> see his rating, in the interest of<br />

“h<strong>on</strong>esty”,[1,90] “authenticity,”[89] and “objectivity” (read freedom of the rater <str<strong>on</strong>g>to</str<strong>on</strong>g> give an<br />

adverse rating with the security of knowing that the ratee cannot learn of it and therefore not have<br />

grounds for retaliati<strong>on</strong>) of the letter and of its “value”[1] <str<strong>on</strong>g>to</str<strong>on</strong>g> the receiving instituti<strong>on</strong>.<br />

In edi<str<strong>on</strong>g>to</str<strong>on</strong>g>rial peer-review, the author receives the rating but not the identity of the rater. In<br />

performance-appraisal, the trainee knows the identity of the rater but, ideally, not the rating.<br />

Yet, dissenting voices resist such organizati<strong>on</strong>al oppressi<strong>on</strong>, and for good reas<strong>on</strong>, in my view:


...One of our wisest and most experienced faculty members, <str<strong>on</strong>g>Dr</str<strong>on</strong>g>. Douglas Lindsey, offers <str<strong>on</strong>g>to</str<strong>on</strong>g> write<br />

letters for every medical student. He writes them h<strong>on</strong>estly. He then shows the student the letter. It<br />

is up <str<strong>on</strong>g>to</str<strong>on</strong>g> the student <str<strong>on</strong>g>to</str<strong>on</strong>g> decide wether it is sent. This is an excellent policy of a great teacher.<br />

Unfortunately, it is probably unique. (320)<br />

A few students have been asked <str<strong>on</strong>g>to</str<strong>on</strong>g> sign statements that they have not seen their reference letters.<br />

This is ridiculous and unenforceable. D<strong>on</strong>'t sign...it is comm<strong>on</strong> practice for students <str<strong>on</strong>g>to</str<strong>on</strong>g> be asked <str<strong>on</strong>g>to</str<strong>on</strong>g><br />

sign a waiver of their right <str<strong>on</strong>g>to</str<strong>on</strong>g> request <str<strong>on</strong>g>to</str<strong>on</strong>g> see referee letters. If you are forced in<str<strong>on</strong>g>to</str<strong>on</strong>g> this type of<br />

situati<strong>on</strong>, you may have <str<strong>on</strong>g>to</str<strong>on</strong>g> sign it and hope for good letters. If possible you do want <str<strong>on</strong>g>to</str<strong>on</strong>g> see those<br />

letters before they go out. (321)[53]<br />

The practice of “c<strong>on</strong>fidentiality,” under compulsi<strong>on</strong> and under false color of “h<strong>on</strong>esty,” in the<br />

rater, may thus spawn duplicity and dish<strong>on</strong>esty in the ratee.<br />

On the subject of so-called h<strong>on</strong>esty, <strong>on</strong>e naturally w<strong>on</strong>ders whether the “h<strong>on</strong>esty” will be<br />

even-handed or biased. A few obvious questi<strong>on</strong>s spring <str<strong>on</strong>g>to</str<strong>on</strong>g> mind:<br />

Will the rater be as “h<strong>on</strong>est” about how he himself prioritized the needs of trainees lower than his<br />

own pers<strong>on</strong>al needs and therefore devoted insufficient time <str<strong>on</strong>g>to</str<strong>on</strong>g> the those in need of guidance <str<strong>on</strong>g>to</str<strong>on</strong>g><br />

foster their improvement as he claims <str<strong>on</strong>g>to</str<strong>on</strong>g> be about the shortcomings of those trainees, whom the<br />

rater thus aband<strong>on</strong>ed? Will he be h<strong>on</strong>est about his own failures <str<strong>on</strong>g>to</str<strong>on</strong>g> implement and incorporate core<br />

c<strong>on</strong>tent of his specialty (e.g., emergency-medicine) in training and rating his trainees? Will he be<br />

h<strong>on</strong>est about his own failure <str<strong>on</strong>g>to</str<strong>on</strong>g> provide daily feedback <str<strong>on</strong>g>to</str<strong>on</strong>g> trainees <str<strong>on</strong>g>to</str<strong>on</strong>g> keep them informed of what<br />

specific performances they needed <str<strong>on</strong>g>to</str<strong>on</strong>g> dem<strong>on</strong>strate the following day <str<strong>on</strong>g>to</str<strong>on</strong>g> show improvement? Will<br />

the rater be h<strong>on</strong>est about his own failure <str<strong>on</strong>g>to</str<strong>on</strong>g> document daily or weekly improvement or otherwise<br />

and reas<strong>on</strong>s therefor in his rating-comments? Will the rater be h<strong>on</strong>est about his own failure <str<strong>on</strong>g>to</str<strong>on</strong>g><br />

define behavioral educati<strong>on</strong>al objectives,[92] <str<strong>on</strong>g>to</str<strong>on</strong>g>ward which the trainees might strive? Will the<br />

rater be h<strong>on</strong>est about how he exchanged gossip with other faculty about various trainees and<br />

thereby formed a collective, united, homogenized opini<strong>on</strong> of trainees, insteade of expressing his<br />

own opini<strong>on</strong>, based <strong>on</strong> his pers<strong>on</strong>al observati<strong>on</strong>s? Will the rater be h<strong>on</strong>est about casting the<br />

evaluati<strong>on</strong> in terms <strong>on</strong>ly of the trainee's failures, not in terms of systematic failures of the<br />

instituti<strong>on</strong>?<br />

T<strong>on</strong>esk provides a twisted view of objectivity vs. subjectivity and authority-relati<strong>on</strong>ships in<br />

medical educati<strong>on</strong>.[93]<br />

In the realm of edi<str<strong>on</strong>g>to</str<strong>on</strong>g>rial peer-review, Walsh et al found referees more c<strong>on</strong>siderate and courteous<br />

<str<strong>on</strong>g>to</str<strong>on</strong>g>ward authors if their names attached <str<strong>on</strong>g>to</str<strong>on</strong>g> their reports[94]. What's wr<strong>on</strong>g, therefore, with<br />

accountability in <strong>LOR</strong>s?<br />

Flacks wrote:<br />

. . . maintaining the c<strong>on</strong>fidentiality of the c<strong>on</strong>tents of evaluati<strong>on</strong>s and letters of reference would<br />

[not] improve the quality of such assessments. On the c<strong>on</strong>trary, . . . I've become c<strong>on</strong>vinced . . .<br />

that the reverse is true. New state laws and university regulati<strong>on</strong>s have opened the process . . --


and the results . . . have been good. Faculty members and departments now have the opportunity<br />

<str<strong>on</strong>g>to</str<strong>on</strong>g> resp<strong>on</strong>d <str<strong>on</strong>g>to</str<strong>on</strong>g> negative reviews . . . timely . . . and with some understanding of the arguments that<br />

may merit rebuttal. The review process is now more cumbersome, but it is . . . less Kafkaesque. . .<br />

A new law that would require full disclosure has passed the legislature but is being c<strong>on</strong>tested in<br />

the courts by the University. I am quite sure that the . . . motivati<strong>on</strong> for the University's resistance<br />

is not so much <str<strong>on</strong>g>to</str<strong>on</strong>g> protect the quality of the review process as it is <str<strong>on</strong>g>to</str<strong>on</strong>g> protect the discreti<strong>on</strong>ary<br />

powers of the administrati<strong>on</strong>.<br />

. . . The need for open evaluati<strong>on</strong>s is not simply that such openness promotes due process. The<br />

due process argument applies <str<strong>on</strong>g>to</str<strong>on</strong>g> all instituti<strong>on</strong>s in their treatment of workers. . . open access helps<br />

ensure that each member can benefit from critical feedback and also ensures that criticisms are<br />

made in a way that is resp<strong>on</strong>sible <str<strong>on</strong>g>to</str<strong>on</strong>g> can<strong>on</strong>s of scholarly objectivity. . .[95]<br />

Fashing wrote:<br />

. . . If we allow people <str<strong>on</strong>g>to</str<strong>on</strong>g> require an<strong>on</strong>ymity as the price for the exercise of candor and<br />

professi<strong>on</strong>al resp<strong>on</strong>sibility, then surely we encourage a pernicious form of cowardice. Are our<br />

sensibilities so delicate that they cannot c<strong>on</strong>tend with the requirement <str<strong>on</strong>g>to</str<strong>on</strong>g> render our negative<br />

judgments openly and h<strong>on</strong>estly with whatever risk that entails? And if they are, should we<br />

c<strong>on</strong>tinue <str<strong>on</strong>g>to</str<strong>on</strong>g> encourage such delicacy or should we begin <str<strong>on</strong>g>to</str<strong>on</strong>g> require a modicum of courage <str<strong>on</strong>g>to</str<strong>on</strong>g> go<br />

with our “candor”? I for <strong>on</strong>e believe we should. . . the requirement of an<strong>on</strong>ymity raises serious<br />

questi<strong>on</strong>s of credibility in its own right. Why should we believe that an<strong>on</strong>ymity is the price of<br />

h<strong>on</strong>esty any more than that it is an opportunity for dish<strong>on</strong>esty? . . .<br />

. . . there are compelling reas<strong>on</strong>s for c<strong>on</strong>fr<strong>on</strong>ting intellectual, professi<strong>on</strong>al, and . . . pers<strong>on</strong>al<br />

differences as a minimal requirement for the development of any serious sense of community. This<br />

will no doubt produce some unpleasant moments in the c<strong>on</strong>text of whatever c<strong>on</strong>flicts surface, but<br />

what group that c<strong>on</strong>stitutes a serious community, or perhaps more importantly, a community <str<strong>on</strong>g>to</str<strong>on</strong>g><br />

be taken seriously, especially in intellectual terms, is without c<strong>on</strong>flict? That c<strong>on</strong>sensus about all<br />

issues is unnecessary <str<strong>on</strong>g>to</str<strong>on</strong>g> the maintenance of a healthy community is recognized by all but the most<br />

resolutely c<strong>on</strong>servative members of the academy. To address such differences and <str<strong>on</strong>g>to</str<strong>on</strong>g> resolve<br />

them, or in the case of intellectual differences, <str<strong>on</strong>g>to</str<strong>on</strong>g> provide a climate in which debate and c<strong>on</strong>flict of<br />

opposing ideas are a catalyst for intellectual growth and creativity, strikes me as the essence of<br />

academic community and a primary requirement for intellectual and academic freedom. In this<br />

sense disclosure should promote rather than retard intellectual excellence. (222)[96]<br />

6. The ultimate goal: communi<strong>on</strong> of “<str<strong>on</strong>g>to</str<strong>on</strong>g>p” talent in “<str<strong>on</strong>g>to</str<strong>on</strong>g>p” instituti<strong>on</strong>s<br />

The counter-argument <str<strong>on</strong>g>to</str<strong>on</strong>g> the foregoing is that the most competitive programs have <str<strong>on</strong>g>to</str<strong>on</strong>g> select the<br />

most competitive trainees.<br />

Why? Even assuming that the selecti<strong>on</strong>-process be valid, a dubious propositi<strong>on</strong>, what ultimate<br />

utility is there in aid of quality of patient-care in c<strong>on</strong>centrating “<str<strong>on</strong>g>to</str<strong>on</strong>g>p talent” in “<str<strong>on</strong>g>to</str<strong>on</strong>g>p instituti<strong>on</strong>s?”<br />

Isn't that just elitism run amock? What about spreading the wealth, if that's what it is (a dubious<br />

propositi<strong>on</strong>), around a little? Wouldn't the “n<strong>on</strong>-competitive” trainees gain from exposure <str<strong>on</strong>g>to</str<strong>on</strong>g> “<str<strong>on</strong>g>to</str<strong>on</strong>g>p


instituti<strong>on</strong>s” and wouldn't “competitive trainees,” if they offer any genuine advantage over<br />

n<strong>on</strong>-competitive trainees, be able <str<strong>on</strong>g>to</str<strong>on</strong>g> work their magic in instituti<strong>on</strong>s in more humble locati<strong>on</strong>s?<br />

I've interacted with finished physicians from a broad range of instituti<strong>on</strong>s and I'm c<strong>on</strong>stantly<br />

impressed with how alike they are. Physicians from Harvard, Yale and other Ivy League<br />

instituti<strong>on</strong>s are no great shakes and some of the most impressive come from the hinterlands. What<br />

was all the fuss about during educati<strong>on</strong> and training, then?<br />

7. Illustrative anecdote which is more typical than it should be<br />

When I worked as a civilian in the ER of the military hospital, Fort Stewart, GA, my military<br />

supervisor, a Major in the Army Medical Corps, liked me pretty well at first but seemed <str<strong>on</strong>g>to</str<strong>on</strong>g> dislike<br />

me more and more as time went <strong>on</strong>, evidently because of c<strong>on</strong>flicts that swirled around me.<br />

He criticized my handwriting, so I brought in a word-processor <str<strong>on</strong>g>to</str<strong>on</strong>g> write up my charts and make<br />

them optimally legible. He didn't s<str<strong>on</strong>g>to</str<strong>on</strong>g>p me from doing that but, l<strong>on</strong>g after I'd left there, I obtained<br />

copies of my pers<strong>on</strong>nel-records, including documentati<strong>on</strong> of his commentary <strong>on</strong> the episode.<br />

Without explaining what he intended, he put an exclamati<strong>on</strong> after the statement, “he brought in a<br />

word-processor!” I gather he disapproved of my c<strong>on</strong>structive resp<strong>on</strong>se <str<strong>on</strong>g>to</str<strong>on</strong>g> his criticism, yet he<br />

suggested no other alternative. What did he want from me? Did he expect me suddenly <str<strong>on</strong>g>to</str<strong>on</strong>g> develop<br />

handwriting like his? He never explained.<br />

In perhaps the emblematic episode of my tenure there, I pissed off <strong>on</strong>e of his fellow Army-officers<br />

by calling him in at night <str<strong>on</strong>g>to</str<strong>on</strong>g> attend a female patient of his by admitting for her evaluati<strong>on</strong> and<br />

m<strong>on</strong>i<str<strong>on</strong>g>to</str<strong>on</strong>g>ring of her chest-pain that I suspected had a cardiac origin. He chewed me out for<br />

disturbing his sleep and wanted me <str<strong>on</strong>g>to</str<strong>on</strong>g> release her home without forcing him <str<strong>on</strong>g>to</str<strong>on</strong>g> come in and<br />

examine her. He claimed <str<strong>on</strong>g>to</str<strong>on</strong>g> know her so well that he KNEW that her chest-pain was not cardiac<br />

but, instead, was from her COPD. The rules, not of my making, required him <str<strong>on</strong>g>to</str<strong>on</strong>g> come in and<br />

examine a patient whom the ER-physician suspected of requiring admissi<strong>on</strong>. Under protest, he<br />

came in, chewed me out some more in fr<strong>on</strong>t of nurses and other pers<strong>on</strong>nel and released her home.<br />

A few weeks later, her cardiac catheterizati<strong>on</strong> at Fort Gord<strong>on</strong> revealed severe cor<strong>on</strong>ary artery<br />

disease. I had committed an unpard<strong>on</strong>able sin: being right when an army-doc<str<strong>on</strong>g>to</str<strong>on</strong>g>r was wr<strong>on</strong>g.<br />

It's not as if this were a diagnostic coup. It could hardly have been more stereotypical. She had<br />

chest-pain, reminiscent of cardiac chest-pain. It was bread-and-butter medicine. She needed<br />

admissi<strong>on</strong> for the sake of safety. The officer fulfilled his paper-duty under protest by getting out<br />

of bed and examining the patient. He failed in his duty <str<strong>on</strong>g>to</str<strong>on</strong>g> admit her for m<strong>on</strong>i<str<strong>on</strong>g>to</str<strong>on</strong>g>ring.<br />

I pissed off a pediatrician by calling him in at night a few times <str<strong>on</strong>g>to</str<strong>on</strong>g> attend febrile infants who I<br />

thought might need admissi<strong>on</strong>, as a posted directive required me <str<strong>on</strong>g>to</str<strong>on</strong>g> do. Whether the patient's<br />

c<strong>on</strong>diti<strong>on</strong> is serious enough <str<strong>on</strong>g>to</str<strong>on</strong>g> warrant admissi<strong>on</strong> is a matter of judgment and, if I think the patient<br />

needs admissi<strong>on</strong>, the pediatrician may disagree. I assumed that <str<strong>on</strong>g>to</str<strong>on</strong>g> be in the realm of disagreement<br />

am<strong>on</strong>g reas<strong>on</strong>able people. He evidently disagreed, even with that principle, probably because he<br />

was the pediatrician <strong>on</strong> call and fulfilling his duty required him <str<strong>on</strong>g>to</str<strong>on</strong>g> exert unwelcome effort. He<br />

impugned my “judgment,” as a tactic in his campaign. He sent all the patients I referred <str<strong>on</strong>g>to</str<strong>on</strong>g> him


home, possibly as a way of accumulating incompetence-points against me. Those incidents<br />

illustrate the principle, universal, in my observati<strong>on</strong>, that hospital-pers<strong>on</strong>nel pay abundant<br />

lip-service <str<strong>on</strong>g>to</str<strong>on</strong>g> c<strong>on</strong>cern for quality of patient-care but their acti<strong>on</strong>s bespeak <strong>on</strong>ly c<strong>on</strong>cern for their<br />

own c<strong>on</strong>venience.<br />

Thus, I accumulated “complaints” against me but the hospital never preferred any charges against<br />

me or offered me a peer-review hearing for me <str<strong>on</strong>g>to</str<strong>on</strong>g> rebut such charges, presumably because the<br />

noti<strong>on</strong> would have been absurd, even <str<strong>on</strong>g>to</str<strong>on</strong>g> Army-brass.<br />

Hypothetical charge 1: diagnosing chest-pain as cardiac which later proved <str<strong>on</strong>g>to</str<strong>on</strong>g> be cardiac but<br />

pissing off Army-Officer in the meantime by calling him in at night <str<strong>on</strong>g>to</str<strong>on</strong>g> do his duty. Charge 2:<br />

complying with posted hospital-directive by calling in Army-Officers in relevant specialties<br />

“unnecessarily,” and thereby pissing them off, <strong>on</strong> nights when they're <strong>on</strong> call <str<strong>on</strong>g>to</str<strong>on</strong>g> attend patients,<br />

possibly appropriate for admissi<strong>on</strong>, and <str<strong>on</strong>g>to</str<strong>on</strong>g> render their opini<strong>on</strong>s.<br />

Instead of taking a formal route, they chose a typical bureaucratic route: my supervisor completed<br />

c<strong>on</strong>secutive evaluati<strong>on</strong>-reports in secret and never discussed them with me. The pers<strong>on</strong>nel-records<br />

I obtained years later, exhibited an unmistakeable halo-effect: In all comp<strong>on</strong>ents, from “medical<br />

knowledge” and “rapport with staff” <str<strong>on</strong>g>to</str<strong>on</strong>g> “health” and “appearance,” the ratings descended in<br />

parallel from 9 or 10 of 10, steadily downward, <str<strong>on</strong>g>to</str<strong>on</strong>g> end at about 3 or 4 out of 10, under the<br />

influence of multiple complaints of pissing off Army-physicians by asking them <str<strong>on</strong>g>to</str<strong>on</strong>g> do their duty.<br />

That is, each evaluati<strong>on</strong>-cycle, my supervisor assigned all comp<strong>on</strong>ents the same rating: all 9s, all<br />

8s, all 7s, all 6s and so forth. Yet, my “appearance” and “health” were verifiably the same<br />

throughout that time: fine and stable. He presented not a scintilla of evidence of my deteriorating<br />

health, for example, yet he “documented” its deteriorati<strong>on</strong> in his numerical ratings. This pers<strong>on</strong><br />

had an MD-degree!<br />

Thereup<strong>on</strong>, enough poor pseudo-ratings had accumulated against me <str<strong>on</strong>g>to</str<strong>on</strong>g> “justify” my terminati<strong>on</strong><br />

and <str<strong>on</strong>g>to</str<strong>on</strong>g> provide an ir<strong>on</strong>clad “paper-trail,” in case I should have decided, at some point, <str<strong>on</strong>g>to</str<strong>on</strong>g> c<strong>on</strong>test<br />

my terminati<strong>on</strong> legally.<br />

<strong>LOR</strong>s that I requested from Fort Stewart stated <strong>on</strong>ly the dates of my employment there but made<br />

no menti<strong>on</strong> whatever of my performance, e.g., my thoroughness and my diligence, for the benefit<br />

of patients, against the odds of dysfuncti<strong>on</strong>al military-bureaucratic obfuscati<strong>on</strong>. Those <strong>LOR</strong>s<br />

illustrate a fundamental principle of all <strong>LOR</strong>s: <strong>LOR</strong>s accommodate the needs of the ambient<br />

power-hierarchy, not of the subject thereof. That makes them inherently inaccurate. If the<br />

academic is h<strong>on</strong>est with himself, he will c<strong>on</strong>cede that academic power-hierarchies exhibit similar<br />

manifestati<strong>on</strong>s.<br />

I could provide other anecdotes with similar import but I've g<strong>on</strong>e <strong>on</strong> far <str<strong>on</strong>g>to</str<strong>on</strong>g>o l<strong>on</strong>g already, so I'll<br />

s<str<strong>on</strong>g>to</str<strong>on</strong>g>p.<br />

When will decisi<strong>on</strong>-makers cop themselves <strong>on</strong> <str<strong>on</strong>g>to</str<strong>on</strong>g> the inherent unfeasibility of rating human<br />

beings?

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!