12.07.2015 Views

Chapter 6

Chapter 6

Chapter 6

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Testing a Change<strong>Chapter</strong> Five discussed methods for answering the question, 'l{hat changes canwe ialre that will resu\t in impro' ement? This chapter is concerned with deter--i"Jg *ft",tt"t ,he changes developed are really improvements This is done bytesting the changes using the PDSA Cycle'Tim wos o suPervisor in o lob. He hod recently become owore of o new procedufe ihotcould be used to meosure lhe omounl of o key ingrediento product Another lob thotused this procedure reported improvemenis in occurocy ond iime 10 complete lhe test'onJ on,ibuted ir to rhe foct tho he new procedureliminoted o lot of complexily. Timdecided to iest the new procedure in his lobITim thought that the change had some promise He made a wise decisionwhen he de.Ided to test the new procedures rather than just implement them'When making improvements, it is important to distinguish between testing andimolementini. Testing is used to evaluate one or more changes lmplementing*"ars mukln-g a change part o[ the day-to-day operations for a process or serviceor incorporating it into the nexl version of a product An important praco-.ui .o"t.q"*." of iesting before implementing is that some tests are expected,"]iil, ^"i *".^n learn fiom those failures' That is \Mhy testing on a small scaleto build knowledge while minimizing risk is so important Once a change isimplemented, we Jhould expect very few failures' (Implementation o[ a changeis discussed in ChaPter Seven.)Tim is interestedin learning whether the new procedure will be an improvementin his lab when used by different technicians for different batches He needsto make a prediction whether the change will result in improvement once it isimplemented. The use of good methods to test the new procedure will increaseTim's ability to make that prediction.92


lesfing o [hmge 93(hange ss a Predktionhatges canvith deterisdone byredure lhoither lob thotele lhe test,rplexily. Tim;e decisionnent them.rcsting andplementingcess or sertantpracureexpectedsmall scalea change isof a changeln improve-:s, He needsnt once it isvill increaseA prediction is a statement of how a system, process, or product is expected toperform. Implicit in every change is a prediction that the change will result infuture improvement. Will a new teaching approach be successful in all classesin upcoming years? Will a new computer system result in faster and more accurateorder processing? Will a new surgrcal procedure, when used routinely, resultin shorter rehabilitation time for a variety ofpatients? A prcdiction of the impactof a change puts one in a better position to learn as the change is tested.A predictlon is made in response to a question. Tim asked, "Will the new procedureresult in improvement in the 1ab?" A predictlon is usually stated in termsof the particular measures or outcomes that the change is attempting to affect.Tim predicted that the change would improve the accuracy of the procedure byapproximately 10 percent and cut the time to perform the procedure in half. Aprediction should also include the reasoning or theory on which the predictionis based. Tim based hls prediction on the theory that the new procedure reducedthe complexity of the test. He could articulate the details ofhis theory to those inthe lab. Tim was confident in his prediction because there was evidence availablefrom a similar lab lo suDDorl it.A goveinmenl ogency wos concerned obout ihe ow role of immunizotions (shots formeos es, po io, ond so on) omong children. The ogency pioposed o chonge thot conslstedof the ogency buylng the ovoilob e voccine ond distribuiing it free of chorge tophysicions. Their prediciion wos ihot ihis would substonliolly increose the rote of immunizotionomong children. The prediction wos bosed on the theory thot the high cost ofshots is ihe reoson oeoole do nof hove their chidren immunized.Just as Tim did, the government agency made a prediction about a change. Inmaking any prediction, one has some "degree of belief' (high, medium, or low)that the prediction is a good one. The concept of degree of belief provides a wayto assess the depth of one's knowledge about whether a change will result inimprovement in the future. One's degree of belief in a prediction depends on twoconsiderations: (1) the extent to which the predlction can be supported by evidence,and (2) the similarity between the conditions under which the evidencewas obtained and the conditions to which the prediction applies.Is the cause of the low immunization rate the cost of the vaccine, as the agencysuggests? Perhaps it is something else, such as lack of knowledge that the shotsare important, neglect, or complacency because no one geG certain diseases any-


94 The lnprovenent Guidemore. What is the evidence to support the theory that cost is the issue? Do poorpeople have lower immunization rates? What about poor people who have theshots paid for by a government program, by an employer-supplied insuranceplan, or by an HMO? Data consisting of evidence related to the prediction willhelp answer these questlons.Degree o[ belief will be high, medium, or low depending in part on how muchthe evidence supports the prediction. If the degree of beiief is low, it does notmean that the change should be abandoned or not tested. However, a low degreeofbeliefis a reason for caution. The test should be kent small and the risks low.Implementing the change on a large scale when the degree ofbelief is low maynot be prudent.Suppose the government agency decides to run a test of the free vaccine togather more evidence on the medts of the change. This test could be made moreinformative by including people from all pars of society-such as people whohave no family doctor and those who do-and by conducting the test in differentregions of the country. The range of conditions that should be included inthe test or tests are those to which the prediction will apply. Someone withknowledge of the area under consideration rnust decide those conditions.Satisfactory prediction of the results of tess conducted over a wide range ofconditions increases the degree of belief that the change will result in irnprovement.Test results that do not agree with predictions may be cause to rethink thetheory that is the basis of the prediction. This reassessment will allow for the formulationof new theories to help in the development of other changes. The PDSACycle should be used as the framework to cany out such tesLs.Using the PDSA Cycle to fest u (hangeThe four phases of the PDSA Cycle-planning the details of the test and makingpredictions about the outcomes (the Plan phase), conducting the test andcollecting data (the Do phase), comparing the predictions to the results of thetest (the Study phase), and taking action based on the new knowledge (the Actphase)-were discussed in <strong>Chapter</strong>s One and Four. Figure 6.1 summarizes whatshould be considered in each phase of the cycle when testing a change. Developinga good plan for a test is critical to its success. Exhibit 6.I contains a formthat can be used to develop a plan.The plan begins with a sutement of the specific objective of the cycle. Cyclesto test a change will have varying objectives depending on the current degree ofbelief.


Testing o ftonge 95Do poorhave thersurancetion willrw muchdoes notw degreeisks low.low mayAct.what changes areto be made?.what will be rheobjective ol next cycle?Plan.State objective of the test..Make predictions..Develop plan tocarry out the test...(who, what, where, when)iccine toade more:ple whoin di{fer-:luded inone with)ns.: range ofimprove-:think therr the for-Ihe PDSAStudy.Complere the analysisof the data..Compare test data.Summarize whatwas learned.Figure 6.1. Using the P|SA Qcle to lest o (honge.Do.Carry out the tesr..Document problemsand unexpectedobservations.. Begin analysisof the data.The following are some possible objectives of cycles:and maketest andults of the,e (the Actrizes whatge. Develinsa formcle. Cyclest degree of. To increase the degree of belief that the change will result in animprovement. To decide whlch of several proposed changes will lead to the desiredrmprovement. To evaluate how much improvement can be expected if the change is made. To decide whether the proposed change will work in the actual environmentof interest. To decide which combinations of changes will have the desired effects onthe important measures of quality. To evaluate costs, social impact, and side effects from a proposed change


95 lhe Inprovenent Guide1. Objecrive of the test:2. Change being tested:3. Questions and Predictions:seefereisashoenticollfouersale5.6.Key back$ound information:Measurets):Design of the test:Scale of the test and the risks involved:T)?e of study:Method of data analYsis:How a range of conrlitions will be included:Randomization:Who, when, and where:greoflbepe(it(onINthrchanIITdcusinExhibit 5'1. Forn for Plonning o fesl'Mike wos ihe monoger of production plonning'. He speni four months redesigning oll;;;;; ; ;; ,;r.;J;*s piocess When n' "yt"i*a li 313J11; l:"n,X?::',"r: l[?1'::?il$*T;il,:t;G.ho"g"ro their normol p':'"d:::' When the plonnerslried to use ihe new process to sched"ule produc on lhe next mon$' thete wete numer';;;;il; Mlke quicklv swiiched bock 1o the old scheduling process'Mike made a common but often d'isastrous misiake in his attempt to improveschedulinq: he developed the "perfect" change and then implemented it What


lesting o [honge9Iseems exactly right after many hours of planning and analysis around the conferencetable may collapse under the stress of everyday realiry Remember, changeis a prediction. Rough prototypes o[ the change or some of its componentsshould be tested in PDSA cycles as soon as possible. Instead of redesigning theentire scheduling process over four months, Mike could have begun with a roughconcept of the new system. He then could have used multiple cycleg over thefour months to test and improye the components. In this case and in most others,using multiple cycies to increase one's knowledge of a change will acceleratethe rate of improvement.The use of multiple cycles allows knowledge to be increased as a change progressesfrom testing to implementation. It allows risk to be minimized. As degreeof belief that the change will be successful is increased, the scale of the test canbe increased.Suppose a change is developed in a manufacturing process. In the first cycle,people with knowledge of the subject might review it. Then, in the second cycle,it could be tried in a pilot plant. In the next cyc1e, the change might be testedon one line in the production area. The change might then be revised and testedin a fourth rycle. If the leaming from the frrst four cycles increases to a high levelthe degree of belief that the change will result in improvement, a1l or part of thechange could then be implemented full-scale in production. The collection andanalysis of data in each of these cycles is essential to the learning process.Based on the results o[ a test, a change or some part of a change could beimplemented as is, or it could be modified and retested, or it could be abandoned.Flgure 6.2 illustrates changes in degree ofbeliefas a team or individualuses cycles to go from the development of a change to testing and implementingit.Degree ofbeliefthat the i:change twill result in EimprovementUnsuccessfulproposed changedesigning ollrionners, ihey'henthe plon3 WeIe numerrtto improverted it. whatDevelopinga changeTesting a changecycle l, cycle 2, . . .lmplementingFigure 5.2, lloving hon Developing to lesting to lmplenenling o Chonge.


9Elhe Inprovenent GuideTypes of StudiesSeveral different types of studies can be used within a PDSA cycle to test achange. They range from simple, informal tests to comprehensive, complexexperiments. The individual or team involved in testing a change should selecta level of formality and complexity relevant to their situation. In this section,two types of studies-the before-and-after test, and the simultaneous comparisonof two or more alternatives-are introduced. These studies are widely applicableand occupy the midrange of formality and complexity The examples in thissection show that the methods for the analysis of data are almost exclusivelygraphical. The aim of graphical methods is to visually display the impact of achange on the data being collected. Run charts and histograms are commonlyused because of their simplicity. Using graphical methods also has the side benefirof allowing everyone involved with planning the test to be involved in theanalysis.Before-ond-After TestA common and very useful way to test a change is to make the change and comparethe clrcumstances after the change to the circumstances before the change.The collection of data before the change provides the hlstorical experience thatis the basis of the comparison. A before-and-after test was used in the exampleof the diner in <strong>Chapter</strong> One.Figure 6.3 presents a run chart that was used to analyze the results of a testconducted by a group of physicians. The test was designed to detetmine whether504540353020l510509 11 13 15 r-/ t9 21 23PatientFigure 6.3, Dots from o Before-ond'After Test.


lesting o (honge99to test a:omplexld selectsection,:ompanyapplic-3s in this:lusivelypact of ammonly;ide benedin thelnd com-: change.ence thatexample; of a test: whether,7 39 4ra new instrument to obtain ce1l samples from patients would be an improvementover the existing instrument. One of the measures of quality that was used inthe evaluation was the percentage of inadequate samples. An inadequate samplewas one that did not contain enough cells to proceed with the diagnostic test.Data were obtained on the percentage of inadequate samples before and after theuse of the new instrument. Collecting data over time makes it possible to seewhether patterns indicating improvement coincide with the time of the change.In the run chart in Figure 6.3, the reduction in inadequate samples coincideswith the use of the new instrument.Since the basis of comparison in a before-and-after test is historical experience,the test is vulnerable to misinterpretation if a special cause unrelated tothe change occurs at or about the same time that the change is made. perhaps aseminar was given on a new way to use ihe old instrument while the new instrumentwas being introduced. It is up to those conductlng the test to make thejudgement that the effect seen is due to the change being tested. There is a rationalbasis for this judgement if there are no obvious external events and the systemhas been stable in the past. In the example just presented, the physiciansbelieved that the improvement in the percentage ofinadequate samples was theresult of the new instrument. They decided to plan another cycle to test the useof the instrument under a wider range of conditions.The data collected during a before-and-after test is susceptible to another specialcause, one that is difficult to detect: the effect of conducting the test in thefirst piace, sometimes known as the Hawthorne or Sentinel effect. During thetest, the people involved may be more careful or diligent in thelr work. This maybe the cause of the improvement rarher than the change being tested. Thls difficultycan be mitigated by testing over time so that the initial novelty of the tesrwears off. Also, ifit is possible and ethical, people might not be informed thatthey are part of a test. Open communication, however, is often the better alternative.If the Hawthorne effect is suspected, the data collected during the testcan be used to provide feedback to people. This feedback allows them to changewhat they are doing so that the improvement seen during the test can be sustained.In the example illustrated in Figure 6.3, it was possible to collect a sequenceofdata during the test. In some situations, it might be practical to collect dataonly once before and once after the change. For example, scores on a pretest andposttest might be used to evaluate the effect of new audiovisual materials in ahistory class, or a group of patients undergoing rehabilitation might be asked toevaluate the amount of pain they are feeling before and after a new exercise program.In this case, displaying on a histogram the data collected prior to and aftera test is a useful way to do the analysis.


t00Ihe lnprovenent GuideAfter (Bagged Mulch)GoodVery GoodExcellent0 1 2 3 1 t otigurc 5.4. Doton the Use of llukhA londscope moinlenonce orgonizolion purchosed mulch in bulk quontities The mulchwos delivered io o centrol locoiion lt wos used oround trees, shrubs, ond flower bedsot voriousites. Bulk purchosing wos ihe cheopest woy to buy the mulch, but it resulied inwoste ond cleonup problems A leom wos formed lo consider oLlernolive woys to PUI'chose ond ,r" trth. After some reseorch, ihey decided io lest ihe use of mulch supplledin bogs.Cu".ro.e,, oi the different siies were osked io roie the oppeoronce of lhe mulch on oscole ronging from poor to excellent, bolh before ihe tesl of bogged mulch ond ofter ihebogg"d ;,rl".h*o, u,"d The responses before qnd ofier the chonge were comporedonl"hirtogro. (Figure 6 4). Doto on the cost of mulch during the iest wos kePl by thepurchosing depo'tment ono compored to Posr eroendil'resAfter reviewing the data, the team concluded that appearance was improvedwith the use of bagged mulch. The members o[ the crews did not feel that anychange other than-tle use ofbagged mulch delivered to the site resulted in theimpr"ovement shown. Purchasing reported that the cost for mulch was onlystijntly trigher when the bagged mulch was used. Considering ease of applicatio"nana d"elivery, the total cost to the system was considered to be less Thedegree of beiief of the team was high that the use of bagged mulch was ani-i.o,r".rr"nr. Purchasing was asked to arrange with the supplier to provide themulch required. The team decided to track any problems encountered duringthe implementation of this change.The rating question used. in this example could have been "Regarding appearance,how iould you rate the bagged mulch versus the bulk mulch?" and a"much 'lmuch worse" to better" siale (such as that introduced in Figure 4 4)couldhavebeenused.Datawouldthenneedtobeco]lectedonlyafterthechanqe.


lesting o honge t0,l5illIiir Current Procedurea New Procedure| 2 3 4 5 6 7 A 910 11 12 13 14 15PatientFlgure 5.5. Doto blleaed on Rehohilitation line.18The mulchcwer bedsresuhed inoys to pur:h suppliedmulcn on ord ofier thekept by theimproved:1 that anyIted in ther was onlyof applica-: less. Thech was anrrovide thered duringing appearch?"and a:rgure't.'rr.y after theIIiISi n ult a n eo u s bn p orim nsIn a simultaneous comparison test (commonly called a paired comparisonstudy), two or more alternatives are compared at the same time, in the samespace, or under other similar conditions. When one of the alternatives is thecurrent system, the test is often called a simultaneous comparison with acontrol. By comparing alternatives in such a way, the effect of external events onthe different allernatives can be studied during the test. A simultaneouscomparison test can therefore help to rule out alternate explanations for thermprovement.Figure 6.5 shows the data a physician, Dr. Smith, collected on rehabilitationtime. The data were collected during a simultaneous comparison test of a newrehabilitation procedure. Based on previous research, Dr. Smith's degree of beliefwas high that the new procedure would result in improvement. It was also herbelief that if it turned out that the new procedure was not an improvement, therisks involved were not high. She advised the patients and received their consentto include them in the test of the new procedure.The run chartjuxtaposes the data that resulted when the current rehabilitationmethod was used on a group of patiens with the data obtalned from patientsusing the new method. Both sets of data were collected over the same four-monthperiod. The run chart reveals the impact of the change and the effect of an externalevent on both the currenl and the new method. The external event thatcaused a higher rehabilitati.on time for both procedures (although the new procedurewas still lower) was identified as the presence of a new physical therapist.This therapist conducted the rehabilitation for a short time when the regulartherapist was on vacation. The results of this test increased Dr. Smith's degree ofbelief that the new method was an imDrovement over the old.


t02lhe lnprovenent GuideRundomizationAfter reviewing the run charts on rehabilitation time' one might questionwhether the two groups dif"' ""ot U"iu"'" of the procedure used but because of;,i";;;;tfi?li#"t't".('utt'us the age of the patients) Dr' Smith overcamethis nossible alternate "*p'u'-'u'-i-ot' fo' tfe results of the test by using random"ttf;iTiil^rrt*"ment is the use of a device such as a table- o[ rand'om numbersto assign the change being test"d to th" p"opl" or things selected for the test ln;#;"^;;i" ;G"al nu"mber of people were assigned to each procedure usinga table of random nurnbers -";;fJ'tcl the list of"patients participating in ther#y-.ai "t*fters for the o1d procedure and evel for the new proced.ure.If random ur,tgt'-""t i! i'"a' iit' utt"-"a that the groups do not dif-lumfersfer systematically before the test'In some situation,, u tn"t'Jtutt b" done on the random assignment by collectinedata on the groups U"f"'" tftt lft""ge is tested The differences in theg.oupl.un then be evaluated before the change'F,Plonned GrouPingBosed on fie resulis of her tesr, Dr' Smiih believed thoi the new rehobilitotion proceduredil;il::::"J,J'H;i;;'';;;' on,"hobiiitorion tr me SheY:'':y..]ii?'l1l ll"liip":::n:o'"""1..".'r'r *h"" '*a bv diflerenf rheropisfs' in differenr hos-;;[, Ji,j;il;;ung ond old porients' She decided to run onolher cvcle 10 consider owide ronge of conditions ln her. test she plonned lo sei,uP two grouPs with exlreme conditionsio determine whelher the n"* p,o."a,t" wouli result in improvement ln bolh;'*;,il;;1y:l":*:l;,'li';i'#,il"JT:i;"Y'.15i,.*Jllwo yeors of exPerience ond one \^.,f oo. ond one sixly yeors of oge ond older'((IIIDr. Smith was us itgplanned grouping' in which important condidons are heldconstant within each of two * -oti gt-o"pt Uut va.ried between the groups Theuse of planned groupi"g utto"' iot u'*id" 'u"g" ol conditions to be included in;;;';;;#atiiwlv rhe two sroups Dr' Smith planned were:Group 1Group 2Small hosPitalLarge hosPitalTherapist with two years ol experlenceTherapist with ten years ol experienceYoung patientsOld patients


lesting o Chonge 103Group IGroup 2t questronbecause of1 overcamerg randoml numbersthe test. In:dure using.ting in thee new prodonot dif-.ent by co1-nces in thei: I{J0 5 1 0 1 5 2 0 2 5 0Patienta Current Procedurel New Procedureo Current Procedurel New ProcedureFigure 6.6. Sinultoneous tomporison Test Using Plonned ouping.,n procedurering whetherdlffereni hosoco|sloer oexheme conmenlnboihsts, one wilh3r rnrrryeofscns are heldgroups. Theincluded inIn her test, Dr. Smith purposely set up two groups that had very different conditions.lf the new procedure resulted in improvement in both groups, her degreeof belief would be high that the change would result in improvement in thefuture. To conduct the test, she ran a simultaneous comparison study in bothgroups. Once the patients were selected for a particuiar group, they were randomlyassigned to receive the new or old procedure. The results of the test areshown in Figure 6.6.The run chart shows that the new procedure resulted in improvement in bothgroups. Alrhough the rehabilitation time was generally higher ln group l, thenew procedure stlll resulted in shorter times. Dr. Smith's degree of belief wasincreased that ihe new procedure would result in improvement under the differentconditions to which it would be applied.The possibility exists that the change will show improvement in some plannedgroups but not in others. If this results, the next cycles should be used to studythe relationship between the change being tested and the different conditions inthe groups.Choosing o Studyf experienceThe before-and-after test and the simultaneous comparison can be used in manysituations. Following are some things to consider when deciding which tlpe ofstudy to use:


104 The lnprcvenenl GuideConsider a beJore-and-aJter test uhen:1. The data that are available or that can be collected before the change are sufficienlto lorm the basis of comparison.2. There is a minimal threat of misinterpretation of the results because someexternal event is present on or about the same time the change is made.3. Data will continue to be collected over a long period of time after the changehas been made.4. Large i.mprovements are expected.5. Groups needed for a simultaneous comparison test cannot be isolated-forexample, when a group of mechanics assigned to use an old maintenanceprocedure prefer the new procedure and begin to use it.Consid,er a simultaneous comparison when:I. Two or more alternatives to the current system (for example, two new suppliers)are being tested.2. Only one alternative is being tested but external events pose a threat to theinterpretation of the resuls. A control group (a group using the current system)should be used along with the alternative group being tested.3. There is a desire to include a wide range of conditions in a test during onecycle. This is possible by using planned grouping.Tests in Which More Than hne Chonge is ModeIt is possible to test more than one change at the same time by using a t)?e oftest called a Jactorial expeiment. A factorial experiment provides an altemativeto testing one change at a time. lt allows for the study of the impact of differentcombinations of changes. For example, a factorial experiment could detect thatsome combination of temperature and pressure has the optimal effect on animportant characteristic of a product. A1so, a teacher might want to test havingstudents working in teams versus a standard lecture format and at the same timetest the impact o[ class size. Perhaps, the team approach will result in improvementonly when the class size is small. Although factorial experiments arebeyond the scope of this chapter, many books are available for further study.Evaluotion of the EvidenceThe prediction that a change will result in improvement is based on the knowledgeof the people making the change and the evidence or data that is collecteddurlng the test. In some cases, the evidence is very strong. In the examples inthe last section, the impacs of the chanqes were obvious. The data indicated that


105 The lnprcvenent Guidein the next cycles while the change is being refined. In the final cycles, thechange should be critically evaluated, and strong evidence of improvement usinggraphical methods should be acquired.ln a few cases, the prediction might be that the impact of the change will besmall compared to the large variation in the system, but even a small improvementwill be beneficial. This might be true with yield improvements for expensiveraw materials or with the reduction of serious side effects of a drug. Adecision might be made to continue the change to determine whether theimprovements seen can be sustained over time. In such cases, a statistici.an mightbe consulted to help plan the cycie and evaluate the evidence.Principles for Testing o ChongeThe ideas discussed in the previous section, on maximizing learning and minimizingthe negative side effects during a test, can be summarized in two basicprinciples that should be considered in the plan phase of a cycle to test a change:l. Build knowledge sequentiallyTest on a small scale.Use multiple cycles.Increase the ability to predict from the results of the test.Collect data over time during the test.Test over a wide range of conditions.Principle l: Build Knowledge SequentiallyBecause not all changes result in improvement, those responsible for developingthe change should continually be looking for ways of reducing the risks ofthe test while maximizing the learning.fesl on o Snollkole The scale of the test should be decided according ro ( l) thedegree ofbelief that the change will result in improvement, and (2) the risksfrom a failed test. As shown ln Table 6.1, very small-scale rests are needed whenthe consequences of failure will be major and the degree of beiief in success islow Consequences might include severe negative impact on customers, financialloss, or injuries. The use of expensive new technology introduction of a newservice, or the test of a new medical procedure would fall into this category.When risks are high, it is always wise to have a contingency plan developed thatdescribes the actions needed in case the test falls.If the consequences of a failed test are major but one's degree of belief in succf:tldo;pirc5tT


Testing o Chonge t0/'cles, theent using;e will beimprove-)r expen-L drug. A:ther the.an mightlrth 6.1. Deciling the kole ol o lest.(oltstouflt(Is oI IAltI0 rtsrlf,inortloiorDIGntt 0t ttult t1{ succlstl4edium*cole tests 0ne cycle lo implementthe rhongeVery smolfscole te$s Smoll-to medium*roleTCST.nd mini-.wo basica change:cess is high, then small- to medium-scale tests should be consldered. The test ofa medical procedure or a drug shown elsewhere to be an improvement wouldfall into this category The test is directed at learning about the use of the procedureor drug in a new environment.Consequences are often minor when tests are run that affect processes or systemsthat are internal to the organization-that is, when the tests do not have adirect impact on the organization's customers. Exampies might be mail deliveryor the storage of inventory In these cases, tests can be done on a larger scale.If small-scale tests are appropriate, one way to design such a test is to simulatethe change in some way.developerisks ofo (l) thethe risksled when;uccess lsrs, frnan-Lofanewcategoryoped that.ef in suc-Consideroiion wos being given to instolling o pneumotic tube system to corry sompleporis between lwo oreos in o lorge orgonizoiion. Before ihe instollotion, the supervisorin one of ihe oreos decided io run o test to determine the utilizolion of such o system. Foro week she wore o beeper lf onyone wonted to send ports, they would beep her ondshe would hond corry them to ihe olher oreo.By simulating the pneumatic tube system and measuring utilization, the supervisorwas ln a better position to make a decision about whether to implementthis change. Modeling a change on a computer or role playing are two other waysto simulate a change. Often, some imagination is all that is needed.Besides simulating the change, some other ways to design a small-sca1e testare.. Have others who have some knowledge about the change review and commenton its feasibility.. Test the new product or the new process on the members of the team thatdeveloped the change before introducing it to others.


,06 Ihe lnprovenent 6uide. lncorporate redundancy in the test b; m3kins the thange-side-by-sidewrththe existing proct" o' p-od"tt (a simultaneous comparison test)'. Conduct the test in only one facllity or office in the organization' or withonlY one customer'. Conduct the tesl over a short period'. Test the change on a small group of volunteersTestinq a change on a small scale is an importantpeople:laY,of,leducingfJ# "i^it:,,;'it ""gi wt'"" tt"ll-scale tess are not considered' people pro-'.";t;.il;yr'ty ,o?"'o"top ii" p"rfect change because of the potential con-;fi#;;i"#d ie,t. rhii app'oa{r mi.strt ue-n1f:::Tll^p'*:l-'1::':::bie corporations or govemment agencies where, any change to orograms or poucilis usually scrutinized When p-lanning a cycle to test a change' much thoughtshould be given to a"u"ropi"g *lft oiuitiUi"g tt"o*ledge through small-scaletests.Use Mukfu Cyclu Testing on a small scale leads to the use of multiple cycles ro buildknowledge sequentially Attempting to get all the answers concerning a changeil;;":ir;;;;le should ui'^""?a"ale"'laes the negativ elfect such a larsechange might hav" o" "o*uL-op"'utions' the people the change affects will havemore difficulty committrng til'" ti""g" Some rnitial cycles might be used todecide whether ttt" ct'at'g" it *o'itutle inder the best conditions' This wili also#;;;;;;;i;; i"t""'iuna *iurngness-to buv into the clanse---B^sed on what is learned from any cycle' a change might be:. lmplemented as is. Abandoned. Increased in scoPe. Modified. Tested under different conditionsIn the las1 three situations listed, addidonal cycles for testing the change areneeded. As the deg*" tf b";iil;;L 'utt"tt or *" trtange is increased' the scaleof the test can be ln.r""."a *iit i"tt "tt ts* nieuJg r for an illustration ofJfr"* ift" ."p"","a u'" of tf'" lftl" tan be used to build knowledge )iem IMCcounls loDiscrepomode towos hovto mokecess in lrchongesCYcieCYckCYcCYcCYcLyclLyccyc(, Prin*rnponu lhol monufoctures liles for floors ond ceilings wos experiencing some difficultieswith the occurocy ot the |. 'nu""nto'y |."to'at A cJmputerized moteriol control sys-f,olleby coTherthe ethe sFiinadmedrmpisequthe tsamchain tltoglpIe


Testing o honge109:-by-side withtest).rtion, or withcing people's1, people pro-)otential conalentin some;rams or polinuchthought;h small-scale:ycles to buildring a changet such a largeects will haveht be used toThis will alsore change aresed, the scalellustration of.);ing some diffirioconlrol sys-tem (MCS) wos used lo keep trock of fie omount of invenlory on hond. Actuol physicolcounts to verify the compuierized records were mode frequently by o voriety of people.Discreponcies were oflen found befween the MCS ond the physicol counts. Adlusimenlsmode io reconcile fie discreponcie seemed to moke the problem worse. The siluolionwos hoving on effect on the schedule for monufociuring. A teom wos therefore formedio moke improvemenis to the process o[ moniiofing invenlory. The teom hod greol suc'cess in increosing inventory occurocy by using numefous cycles to test ond implemenichonges. The objectives of some initiol cycles were os follows:Cycle ICycle 2Cycle 3Cycle 4Cycle 5Cycle 6Cycle 7Cycle 8Test changes to standardize procedures for the physical counts for residendaltile.Test the revised counting procedures (based on the learning from cyclel) for physical counts for residential tile.Implement the new counting procedures for the physical counts forresidential tile.Test the new counting procedures for other products.lmplernent the new counting procedures for other products.Collect data to d€termine the impact of adjustmens to the MCS, basedon physical counts of the inventory of residential tile.Test the use of a control chart to determine when adjustments to theMCS for residential tile should be made.Implement the use of control chars to determine when adjustments tothe MCS should be made for all oroducts.Principb 2: lnueon the Ahility to Predia fron lhe Results of the Testblhtt Doto 0vu line The measures of a product or process will always be affectedby common causes of variation that are unrelated to the change being tested.Therefore, viewing the patterns in the data over time is critical to determiningthe effect of the change. After a change, data must be analyzed to decide whetherthe system is stable; a prediction about future performance can then be made.Figure 6.3 contained a run chart of rhe data collected on the percentage ofinadequate samples obtained during the testing of a new instrument used in amedical procedure. Variation is evident before the change, but also evident ls theimpact of rhe change on the data. It is possible to see the impact because asequence ofdata was collected and plotted on the run chart both before and afterthe change. After the change to the medical procedure, the percent of inadequatesamples stabilized at an average of approximately 15 percent. Unless some otherchange in the system occurs, one would expect this same level of performancein the future. When collecting a sequence of data over time is not possible, histogramsshould be used to analyze data from a test (see Figure 6.4 for an exampleof a histogram).


It0The Inprovenent GuideIn most tests of change, conditions during the test are a more important considerationthan sample size (the number of items tested). A change may have theoredicted effect in the short term, but most tests also have the aim of determiningwhether the effect will persist in the future. Since conditions will naturallychange over dme, more information is usually obtained from a sample selectedover a long period than from a larger sample collected over a short period. (Onecycle or several cycles might be used to provide an appropriate period ) The timerequired to increase degree of belief that the change will result in improvementand that the improvement will persist is a matter of iudgement.The importance of time in determining if a change is an improvement sets a contextfor determining sample size for the test. The sample size needs to be adequateto detect pattems that indicate improvement. Some ruIes of thumb for determiningthe number ofplotted poins necessary to detect pattems on a Iun chart are:Number oJ PointsFewer than 10l) Io fu50 to 100More than I00SituationExpensive tests, expensive prototypes, or long periodsbetween available data poins; large effects anticipated.Usually sufficient to discem patterns indicating improvementsthat are moderate or large.The effect of the change is expected to be small relativeto the variation in the system.The chanee is intended to affect a rare event.Usually, fifteen to twenty-frve plotted points will be sufficient to recognize patternsindicating improvement. Sometimes as many as fifty poins might be necessarylf no historical data are available to establish a baseline before the changeis made. From fifty to one hundred points would be necessary only when theeffect of the change is anticipated to be small relative to the variation in ihe system.Examples are situations when the variation in the measuremenG themselvesare large or when the variations among people-students, Patients, or customers-canmask the effect of the change. More than one hundred poins mightbe needed if the change was intended lo affect a rare event, such as the side effectof a new drug or a serious but rarely occurring defect. In cases such as these, itmight be wise to consult a statistician to help in the design of the test.In certain situations it is not practical to collect fifteen or more points duringthe testing of a change. Protot)?es might be very expensive, such as prototypesof a new automobile engine, or the cost of conducting the test might be high.Possibly the data is financial in nature and only available on a monthly basis. Incases such as these, it may be that some data is better than none However, ifonly small amounts of data are available, their wonh can be enhanced by inciudinga wide variety of conditions in the test.len l)systcharditicSmnthe (belieof tepeobilit:ent (diffeISomcon!ci.enditirreas1inclwhilectrd"glncrovetheave2tesltobtestSlS Iwhi3intdorterpatisixljudinc


lesfing o Chonge,,,)oranr conrayhave the)f determinillnaturallyple selected,eriod. (Onel.) The timenprovementrt sets a conbeadequatex determinrchart are:ong periodsinticipated..ng lmprovemallrelative.t.:cognize patrightbe necethe change.ly when then in the syssthemselves)nts, or cuspointsmighthe side effecth as these, itest.roints duringts prototypesighr be high.thly basis. InHowever, ifed by includ-Iest llnder t Wide Ronge 0f tondilions Making a change in order to improve a process,systemr or product involves making a prediction. The prediction is that thechange will be beneficial in the long run. It is important to recognize that conditionsin the future will be different from the conditions of the test. Circumstanceswill arise that were unforeseen or not present at the time of the test. Isthe change still an improvement under these new conditions? The degree ofbelief in the results is increased as the same conclusions are drawn for a varietyof lest condirions (different times, materials, environmental condidons, tlpes ofpeople, and so on). Dr. Smith increased her degree of belief that the new rehabi.litationprocedure was an improvement by testing the procedure under differentconditions. She tested the procedure in different hospitals, with patients ofdifferent ages, and with physicians who had different levels of experience.Too often, tests of changes are not conducted over a broad range of conditionsSome reasons given for limiting the conditions include limited resources, timeconstraints, difficulty in analysis of the data, lack of knowledge of how to efficientlyinclude different conditions, and the existence of too many possible conditionsto consider. Following are some slmple ways of dealing with thesereasons:I. Collecting data oter time . Many conditions change over time. So, one way toinclude a range of conditions is to incorporate into the study a period duringwhich condj.tions are expected to change significantly lf a sequence of data col-Iected after a change exhibits a predictable pattem that shows improvement, one'sdegree of belief that the improvement will be sustained in the future will beincreased. Dr. Smith used the new rehabilitation procedure on twenty patientsover a four-month period. Except for the one special cause that was identified,the data collected on the new procedure exhibited a predictable pattern with anaverage of about nine days (see Figure 6.5).2. U sing planned grouping. Another way to include a range of conditlons in atest is to use planned grouping. Planned grouping allows for different conditionsto be brought into the test in a systematic way. Dr. Smith was able to conduct thetest of the new procedure under different conditions but still do a simple analysisof the results, because the analysis was done within two planned groups inwhich the conditions were uniform.3 . IJ sing judgement samples . The selection of people or things to be includedin the test provides an opportunity to consider a wide range of conditions. A randomselection of units is rarely preferred to a selection made by a subject matterexpert. This selection is called a judgement sample. Dr. Smith selected thepatients for her test from those who were either under thirty years of age or oversixty. She judged that age could have an impact on the results of the test. Thisjudgement sample assured her that both young and old people would beincluded.


SollyIlhe cctestedthot ilogedued tcwland tlnveshas y,are fiimpltdevelical, rnot bponechanTIchanpredfromteStechancoultlargerng a,r2 The lnprovenent GuideCondusionRunning cycles to test proposed changes logically follows from developingchanges. What happens to the changes that the testing indicates will result inimprovement? The next chapter discusses the implementation of these changes.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!