06.09.2023 Views

25949117

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

The almanac has always been one of the key factors for success for farmers, ranchers, hunters, and

fishermen. Historical data about past weather patterns, phases of the moon, rain, and drought

measurements were all critical elements used by the authors to provide their readership strong

guidance for the coming year about the best times to plant, harvest, and hunt.

Fast-forward to modern times. One of the best examples of the power, practicality, and tremendous

cost savings of machine learning can be found in the simple example of the U.S. Postal Service,

specifically the ability for machines to accurately perform OCR to successfully interpret the postal

addresses on hundreds of thousands of postal correspondences that are processed every hour. In 2013

alone, the U.S. Postal Service handled more than 158.4 billion pieces of mail. That means that every day,

the Postal Service correctly interprets addresses and zip codes for literally millions of pieces of mail. As

you can imagine, this amount of mail is far too much for humans to process manually.

Back in the early days, the postal sorting process was performed entirely by hand by thousands of

postal workers nationwide. In the late 1980s and early 1990s, the Postal Service started to introduce

early handwriting recognition algorithms and patterns, along with rules-based processing techniques to

help “prefilter” the steady streams of mail.

The problem of character recognition for the Postal Service is actually a very difficult one when you

consider the many different letter formats, shapes, and sizes. Add to that complexity all the different

potential handwriting styles and writing instruments that could be used to address an envelope—from

pens to crayons—and you have a real appreciation for the magnitude of the problem that faced the

Postal Service. Despite all the technological advances, by 1997, only 10 percent of the nation’s mail was

being sorted automatically. Those pieces that were not able to be scanned automatically were routed to

manual processing centers for humans to interpret.

In the late 1990s, the U.S. Postal Service started to address this automation problem as a machine

learning problem, using character recognition examples as data sets for input, along with known results

from the human translations that were performed on the data. Over time, this method provided a

wealth of training data that helped create the first highly accurate OCR prediction models. They

fine-tuned the models by adding character noise reduction algorithms along with random rotations to

increase effectiveness.

Today, the U.S. Postal Service is the world leader in OCR technology, with machines reading nearly

98 percent of all hand-addressed letter mail and 99.5 percent of all machine-printed mail. This is an

amazing achievement, especially when you consider that only 10 percent of the volume was processed

automatically in 1997. The author is happy to note that all letters addressed to “Santa Claus” are still

carefully routed to a processing center in Alaska, where they are manually answered by volunteers.

Here are a few more interesting factoids on just how much impact machine learning has had on

driving efficiency at one of the oldest and largest U.S. government agencies:

523 million: Number of mail pieces processed and delivered each day.

20

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!