FREE online courses on the Basics of a Computer - HARDWARE AND SOFTWARE -
Mark and Character Recognition
This method involves the
recognition of marks or characters, e.g. from work dockets, checks, till roll,
and also cards. There are three types of recognition:
§
Mark Sense Reading.
§
Magnetic Ink Character Recognition (MICR)
§
Optical Character Recognition (OCR)
On the whole, to achieve the
required standard of accuracy which the computing process demands, the reading
devices associated with mark and character recognition operate at slower rates
than the punched card reader.
This is literally what it says.
The card or form is divided up into boxes, in which a mark is made by pencil or
pen. A character is represented by marking the correct combination of boxes in
any one column, as opposed to displaying holes from a punched card. Forms and
cards are pre-printed for special purposes so that a mark can be made in a
certain position to represent a YES or NO, to answer a market survey question
for example, or to signify a number, as on insurance forms, gas and electricity
recording cards.
In one form of detection, the
conductivity of graphite marks is sensed. The method necessitates the use of a
soft pencil, and non-graphite pen or printed marks are not acceptable. Another
method uses equipment which reads marks optically. Quite simply a light source
senses the presence of a mark. In this case, special pencils are not required to
mark the cards or documents. A mark reader may be designed to be insensitive to
certain colors. These colors can then be safely used in the pre-printing of the
cards or documents without risk of being read when marks made later are sensed.
Due to the success of mark
recognition, investigation turned to the possibility of reading characters. The
first successful form of character shapes printed in an ink containing
magnetically particles. Early in 1966, two standard MICR fonts (typographical
styles) were accepted by the International Standards Organization. One, known as
E13B, consists of the numerals 0-9 and four special characters. This is used
principally for bank checks. The code number of the bank, the customer's account
number, and the check sequence
number are all pre-printed in magnetic ink. When a check is submitted to a bank
the amount of the transaction is inscribed on it before the check is presented
for computer processing.
The magnetized ink induces a
current in a reading circuit. The current induced will be proportional to the
area of ink being scanned. The patterns of the varying currents can then be
compared with and identified as, bit patterns or the selected character. E13B is
used in the USA,
where it originated and in the UK.
Another MICR font, which originated in France
and is used in Europe, is CMC7. This includes the digits
0-9, the letters of the alphabet, and five special characters. The symbols are
made up of seven magentizable lines with six spaces of varying width between
them. A wide space generates a binary one, a narrow space a 0. The speed of
reading Micro is around 1200 documents a minute.
MICRO systems employ character
styles designed expressly for machine recognition and, therefore, the character
has to be accurately formed. They also require magnetic ink. These factors make
for expensive printing, but one useful advantage is that characters printed with
ink containing magnetically particles can still be read even when over-stamped,
as many be the case with bank checks. MICR readers cannot verify, they can only
identify. With a check someone still has to verify the amount to be paid, to
whom it is to be paid and, most importantly, that the signature authorizing the
payment is correct.
It is not only handwriting which
varies. Different typewriters and different typesetters produce the letters of
the alphabet in a variety of forms, shapes and sizes. Nevertheless, there are
certain characteristics which are peculiar to, and common to, each letter,
however it is produced.
OCR readers examine each character
as if it were made up of a collection of minute spots. Once the whole character
has been scanned, the pattern detected is matched against a set of patterns
stored in the computer. Whichever pattern it matches, or nearly matches, is
considered to be the character read. Patterns which cannot be identified are
rejected. OCR readers can read at a rate of up to 2400 characters per second.
They are generally designed to operate at slower speeds, typically 300-800
characters per second, at which they are more accurate and can handle characters
which are not quite so perfectly formed. OCR readers are expensive devices of
data to process.
A wide range of fonts, using
ordinary inks, can now be accepted by OCR readers, including many common
typewriter fonts. The standard fonts used are OCR-A (American Standard) and
OCR-B (European Standard). Some OCR readers can accept computer print-out and
complete pages of type text. It is possible that a computer could be programd to
accept some signatures, but it is unlikely that it could ever be programd to
accept every type of signature. Even so, devices have been developed which can
read neat hand printing (capital letters rather than lower case) in black ink,
and with sufficient accuracy for this to become a viable form of input. Refer
figure 3.