2.4.2. The Digitizer


    The digitizer converts the input image source to an electrical signal and then samples the electrical signal, using an A/D converter. The specific functions performed by a digitizer depend on the input image source. When the input is already in the form of an electrical signal. as in the case of a VCR tape, the digitizer is interfaced to the input source and is used in sampling the electrical signal following the format used in converting the input source to an electrical signal.

    When the input source is in the form of an image, an electronic camera converts the image to an electrical signal, and the result is digitized by using an A/D converter. In some camera systems, parallel paths allow light intensities at many spatial points to be measured simultaneously. In typical systems, however, there is only one path, and the light inten~ty at only one spatial point can be measured at a given time. In this case, the entire image is covered by scanning, in most scanning systems, a small aperture searches the image following a certain pattern called a raster. The light intensity integrated over a small aperture is measured, is converted to an electrical signal, and is considered to be the image intensity at that spatial point. This process can be viewed as convolving the input image intensity I(x,y) with the aperture and then sampling the result of the con­volution. The effect of the aperture is, therefore, lowpass filtering I(x,y). This limits the spatial resolution of I(x,y) and can be used for antialiasiñg necessary in an AID converter. For a still image, the image is generally scanned once. hut may be scanned more times for noise reduction through frame averaging. For a moving scene, the image is scanned at periodic time intervals.

When the input is a film or a photograph. a common device used to convert the image intensity to an electrical signal is a flying spot scanner. in this arrange­ment. a small spot of light scans the input s7ource, and the light that is reflected by the photograph or transmitted through the film is collected by wide-area photo­detectors. The source of the small light spot is a CRT screen.

When the input image source is an object or a natural scene. the most common device for converting light intensity to an electrical signal has been the vidicon and its relatives such as a Saticon and Newvicon. The vidicon and its relatives were employed until the early 1980s in practically all TV applications, including broad-casting. small portable video cameras, and surveillance cameras. The construction of a vidicon camera tube is shown in Figure 2.26. At one end (the left end in the figure) of the tube inside the glass envelope is an image plate. The plate has two layers. Facing the light from the input source is a thin layer of tin oxide coating that is transparent to light but is electrically conductive. Facing the electron gun is the second layer. This second layer has a coating of photosensitive material, antimony trisulfide for a basic vidicon. The light from the input source passes through an optical lens that is the focusing mechanism, through an optically flat glass plate, and through the first layer of the image plate. The light is then focused on the second layer. The photosensitive image plate (the second layer) is scanned by an electron gun, and the resulting electrical current is the camera signal that is digitized by an A/D converter. The scanning pattern used is from right to left. bottom to top. Since the input source is inverted by the lens, this scanning pattern is equivalent to scanning left to right, top to bottom, in the input image plane.

The photosensitive layer is a semiconductor, which acts as an insulator when no light is present. As light hits this layer, electrons move to the electrically conductive tin oxide layer, creating positive charges on the image-plate surface facing the electron gun. The number of electrons that move, or alternatively the number of positive charges facing the electron gun, represents the image intensity at that spatial point. As the low-energy electron beam from the electron gun scans the image plate. it drops enough electrons to neutralize the positive charges. This discharge current is collected at a metal target ring that is electrically attached to the tin oxide layer. The current at the metal target ring is the camera signal. The electrons originate at the cathode, which is at the other end of the vidicon camera tube. Electrons converge to a narrow beam by means of the electrostatic lens and magnetic focusing.

The spectral response of a basic vidicon for a black and white image is very similar to the C.I .E. relative luminous efficiency function discussed in Section 2.1.2. For color images, a color camera optically separates the incoming light into red, green, and blue components. Each component is the input to a vidicon camera tube. A color camera, therefore, houses three separate tubes.

The camera signal, which represents the intensity of the in put image source is sampled by an A/D converter to obtain digital images. Common digital imagt sizes are 128 x 128. 256 x 256, 512 x 512, and 1024 x 1024 pixels. As we reduce the number of pixels, the spatial resolution, which is also referred to definition, is decreased and the details in the image begin to disappear. Examples of images of different sizes can be found in Figure 2.11. The amplitude of each pixel is typically quantized to 256 levels (represented by 8 bits). Often, each level is denoted by an integer, with 0 corresponding to the darkest level and 255 co; responding to the brightest. As we decrease the number of amplitude quantization levels, the signal-dependent quantization noise begins to appear first as random noise and then as false contours. Examples of images quantized at different num­bers of quantization levels can be found in Figure 2.10. For a color image. each of the red, green. and blue components is typically quantized to 8 bits/pixel, a total of 24 bits/pixel.

    The vidicon and its relatives are called photo-conductive sensors or tube sen­sors, and were employed until the early 1980s in practically all TV applications. Since the mid-1980s, however, there has been a rapid growth in solid-state sensors. in a typical solid-state sensor, a 2-D array of sensor elements is integrated on a chip. One sensor element is located spatially at each pixel location and senses the light intensity at the pixel, the value of which is then read by a scanning mechanism.

    The charge coupled device (CCD) is an example of a solid-state sensor ele­ment. When a CCD array is exposed to light, charge packets proportional to the light intensity develop. The stored charge packets are shifted to the storage CCD array which is not exposed to light. The light intensity values are then read from the storage array. Depending on how the imaging and storage CCD arrays are configured. different methods have been developed to read the light in­tensity values from the storage array.

    Solid-state sensors have many advantages over photo-conductive sensors. They are inherently more stable and compact. A solid-state sensor also has a well-defined structure and the location of every pixel is known accurately in both space and time. As a result. color extraction requires simpler signal processing and better color uniformity can be obtained. Solid-state sensors also have the potential for much higher sensitivity of light detection. In a typical photo-conductive sensor. each pixel is examined one at a time by a single light sensor. The time for light detection is measured in microseconds in a typical TV application and the sensitivity of light detection is low, in a typical solid-state sensor, an array of sensor elements, one for each pixel, is used. Each sensor element, therefore, needs to be checked once per picture. The light energy can be integrated over the time of one frame rather than one pixel. increasing the potential for light sensitivity bs orders of magnitude. A solid-state sensor also has a lower lag factor than a photo-conductive sensor. The lag is the residual output of a sensor after the light intensity is changed or removed.

    Solid-state sensors have some disadvantages in comparison with photo-con­ductive sensors. One is the spatial resolution. A higher spatial resolution requires a larger number of pixels, meaning a larger number of sensor elements that need to be integrated on a chip. Another is the relatively low signal-to-noise ratio. Despite some of these disadvantages, the solid-state technology is advancing rap­idly, and it is expected that solid-state sensors will replace photo-conductive sensors in almost all TV applications in the near future.

    In the NTSC television broadcasting, 30 frames are transmitted every second. Each frame consists of 525 horizontal scan lines, which are divided into two fields. the odd field and the even field. Each of the two fields is scanned from left to right and from top to bottom, and covers 262½ horizontal lines. The odd field consists of odd-numbered lines, and the even field, of even-numbered lines. The horizontal lines in the odd and even fields are interlaced to form a frame. This is shown in Figure 2.27 and is called a 2:1 interlace. The 2:1 interlace is used so that the vertical resolution will be 525 lines per frame at tli e rate of 30 frames/sec. but the flickering frequency will be 60 cycles/sec. reducing the perception of flicker at the receiver monitor. Without the interlace, a frame would have to be displayed twice to achieve a flicker frequency of 60 cycles/sec. and this requires frame storage or more transmission bandwidth. The spatial resolution of a television frame displayed as a still image is similar to that of a 512 x 512-pixel digital image. In the television broadcast, the signal remains in the same analog form both at the transmitter and receiver. Digital processing of these signals involves sampling using an A/D converter.

The output of the digitizer is a sequence of numbers. Although the output is represented by a 2-D sequence f(n1,n2) in Figure 2.25. the output may be three sequences, fR(n1,n2) fG(n1,n2), and fB(n1,n2) corresponding to the red. green. and blue components for a color image. The output may also be a 3-D sequence f(n1,n2, nt), which is a function of two spatial variables and a time variable for a sequence of frames. These signals are then processed by digital image processing algorithms, which may be implemented on a general purpose computer. a micro­processor, or a special purpose hardware.