CSE 126/228F - Homework 1
1. Multimedia: Color Concepts
1.
Compare three systems for color image representation. Give two techniques for reducing the number of bits in color representation. The three most often used systems for color image representation are: R,G,B (Red, Green, Blue)Y,U,V (Luminance, Chrominance)
H,S,B (Hue, Saturation, Brightness)
While all of these schemes can represent color faithfully, they all differ in the type of encoding and therefore can affect compression greatly. RGB uses the same number of bits to encode each one of the primary colors, while YUV differentiates between the luminance (the Y component) and the Chrominance or color (the U and V coordinates). Since the human eye has been found to perceive slighter differences in luminosity over those of color, it has been found that assigning more bits (relatively) to luminance and fewer to chrominance can achieve a total savings as well as maintain faithful color representation. HSB is similar to YUV in the sense that it also assigns a different number of bits to Brightness then it does to hue and saturation (which make up color).
In order to display colors, a computer must employ a table to convert a color value into a corresponding signal. Because these tables can get quite large (over 16 MB for 24 bit color), there are two schemes commonly used to reduce the size of this table. The first method simply uses the highest-order bits of each color to create the new color. This method, however, limits the color set so a second method, a Color Look Up Table (CLUT) is used that allows each application to define a table of colors. This way each color to be displayed will map to the closest color in the CLUT.
2. Television Data Rates
1. How does television solve the problem of flicker in display?
In order to deal with flicker (caused by the slower refresh rates of televisions as compared with the perception of the human eye), televisions employ a system known as interlacing. Interlacing is simply the alternating of scan lines so that the raster gun draws the even set of lines on one pass and then the odd lines on the second pass. This effectively doubles the refresh rates.
2. Compute the data rates in bits per second for uncompressed video captured in the following formats: NTSC, PAL, and HDTV.
| Refresh Rate | Vertical Scan Lines | Horizontal Scan Lines | Bits per pixel | Total Data Rate (bits/sec) | |
| NTSC | 30 Hz | 525 | 700 | 24 | 264,600,000 |
| PAL | 25 Hz | 625 | 833 | 24 | 312,375,000 |
| HDTV | 60 HZ | 1,000 | 1,778 | 24 | 2,560,320,000 |
* Bits per second calculated as vert. scan lines x horiz. scan lines x bits per pixel.
Bits per pixel assumed at 24 as NTSC and PAL are both analog formats.30 HZ also accepted for HDTV
3. Audio Coding
1.
Describe the various steps in digitization and coding of audio signals. In order to encode audio, the audio must first be sampled from its source. This is done with some sort of a microphone that then sends an analog signal to an encoder or some sort of recording device. If the audio is to be stored digitally (as it must be to be used with computers), the encoder must then sample the data, normally at a rate twice that of the maximum frequency of the signal as governed by Nyquist's Law. The data must then by quantized or rounded so that it can be encoded using a standard number of bits. Digital audio is normally quantized using a form of scalar quantization. The audio is then encoded in three primary ways: Pulse Code Modulation (PCM) which simply encodes the intensity values for each sample; Differential PCM (DPCM) which sends an initial intensity and then only the differences from that value of each sample; and Adaptive DPCM (or ADPCM) which is like DPCM except that rather then sending a full value representing the difference from the initial intensity, only a sign (+ or -) is sent, along with a scaling factor.2.
The following sequence of real numbers has been obtained sampling an audio signal: 1.8, 2.2, ..., 0.9 Quantize this sequence by dividing the interval [-4, 4] into 32 uniformly distributed levels. Write down the quantized sequence. How many bits do you need to transmit it? The quantization function would look as follows:
22, 24, 24, 28, 28, 28, 25, 26, 26, 26, 21, 19, 20, 20, 22, 24, 24, 24, 23, 24, 20, 16, 10, 10, 8, 11, 6, 9, 9, 12, 15, 19
The maximum number of bits necessary to transmit any single datum in this set would be 5. The total number of transmitted bits will be 160.
3.
Encode the quantized sequence using DPCM and use a Huffman code on the difference samples. How many bits do you need now to encode the sequence? Positive signs are assumed and negatives are marked as such.22, 2, 0, 4, 0, 0, -3, 1, 0, 0, -5, -2, 1, 0, 2, 2, 0, 0, -1, 1, -4, -4 -6, 0, -2, 3,
-5, 3, 0, 3, 3, 4
The numbers in this set range from -6 to 22 which means that 5 bits per number is needed for encoding. There are 32 total numbers so 5 x 32 = 160 bits necessary for encoding.
In order to produce a huffman code for the above numbers, we must construct the following table:
| Symbol | 22 | 2 | 0 | 4 | -3 | 1 | -5 | -2 | -1 | -4 | -6 | 3 |
| number of appearance | 1 | 3 | 10 | 2 | 1 | 3 | 2 | 2 | 1 | 2 | 1 | 4 |
| Encoding length | 5 | 3 | 2 | 4 | 5 | 3 | 4 | 4 | 5 | 4 | 5 | 3 |
Following the table, the total number of bits now necessary for all of our data is 102.
4. What is the compression ratio you have achieved?
By using DPCM and Huffman coding we have reduced the number of bits necessary from the quantized sequence of 160 to 102, a total savings of 36%! Or the compression ratio is 160/102=1.574. Video Coding
1.
Differentiate between Subband coding and Subsampling. Subband coding gives different resolutions to different bands. Subsampling groups pixels together into a meta-region and encodes a single value for the entire region.Thus if a pixel uses YUV encoding we can decide to give the Y component more bits and reduce the bits in UV since we know the human eye is more sensitive to luminance changes then color--This is called Subband coding. Subsampling would then take a group of the Subband coded pixels and group them together into a meta-region assigning all of them one value.
2. Will Transform coding always result in compression? Explain with an example.
No. Transform encoding only works well if the data is closely correlated. Since transform involves rotating the one set of reference planes to another in order to save bits, it assumes that the color points aren't scattered all over the graph in which case the transform would simply change the values but not provide any savings as the points (though not the same points) would require about the same number of bits for representation. For example, most natural images have relatively little color change from pixel to pixel and when changes occur, they tend to be gradual. If however you were working with an image in which color changed almost every pixel then transform would save you little if anything.
5. JPEG
1.
Are the 8x8 blocks within an image coded independently of each other? In other words, if for example, if a decoder receives all blocks except the first one, can it go ahead and construct most of the image (except of course the region covered by the first block)? Explain. Although JPEG's are interleaved, which means that each one of the 8x8 blocks passes through the various stages of JPEG encoding one at a time, the 8x8 pixel blocks cannot be decoded independently. This is becasue the DC value for each block except the first one is encoded as a difference from the previous DC value. 2. What is the reason for zig-zag coding of AC Coefficients? Zig-Zag coding is based on the observation that color is not likely to change much from pixel to pixel within an 8x8 region. For this reason, we can place greater importance on the low frequency DC coefficients and the single AC coefficient. Since there will usually be little or no values in the high-frequency AC coefficients, the end of the zig-zag sequence is of less importance. 3. Give 3 methods for encoding multiple resolutions in JPEG.Multiple resolution JPEG's can be encoded in the following 3 ways:
Progressive JPEG: Progressive JPEG subsamples the original image by 4x4, and encodes the result, then subsamples the original image by 2x2 and encodes the pixel differences from the previous image, and finally encodes pixel differences between the original image and the result of the 2x2 subsample.
Alternative 1: It is possible to pass the image through the JPEG stages using only the four most significant bits of the pixel values first, then pass the four least significant bits second.
This is based on the observation discussed earlier about quantization that since colors tend to be pretty close to each other and that human perception pays more attention to significant rather then gradual change, the most significant bits will be the most important in our encoding.
Alternative 2: It is also possible to pass the DC and 31 AC coefficients through the stages first, then pass the remaining 32 AC coefficients through afterward. Again, based on our discussion of zigzagging, the DC coefficient followed by the first few AC values in the zigzag represent the bulk of the important data about the 8x8 block. A pretty good image can be constructed from just that information with the detail to follow later.