Jeff Prosise
December 3, 1996
Ever since I brought home my first PC more than 14 years ago, I've been fascinated by the computer's ability to store meaningful information in streams of ones and zeros. The first to catch my eye were those long BASIC program listings, once prevalent in computer magazines, with lots of DATA statements that you could type in and run to create other programs. From them I learned that any program, no matter how complex, is really just a series of numbers representing instructions to the computer's microprocessor. Next, I became interested in ASCII, and then word processor file formats, and after that--well, you get the picture.
One of the technologies that still fascinates me today is bitmap file storage. A bitmap file stores the information a computer needs to re-create a picture. You and I may see an image on the screen as a beautiful sunset, but the computer sees it as ones and zeros. It's what the computer does with those ones and zeros that enables it to reproduce the original image. Ultimately, the bits and bytes in the bitmap tell the computer what color to paint each pixel in the image. The computer then translates the colors in the bitmap into a format compatible with its display adapter and outputs them to the video hardware.
The interesting part of the process is how the computer interprets the data in the bitmap. Bitmap files come in many formats, and each format has its own way of encoding pixel data and other information indigenous to computerized images. That's why the Paint program that comes with Windows 95 will read .BMP files but not .GIF files. Paint's creators endowed the program with the ability to decode image data stored in BMP format, but the popular GIF is as foreign to Paint as Swahili is to your average Texan.
So what's inside a bitmap file, and how does one file format differ from another? To answer these questions, let's look briefly at six popular graphics file formats used on PCs. There are other bitmap file formats, of course, as well as file formats--for vector graphics, for instance--that store instructions for re-creating an image rather than color data for every individual pixel. But the bitmap file formats discussed here are the ones you're most likely to run across in the course of a day's work.
The BMP (short for BitMaP) file format is the native bitmap file format for Windows--native because it closely matches the format in which Windows stores bitmaps internally. A BMP-format file typically has the filename extension .BMP, although some carry the extension .RLE, which stands for run length encoding. An .RLE filename extension generally indicates that the bitmap data in the file is compressed using one of the two RLE compression methods that BMP-format files support.
.BMP files encode color information using 1, 4, 8, 16, or 24 bits per pixel (bpp). The bits-per-pixel count, also known as the image's color depth, determines the maximum number of colors the image may have. A 1-bpp image is limited to 2 colors, while a 24-bpp image can contain more than 16 million different colors.
The illustration on the next page shows the structure of a typical .BMP file containing a 256-color (8-bpp) image. The file is divided into four main sections: a bitmap file header, a bitmap info header, a table of color values, and the bitmap data itself. The bitmap file header contains information about the file, including the location in the file where the bitmap data begins. The bitmap info header contains information about the image contained in the file--for example, its height and width, in pixels. The color table contains RGB color values for the colors used in the image. On video adapters that can't display more than 256 colors at a time, programs that read and display .BMP files can program these RGB values into the adapters' color palettes for accurate color reproduction.
The format of the bitmap data in a .BMP file depends upon the number of bits per pixel used to encode color data. For a 256-color image, each pixel is represented by 1 byte (8 bits) in the bitmap data portion of the file. The pixel's value isn't an RGB color value; it's an index into the file's color table. So if the first RGB color value in a .BMP file's color table is R/G/B=255/0/0, then a pixel value of 0 in the bitmap translates to bright red. Pixel values are stored in left-to-right order, beginning (usually) with the bottom line in the image. Thus, in a 256-color .BMP file, the first byte of bitmap data is the color index of the pixel in the image's lower-left corner, the second byte is the color index of the next pixel to the right, and so on. If the number of bytes per line of bitmap data is an odd number, each line is padded with an extra byte to align bitmap data on 16-bit boundaries.
Not all .BMP files are structured like the one shown in the diagram on the next page. For example, 16- and 24-bpp .BMP files don't have color tables; instead, pixel values in the bit-map specify RGB color values directly. The internal storage format of the file's individual sections can vary, too. For instance, bitmap data in some 16- and 256-color .BMP files is compressed using an RLE algorithm that replaces runs of identical pixels in the image with tokens specifying the number of pixels in the run and their color. And Windows still supports OS/2-style .BMP files, which use different bitmap info header and color table formats.
PCX was the first graphics file format to become a standard for bitmap file storage on IBM PCs. It was the format used by ZSoft's PC Paintbrush program, which was licensed to Microsoft in the early 1980s, distributed with Microsoft products, and later converted into Windows Paintbrush and distributed with Windows. Although use of this popular format is diminishing, PCX-format files, which are easily identifiable by their .PCX extension, are still commonplace today.
.PCX files are divided into three sections, in the following order: a PCX header, the bitmap data, and an optional color table. The 128-byte PCX header contains several fields, including the image size and the number of bits per pixel used to encode color information. Bitmap data is compressed using a simple RLE compression method, and the optional color table at the end of the file contains 256 RGB color values specifying image colors. The PCX format was originally developed for CGA and EGA display adapters and was later modified to support VGA and true-color adapters. Color data in modern-day PCX images can be encoded with 1, 4, 8, or 24 bits per pixel.
While PCX is among the simplest of all the bitmap file formats to decode, TIFF (Tagged Image File Format) is one of the toughest. TIFF files have an extension of .TIF. Each begins with an 8-byte image file header (IFH), whose most important member is a pointer to a data structure known as an image file directory (IFD). An IFD is a table that identifies one or more variable-length chunks of data called tags, which hold information about the image. The TIFF file format specification defines more than 70 different types of tags. One type, for example, stores the image's width in pixels; another stores its height. Another tag type stores a color table (if required), and yet another holds the bitmap data itself. An image encoded in a TIFF file is wholly defined by its tags, and the file format is highly extensible because additional features can be added simply by defining additional tag types.
So what is it that makes TIFF so complex? For one thing, writing a piece of software that understands all the different tag types is no easy undertaking. Most TIFF readers implement only a subset of the tags; that's the main reason a TIFF file created by one application sometimes can't be read by another. Programs that create TIFF files can also define private tag types that are meaningful only to them. TIFF readers can skip tags they don't understand, but there's always a possibility that doing so will affect the appearance of the image.
Another complication is that a TIFF file can contain multiple images, each accompanied by its own IFD and set of tags. The bitmap data in a TIFF file may be compressed using any of a number of methods, so a robust TIFF reader has to include an RLE decompressor, an LZW (Lempel-Ziv-Welch) decompressor, and several others. To make matters worse, LZW decoders must enter into a license agreement with--and often pay royalties to--Unisys Corp. in order to use the LZW algorithm. As a result, even some of the best TIFF readers throw up their hands when they encounter an LZW-compressed image.
Despite its complexities, the TIFF file format is one of the best for transferring bitmaps across platforms, because it is flexible enough to allow virtually any image to be encoded in binary form without losing any of its attributes, visual or otherwise.
When most graphics gurus think of LZW, they also think of GIF (Graphics Interchange Format; pronounced "jiff"), the popular interplatform bitmap file format created by CompuServe. GIF files normally have the filename extension .GIF, and thousands of them are available on CompuServe.
The organization of a .GIF file depends on which version of the GIF specification the file conforms to. Currently there are two versions, GIF87a and GIF89a. The former is the simpler of the two. Regardless of the version number, a .GIF file begins with a 13-byte header that contains a signature identifying the file as a .GIF file, the GIF version number, and other information. If the file contains just one image, the header is usually followed by a global color table defining the colors in the image. If the file contains multiple images (GIF, like TIFF, allows two or more images to be encoded in a single file), then the global color table is often omitted and each image may be accompanied by a local color table instead.
In a GIF87a file, the header and global color table are followed by the image, which may be the first of several images strung end to end. Each image consists of a 10-byte image descriptor followed by an optional local color table and then the bitmap bits. For space efficiency, bitmap data is compressed using the LZW compression algorithm.
GIF89a files are structured similarly, but they can also include optional extension blocks containing additional information about each image. The GIF89a specification defines four types of extension blocks: graphics control extension blocks, which describe how an image should be displayed (for example, whether it should overlay a previously displayed image like a transparency or simply replace it); plain-text extension blocks, which contain text to be displayed with the image; comment extension blocks, which store comments in the form of ASCII text; and application extension blocks, which store data private to the application that created the file. Extension blocks can appear virtually anywhere in the file after the global color table.
GIF's primary strengths are its widespread popularity and its compactness. But it has two rather severe weaknesses. One is that images stored in .GIF files are limited to a maximum of 256 colors. The other, perhaps more significant, is that software developers who use GIF in their applications must strike a license agreement with CompuServe and pay royalties on every copy sold--a policy CompuServe adopted after Unisys announced it would start enforcing its patent rights and require users of LZW compression to pay royalties. The resulting legal morass is a disincentive for programmers to implement support for .GIF files in their graphics applications.
The PNG (Portable Network Graphic; pronounced "ping") file format was developed as a replacement for GIF in order to circumvent the legal issues surrounding the use of .GIF files. PNG inherits many features from GIF and also supports true-color images. More important, it compresses bitmap data using a variation of the highly regarded LZ77 compression algorithm, a precursor of LZW that anyone is free to use. Because space is growing short, I won't take the time to discuss PNG internals. If you'd like to know more about PNG, check out the references at the end of this column.
The JPEG (Joint Photographic Experts Group; pronounced "jay-peg") file format was developed by C-Cube Microsystems to provide an efficient method of storing deep-pixel images, such as scanned photographs, which are characterized by numerous subtle (and sometimes not so subtle) variations in color. The greatest difference between JPEG and the other file formats discussed here is that JPEG uses a lossy, not a lossless, compression algorithm. Lossless compression preserves image data, so that a decompressed image matches the original image exactly. Lossy compression sacrifices some image data in order to achieve greater compression ratios. A decompressed JPEG image rarely matches the original exactly, but very often the differences are so minor that they are barely detectable, if at all.
JPEG image compression is a complex process that frequently requires a hardware assist to achieve acceptable performance. First, the image is tiled into blocks that measure 8 pixels to a side. Each block is then compressed separately, in three stages. The first stage involves using a discrete cosine transform (DCT) formula to convert the 8-by-8 block of pixel data into an 8-by-8 matrix of amplitude values representing different frequencies (or rates of color change) in the image. In stage two, the values in the amplitude matrix are divided by the values in a quantization matrix that's biased to filter out amplitudes that are less important to the overall appearanceof the image. In the third and final stage, the quantized amplitude matrix is compressed using a lossless compression algorithm.
Because the quantized matrix lacks much of the high-frequency information of its predecessor, it frequently compresses to half its original size or less. Lossless compression methods are often unable to compress real-life photographic images at all, so a 50-percent reduction is quite good. On the other hand, lossless compression methods can reduce some images by 90 percent. Such images are poor candidates for JPEG compression.
The lossy part of the JPEG compression is stage two. The higher the values in the quantization matrix, the greater the amount of information discarded from the image, and the more tightly the image is compressed. The trade-off is that higher quantization values result in poorer image quality. When a JPEG image is generated, its creator chooses a quality factor, whose value drives the values in the quantization matrix. The optimal quality factor--the one that exhibits the best balance between compression ratio and image quality--is different for every image and is usually found only through trial and error.
Several books are available on computer graphics file formats, but one that I've found particularly helpful (and well written) is James D. Murray and William van Ryper's Encyclopedia of Graphics File Formats, Second Edition (O'Reilly & Associates, 1996). It documents the history and internals of more than 100 file formats, ranging from Adobe Illustrator to ZBR. Curiously, it's silent about one of the most common formats of all--BMP--but there's plenty of information available on .BMP files elsewhere. "Further Information" sections accompanying the descriptions of each format tell you where you can get specifications and other information and also include URLs for hundreds of helpful Web sites.
Another book that I've found to be invaluable is Steve Rimmer's Windows Bitmapped Graphics (Windcrest/McGraw-Hill, 1993). In addition to documenting the formats of .GIF, .PCX, .BMP, .WPG, TIFF, .TGA, and MacPaint files in great detail and also describing many of the quirks and variations commonly found in files of these types, Rimmer provides C source code for Windows applications that read and display images stored in each of the different formats. His book deserves a place on the shelf of anyone interested in popular computer graphics file formats.
Jeff Prosise is a contributing editor of PC Magazine.