H.264: What More It Can Do and How It Will Be Implemented

H.264: What More It Can Do and How It Will Be Implemented
We all know H.264 compresses bits more than MPEG-4 does. However, it can do much more than that. As technology improves, we are sure to see its performance in the future. If continuous evolution is the case for video compression, what will be the best carrier for video codecs: ASIC, DSP or CPU (Intel Atom)?

Not Just High Compression
The first that comes to mind about H.264 is its high compression rate. For a quick comparison, the ratio for compression rates of M-JPEG, MPEG-4 and H.264 is roughly 1:10:20, or the ratio for bandwidth requirements 1:10 percent:5 percent, compared by the same video quality. Usually, the objective measurement for video quality is the peak signal-to-noise ratio (PSNR) of the pixel values.

However, there is more to H.264 than high compression rates. Figure 2 shows four video compression formats: JPEG-2000, MPEG-2, MPEG-4 (part 2) and H.264. The vertical axis stands for video quality measured by PSNR and the horizontal axis stands for compression rate. To get a better sense, each value of the compression rate is converted to a value that represents the number of days required for a 500GB hard-disk to store a single real-time QVGA (320 by 240) video stream under nonstop recording. Please note the algorithms an encoder adopts will certainly affect the results, so the encoder providers are marked. The JPEG-2000 and MPEG-2 are from open source encoders, while the MPEG-4 and H.264 from HuperLab.

All encoders are tested with the same video clip of a street scene from a standard analog camera (Figure 3).

In Figure 2, each encoder is represented by a curve with multiple sampling points. These sampling points are retrieved by feeding each encoder with different parameter sets. Please note the end points of each curve are not exactly the extremes of that curve. It is possible to extend the curve a little bit further if the proper parameters are given.

We find all the curves roughly take the form of a reciprocal curve. The closer a curve is to the upper right corner, the better it performs. Moreover, we find each encoder has its own coverage, or working range, in terms of video quality and compression rate. JPEG-2000 works for low compression rates only, which is a nature of all-intra encoding. MPEG-2 is better, but the compression rate is still low. MPEG-4 (part 2) was targeted at higher compression rates than MPEG-2, so it works as designed, lacking support for quality range as high as MPEG-2.

For H.264, it performs the best in both aspects. Among the four video encoders, H.264 is the only one that covers the full range, from low compression rate to high compression rate and from low video quality to high video quality. That's why applications with different requirements can use it at the same time, such as video streaming (high compression rate) and Blu-ray discs (high video quality).

From the figure, we can have a rough idea about the ratio for the compression rates of MPEG-4 and H.264. Knowing the point “HM Good (MPEG-4)” and the point “Huper H.264 Compact” have similar PSNR values, we compare their compression rates and get the rough ratio to be 7.5:15, or 1:2, just as what is claimed at the beginning of this article.

What Else H.264 Can Do
We have seen that H.264 is not only good at high compression, but also good at low compression. This is certainly the biggest selling point of H.264. In addition to this, however, H.264 has more to offer. Here are some of the features concerning video surveillance that may be implemented in the future.
    • Scalable Video Coding (SVC)
    SVC is most desired by IP camera vendors. To support multiple viewers at different bandwidths, the only solution that an IP camera can provide before SVC is to deliver multiple streams at different bitrates. With SVC, only one stream is required. The stream contains multiple layers in order to achieve scalability, and is thus a little larger than the original stream, but still far smaller than multiple streams put together. Not only computation is saved, but the bandwidth-adaptive ability of SVC is also better. SVC is an extension to H.264 and was just finalized in 2007, so it is not yet prevalent.

    • Intra-Frame-Only Mode
    Some people only believe in all-intra encoding like M-JPEG. If it is the case, by setting the number of groups of picture (GOP) to one, the H.264/MPEG-4 encoder will give us the intra-frame-only bitstream. Thus IP cameras do not have to provide dual codec support any more. According to research, the compression performance for H.264 intra-frame is better then that for JPEG-2000. However, if people find the format is H.264, they may not believe it is truly intra-frame-only.

    • Lossless Compress
    For those demanding users, lossless compression is supported by H.264. To be sure, it comes at a price: the compression rate is very low. A good example of using lossless compression is to apply it in critical regions. However, the feature is only supported in the H.264 profile named “High 4:4:4 Predictive Profile” and is not yet implemented by any hardware codec in the market right now.

    CPU as the Codec Carrier
    We have seen what H.264 can do beyond high compression rate. The next thing we want to know is what is the best hardware to carry it? The possible options are nothing more than ASIC, DSP and CPU. Each of them represents a different position in the spectrum of cost versus flexibility.

    In the past, there has a clear gap between ASIC/DSP and CPU. That's why DVRs are classified into stand-alone and PC-based, and capture cards are classified into hardware compression and software compression. However, as Intel processors become more powerful and Intel Atom processors hit the market, things may change.

    Let's look at high-end DVRs first. The situation is that if processor performance keeps on improving, will we still need hardware compression capture cards in the future? Consider the MPEG-1 and MPEG-2 decoder cards used in the past. At present, an Intel quad-core CPU, such as the Q6600, can achieve 480 fps D1 MPEG-4 recording with live display. Higher resolution can managed done by the Intel Q9550 quad-core CPU, for 480 fps D1 H.264 recording with live display. As the Intel Core i7 hits the market, the era of software compression seems at hand.

    As for low-end DVRs, Intel launched its Atom processors in 2008 and gained success in netbooks, showing its ambitions for the embedded market. With the next generation Atom-core SoC in development, we can expect more. For Intel, its next battle with ARM on mobile Internet devices (MIDs) may not be as easy, while things may differ on video surveillance devices.

    Video surveillance devices are much more complex than MIDs. From H.264 baseline, main profile to SVC and even to video analytics, devices will get even more complex. If uncertain complexity is always the case, PCs will be happy to stand in the limelight. With the help of embedded OS, the Intel Atom has the potential to be a good compromise between cost and flexibility.

    Performance of Software Codec
    Figure 5 shows the CPU requirements for MPEG-4 recording with real-time display under different combinations of recording rate and video resolution.

    Figure 6 shows the CPU requirements for H.264. We can see the full coverage of Intel processors from low-end DVRs to high-end DVRs. Another thing to note is the performance of Atom. The price/performance ratio for Atom is especially high, showing Intel's strong ambition for the embedded market.

    In conclusion, we look at a typical scenario on launching a product using a software codec. After the algorithm and the assembly code for the codec are both fine-tuned, what manufacturers have to do is wait for a more powerful CPU to drop in price.

    How long for the wait, then? If Moore's law holds true, the wait time for the transition from MPEG-4 to H.264 at current CPU prices is 18 months, since H.264 is twice as complex as MPEG-4. No parallel optimization is needed for the new processor, be it quad-core or octo-core. Multichannel DVRs are intrinsically a good match for multicore processors. Should SVC, intra-frame-only mode or even lossless compression be required, all the necessary code is already waiting for use.

    Share to:
    Comments ( 0 )