Video is everywhere. From the latest HD movies on Blu-ray Discs to streaming video on mobile handsets, it seems that the moving image has found its way into all aspects of modern life. Video comprises individual frames, each displayed for an instant, to simulate smooth motion. This requires vast quantities of data. To cost-effectively store this video and then move it to a user, it must first be compressed.
Compression is the job of a codec, and the compromises and tradeoffs that must be made in that process define the application. Considerations include the bit rate (size) of the compressed video, the frame rate of the video, its quality, and how difficult it is for the client to decode. The optimum setting for each of these parameters can vary widely depending on the application, and dictates the choice of which codec to use.
Because codecs are not created equal, selecting one appropriate for an application is crucial. A codec created for a broadcast application may not provide the optimal solution for a surveillance application, and vice versa. Consider, for example, M-JPEG. This codec has been used in surveillance systems for many years, and while it is largely considered obsolete, it is still found in many mainstream installations. In M-JPEG, individual frames of video are compressed with the same techniques used in still cameras. A relatively poor compression ratio of roughly 8:1 is obtained.
In many surveillance applications, however, a frame rate of just a few frames per second is acceptable and is used to compensate for the higher data rate required for each of the compressed frames. This compression ratio and frame rate would be absolutely unworkable in a broadcast application. For the surveillance community, however, benefits include the ease with which streams can be decoded. This makes the decoding of multiple streams for video walls a relatively straightforward process.
Different Compression Techniques
In surveillance applications where a higher frame rate is needed, a more complex inter-frame compression technique must be used. Here, a single index frame is compressed using the same M-JPEG technique. Subsequent frames are compared to the first frame and only the differences between them are compressed. The compression efficiency of these schemes depends on the movement in the observed scene (the degree of difference between the index and subsequent frames). Compression ratios of 50:1 are easily achieved using these schemes, but at the expense of complexity.
Since frames of video are referenced back to index frames, searching back and forth through video compressed with this type of codec is difficult. Nevertheless, inter-frame techniques form the basis of most modern codecs, including MPEG-2, MPEG-4 and H.264 advanced video coding (AVC). Figure 1 shows the relative sizes of the bit streams produced by some of the codecs that are commonly used in surveillance installations.
With growing demand for higher frame rates and resolutions in the surveillance community, these interframe schemes have been almost universally adopted. Many new installations are using H.264 AVC and are achieving 100:1 compression ratios. Yet again, the features and compromises that have been established for one set of applications are not necessarily optimal for another.
H.264 AVC, borrowed from the broadcast and video telephony industries, produces streams with bit rates that can be controlled to facilitate the transport of video over a broadcast network. Two common modes of bit rate control are constant bit rate and variable bit rate. Both effectively define a target bit rate for the encoded video but differ in the degree of variance allowed around that target.
This degree of control makes it easy to multiplex multiple channels of video into a broadcast stream, but makes little sense in a surveillance situation. Surveillance video can be characterized by long periods of inactivity with short periods of intense interest. There is no need to constrain the codec either to artificially increase the bit rate to attain its target value or to prevent the codec from capturing the event of interest in the highest possible quality by placing a cap on the bit rate.
For surveillance applications, codecs are starting to employ a third method of rate control called constant quality (CQ). In CQ mode, many of the bit rate constraints are relaxed, and the codec instead attempts to retain a CQ level in the video. This is accomplished by allowing bit rates to fall to extremely low levels during periods of inactivity. During periods of high activity, high bit rates are allowed in order to capture the scene correctly. This results in a much lower average bit rate over time and saves storage space in digital recording systems. Figure 2 shows the variation in bit rate when using different rate control algorithms, and how CQ dramatically reduces bit rates in surveillance applications.
As the surveillance industry embraces digital video, it is becoming increasingly more innovative. Instead of leveraging existing broadcast industry techniques, the surveillance industry is finding new ways to better manage video for its unique needs. One example is in the adoption of scalable video coding (SVC).
Video for surveillance operators is not a means of generating revenue as it is to broadcasters. It is, instead, a tool that is used to mitigate some perceived risk. This tool then needs to be managed over time to echo the severity of the risk until, ultimately, it can be deleted to make room for more recent video. To manage the video, the surveillance operator needs the flexibility to change its resolution or its frame rate to reduce either its storage size or its bandwidth so it can be moved across networks to remote monitoring locations. This led to the development and adoption of the SVC extension to H.264.
H.264 SVC adopts a layered approach to video encoding. Initially, a low frame rate, low-resolution version of the video stream is encoded. Additional "enhancement layers" are then encoded to provide the additional information needed to increase the frame rate or the resolution of the encoded video. The encoded layers are packaged together into a single SVC stream for storage or for transport to the monitoring location. A monitoring device decodes the first layer of video and then decodes subsequent layers until the desired frame rate and resolution are achieved. Once video with the desired characteristics is achieved, decoding can be halted, freeing processing resources to decode other streams. This way, SVC streams can readily be decoded by monitoring devices of diverse processing capabilities, without the need to transcode the stream.
SVC streams can also be "thinned" after they have been encoded. By removing layers from the stream, the size of the recorded stream can be dramatically reduced to reclaim storage resources. A thinned stream can also be moved more easily across networks of constrained or congested bandwidth.
By employing H.264 SVC, surveillance operators can manage the profile of captured video. Full resolution and frame rate video can be captured and stored for relatively short periods of time. When it has been determined that little of interest has occurred in the observed scene during some prescribed period, an operator can thin the video to reduce its frame rate or its resolution and, in doing so, reduce its storage requirements.
Retaining some of the lower SVC layers, however, will ensure that a valid video record is retained for long-term reference until all perceived threats have passed. Figure 3 shows how the size and storage costs of an encoded stream can be reduced by thinning the recorded stream.
As the surveillance community moves to embrace digital video, leveraging core technologies from industries like broadcast can provide enormous benefits. These core technologies, however, provide only half the story. Innovations specific to surveillance applications can turn a mediocre implementation into a game-changing experience for surveillance operators.
When considering codecs, operators must consider how the resulting stream can be configured in terms of bit rate and quality to maximize its performance. How the stream will be monitored and stored and how it will be managed over time must also be considered. Techniques such as CQ rate control and SVC promise to provide the surveillance community the control needed to maximize the benefits that can be derived from the migration to digital video.