In the rapidly expanding video surveillance space, customers are demanding higher resolution, more channels and computationally hungry analytics – all at lower cost and power. In this article we explore how HikVision, a provider of digital video recorders (DVR), digital video servers (DVS) and surveillance IP modules, has kept pace with these demands with its generation products.
Choice of Hardware
In previous generation products, Hikvision used a TMS320C6000TM DSP from Texas Instruments (TI) paired with an ARM processor. This combination served those products well, but more performance was needed to meet customer demand for more channels and improved image quality.
Many factors were evaluated when choosing a platform for next generation products, such as performance, power consumption, cost and ease of development. ASIC-based solutions offered high performance with the lowest power consumption and per unit cost. However, ASICs presented several problems. One is their high development cost in time and money. To recoup design costs requires prohibitively high volumes. Another problem with ASICs is the lack of flexibility. With their long design cycles, ASICs aren't able to respond effectively to shifting customer needs.
While FPGAs and massively-parallel processors offer the needed programmability and additional video channels, these devices have their own drawbacks in that these devices can be difficult to program. What was needed was an easy, programmable solution that offered more performance while maintaining low power consumption and a straightforward programming model at a low cost.
Choosing another TI processor was attractive for several reasons – the most obvious being code reuse. HikVision has invested considerable resources implementing highly optimized codecs on the TMS320C62xTM DSPs. Since TI's next generation DSPs are code compatible with the C62xTM devices, porting existing code could be done with minimal effort. Another motivation was tools. HikVision's engineers were familiar with TI's Code Composer Studio, which offers a tested development environment and an extensive set of features. And most importantly, TI's roadmap offered a way forward with its DaVinciTM family of digital media processors.
TI's DaVinci portfolio includes fourteen digital media processors based on TI's TMS320C64x+TM DSP core. The family's flagship devices (see Figure 1), pair a C64x+TM DSP core with an ARM9 and a Video Processing Subsystem (VPSS), a host of hardware accelerators for common video processing tasks. Four of the DaVinci members feature this basic configuration. Nine of the DaVinci parts omit the ARM core, featuring just the C64x+ core and VPSS.
The DaVinci devices were a good fit because they strike a balance between ASICs and programmable DSPs. By offering fixed function (although configurable) accelerators in the VPSS for common video processing tasks such as encoding, decoding and display, the DaVinci devices are able to offer ASIC-like cost and power for the heavy lifting involved in surveillance applications. By offering a high performance programmable DSP, DaVinci allows designers the flexibility to quickly implement new features like content analytics.
Because of the wide range of options in the DaVinci family Hikvision were able to use DaVinci devices across all product lines (DVR, DVS and IP modules). This enabled a build once, deploy many strategies that lower engineering development and system cost. Additionally, by integrating the ARM core and hardware accelerators on a single die, the DaVinci parts offered increased performance at a lower cost and power consumption. This higher integration was particularly important in enabling cost and power sensitive products such as the surveillance IP module.
Competition in the video surveillance space is fierce. To compete against ASIC solutions with lower power and cost, HikVision finds its advantage in software. By leveraging the flexibility of software, we can respond more effectively than ASIC designers to customer's specific, shifting needs.
Porting legacy software to DaVinci was not without challenges. The first challenge was to re-optimize our code to take advantage of new instructions in the C64x+ instruction set. For instance, the C64x+ can perform up to eight 16-bit MAC instructions per cycle. The C62x, in contrast, can only perform two 16-bit MACs per cycle. For MAC intensive audio/video codecs, this represents a large performance boost. The C64x+ also offers new bit-manipulation instructions and expanded add and subtract capabilities. These new instructions, combined with the higher 600 MHz clock rate with the C64x+ (the highest C62x clock rate is 300 MHz) gave DaVinci based solutions a large performance boost.
Hikvision also decided to switch from a proprietary OS (VxWorks) to Open Source Linux. There were several motivations behind the switch. One, of course, is that Linux is royalty free. Another is growing support for Linux OSs on the DaVinci platform. DaVinci processors support Open Source Linux as well as MontaVista Linux.
The transition to Linux required a significant amount of effort. However, this transition was eased by the Linux support in TI's DSPLink. DSPLink is an interprocessorcommunication scheme that provides an abstraction layer between the ARM core and DSP. With DSPLink, code running on the ARM uses the same high-level APIs to communicate with the DSP regardless of the OS. These APIs make it easier to switch OSs on the DaVinci platform.
System Level Challenges
With the ability to process more channels of video comes another challenge: getting more video into and out of the chip. The TMS320DM6446 processor has a single dedicated input video port and single dedicated output video port. For additional channels on the chip, an FPGA was used to implement a PCI port for streaming multiple digital channels.
For products featuring a single channel input/output—such as the surveillance IP module shown in Figure 2—Hikvision chose the TMS320DM648 DSP. This device contains a C64x+ core and the VPSS but not an ARM core. The IP module uses the DM648 to compress one channel of video using HikVision's patented H.264 algorithm. The video is compressed at 4CIF resolution (4CIF = 4 x CIF resolution, or 704 x 576 pixels). One channel of audio compression is also supported using the OggVorbis codec. This IP module is small, consumes very little power and can be embedded into an analog camera to turn it into a networked camera.
For products supporting multiple channels Hikvision had the DM6446, which incorporates a C64x+ and VPSS as well as the ARM9. For example, the DS 6004HCI digital video server supports simultaneous encode/decode of up to four channels. It supports one channel at 4CIF and three others at CIF. Alternately, the DS-6004 can be configured to do two channels only at 4CIF. Up to four channels of audio encode/decode are also supported using the OggVorbis codec.
To continue expanding market share, Hikvision is designing for the future that customers envision. In the short term this means higher resolution and image quality, even if achieved by lower frame rates, and in the long term, high-resolution and high frame rates will be the norm.
Customers are also beginning to demand video analytics. Advanced compression algorithms such as H.264 are allowing more image data to be transmitted and stored. This has led to video systems with increased channels, which is good but also poses a problem – how to monitor the increased number of channels. Human monitoring of video is expensive and error prone so algorithms that can monitor video for events of interest such as suspicious activities are needed. These events can then trigger an alarm and be sent to a human observer for further scrutiny.
A large amount of research has been done on video analytics, or Video Content Analysis (VCA), but application of this research to high volume consumer products has been slow. This is mainly due to the high computational requirements of analytics algorithms. However, advanced technologies such as DaVinci are bringing video analytics within reach. With DaVinci's hardware accelerators doing the heavy lifting associated with encode/decode and display, the high performance C64x+ DSP is able to handle the intense processing requirements of video analytics. Currently, HikVision is working on implementing its first video analytics feature, facial recognition.
Meeting future demands will necessitate even higher performance. Continuing to explore TI's roadmap, Hikvision is currently evaluating the DaVinci processor, the TMS320DM6467 (see Figure 4). The device offers several advantages for video surveillance. For one, the DM6467 device offers a significant performance boost via a transcoding coprocessor comprised of a tightly coupled encoder and decoder coprocessor. In a DVS requiring simultaneous encode/decode, this coprocessor could be maximally leveraged by freeing up even more of the C64x+ for computationally hungry analytics. For products requiring only encode, such as a surveillance IP module, the decoder could be used to accelerate encoding, as the decoding processor is essentially a subset of encoder accelerators.
The DM6467 device also adds a PCI port. Using this port would eliminate the need for an FPGA in our multi-channel DVS and DVR products. In addition, the DM6467 supports two 8-bit BT .656 inputs and outputs that can be reconfigured into a single 16-bit BT input and output. This makes the DM6467 a good solution for surveillance solutions with two high resolution inputs.