Intelligent Cameras and Encoders
Submitted by Intellivision 2007/7/12

This article discusses both the technical and commercial aspects of intelligent cameras. At times, people refer to these as smart cameras or smart encoders or servers. This article will define the basics of intelligent cameras and the supporting architecture, highlight the standard video pre-processing modules and solutions that are applicable and the video analytics modules that are available.

This article discusses both the technical and commercial aspects of intelligent cameras. At times, people refer to these as smart cameras or smart encoders or servers. This article will define the basics of intelligent cameras and the supporting architecture, highlight the standard video pre-processing modules and solutions that are applicable and the video analytics modules that are available.

First, let us define an intelligent or smart camera. A typical camera in today's industry is a standard image sensor that outputs video data in analog or digital form. An intelligent camera, however, is a device that supports video processing and intelligence inside the camera and outputs both better video and/or events or metadata derived from the video. In simpler terms, an average camera represents the "eyes," while an intelligent camera combines the "eyes" with the "brain."

Clearly, intelligent or smart cameras are the way of the future, as evidenced by the growing interest and demand from customers and the market. Customers are increasingly looking for unique features that provide enhanced quality of video, lower bandwidth, automated monitoring, events and metadata extracted from the video, as well as a less-expensive video and security system. More information concerning the benefits and return on investment (ROI) will be highlighted later in the article.


A key part of the intelligent camera architecture is the ability to add an extra video processor in the video pipeline. In a digital or IP camera environment, this is much easier, since analog video is converted to its digital form by an A/D converter, typically to a BT 656 video bus format. This video data is then taken into the video processor for the application of video enhancement and analytics algorithms. Finally, the output is published over IP in terms of better video or metadata. We will discuss two types of intelligent cameras.

 Type 1: Modified video and enhanced quality
 Type 2: Events and metadata are extracted, but the video is not modified. For each of these types, the video paths and block architectures are slightly different.


There are many options to process the video. We have either encountered or provided applications on many of the following processors:







Video Pre-Processing or Enhancing Applications Video pre-processing or video quality enhancement is one of the primary applications that are being used in cameras. Video content takes up bandwidth of 10 MB per second for SD streams and is higher for HD streams. Hence, there is tremendous pressure and demand to lower the bandwidth while sustaining high video quality. Video quality varies a great deal based on day/night conditions, sensor quality, unstabilized shaking cameras, multiple transmissions, low-light conditions, unstabilized shaking cameras, multiwireless hops and multiple encoding-decoding steps. All of these factors lower video qualitythe degraded video takes up more bandwidth, is hard to compress and consumes more storage space. The following applications aim to make the camera more intelligent and reliable by lowering the video bandwidth without affecting quality, enhancing video quality for better viewing, eliminating noise and shaking while enhancing color and contrast, and lastly consuming less storage space.

The following are the top five applications used in this category for intelligent cameras:

1. Video Stabilization
Almost all mobile and outdoor cameras shake and need stabilization. This application stabilizes the video stream to allow for clearer and easier viewing; it also consumes minimal storage and takes up less bandwidthtypically there is a 30- to 40-percent reduction in bandwidth utilization.

2. Noise Reduction
This application attempts to remove external noise in video. Such sounds can originate from low signal-to-noise ratio sensors, low-light conditions, high-gain video amplifiers, long or multiple transmissions, strong electromagnetic fields around the wires, or multiple encoding-decoding steps. Video noise can be cleaned out to significantly improve the video quality.

3. Video Contract and Color Enhancement
This application aims to improve the video contrast and colors for better viewing and compression. Examples are shown in the figure above.

4. Deinterlacing or Decombing
This application cleans up the combing effects from moving objects in analog cameras. Analog cameras take 60 frames and do not do progressive 30-FPS scans. Hence the moving object boundary will have jagged edges or comb-like serrations due to motion between the fields. This filter removes these effects and makes video quality better.

5. Backlight Compensation and Scene Adjustment
This application provides backlight compensation and corrects overly dark or bright spots in the video due to AGC, AWB or other factors.

6. Camera Status Checking
This application continuously checks to see if the camera is operating wellthat the camera is not obstructed or moved, defocused or blocked or changed orientation. It is very important for security applications to ensure the camera is not made ineffective or sabotaged.  

Intelligent Encoders

Intelligent encoders, or video servers, are devices that add video enhancement and quality improvement features before the video is encoded. These enhanced encoders provide users with the options of selecting some or all of these video enhancement tools to reduce the network bandwidth and decrease the file storage space. These intelligent encoders are equivalent to intelligent cameras without the image sensor.

Video Analytics Applications

Intelligent video or video analytics applications are becoming increasingly popular. Video analytics are solutions that perform intelligent video analysis and automate video monitoring. They automatically track and identify objects, analyze motion and extract video intelligence from analog or digital video streams. The server or camera can output analysis and video data mining as real-time events or store such information in a database. These applications are focused on automating video analysis and security alerts, thus enabling real-time responses while eliminating the need for manual labor and huge monitoring costs. They also increase the productivity and efficiency of video surveillance systems and the people who monitor them. Some popular applications and examples are illustrated below:

1. Face Detection and Recognition
This application detects and recognizes human faces; its popularity is growing, particularly among indoor cameras, where people are the primary objects of interest. A typical application provides these features:
 Detection of human faces
 Recognition of a person and identification
 Search for people by similar facial features or by name

2. License Plate Recognition
License plate recognition is a popular application for outdoor cameras that detects and records the license plates of vehicles in the camera's field of view. The license plate application requires clear visibility of the plates, has certain requirements for the size of the plates in a video and their angle with respect to the camera. Similar to the face recognition application, the license plate application also provides detection, recognition and search capabilities for vehicle license plates.

3. Intelligent Motion Detection
Many customers request intelligent video motion detection (VMD), a key factor in triggering video recording for forensic analysis and archival purposes. In the real world, there is a lot of false alarms from motion detection due to lighting changes, shadows, movements of water, trees and wind, camera shakes, noise in the video, and much more. Intelligent VMD eliminates false motion and reliably detects motion related to people and/or vehicles that is of interest for forensic analysis and archival. When video recording and archival is controlled reliably by true motion detection, it greatly reduces the bandwidth utilization and makes video archival and recording more useful and efficient.

4. Intrusion Detection and Perimeter Protection
This module looks for valid intrusions by people or vehicles within a closed areaindoor or outdoor. This system avoids false detections from lighting changes and severe weather conditions; it detects and raises alert signals based only upon true intrusions into the defined perimeter.

5. Object Counting
Counting is a popular application for many industries such as retail, transportation, access control, etc. This system counts people, vehicles or other notable objects in the video; it reports all data by a user-defined time increment (hour, day, week, month, etc.).

6. Object Left and Removed Detection
Customers want to detect and send alerts for objects left behind for too long or removed from the scene. For instance, in airports, train stations or corporate lobbies, when an object such as a bag has been left unattended in the scene beyond a certain length of time, the system sends out an alert for "abandoned object." Similarly, if an object of interest is taken away from the scenesuch as a painting from an art museum or items from a warehouse, the system will immediately send an alert that an object has been removed. These intelligent and thus valuable alerts will act as automated operators continuously watching the scene.

7. Abnormal Activities
Abnormal activities include: people loitering, persons or vehicles moving in the wrong direction, stopped objects, people deviating from the normal path, etc. The application continuously monitors the scene and sends automatic alerts upon encountering these abnormal situations.

Events and Metadata from the Camera

Intelligent cameras with analytics will stream out events in IP format. These events are like IP messages that can either be sent out as separate IP data streams or be embedded into the MPG7 video streams as metadata coded inside the stream. Although MPG7 has been around for quite some time, it has not become a standard due to the unavailability of metadata. Typical events will have the following event information: type, date and time, camera details, location, and a sample image for validation. Each of the events can also contain and send objects that caused the event. Object information can include object type, size, location, center, speed, direction, shape, color, height, width, path, history. Metadata tends to be very small in sizetypically only 256 to 1024 bytes per event minimal compared to the several megabytes of video data.


Benefits for vendors of intelligent cameras include:

 Meet strong demand from the market/customers

 Be positioned well in the fastest growing segment of the camera market

 Competitive advantage and strong product differentiation

 Higher market share

 Enhanced brand imagemarket reputation as a provider of leading edge solutions

 Intelligent cameras provide clear ROI to customers; great value proposition Benefits for end users or customers of intelligent cameras:

 Intelligent cameras save money for customers

 Reduces the cost of manual labor and tedious error-prone manual monitoring.

 Computers are more reliable; they work 24/7 (more consistently than manual labor)

 Enables an intelligent, proactive and cost-effective security system,

 Intelligent cameras can scale up with thousands of cameras, while manual labor cannot


Despite the numerous benefits and strong customer demand, there are notable challenges to implementing and marketing intelligent cameras:

 Many companies talk about intelligent cameras, but few have any initiative or clear plans. There is a need for a company-wide commitment, plan and strategic goal to move toward intelligent cameras. Without clear corporate and management dedication, this will not be a reality.

 Price challenges need to be resolved and prices kept lower. The cost of adding extra hardware, processors and software will be in the range of US$100 to $300 per channel, based on various options and features. This will increase significantly when it passes through OEM and distribution channels.

 Companies must select the right video processor and hardware architecture from the many options available. Careful design planning should be done in terms of bandwidth, memory and data throughput (input and output) requirements.

 Vendors need to work with video processing and intelligent video companies to include their software.

 End user market adoption needs to increase and grow faster.

 The technologies need to mature to ensure a higher percentage of true alarms and fewer false alarms.

Summary and Conclusions

Intelligent or smart cameras are clearly the future of video cameras in the security market. There is an increasing need and demand for these cameras, and several companies have released intelligent cameras or encoders recently in the market. The article highlights the exact type of applications to consider and how to "architect" an intelligent camera. Intelligent cameras are of two types: those that process and enhance the video for better quality and lower bandwidth; and/or perform video analytics to detect events and send alerts in real time. There is an immediate need and trend towards implementing the first type in order to enhance video quality, decrease video size and reduce storage requirements.

Advanced video analytics solutions will follow as prices decrease and volume and accuracy increase. IntelliVision highly recommends that camera vendors consider making their cameras more intelligent, as they have clear ROI and benefits, such as higher revenues, competitive advantages and generally lower security system costs, to both manufacturers and end users.

Messe Frankfurt New Era Business Media Ltd. All rights reserved. 2016/10/22