Video analytic solutions often don't require high-resolution cameras. This is especially true when Deep Learning is involved.
Camera manufacturers across industries like to compete and market their products based on resolution. We see this is in mobile phones, DSLR/point-and-shoot cameras, and even in the surveillance camera industry. If HD was a gold standard once, we are way past that now to 4 and 8K.
But do you really need such high resolution? The answer obviously is "it depends." Larger resolution cameras can capture more detail, and that's a necessity in some scenarios. But one specific application that's still quite happy with low-resolution footage is video analytics.
Not all applications need high-quality video
According to Lee Shelford, Sales Engineering Manager at Genetec
, applications like people counting, object detection, intrusion, and cross-line detection need minimal resolution. And in cases of applications where deep learning is used, the system scales the footage resolution down before processing.
"Deep learning helps us gain better analytic results because of the data and accuracy that's fed to the computer," Shelford said. "But deep learning functions with a fixed resolution that the neural network is set with. This is often 512x512 pixels, although you may find less/more in some cases. So, even if we feed in footage that's higher than 720p, for example, the first thing that the system does is transcode the video down to 512x512 to meet the requirements of the neural network."
Also read: Top 5 video analytic features for traffic management
Abhishek Mishra, Data Scientist at Dragonfruit AI
, gave more background on the technical factors behind this. The resolution would depend on the model that you use. A popular object detection model, YOLO v4, can provide the option of training and inference in three different resolutions. But a standard resolution to select is 416x416.
"Even if you scale down a higher resolution footage to 416x416, the algorithm can process it well unless your task is very specific," Mishra explained. "You may need a higher resolution for applications like license plate recognition, but if the requirement is only for object detection and applications built upon object detection, like people counting, this standard resolution is adequate."
What it means to the customer
Processing higher resolution footage requires higher processing power, which increases the costs. The cost of high-resolution cameras is also high, increasing the overall cost of the project. If the purpose is basic video analytics, the customer can opt for a low-resolution camera. But if they wish to have a live feed for human monitoring, then high-resolution would also be needed.
"Let's not forget as well that most modern camera technology is capable of sending multiple streams," Shelford added. "We can send that high-resolution feed to the operators so that they can see what's going on in HD or 4K. We can record at a medium resolution and a slightly lower frame rate to save on storage, and then a third stream for video analytics at the required resolution, which may be VGA or 512x512, according to what the neural network is set to."
This is a boon for customers with legacy systems. They don't need to upgrade their hardware to take advantage of the power of video analytics.
"Unless there is a problem with the cameras, I would not advise them to upgrade the hardware," explained Amol Kulkarni, VP and Country Head of Dragonfruit AI. "They can use smart analytic platforms which will work with all kinds of cameras - even analog – and obtain meaningful results without incurring high costs."
Camera manufacturers would continue to increase footage quality. The appeal of a higher number of pixels and the details that come with it cannot be denied. But not all customers and not all scenarios require them
. Most basic video analytics require low resolution. Taking this into consideration while implementing analytics can help the customer reduce their expenses.