Can You Hear Me Now?

Date: 2011/09/08
Source: a&s International

Active video monitoring is igniting a new round of innovation in security. a&s explores an auxiliary sense to modern-day surveillance systems — auditory perception and analytics.

How often do you get distracted by hearing an anomaly in your environment? Currently, most cameras are “deaf,” said Derek van der Vorst, CEO of Sound Intelligence. “Humans are in many cases first triggered by a sound, after which they will turn their heads to see what's going on.” What if a security system could do the same? Car alarms, gunshots, footsteps, signs of aggression and breaking glass are just some of the sounds worth checking out.

Sensory processing and cognition are extremely complex, but end results can be very helpful. Imagine a system that can not only see the scene and hear its surroundings, but also make sense of what it is taking in. A defining characteristic of video cameras and VCA is that it looks at what you point it to, but nothing else. The addition of audio analytics allows the system to not only look at the scene, but also listen in on what is outside the field of view. This is a benefit of audio analytics that is often overlooked, said Chris Mitchell, founder and CEO of Audio Analytic. “Typically, cameras can only detect an event that they are pointing at, and only if lighting conditions are acceptable; audio analytics can see in the dark and hear around corners.”

Audio analytics has been used as a core component within security systems for a considerable amount of time — much longer than people might first imagine, Mitchell said. When it comes to sound detection, the first thing that pops into mind is acoustic-based glass-break detectors, or sound discriminators. Stand-alone acoustic glass-break detectors have been widely used for many years, and now represent a significant market, Mitchell continued. “These units detect the unique sound of breaking glass, and are often connected directly back to the alarm panel or broader security system.”

It was only recently that sophisticated algorithms have started to show the potential to deliver considerable value to mass-market security applications, Mitchell said. “Early attempts to introduce more advanced functions have resulted in complex and extremely expensive solutions, requiring dedicated servers and high-performance microphones. Cost and complexity issues presented real barriers to wide-scale adoption.”

The ongoing transition to IP-based security systems provides a unique opportunity to breathe new life into audio analytics, and recent developments in design, hardware and software have helped embody cost-effectiveness and overall value, from manufacturers down to end users.

RECENT STRIDES
Modern network cameras with built-in microphones provide a simple and scalable way to fully integrate audio analytics into security systems. “This combination leads to systems that deliver considerable value for the end user and signal an area of significant potential growth within the industry,” Mitchell said.

The latest batch of audio analytics solutions can work with microphones built into many network cameras, intercoms or with low-cost, stand-alone microphones. In contrast, earlier solutions required separate, high-end microphones or preamps to be installed and configured. Furthermore, edge devices are becoming increasingly capable, with technical advances driven by demand from the consumer market. Manufactures are embedding audio analytics in edge devices, running on DSPs inside cameras, intercoms, DVRs or other devices, Van der Vorst said. “Currently, the demand is mostly for server-based analytics, but this will shift toward embedded systems in about a year.”

Software development has also seen progress. It was not until recently that sophisticated solutions could run on edge devices or provide multiple channels of processing on an existing VMS, Mitchell said. “Sophisticated signalprocessing algorithms combined with innovative implementations have enabled real-time audio analytics via lightweight software components that can be easily integrated into edge or centralized devices, such as VMS.”

Performance-wise, the greatest advances are in the capability of distinguishing different sounds in many difficult environments, Van der Vorst said. Other improvements include eliminating false alarms, recognizing the buildup of an incident and detecting additional events, such as stressed voices, graffiti spray cans, gunshots, vehicle presence and classification, epileptic seizures and many more. Possibilities seem endless.

On newer devices that take advantage of the increasingly powerful processors, these new solutions take up very little memory and use a fraction of the available processing horsepower, enabling them to run on existing system designs. They have also been demonstrated to detect low-level audio events in noisy environments, delivering acceptable accuracy.