Natural language-based rules engines allow users to create alarms and algorithms simply by inputting a text command. Related solutions have already emerged in video surveillance. Meanwhile, challenges still remain to be resolved.
Natural language-based rules engines allow users to create alarms and algorithms simply by inputting a text command. Related solutions have already emerged in video surveillance. While such technology provides convenience for users, challenges still remain to be resolved.
Setting up rules or alarms with natural language has gained attention in security, especially amid the rise of natural language models, large-scale AI models and ChatGPT. With natural language-based rules engines, alarms can be created in a shorter period of time by people who are not necessarily trained in this regard. All they have to do is input a text prompt for example “Alert me if a car stops in the loading zone for more than 5 minutes.” The technology can benefit end user organizations in a variety of vertical markets.
"Early use cases include video search and alert configuration in security operations,” said Albert Stepanyan, President and CEO of Scylla AI, adding that verticals like healthcare, education, retail, and critical infrastructure stand to benefit as non-technical staff can efficiently configure and manage surveillance without specialized training.
Related solutions
Given the popularity and potential of natural language-based rules engines, related solutions have already emerged. Below we take a look at some of them.
Dahua
Dahua has launched their
Xinghan large-scale AI models, of which Text-defined Alarms are a main feature. According to Dahua, with current, conventional AI technologies, the development of new algorithms is expensive, requires substantial investment of human and material resources, and is subject to long customization cycles. That said, Xinghan’s Text-defined Alarms supports custom arming via text descriptions. New algorithms can be developed through prompt text, greatly reducing the development threshold. Further, after creating a new algorithm using Text-defined Alarms in recorders (IVSS), the user can directly perform local training within the same device to save time and cost.
Network Optix
Network Optix (Nx) has announced Nx AI Manager, the newest addition to Nx’s Toolkit technology. Nx AI Manager is designed to help developers deploy, manage, and optimize AI models across a wide range of hardware accelerators. Among its various features, Nx AI Manager can
pair with CLIP to enables flexible natural language detection directly at the edge. CLIP or Contrastive Language–Image Pretraining is a vision-based language model developed by OpenAI that can evaluate how well an image matches a given text prompt. Once CLIP recognizes something that matches a prompt, the events rules engine takes over, enabling the system to automatically trigger an alarm, send an email, or display a notification in real time.
Hikvision
Search by text is another natural language application that has gained popularity in security of late. In this regard, Hikvision has unveiled the
AcuSeek NVR powered by their Guanlan large-scale AI models. The solution seeks to address the critical industry challenge where security personnel often spend hours manually reviewing footage frame by frame during incident investigations. With Hikvision’s solution, users can simply input a single phrase or keyword such as "white van" or "person walking a dog," and the system rapidly extracts subject features from video footage, enabling precise video and image retrieval within seconds.
Challenges
Despite the benefits offered by natural language-based rules engines, several challenges still exist that need to be overcome. One is ambiguity in language – for example saying “near the entrance” is somewhat vague compared to giving the exact coordinates. The complexity of translating user queries into precise system actions can also be a challenge.
“These can be addressed through guided NLP interfaces that help users phrase rules more clearly, along with strong computer vision backbones to ensure reliable detection,” Stepanyan said.
Challenges aside, using natural language to create alarms, rules and algorithms is poised to become a trend in video surveillance. “It’s part of a clear trend toward more accessible and user-friendly analytics. The focus is shifting to making surveillance systems easier to use without over-relying on heavy NLP models – with computer vision and hybrid edge-cloud setups ensuring accuracy and efficiency,” Stepanyan said.