The surveillance technology industry is in the spotlight today, but not for the best of reasons. With controversy surrounding US Immigration and Customs Enforcement tapping into Flock’s network of cameras to monitor people, and home camera maker Ring is under fire for creating new features that would allow law enforcement to ask homeowners for video of their neighborhoods, there’s currently a wide-ranging debate surrounding security, privacy and who can monitor whom.
Controversy isn’t erasing markets, however, and the continued improvement of vision language models only puts more wind in the sails of companies building new ways to help companies monitor what’s going on in their facilities.
According to Matan Goldner, co-founder and CEO of the video surveillance startup Contourthe ethics around this issue are important enough that he says his company is quite selective about which customers to sell to. That might not seem like business sense for a startup just two years in, but Goldner says he can afford to do it because Conntour already has several large government and publicly traded clients, one of which is Singapore’s Central Narcotics Bureau.
“The fact that we have such large clients allows us to choose them and be in control […] We really have control over who uses it, what the use case is, and we can choose what we think is ethical and, of course, legal. We use all our judgment and make decisions based on specific clients that we are okay with [to work with] because we know how they’re going to use it,” Goldner told TechCrunch in an exclusive interview.
That grip helped Conntour not be picky. Investors have taken notice: The startup recently raised a $7 million seed round from General Catalyst, Y Combinator, SV Angel and Liquid 2 Ventures.
Goldner said the round closed within 72 hours. “I think I scheduled about 90 meetings in about eight days, and just after three days – we started on Monday and by Wednesday afternoon, we were done,” he said.
Regardless, Conntour may be right to be picky, especially given how powerful AI tools have become in this space. The company’s video platform uses artificial intelligence models to allow security personnel to search camera feeds using natural language to find any object, person or situation in the footage, in real time — a Google-like search engine specifically designed for security video feeds. It can also monitor and detect threats on its own based on predefined rules and automatically surface alerts.
Unlike legacy systems that depend on predefined definitions or parameters to detect specific objects, movement patterns or behaviors, Conntour claims its system uses natural language and vision language models, which give it a high degree of flexibility and usability. A user can ask, “Find instances of someone wearing sneakers passing a bag in the lobby,” and Conntour’s system will quickly search through all recorded footage or live video streams to return relevant results.
And because the platform is based on artificial intelligence models, users can simply ask questions about the material and receive answers in text, accompanied by the relevant video streams, as well as create incident reports.
The company’s selling point, however, is its scalability. Goldner explained that the platform primarily differs from other AI video search services because it is designed to scale efficiently to systems that include thousands of camera feeds. In fact, he said, Conntour’s system can monitor up to 50 camera feeds from a single consumer GPU like Nvidia’s RTX 4090.
The company does this by using multiple models and logic systems, then determining which models and systems the algorithm should use for each query to require the least amount of computing power to give users the best results.
Conntour claims its system can be deployed entirely on premises, entirely in the cloud, or a combination of both. It can be connected to most security systems already in use or can serve as a complete monitoring platform on its own.
But there’s a long-standing problem in the video surveillance industry: The quality of the surveillance is only as good as the footage being captured. It’s hard to make out details from footage of a poorly lit parking lot captured by a low-resolution camera with a dirty lens, for example.
Goldner says Conntour compensates for this inevitability by providing a trust score alongside its search results. If a camera’s feed source is not of good enough quality, the system will return results with low confidence levels.
Moving forward, Goldner says the biggest technical problem to solve is bringing the full level of LLM capability to her system while maintaining its efficiency.
“We have two things we want to do at the same time that are in conflict with each other. On the one hand, we want to provide full natural language flexibility, LLM style, to let you ask anything. And on the other hand there’s efficiency, so we want to make it use very few resources, because again, the processing [thousands] of feed is just crazy. This contradiction is the biggest technical hurdle and technical problem in our space and one that we are working really, really hard to solve.”
