Advanced Computing in the Age of AI | Tuesday, September 17, 2024

5 Things Developers Need to Know to Process Computer Vision Data 

The demand for computer vision technologies is growing rapidly as companies employ new AI and automation strategies to stay competitive and provide an excellent customer experience. As the adoption of computer vision platforms continues to soar, it is on track to become a $17.4 billion dollar industry by 2024.

Computer vision is proving to be exceptionally valuable in the retail sector, where it is being leveraged to deliver no-touch checkout experiences, bolster security, and provide real-time store management capabilities. The market for self-checkout technologies is expected to double by 2024, in part as the result of a shortage in labor coupled with increasing labor costs. The current COVID-19 pandemic has further underscored the need for no-touch retail experiences.

However, developing AI-powered contactless checkout systems for brick and mortar retailers can be particularly challenging and complex. These platforms must be able to respond to a massive influx of constantly changing data — in real time — from a variety of sources.

As developers work to create computer vision platforms that provide unparalleled value for retailers, here are five key things they should keep top of mind:

Velocity

Computer vision platforms must be able to adequately handle a large amount of data that is being created at a high velocity.

Developers should consider data strategies to handle the rapid velocity at which vision data is being churned out. For example, this may look like keeping high frequency (~ 500 Hz) data on the store premises, immediately processing low-frequency (~ 5Hz) data in the cloud, and streaming medium frequency (~ 50 Hz) data for application development.

Volume

The volume of data that the computer vision platform is contending with should also be taken into consideration.

Aside from video data from each camera-equipped store, these platforms will need to effectively process other data sets such as transactional data, store inventory data that arrive in different formats from different retailers, and metadata derived from the extensive video captured by store cameras.

Variety

Another hurdle to consider when developing computer vision platforms is managing the variety of vision data that will be processed.

The schema of the metadata may change on a daily basis, so development teams should carefully select databases that are equipped to handle frequent schema changes, mixed data types, and complex objects.

As is common with fast-growing markets, data and analytics requirements are constantly evolving. Adding external data sources, each with a different schema, can require significant effort building and maintaining ETL pipelines. Testing new functionality on transactional data stores is costly and can impact production, while ad hoc queries to measure the accuracy of the checkout process in real time are not possible with traditional data architectures.

To overcome these challenges and support rapid iteration on the product, development teams must rely on flexible and state-of-the-art tools for their prototyping and internal analytics.

Accuracy

Accuracy and precision are paramount when it comes to computer-vision-aided checkout for retail.

Were shoppers charged for the correct number of items? How accurate were the AI models compared to human-resolved events? Any discrepancies can result in revenue loss and a less-than-stellar customer experience.

Engineering teams need to pull from multiple data sets — event streams from the stores, data from vendors, store inventory information, and debug logs — to generate correct accuracy metrics. From there, teams may run ad hoc queries to join across data sets and analyze metrics in real time, rather than wait for asynchronous data lake jobs.

Ease-of-use

Developers of computer vision platforms require tools that allow them to easily and rapidly iterate and build data powered applications on production datasets. Teams must have the ability to drive product improvements — from conception to production — faster than ever, which involves being able to run experiments and analyze real-time metrics in a way that is easy and accessible.

Development teams should consider working with a data stack that can handle data at different frequencies, from different sources, and is easily accessible for ad-hoc analysis, prototyping, and moving new features into production. For these capabilities, many teams are turning to real-time databases in the cloud that offer speed and simplicity when developing new features and verifying the accuracy of AI models.

The need for computer vision platforms that power contactless checkout experiences will only continue to infiltrate the retail space in the coming months and years. Developers who create these technologies can rely on the points above to ensure they’re building products that give retailers the capabilities they need to stay competitive

About the Author 

Dhruba Borthakur is CTO and co-founder of Rockset, responsible for the company's technical direction. He was an engineer on the database team at Facebook, where he was the founding engineer of the RocksDB data store. Earlier at Yahoo, he was one of the founding engineers of the Hadoop Distributed File System. He was also a contributor to the open source Apache HBase project. Dhruba previously held various roles at Veritas Software, founded an e-commerce startup, Oreceipt.com, and contributed to Andrew File System (AFS) at IBM-Transarc Labs.

AIwire