Demand for AI Training Data Spurs a Cottage Industry
Growing concerns about AI replacing workers is being offset by a cottage industry of startups providing entry-level employment in developing countries to tag, label and otherwise prepare data used to train AI models. Demand for those data sets is, of course, soaring.
Among the startups is CloudFactory, which employs workers around the world on “virtual assembly lines” to meet the demand for labeled data. The U.K. company announced this week it has raised an additional $65 million in venture funding.
According to the web site Crunchbase.com, CloudFactory has so far raised $78 million in five funding rounds.
The latest funding round was led by FTV Capital, with participation from Weatherford Capital.
CloudFactory operations in Kenya and Nepal use more than 100 annotation tools to produce high-quality data that can be used to train a variety of AI and machine learning algorithms. Along with proprietary tools, the company’s “delivery centers” use tools from partners Dataloop, Deepen, Hivemind and Onepanel.
The company said it would use the new funding for workforce development and training. Among those efforts are introducing new automation tools in an effort to offload routine task so local workers can concentrate on adding more value to their data preparation tasks.
“The future of AI and machine learning innovation will be driven by people, and we believe our focus on people has been the biggest contributor to our growth,” said Mark Sears, CloudFactory’s founder and CEO.
“A company can have the best tools, but people are the critical factor in determining the quality of data feeding these algorithms,” Sears added. “By valuing our workforce above all else they become more engaged in their work and, using the best tools, deliver the highest quality data for our customers.”
Data annotation services are booming. In August, for example, startup Scale AI announced a $100 million funding round, along with a company valuation of $1 billion. The company develops software used to annotate images for training models used in robotics, autonomous vehicles and drones.
The software organizes images that are then handed off to contract workers for tagging and labeling.