Advanced Computing in the Age of AI | Friday, May 17, 2024

Datasaur Secures $4M and Unveils New Feature Dinamic to Train Custom Natural Language Processing Models 

SAN FRANCISCO, Aug. 3, 2023 -- Datasaur, a leading natural language processing (NLP) platform that helps annotators train AI algorithms, today announced the close of a $4 million seed funding round and launched a new feature, Datasaur Dinamic, allowing users to easily train custom NLP models. The round was led by Initialized Capital with participation from HNVR, Gold House Ventures, and TenOneTen, bringing Datasaur's total funding to date to $7.9 million. The latest investment will be used to democratize access to the latest advancements in NLP and LLM technology.

As NLP model training processes and platform capabilities have advanced and converged, it is increasingly proprietary datasets that power the unique capabilities of the resulting models. Datasaur has invested the last four years in building an intuitive and efficient platform that enables companies to label their own data which transforms raw data into valuable AI datasets.

With Datasaur's new product Dinamic, users can take this labeled data one step further with a click of a single button to train a custom NLP model. As more data is labeled, the model automatically learns and becomes more accurate and powerful. With this streamlined process, teams can quickly build and iterate on models. Dinamic turns a complex, multi-step process spanning multiple platforms and technologies into a simple two-step process. Companies can now annotate the data based on business requirements and automatically receive a fully trained NLP model, saving millions of dollars in data science costs along the way.

"I've long observed that the primary differentiating factor between NLP models is the underlying training data," said Ivan Lee, CEO and founder of Datasaur. "We initially founded Datasaur with a focus on the labeling platform because that was the most painful, complex, and time consuming step in the NLP development cycle. We've built a configurable and comprehensive interface for labeling the petabytes of raw text and audio data companies have accumulated. Today we are in a perfect storm between the dizzying advancements in LLM technology alongside renewed vigor from business stakeholders in translating AI into cost savings and accelerated revenue generation. At this key inflection point, we're excited to accelerate our product development and help our customers tap into the full potential of NLP."

With OpenAI's president Greg Brockman as an early investor, Datasaur has helped support companies such as Spotify, Google, and Qualtrics label a vast array of text data ranging from Word documents to PDFs to audio clips. The platform employs state of the art techniques such as weak supervision and LLM-labeling to save customers up to 80% of their time and costs. Datasaur's workforce management platform and Conflict Review mode also support teams in scaling their efforts and utilizing best practices to identify errors in their training dataset.

"The NLP space is clearly primed for growth," said Brett Gibson, Managing Partner at Initialized Capital. "We're seeing companies in every industry and vertical rushing to discover how to apply ChatGPT-like technology to their own processes. Over the last few years, we've been impressed by the Datasaur team's ability to take complex technical workflows and condense them into an intuitive experience for data scientists and non-technical annotators alike. The current LLM space is highly fragmented and evolving rapidly. Products like Datasaur Dinamic simplify and standardize the process for those new to the NLP space. We saw the potential in the NLP space in 2020 when we first invested in this team, and the time is ripe to capture the rapidly growing market."

Datasaur has built the NLP industry's most efficient data labeling tool and will leverage that foundation to expand into a full-fledged, all-in-one NLP platform. The company's mission has been to increase accessibility to NLP technologies and support NLP development in international languages for a global audience. Datasaur Dinamic now allows non-technical teams to build and develop their own proprietary NLP solutions.

To learn more about Datasaur, please visit https://datasaur.ai.

About Datasaur

Datasaur leads the NLP industry with its comprehensive and automated data labeling solution. Founded in 2019 and headquartered in Silicon Valley, the company helps financial, legal, and healthcare companies turn raw unstructured data into valuable ML datasets. Prior to Datasaur, CEO Ivan Lee sold his first company Loki Studios to Yahoo and led ML teams at Yahoo and Apple. Datasaur graduated from the Stanford StartX (F19) and YCombinator (W20) accelerators and are backed by Initialized Capital, Greg Brockman (OpenAI President) and Calvin French-Owen (Segment CTO).


Source: Datasaur

EnterpriseAI