Advanced Computing in the Age of AI | Monday, April 22, 2024

Cash Prizes for COVID-19 AI Contest Testing Data Scientists’ NLP Smarts 

Ten thousand dollars in prize money is available to data scientists with natural language processing expertise competing in a contest designed to help rid the world of coronavirus.

Kaggle, a web-based information sharing site and public arena for data scientists out to prove they’re the smartest of the smart, has initiated new challenges designed to advance machine learning-based analysis and access into more than 44,000 scholarly articles about COVID-19 and related coronavirus problems.

Called the COVID-19 Open Research Dataset Challenge (CORD-19), it comes in the form of 10 questions covering such risk factors, virus genetics, therapeutics and other topics. One question, “What is known about transmission, incubation, and environmental stability?” already has 40 submissions for the $1000 prize that goes to the best answers to each of the 10 questions.

Kaggle said the CORD-19 dataset, presented in machine readable format so that text and data mining techniques can “find answers to questions within, and connect insights across, this content,” was prepared by a coalition of research groups in coordination with the White House. It includes more than 29,000 full-text articles about COVID-19, SARS-CoV-2 and related coronaviruses.

The goal is to reduce the time required for healthcare and public policy professionals to find the coronavirus-related information they need.

“We are issuing a call to action to the world's artificial intelligence experts to develop text and data mining tools that can help the medical community develop answers to high priority scientific questions,” Kaggle stated. “This freely available dataset is provided to the global research community to apply recent advances in natural language processing and other AI techniques to generate new insights in support of the ongoing fight against this infectious disease. There is a growing urgency for these approaches because of the rapid acceleration in new coronavirus literature, making it difficult for the medical research community to keep up.”

Kaggle said the dataset, presented in machine readable form, was created by the Allen Institute for AI in partnership with the Chan Zuckerberg Initiative, Georgetown University’s Center for Security and Emerging Technology, Microsoft Research and the National Library of Medicine - National Institutes of Health, working with the White House Office of Science and Technology Policy.

Kaggle today finished accepting entries for another coronavirus competition, a COVID-19 forecasting challenge to help answer questions developed by World Health Organization and the National Academies of Sciences, Engineering, and Medicine.

Kaggle noted that instead of accepting cash awards, winners of the challenge may choose to deliver prize money to charitable organizations conducting COVID-19 research or relief efforts.