Advanced Computing in the Age of AI | Thursday, May 2, 2024

Anthropic Unveils Claude Version 2.1 

Nov. 22, 2023 -- Anthropic has announced that its latest model, Claude 2.1, is now available over API in the Anthropic Console and is powering the Company's claude.ai chat experience. Claude 2.1 delivers advancements in key capabilities for enterprises—including an industry-leading 200K token context window, significant reductions in rates of model hallucination, system prompts and our new beta feature: tool use. Anthropic is also updating pricing to improve cost efficiency for customers across models.

200K Context Window

Since launch earlier this year, Claude has been used by millions of people for a wide range of applications—from translating academic papers to drafting business plans and analyzing complex contracts. Users have asked for larger context windows and more accurate outputs when working with long documents.

In response, Anthropic is doubling the amount of information you can relay to Claude with a limit of 200,000 tokens, translating to roughly 150,000 words, or over 500 pages of material. Claude users can now upload technical documentation like entire codebases, financial statements like S-1s, or even long literary works like The Iliad or The Odyssey. By being able to talk to large bodies of content or data, Claude can summarize, perform Q&A, forecast trends, compare and contrast multiple documents, and much more.

Processing a 200K length message is a complex feat and an industry first. While the Company is excited to get this powerful new capability into the hands of users, tasks that would typically require hours of human effort to complete may take Claude a few minutes. We expect the latency to decrease substantially as the technology progresses.

2x Decrease in Hallucination Rates

Claude 2.1 has also made significant gains in honesty, with a 2x decrease in false statements compared to the previous Claude 2.0 model. This enables enterprises to build high-performing AI applications that solve concrete business problems and deploy AI across their operations with greater trust and reliability.

Credit: Anthropic

Anthropic tested Claude 2.1’s honesty by curating a large set of complex, factual questions that probe known weaknesses in current models. Using a rubric that distinguishes incorrect claims (“The fifth most populous city in Bolivia is Montero”) from admissions of uncertainty (“I’m not sure what the fifth most populous city in Bolivia is”), Claude 2.1 was significantly more likely to demur rather than provide incorrect information.

Claude 2.1 has also made meaningful improvements in comprehension and summarization, particularly for long, complex documents that demand a high degree of accuracy, such as legal documents, financial reports and technical specifications. In evaluations, Claude 2.1 demonstrated a 30% reduction in incorrect answers and a 3-4x lower rate of mistakenly concluding a document supports a particular claim.

Credit: Anthropic

While these accuracy improvements are encouraging, enhancing the precision and dependability of outputs for users remains a top priority for Anthropic's product and research teams.

API Tool Use

By popular demand, the Company has also added tool use, a new beta feature that allows Claude to integrate with users' existing processes, products, and APIs. This expanded interoperability aims to make Claude more useful across users’ day-to-day operations.

Claude can now orchestrate across developer-defined functions or APIs, search over web sources, and retrieve information from private knowledge bases. Users can define a set of tools for Claude to use and specify a request. The model will then decide which tool is required to achieve the task and execute an action on their behalf, such as:

  • Using a calculator for complex numerical reasoning
  • Translating natural language requests into structured API calls
  • Answering questions by searching databases or using a web search API
  • Taking simple actions in software via private APIs
  • Connecting to product datasets to make recommendations and help users complete purchases

Tool use is currently in early development— Anthropic is building developer features and prompting guidelines for easier integration into applications. Anthropic encourages users to share feedback on tool use to help shape and improve the product.

Developer Experience

Anthropic has been working to simplify the developer Console experience for Claude API users while making it easier to test new prompts for faster learning. The new Workbench product enables developers to iterate on prompts in a playground-style experience and access new model settings to optimize Claude’s behavior. They can create multiple prompts and navigate between them for different projects, and revisions are saved as they go to retain historical context. Developers can also generate code snippets to use their prompts directly in one of our SDKs.

Anthropic is also introducing system prompts, which allow users to provide custom instructions to Claude in order to improve performance. System prompts set helpful context that enhances Claude’s ability to take on specified personalities and roles or structure responses in a more customizable, consistent way aligned with user needs.

Claude 2.1 is available now in Anthropic's API, and is also powering the chat interface at claude.ai for both the free and Pro tiers. Usage of the 200K token context window is reserved for Claude Pro users, who can now upload larger files than ever before. The Company can't wait to see the use cases these new features inspire as we work to build the safest and most technically sophisticated AI systems in the industry.


Source: Anthropic

EnterpriseAI