Claude 3.5: What you need to know about Anthropic's AI models and chatbot

Claude 3.5: What you need to know about Anthropic's AI models and chatbot https://ift.tt/dSl7ALn

As impressive as today's

AI chatbots are, interacting with them might not leave you with an I, Robot level of existential sci-fi dread (yet).

But according to Dario Amodei, the CEO of

Anthropic, an AI research company, there'sa real risk that AI models become too autonomous—especially as they start accessing the internet and controlling robots. (Anthropic's offices evenfeature a framed picture of a giant robot destroying a city.) Hundreds of other AI leaders and scientistshave also acknowledged the existential risk posed by AI.

To help address this risk, Anthropic did something counterintuitive: they decided to develop a safer

large language model (LLM) on their own. Claude 2, an earlier version of Anthropic's model, was hailed as a potential "ChatGPT killer." Since its release, progress has happened fast—Anthropic's latest update to its LLM, known as Claude 3.5, now surpasses ChatGPT's latest model,GPT-4o, on a range of benchmarks.

In this article, I'll outline Claude's capabilities, show how it stacks up against other AI models, and explain how you can try it for yourself.

What is Claude?

Claude is an AI chatbot powered by Anthropic's flagship LLM, Claude 3.5.

If you've used

ChatGPT orGoogle Gemini, you know what to expect when launching Claude: a powerful, flexible chatbot that collaborates with you, writes for you, and answers your questions.

Anthropic, the company behind Claude, was started in 2021 by a group of ex-OpenAI employees who helped develop OpenAI's GPT-2 and GPT-3 models. In 2023, Anthropic's founders launched Claude 2, the first publicly-available version of their LLM, which stood out from rivals by focusing on AI safety.

While Claude 2 lagged behind OpenAI's

GPT-4, Anthropic's latest model—Claude 3.5 Sonnet, released in June 2024—now beats OpenAI's latest model, GPT-4o, across a range of capabilities.

A chart showing Claude 3's capabilities compared to GPT-4

With the release of Claude 3.5, one of the most significant enhancements to Claude is its coding prowess. A new feature called Artifacts serves as a user interface for coding projects, allowing you to instantly see the results of your code and even interact with it directly in Claude's interface. For example, I was able to quickly create (and play!) a clone of the classic game Snake.

Ryan using Claude to create and play the game Snake

Claude 3.5 also features what Anthropic terms "vision capabilities": it can interpret photos, charts, and diagrams in a variety of formats. This is perfect for enterprise customers looking to extract insights from PDFs and presentations, but even casual users like me will get a kick out of seeing Claude interact with images.

For example, check out Claude's flawless analysis of this photo of a breakfast spread by a pond.

The Claude model family

LLMs take up a staggering amount of computing resources. Because more powerful models are more expensive, Anthropic has released multiple Claude models, each optimized for a different purpose.

Claude 3.5 Sonnet

In June 2024, Anthropic upgraded Sonnet—originally its mid-range model—to Claude 3.5, the latest version of its LLM. That makes it, for now, the most powerful and intelligent model Claude has to offer. (Anthropic will gradually upgrade the rest of its models to Claude 3.5, too).

In Anthropic's tests, Claude 3.5 Sonnet outperforms the latest models from ChatGPT, Gemini, and Llama, as well as other Claude models. Despite Claude 3.5 Sonnet's increased capabilities, it costs $3 per million input tokens—80% cheaper than Claude 3 Opus, which until recently was the most powerful Claude model.

Claude 3 Haiku

At just $0.25 per million input tokens, Haiku is 98% cheaper than Claude's priciest model. It also boasts nearly instant response times, which is crucial if you're using Claude for time-sensitive applications like customer support chats. If you're manipulating large quantities of data, translating documents, or moderating content, this is the model you want.

Claude 3 Opus

With a price of $15 per million input tokens, Opus is a resource-intensive model but performs well on challenging multi-step tasks. Because the cost of using Opus can quickly add up, it's best reserved for complex tasks like financial modeling, drug discovery, research and development, and strategic analysis. Apart from a few specialty use cases, most users will be better served by Claude 3.5 Sonnet, which is five times cheaper and performs better on most benchmarks.

How to try Claude for yourself

For access, sign up at

Claude.ai. From there, you can start a conversation or use one of Claude's default prompts to get started. As a free user, you'll get access to Claude 3.5 Sonnet, Anthropic's most powerful model—though free users only get limited access.

Upgrading to

one of Claude's paid plans gives you higher usage limits, access to additional models (Opus and Haiku), and priority access even during times of high traffic.

You can also use the

Claude API if you want to connect Claude to your own app.

How is Claude different from other AI models?

An infographic showing the names of OpenAI's, Google's, Anthropic's, and Meta's LLMs and chatbots

All

AI models are prone to some degree ofbias and inaccuracy.Hallucinations are a frequent occurrence: when an AI model doesn't know the answer, it often prefers to invent something and present it as fact rather than say "I don't know." (In that respect, AI may have more in common with humans than we think.)

Even worse, an AI-powered chatbot may unwittingly aid in illegal activities—for example, giving users instructions on how to commit a violent act or helping them write hate speech. (Bing's chatbot

ran into some of these issues upon its launch in February 2023.)

With Claude, Anthropic's primary goal is to avoid these issues by creating a "helpful, harmless, and honest" LLM with carefully designed safety guardrails.

While Google, OpenAI, Meta, and other AI companies also consider safety, there are three unique aspects to Anthropic's approach.

Constitutional AI

To fine-tune large language models, most AI companies use human contractors to review multiple outputs and pick the most helpful, least harmful option. That data is then fed back into the model, training it and improving future responses.

One challenge with this human-centric approach is that it's not particularly scalable. But more importantly, it also makes it hard to identify the values that drive the LLM's behavior—and to adjust those values when needed.

Anthropic took a different approach. In addition to using humans to fine-tune Claude, the company also created a second AI model called

Constitutional AI. Intended to discourage toxic, biased, or unethical answers and maximize positive impact, Constitutional AI includes rules borrowed from the United Nations' Declaration of Human Rights and Apple's terms of service. It also includes simple rules that Claude's researchers found improved the safety of Claude's output, like "Choose the response that would be most unobjectionable if shared with children."

The Constitution's principles use plain English and are easy to understand and amend. For example, Anthropic's developers found that early editions of its model tended to be judgmental and annoying, so it added principles to reduce this tendency (e.g., "try to avoid choosing responses that are too preachy, obnoxious, or overly-reactive").

Red teaming

Anthropic's pre-release process includes significant "red teaming," where researchers intentionally try to provoke a response from Claude that goes against its benevolent guardrails. Any deviations from Claude's typical harmless responses become data points that update the model's safety mitigations.

While red teaming is standard practice at AI companies, Anthropic also works with the

Alignment Research Center (ARC) for third-party safety assessments of its model. The ARC evaluates Claude's safety risk by giving it goals like replicating autonomously, gaining power, and "becoming hard to shut down." It then assesses whether Claude could actually complete the tasks necessary to accomplish those goals, like using a crypto wallet, spinning up cloud servers, and interacting with human contractors.

While Claude is able to complete many of the subtasks requested of it, it's (fortunately) not able to execute reliably due to errors and hallucinations, and the ARC concluded its current version is

not a safety risk.

Public benefit corporation

Unlike others in the AI space, Anthropic is a public benefit corporation. That empowers the company's leaders to make decisions that aren't only for the financial benefit of shareholders.

That's not to say that the company doesn't have commercial ambitions—Anthropic partners with large companies like Google and Zoom and recently raised

$7.3 billion dollars from investors—but its structure does give it more latitude to focus on safety at the expense of profits.

Creativity

Anthropic

says Claude has been built to work well at answering open-ended questions, providing helpful advice, and searching, writing, editing, outlining, and summarizing text. And generally, folks agree that it's much better for creative tasks than, say,ChatGPT.

Claude context window

One of Claude's original selling points was its large context window: Claude 3.5 can handle up to 200K tokens per prompt, which is the equivalent of around 150,000 words. And 200K tokens is just the start: for certain customers, Anthropic is approving 1 million token context windows (the equivalent of the entire Lord of the Rings series).

But other AI models are catching up to Claude in this area: GPT-4o

can unofficially far exceed its context window of 128K tokens, and Gemini 1.5 now blows Claude out of the water with a2 million token context window.

Claude's impact on the AI safety conversion

The CEO of Anthropic

argues that to truly advocate safety in the development of AI systems, his organization can't just release research papers. Instead, it has to compete commercially, influencing competitors by continuing to raise the bar for safety.

It may be too early to say if Anthropic's release of Claude is influencing other AI companies to tighten their safety protocols or encouraging governments to engage in AI oversight. But Anthropic has certainly secured a seat at the table: its leaders were

invited to brief U.S. president Joe Biden at a White House AI summit in May 2023, and in July 2023 Anthropic was one of seven leading AI companies that agreed to abide byshared safety standards. Anthropic, along with Google DeepMind and OpenAI, has also committed to providing theU.K.'s AI Safety Taskforce with early access to its models.

It's ironic that a group of researchers scared of an existential threat from AI would start a company that develops a powerful AI model. But that's exactly what's happening at Anthropic—and right now, that looks like a positive step forward for AI safety.

Automate Anthropic

If you decide to use Claude as your AI chatbot of choice, you can

connect it to Zapier, so you can initiate conversations in Claude whenever you take specific actions in your other apps. Learn more abouthow to automate Claude with Zapier, or get started with one of these pre-made templates.

Zapier is a no-code automation tool that lets you connect your apps into automated workflows, so that every person and every business can move forward at growth speed. Learn more about

how it works.

Related reading:

This article was originally published in September 2023. The most recent update was in July 2024.

سوريا الآن

القائمة الرئيسية

الصفحات