lingo.lol is one of the many independent Mastodon servers you can use to participate in the fediverse.
A place for linguists, philologists, and other lovers of languages.

Server stats:

63
active users

#anthropic

3 posts2 participants0 posts today
Continued thread

Claude LLM: Introducing web search on the Anthropic API
anthropic.com/news/web-search-
news.ycombinator.com/item?id=4

... Web search is now available on the Anthropic API for Claude 3.7 Sonnet, the upgraded Claude 3.5 Sonnet, and Claude 3.5 Haiku at $10 per 1,000 searches plus standard token costs.

The current free version (limited yet still awesome use) is Claude 3.7.
Very high-performing LLM.

Same comments re:
* DeepSeek DeepThink (R1)
* (Google) Gemini 2.0 Flash

An illustration of a globe in between brackets
www.anthropic.comIntroducing web search on the Anthropic APIToday, we're introducing web search on the Anthropic API—a new tool that gives Claude access to current information from across the web.
Continued thread

The #FairUse question hangs over lawsuits brought by #authors, #news outlets & other #copyright owners against companies including #Meta, #OpenAI & #Anthropic. The #legal doctrine allows for the use of copyrighted work without the copyright owner's permission under some circumstances.

The authors in the Meta case sued in 2023, arguing the company used pirated versions of their books to train #LLaMa without permission or compensation.

#law#tech#AI

"To test this out, the Carnegie Mellon researchers instructed artificial intelligence models from Google, OpenAI, Anthropic, and Meta to complete tasks a real employee might carry out in fields such as finance, administration, and software engineering. In one, the AI had to navigate through several files to analyze a coffee shop chain's databases. In another, it was asked to collect feedback on a 36-year-old engineer and write a performance review. Some tasks challenged the models' visual capabilities: One required the models to watch video tours of prospective new office spaces and pick the one with the best health facilities.

The results weren't great: The top-performing model, Anthropic's Claude 3.5 Sonnet, finished a little less than one-quarter of all tasks. The rest, including Google's Gemini 2.0 Flash and the one that powers ChatGPT, completed about 10% of the assignments. There wasn't a single category in which the AI agents accomplished the majority of the tasks, says Graham Neubig, a computer science professor at CMU and one of the study's authors. The findings, along with other emerging research about AI agents, complicate the idea that an AI agent workforce is just around the corner — there's a lot of work they simply aren't good at. But the research does offer a glimpse into the specific ways AI agents could revolutionize the workplace."

tech.yahoo.com/ai/articles/nex

Yahoo Tech · Carnegie Mellon staffed a fake company with AI agents. It was a total disaster.By Shubham Agarwal

🤖👨‍💻 Нове дослідження Microsoft #Research виявило, що навіть передові ШІ-моделі o1 від #OpenAI та Claude 3.7 Sonnet від #Anthropic здатні виправляти помилки в коді не більше ніж у половині випадків. Тестування проводилося на базі бенчмарку SWE-bench.

В ході експерименту ШІ-агенти намагалися вирішити 300 завдань, які стосувались налагодження коду. Лідером стала модель Claude 3.7 Sonnet, яка виконала завдання з успішністю на 48,4%, друге місце посіла OpenAI o1 (30,2%), третє – o3-mini (22,1%).

Reuters: Anthropic wins early round in music publishers’ AI copyright case. “Artificial intelligence company Anthropic convinced a California federal judge on Tuesday to reject a preliminary bid to block it from using lyrics owned by Universal Music Group (UMG.AS), opens new tab and other music publishers to train its AI-powered chatbot Claude.”

https://rbfirehose.com/2025/03/27/reuters-anthropic-wins-early-round-in-music-publishers-ai-copyright-case/

Replied in thread