Anthropic's latest flagship model, Claude Sonnet 4.6, is out now.
Although large language models (LLMs) have the potential to transform biomedical research, their ability to reason accurately across complex, data-rich domains remains unproven. To address this ...
Seed-2.0, the latest version of its Doubao large language model series. The company said the Pro variant is benchmarked ...
Bengaluru-based AI startup Sarvam AI on February 18 announced the launch of two new large language models, a 30-billion-parameter model and a 105-billion-parameter model, both trained from scratch, ...
As large language models (LLMs) continue to improve at coding, the benchmarks used to evaluate their performance are steadily becoming less useful. That's because though many LLMs have similar high ...
Artificial intelligence has traditionally advanced through automatic accuracy tests in tasks meant to approximate human knowledge. Carefully crafted benchmark tests such as The General Language ...
Capable of reasoning, designed for voice, and fluent in Indian languages, the model would be ready for population-scale deployment ...
Sarvam launches 30B and 105B parameter indigenous LLMs trained on Indian languages, positioning India closer to a sovereign, voice-first AI ecosystem ...
The company said the model is optimised for “efficient thinking”, delivering stronger responses while using fewer tokens — a key factor in reducing inference costs in production environments.
What if the tools we trust to measure progress are actually holding us back? In the rapidly evolving world of large language models (LLMs), AI benchmarks and leaderboards have become the gold standard ...