Large Language Models Benchmarks

1don MSNOpinion

AI’s most important benchmark in 2026? Trust

My own trust of chatbots grew in 2025. But it has also diminished.’ In 2026 (and beyond) the best benchmark for large ...

Morningstar

Logical Intelligence Achieves 76 Percent on Putnam Benchmark, Highlighting Shift Beyond Large Language Models to Language-free, Mathematically Grounded Models

Logical Intelligence Achieves 76 Percent on Putnam Benchmark, Highlighting Shift Beyond Large Language Models to Language-free, Mathematically Grounded Models Over the last decade, artificial ...

Unlocking Business Value With Open-Weight Large Language Models

Open-weight LLMs can unlock significant strategic advantages, delivering customization and independence in an increasingly AI ...

How 2025 Recalibrated AI Models Race

In 2025, large language models moved beyond benchmarks to efficiency, reliability, and integration, reshaping how AI is ...

4don MSNOpinion

AI agents arrived in 2025 -- here's what's next for 2026

AI agents have emerged from the lab, bringing promise and peril. A Carnegie Mellon University researcher explains what's ...

Morning Overview on MSN

China’s open AI models are neck-and-neck with the West. What’s next

China’s latest generation of open large language models has moved from catching up to actively challenging Western leaders on ...

ZDNet

With AI models clobbering every benchmark, it's time for human evaluation

Artificial intelligence has traditionally advanced through automatic accuracy tests in tasks meant to approximate human knowledge. Carefully crafted benchmark tests such as The General Language ...

Fox21Online

Z.ai Open-Sources GLM-4.7, a New Generation Large Language Model Built for Real Development Workflows

Z.ai released GLM-4.7 ahead of Christmas, marking the latest iteration of its GLM large language model family. As open-source ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results