In 2025, large language models moved beyond benchmarks to efficiency, reliability, and integration, reshaping how AI is ...
Logical Intelligence Achieves 76 Percent on Putnam Benchmark, Highlighting Shift Beyond Large Language Models to Language-free, Mathematically Grounded Models Over the last decade, artificial ...
MLCommons has launched AILuminate, a benchmark designed to assess the safety of large language models and promote standardized AI safety measures.
Researchers from the University of Edinburgh and NVIDIA have introduced a new method that helps large language models reason ...
Z.ai released GLM-4.7 ahead of Christmas, marking the latest iteration of its GLM large language model family. As open-source models move beyond chat-based applications and into production ...
Artificial intelligence has traditionally advanced through automatic accuracy tests in tasks meant to approximate human knowledge. Carefully crafted benchmark tests such as The General Language ...
OpenAI today detailed o3, its new flagship large language model for reasoning tasks. The model’s introduction caps off a 12-day product announcement series that started with the launch of a new ...
China’s latest generation of open large language models has moved from catching up to actively challenging Western leaders on ...
Instead of a single, massive LLM, Nvidia's new 'orchestration' paradigm uses a small model to intelligently delegate tasks to a team of tools and specialized models.
Since April, Xiaomi has released a series of open-source foundation models covering language, multimodal and voice ...