北航、人大和九坤投资共同撰写的论文 《Scaling Laws for Code: Every Programming Language Matters》 整理而成。 在代码大模型(Code LLMs)的预训练中,行业内长期存在一种惯性思维,即把所有编程语言的代码都视为同质化的文本数据,主要关注数据总量的堆叠。然而,现代软件开发本质上是多语言混合的,不同语言的语法特性、语料规模和应用场景差异巨大。
在代码大模型(Code LLMs)的预训练中,行业内长期存在一种惯性思维,即把所有编程语言的代码都视为同质化的文本数据,主要关注数据总量的堆叠。然而,现代软件开发本质上是多语言混合的,不同语言的语法特性、语料规模和应用场景差异巨大。如果忽略这些差异,笼统地应用通用的 Scaling Laws,往往会导致性能预测偏差和算力浪费。
You might be staring at your budget, wondering how you’re supposed to cover rent, debt, and everything else on $20–$25 an ...
Java will be 30 years old in 2025. This is a good time to look back, but also forward.
科技行者 on MSN
北航团队首次揭秘多语言编程的奥秘:为什么Python比Rust更“饿”数据?
这项由北京航空航天大学的杨健、国鑫、林静等研究者联合优矿公司和中国人民大学人工智能学院团队完成的突破性研究,发表于2025年12月的arXiv预印本(论文编号:2512.13472v1),是全球首次系统性探索多语言编程训练规律的重要成果。
Coding Dojo published data on the programming languages and frameworks that the top unicorns use, like WeWork, Juul, Airbnb, and SpaceX.
There is a common misconception that AI applications can be sufficiently tested and derisked by running a pilot in a ...
No, Microsoft is not rewriting Windows in Rust. The clarification comes after a LinkedIn post by a Microsoft Distinguished ...
Newer languages might soak up all the glory, but these die-hard languages have their place. Here are eight languages ...
Start 2026 strong with 25+ practical New Year resolution ideas for Indian students. Boost your academic performance, personal ...
Whether you're a scientist brainstorming research ideas or a CEO hoping to automate a task in human resources or finance, you'll find that artificial intelligence (AI) tools are becoming the ...
Understanding the core principles of computer programming is the first step to writing effective code. Learning about algorithms and data structures helps you solve problems more efficiently. Writing ...
一些您可能无法访问的结果已被隐去。
显示无法访问的结果