LLM Moe Parameter - Search News

Manifold-Constrained Hyper-Connections: The Architectural Breakthrough That Might Redefine LLM Training

If mHC scales the way early benchmarks suggest, it could reshape how we think about model capacity, compute budgets and the ...

How are Indian firms training LLMs? | Explained

Explore how Indian firms are training Large Language Models, overcoming challenges with data, capital, and innovative ...

Digi Times

ByteDance open-sources COMET to boost MoE efficiency, accelerating LLM training by 1.7x

ByteDance's Doubao AI team has open-sourced COMET, a Mixture of Experts (MoE) optimization framework that improves large language model (LLM) training efficiency while reducing costs. Already ...

SiliconANGLE

DeepSeek open-sources new AI model with 671B parameters

Chinese artificial intelligence developer DeepSeek today open-sourced DeepSeek-V3, a new large language model with 671 billion parameters. The LLM can generate text, craft software code and perform ...

Semiconductor Engineering

The On-Device LLM Revolution

Users running a quantized 7B model on a laptop expect 40+ tokens per second. A 30B MoE model on a high-end mobile device ...

manilatimes

LLM.co Launches Custom Large Language Model Solutions Powered by Moonshot AI’s Kimi K2

The enterprise AI integrator unveils next-generation custom LLMs built on Moonshot AI’s Kimi K2 architecture, bringing trillion-parameter performance and domain-specific intelligence to private and ...

11d

Sarvam AI unveils indigenously-built 30B and 105B LLM models

Sarvam AI launches two advanced LLM models, 30B and 105B, outperforming competitors in key benchmarks, focusing on Indian language support.

Some results have been hidden because they may be inaccessible to you

Show inaccessible results