
























Baseten Inc., a startup with a platform for running artificial intelligence inference workloads, is raising $1.5 billion in funding.
The Wall Street Journal reported today that Altimeter Capital, Conviction, Spark Capital, Sands Capital and Wellington Management are co-leading the deal. It’s unclear whether there are additional participants. Some of the investors are buying shares at an $11 billion valuation while the other backers’ term sheets specify a $13 billion valuation.
Setting up a cloud-based inference cluster involves a significant amount of work. Developers have to provision graphics cards, configure them, link them together and install a large number of software tools. Baseten provides a platform that automates the workflow. The software is available as a managed service and as a standalone application that companies can deploy in their public cloud environments.
Baseten’s platform is powered by three core modules the company calls inference engines. They optimize the performance of customers’ AI models and collect data about technical issues.
The first inference engine, BIS-LLM, is designed power large language models with a mixture of experts architecture. A mixture of experts LLM comprises multiple neural networks that are each geared towards different tasks. BIS-LLM improves the efficiency of such models by optimizing their KV cache, a data structure that stores information necessary for inference. When a model’s token usage increases, BIS-LLM automatically provisions more hardware.
The second inference engine is called Engine-Builder-LLM. It’s optimized for dense LLMs, which are models that comprise a monolithic collection of artificial neurons rather than multiple neural networks. AI models usually generate output one token at a time. Engine-Builder-LLM uses a technology called lookahead decoding to generate multiple tokens at once, which speeds up processing.
The third core inference engine, BEI, is geared towards simpler AI models. It can power embedding models, which turn raw data into a format that LLMs understand, as well as data classification and search models.
Baseten uses a software module called MCM to spread inference workloads across multiple public clouds. If one of the clouds experiences an outage, MCM reroutes prompts to the platforms that are still online. According to Baseten, the technology’s ability to switch providers is also handy when a company’s main public cloud has a shortage of graphics cards.
The platform provides out of the box support for several dozen open-source AI models. Additionally, customers can deploy custom algorithms using a tool called Truss. It automates the task of packaging an LLM into a Baseten-compatible format.
Baseten can not only perform inference with custom LLMs but also train them. According to the company, its platform includes a backup feature that periodically saves copies of a neural network while it’s being trained. If a technical issue crops up, developers can restore the most recent backup copy instead of starting the training workflow from scratch.
The funding comes less than six months after its previous raise. The $300 million investment included contributions from Nvidia Corp. and CapitalG, Alphabet Inc.’s growth-stage startup investment arm.
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。