惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

H
Hackread – Cybersecurity News, Data Breaches, AI and More
S
Schneier on Security
罗磊的独立博客
Recorded Future
Recorded Future
Hacker News - Newest:
Hacker News - Newest: "LLM"
G
Google Developers Blog
博客园_首页
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
T
The Blog of Author Tim Ferriss
Know Your Adversary
Know Your Adversary
L
Lohrmann on Cybersecurity
C
Cybersecurity and Infrastructure Security Agency CISA
博客园 - 三生石上(FineUI控件)
M
MIT News - Artificial intelligence
B
Blog
T
Tor Project blog
D
Docker
Engineering at Meta
Engineering at Meta
Apple Machine Learning Research
Apple Machine Learning Research
Spread Privacy
Spread Privacy
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
Scott Helme
Scott Helme
MyScale Blog
MyScale Blog
量子位
T
The Exploit Database - CXSecurity.com
小众软件
小众软件
aimingoo的专栏
aimingoo的专栏
IT之家
IT之家
AWS News Blog
AWS News Blog
Google Online Security Blog
Google Online Security Blog
NISL@THU
NISL@THU
D
DataBreaches.Net
Help Net Security
Help Net Security
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
Cloudbric
Cloudbric
美团技术团队
W
WeLiveSecurity
H
Hacker News: Front Page
宝玉的分享
宝玉的分享
The Cloudflare Blog
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
爱范儿
爱范儿
N
News and Events Feed by Topic
V
Visual Studio Blog
C
CERT Recently Published Vulnerability Notes
T
Tailwind CSS Blog
MongoDB | Blog
MongoDB | Blog
F
Fortinet All Blogs
B
Blog RSS Feed
S
Security Affairs

Pinecone

Pinecone Assistant: A Managed Knowledge Layer for Production AI Applications Multi-domain RAG in n8n: why one knowledge base is not enough Allspice Transforms the Culinary Experience with Semantic Search Powered by Pinecone | Pinecone Building RAG workflows in n8n: choosing the right Pinecone node Knowledge needs a meta-knowledge layer Garbage Day: How Pinecone Safely Deletes Billions of Objects at Scale When "Performance" Means Two Different Things Pinecone BYOC: Pinecone in your AWS, GCP, or Azure account, no vendor access True, Relevant, and Wrong: The Applicability Problem in RAG Use the Pinecone Plugin for Claude Code to develop AI Applications Faster Millions at Stake: How Melange's High-Recall Retrieval Prevents Litigation Collapse Powering High-stakes Patent Search at Scale: How Melange Built a Reliable AI System on Pinecone | Pinecone Pinecone Assistant Node in n8n: Turn Any Data Source Into Knowledge RAG with Access Control Pinecone Dedicated Read Nodes are now in Public Preview Inside Pinecone: Slab Architecture New Bulk Data Operations: Update, Delete, and Fetch by Metadata The Hidden Cost of Building: Lessons from Aquant Simplifying Vector Embeddings with Pinecone Integrated Inference Capabilities Pinecone joins Microsoft Marketplace as a Launch Partner GTM Engineering: Clay + Pinecone for AI-powered Sales Outbound Build an AI knowledge assistant with Google Docs and Pinecone Moving Pinecone forward with Ash Ashutosh as CEO and Edo spearheading our growing AI ambitions as Chief Scientist Pinecone Founder Edo Liberty to Spearhead Pinecone’s Growing AI Ambitions; Appoints Ash Ashutosh as CEO to Expand Vector Database Market Leadership Fast, Accurate Retrieval for Creators at Scale: Delphi’s Path Toward a Million Conversational Agents with Pinecone | Pinecone Announcing Pinecone Pioneers: A Program for Builders, Organizers, and Community Leaders What is Context Engineering? Chunking Strategies for LLM Applications Beyond the hype: Why RAG remains essential for modern AI Obviant Makes 30% More Accurate Defense Acquisition Recommendations Combining Sparse and Dense Retrieval with Pinecone | Pinecone Build more knowledgeable AI applications with new LLMs and greater control in Pinecone Assistant #NYTECHWEEK 2025 Retrieval-Augmented Generation (RAG) Accurate and Efficient Metadata Filtering in Pinecone’s Serverless Vector Database | Pinecone Terminal X AI Agents, Powered by Pinecone, Turn Complex Financial Data Into Production-grade Insights at Scale | Pinecone Aquant Delivers Scalable, Expert-level Service Intelligence with Pinecone | Pinecone Cascading retrieval with multi-vector representations: balancing efficiency and effectiveness Vector databases aren't just for large-scale enterprise AI Unveiling DIME: Reproducibility, Scalability, and Formal Analysis of Dimension Importance Estimation for Dense Retrieval | Pinecone Fast and Effective Early Termination for Simple Ranking Functions | Pinecone Domain-specific AI Agents at Scale: CustomGPT.ai Serves 10,000+ Customers with Pinecone | Pinecone Using Pinecone asynchronously with FastAPI A Flexible Resource for Top-Weighted Comparisons Between Sets and Rankings | Pinecone Build secure, scalable agentic AI workflows with Rubrik Annapurna and Pinecone Tool up: Pinecone’s first MCP servers are here Add context to your agent with Pinecone Assistant MCP remote server E2Rank: Efficient and Effective Layer-wise Reranking | Pinecone ColBERT-serve: Efficient Multi-Stage Memory-Mapped Scoring | Pinecone Efficient Constant-Space Multi-Vector Retrieval | Pinecone How Vanguard Worked with Pinecone to Boost Customer Support with Faster Calls and 12% More Accurate Responses | Pinecone Pinecone Named to Fast Company's Annual List of the World's Most Innovative Companies of 2025 Launch Week: Pinecone for agents, search, recommendations, and more Optimizing Pinecone for agents (and more) Retrieval Inference for scale and performance How 1up Turns Sales Reps Into Product Experts with Pinecone | Pinecone Don’t be dense: Launching sparse indexes in Pinecone Unlock High-Precision Keyword Search with pinecone-sparse-english-v0 Evolving Pinecone's architecture to meet the demands of Knowledgeable AI Pinpoint references faster with citation highlights in Pinecone Assistant Bringing the leading vector database to your cloud Getting started with llama-text-embed-v2 Natural Language Counterfactual Explanations for Graphs Using Large Language Models | Pinecone Easily build knowledgeable chat and agent-based applications in minutes with Pinecone Assistant, now generally available How to build an agentic, chat or RAG knowledge system using Pinecone Assistant Real-time RAG with Pinecone and Estuary Flow BigQuery to Pinecone in Real-Time with Estuary Flow Stravito Turns Market and Consumer Data Into Actionable Insights with Pinecone Inference | Pinecone Accelerate prototyping and development with Pinecone Local First-of-its-kind Pinecone Knowledge Platform to Power Best-in-class Retrieval for Customers Introducing integrated inference: Embed, rerank, and retrieve your data with a single API Strengthening security and increasing control with CMEK and API key roles Introducing Pinecone Rerank V0 Introducing cascading retrieval: Unifying dense and sparse with reranking From Idea to Action: How Pinecone Assistant Meaningfully Accelerates AI Business Building AI apps on Azure with Pinecone just got a lot easier Building a reliable, curated, and accurate RAG system with Cleanlab and Pinecone Four features of the Assistant API you aren't using - but should Deploying Pinecone with Infrastructure as Code (IaC) Streamlining CI/CD with Pinecone Local September 2024 Product Update Results of the Big ANN: NeurIPS'23 competition | Pinecone Introducing import from object storage for more efficient data transfer to Pinecone serverless Simplify, enhance, and evaluate RAG development with Pinecone Assistant, now in public preview Vectors and Graphs: Better Together August 2024 Product Update Pinecone Helps Deep Talk Deliver World-Class AI Assistants with Lower Engineering Overhead | Pinecone Assembled Delivers Better, Faster AI- Driven Support with Pinecone | Pinecone Llama 3.1 Agent using LangGraph and Ollama Build knowledgeable AI with Pinecone serverless, now generally available on Microsoft Azure Pinecone serverless is now generally available on Google Cloud, adding knowledge to AI assistants and other applications Accelerating Legal Discovery and Analysis with Pinecone and Voyage AI Bridging Dense and Sparse Maximum Inner Product Search | Pinecone Refine Retrieval Quality with Pinecone Rerank Introducing reranking to Pinecone Inference to simplify building accurate AI July 2024 Product Update Connect to Pinecone within your platform to enable a seamless AI development experience Introducing Pinecone API Versioning RAG Brag with Inkeep Co-Founder Nick Gomez LangGraph and Research Agents Introducing Pinecone Inference to streamline your AI workflow
Introducing the First Hallucination-Free LLM
Edo Liberty · 2024-04-01 · via Pinecone

While Pinecone is most known for the vector database which helps reduce hallucinations through Retrieval Augmented Generation, we’re also investing in finding other ways to reduce hallucinations. Today, we’re excited to announce a breakthrough in our research: The first-ever LLM that never hallucinates — ever.

It’s called Luna, and we will open-source the model eventually, but for now, due to the far-reaching implications of an AI model that never hallucinates, we’re only sharing the model’s source and weights with vetted institutions.

The motivation: LLMs hallucinate without access to company data

Hallucinations are the predominant reason why most AI applications never reach production. While LLMs answer most questions about public information, they don’t have sufficient knowledge to answer questions that require access to private data. While this is already being addressed with RAG — using a vector database to retrieve and feed relevant context to the LLM — we wondered if there was an even easier way.

Our novel approach targets the root issue causing all other LLMs to hallucinate: They don’t know the limits of their knowledge, so they often fail to admit when they don’t know the answer. And so they make something up. And therein lies the key insight: A model will never hallucinate if it always admits what it does not know.

O Light Eternal, in Thyself contained!
Thou only know Thyself, and in Thyself
Both known and knowing, smile on Thyself!

How it works: Information-free training

The result of many months of research — conducted in a previously undisclosed satellite Pinecone office in Bowling Green, Kentucky — and many millions of dollars spent on GPUs is a 122B-parameter AI model designed to address hallucinations without access to domain-specific knowledge.

The model was developed with a novel technique we call information-free training. Just as Alpha-zero made history [1] by becoming the best chess engine in the world merely by playing itself and without knowledge of historical games, our model does the same for factual question-answering tasks. Rather than being trained on public, semi-public, accidentally public, and questionably public data, the model was trained by endlessly asking itself questions and measuring the resulting answer quality. The technique also draws on ideas from Ming-Wei et al.[3] and other work on zero-shot learning.

Our scientists noticed a strong correlation between trying to answer questions factually and hallucination. We define the assumed knowledge factor (AKF) as the confidence level set by the model when it forms factual content. High levels of AKF indicate high confidence that factual sentences contain correct information. Low AKF makes the model more unsure about its answers’ factual contents. Note that AKF correlates positively with hallucinations.

LLM Hallucinations vs Assumed Knowledge Factor

IMAGE 1: Rate of hallucinations when training Luna as a function of AKF.

The key insight with training Luna is to consider the other extreme value of AKF. That is, what happens when you set AKF to zero?

Zero hallucinations at low range of Assumed Knowledge Factor

IMAGE 2: The low range of the AKF scale, rate of hallucinations when training Luna as a function of AKF.

Amazingly enough, slowly adjusting AKF all the way to zero while training Luna reduced hallucinations to precisely 0%. To our knowledge, this is the first LLM to achieve this feat.

Optimizing AKF for minimal hallucination of the AI model

Based on our experiments, the equation above gives the best-performing adjustment schedule for AKF (denoted by Zeta). Here, t gives the epoch training index and the values of X are only loosely defined to have some relation to the factualness and correctness of the output. Note that conditional probability over loosely defined variables makes training much more complicated and compute-intensive. We will elaborate on these technical difficulties and consequent solutions in a future technical report.

Performance: Zero hallucinations... at a cost

Luna is not (yet) the best model in the world on all fronts. Achieving zero hallucinations comes at a steep price of significantly diminished performance on other tasks. When reviewing results, we found Luna tends to answer pretty much all questions with some version of “I don’t know.” Therefore, the results are relatively poor on coding (0%) and task completion (0%), as well as usefulness (0%).

While this might diminish the magnitude of the achievements, one must remember that Luna achieved these results without access to any information.

It is not clear whether these results can be improved.

Latency vs memory for different LLM architectures

IMAGE 3: Results from HugginFace LLM-Perf Leaderboard [2]. The results clearly show how different models perform on Latency vs Memory.

IMAGE 4: Interacting with Luna, the hallucination-free AI model, in a chatbot.

Future Research

Pinecone is heavily invested in AI research as a whole and knowledgeable AI specifically. We’re deeply committed to advancing the state of the art in this field, and we’re hiring.

That said, we will probably halt further research on information-free training. If you want to reliably improve the quality, performance, and commercial viability of your AI applications, you can pair any other LLM of your choice with the Pinecone vector database.


References

[1] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

[2] LLM-Perf Leaderboard

[3] Ming-Wei Chang, Lev Ratinov, Dan Roth and Vivek Srikumar: Importance of Semantic Representation: Dataless Classification