惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

博客园 - 司徒正美
aimingoo的专栏
aimingoo的专栏
MongoDB | Blog
MongoDB | Blog
云风的 BLOG
云风的 BLOG
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
酷 壳 – CoolShell
酷 壳 – CoolShell
博客园 - 聂微东
Y
Y Combinator Blog
T
Tailwind CSS Blog
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
S
SegmentFault 最新的问题
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
博客园 - 【当耐特】
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
J
Java Code Geeks
美团技术团队
Google DeepMind News
Google DeepMind News
博客园_首页
Apple Machine Learning Research
Apple Machine Learning Research
T
The Blog of Author Tim Ferriss

DEV Community

I Tested Spam Protection on Formspree vs Formgrid. The Results Were Surprising. May 27 - Video Understanding Workshop Beyond Keywords: How Google's 2026 Algorithms are Redefining SEO From Click to Cart: Ensuring an Accessible Customer Journey in WooCommerce Your company won't replace you with good AI. They'll replace you with bad AI. O fim do “modelo que faz tudo”? Conheça o Conductor, a IA que orquestra outras IAs 10 First-Principles Strategies to Learn Any Programming Language Deeply 10 First-Principles Strategies to Learn Any Programming Language Deeply The Hidden Cost of “Move Fast and Break Things” Why Your Logs Are Useless Without Traces DressCode: Your AI Stylist for Tomorrow The Documented Shortcoming of Our Production Treasure Hunt Engine I'm 16, and I Built an AI Tool That Audits Your Technical Debt Without Ever Touching code Building Your Own Crypto Poker Bot: A Developer's Guide to Blockchain Gaming Logic Apache Iceberg Metadata Tables: Querying the Internals Hermes, The Self-Improving Agent You Can Actually Run Yourself Unity vs Unreal: 5 Things I Had to Relearn the Hard Way Building Agentic Commerce Infrastructure: Overcoming SQLite Concurrency for Autonomous Procurement Agents Solana Accounts vs Databases HTML Table Borders I built a skill that makes AI-generated AWS diagrams actually usable My first post! I'm kinda excited The Page Root Was the Wrong Unit How to audit what your IDE extension actually sends to the cloud I Migrated 23 Make.com Scenarios to n8n and Cut My Bill by 60% — Complete Migration Guide (2026) Solving a Logistics Problem Using Genetic Algorithms Claude Code Skills Explained: What They Are & When to Use Them (2026) Maintaining Apache Iceberg Tables: Compaction, Expiry, and Cleanup Zero-Idle Local LLMs: Running Llama 3 in AWS Lambda Containers We scanned 8 B2B SaaS companies across 5 categories. ChatGPT named the same 12 brands in every answer. How To "Market" Yourself As A Tech Pro We scanned 500 MCP servers on Smithery. Here is what we found. HTML Basics for Beginners – Markup Language, Elements and Types of CSS DiffWhisperer: How I Turned Cryptic Git Diffs into Architectural Stories with Gemma 4 I built a version manager for llama.cpp using nothing but vibe coding. Unit Testing vs System Testing: Key Differences, Use Cases, and Best Practices for 2026 A game design textbook explains why products with fewer features win How to Build a Raydium Launchpad Bonding Curve in 5 Minutes with forgekit How to turn an AI prototype into a production system How Data Lake Table Storage Degrades Over Time Partition and Sort Keys on DynamoDB: Modeling data for batch-and-stream convergence Auto-Generate Optimized GitHub Actions Workflows For Any Stack With This New CLI Tool Unchaining the African Creator Economy The Treasure Hunt Engine Gotcha - A Lesson in Constrained Performance great_cto v2.17 - no more tambourine dance When Catalogs Are Embedded in Storage SafeMind AI: Instant Health & Safety Intelligence What Is PKCE, How It Works & Flow Examples AI Agent Failure Modes Beyond Hallucination Fastest Way to Understand Stryker Solana Accounts Explained to a Web2 Developer TV Yayın Akışı Sitesi Geliştirirken Öğrendiğim Teknik Dersler $500 Challenge Drop My First Look at Google's Gemma 4: A Quick Introduction How I use an LLM as a translation judge Best Calendar and Scheduling API for Developers — 2026 Comparison Agentic AI in Travel: Why UCP Isn't Travel-Ready Yet — and What We Measured I Finished Machine Learning. And Then Changed The Plan. The Five-Thousand-Line File The AI Whirlwind: Why Your Local Agent Matters More Than Ever I Built an Oracle DBA That Lives in Telegram. It Cut a 500K-Row Scan to 5 - After Asking Permission. The Day 2 Reality of Running a Kubernetes Lab on Your Mac: Stop/Start, CKS Scenarios, and What I Learned Building It. n8n for Airtable Power Users: 5 Automations That Take Your Base to the Next Level Validating Gemma 4 for Industrial IoT: A Governance Pattern VS Code Now Credits Copilot on Every Commit by Default Astro and Islands Architecture: Why Your Portfolio Doesn't Need React for Everything Booting from FAT12: How I added file reading to my x86 kernel Unity’s AI agent went public: the developers of a static analysis tool on what that means for code quality Anna's Archive publica un llms.txt para los LLMs que rastrean su catálogo CRDTs for Offline-First Mobile Sync Why I Built Mneme HQ: Preventing AI Agent Architectural Drift Google Antigravity 2.0 Is the I/O 2026 Announcement You Should Actually Care About I Built a Pay-Per-Call Crypto Signal API with x402 — Heres the Architecture JWT Token Refresh Patterns in React 19: Avoiding the Silent Auth Death Spiral 🚀 “From Prompts to Autonomous Agents: What Google I/O 2026 Changed” The Power of Distributed Consensus in Autonomous SOCs Sixteen TUI components, copy-paste, no dependency The Boring Reliability Layer Every Autonomous Agent Needs Nven - Secret manager Building Multi-Tenant Row-Level Security in PostgreSQL: A Production Pattern The Hardest Part of Being a Developer Isn't Coding Building Vylo — Looking for Collaborators, Partners & Early Support I Thought Memory Fades With Time. It Actually Fades With Information. ORA-00064 오류 원인과 해결 방법 완벽 가이드 I registered an AI agent at 1 AM and something cracked open in my head Pitch: Nven - Sync secrets. Ship faster. Why y=mx+b is the heart of AI From Routines to a Crew — Building a System That Plans Its Own Work & executes it 25 React Interview Questions 2026 (With Answers) — Hooks, React 19, Concurrent Mode An open source LLM eval tool with two independent quality signals Using Dashboard Filtering to Get Customer Usage in Seconds from TBs of Data Skills, Java 17, And Theme Accents 4 Hard Lessons on Optimizing AI Coding Agents Arctype: Cross-Platform Database GUI for LLM Artifacts Your robots.txt says GPTBot is welcome. Your server says 403. Organizing How to Use AWS Glue Workflow 5 n8n Automations Every Digital Agency Should Be Running (Bill More, Work Less) Getting Started with TorchGeo — Remote Sensing with PyTorch Designing a Scalable Cross-Platform Appium Framework Google Antigravity 2.0 & Slash Commands
Understanding Embeddings easily.
Daniel Odii · 2026-05-23 · via DEV Community

I've been hearing about embeddings for a while now, and even as someone who's very conversant with using LLMs as a daily driver and for integrating into smart systems, I wasn't really sure what exactly embeddings were and how they connected with everything else.

In this writeup, I'll be unpacking some of the things I've been able to learn about embeddings — what they are and how to use them as a software developer/engineer.


Turning Meanings into Coordinates

Think of embeddings as turning meanings into coordinates. LLMs are not built to — and cannot — understand words the same way humans do, so they convert text into lists of numbers that represent meaning.

Take the word "dog" for example. An LLM wouldn't straightforwardly understand what the word means until it converts it into a group of numbers:

"dog" → [0.21, -0.88, 0.44, ...]

Enter fullscreen mode Exit fullscreen mode

What the Numbers Are NOT Based On

The number of values in an embedding has nothing to do with:

  • Word length
  • Number of letters
  • Number of characters

This is because embeddings don't encode spelling — they encode meaning and features. The embedding size is determined by:

  • The embedding model's architecture
  • How much semantic information the model wants to represent

So the embedding dimension is directly proportional to the model size.

Key Properties

  • Similar meanings end up close together
  • Different meanings end up farther apart

You could say that embeddings are basically "a mathematical location for meaning."


Real-World Analogy

Imagine a large city map:

  • Tailors live in one district
  • Doctors live in another district
  • Developers live in a separate district

Now replace people with words, sentences, documents, or even images. That's basically embeddings!

A few more examples to drive it home:

Pair Relationship
"JavaScript" and "React" Close together
"Needle and thread" and "fashion design" Close together
"Dog" and "cat" Close together
"Bank" (money) and "banana" Far apart

Why Do Embeddings Matter?

Embeddings are what make AI able to:

  • Search semantically — find results based on meaning, not just keywords
  • Recommend similar content
  • Retrieve relevant context
  • Power Retrieval-Augmented Generation (RAG) systems
  • Compare meanings instead of exact words

Case Study: Semantic Search

Without embeddings, AI search would behave like old-school keyword search — returning results based on exact phrase matching.

With embeddings, a query like "How to fix app crashing" would also surface results like:

  • "Application keeps closing"
  • "React Native app freezes"
  • "Unexpected mobile app shutdown"

...because the meanings are close, even if the words are different.


What Can Be Embedded?

Almost anything:

  • Words — e.g., "King"
  • Sentences — e.g., "How to build a React app"
  • Entire documents — e.g., PDFs, docs, chats, codebases, etc.
  • Images — this is how Google reverse image search works

What Happens Behind the Scenes?

The system compares embeddings using similarity/distance metrics:

  • Cosine similarity — measures how similar two embeddings are based on their direction, regardless of size. If two vectors point almost the same way, they likely have similar meaning.
  • Euclidean distance — measures the actual straight-line distance between two embeddings in vector space. A smaller distance means the meanings are closer together.

Applying Embeddings in RAG

Let's look at how embeddings fit into a RAG (Retrieval-Augmented Generation) pipeline. Here's an example: building an enhanced search engine for a company website.

Step 1 → Convert documents into embeddings
         (e.g., PDFs, notes, product catalogs, support docs)

Step 2 → Store them in a vector database
         (e.g., Pinecone, Weaviate, Chroma, PGVector)

Step 3 → A user asks: "How do suppliers onboard?"

Step 4 → The question is converted into an embedding too

Step 5 → The system searches for nearby embeddings
         (semantically similar documents)

Step 6 → Relevant chunks are sent to the LLM

Enter fullscreen mode Exit fullscreen mode

This is essentially how most "Chat with your docs" implementations work.


A Common Misconception

Some people think embeddings store knowledge — but that's not quite right.

Embeddings store:

  • Semantic relationships
  • Meaning patterns

The actual reasoning still happens in the LLM. Embeddings mainly help the model find relevant information, not process it.


Embedding Models

Open-Source / Free

These can be downloaded, run locally, fine-tuned, and used without API costs:

Model Notes
BGE Embeddings Strong general-purpose embeddings
E5 Embeddings Great for retrieval tasks
Sentence Transformers Very popular for semantic search
Hugging Face models Wide variety available

Closed / Paid APIs

These are accessed through APIs and are typically billed per token or request:

Provider Notes
OpenAI Embeddings Widely used, easy to integrate
Cohere Embeddings Strong multilingual support
Voyage AI Embeddings Optimized for retrieval

If you've read to this point — congratulations, you're already on your way to becoming a pro RAG engineer. (Just kidding.)

Thanks for reading through though.