Node, Index, Shard in Elasticsearch

jdhao's digital space

Conversion between base64 and OpenCV or PIL Image 腾讯云对象存储博客图床开启 CDN 加速(不需要购买额外域名) Search and Replace in Multiple Files in Vim/Neovim Change Table Column Width in LaTeX Image or Table Side by Side in LaTeX LaTeX 并排显示图像或表格 Firenvim: Neovim inside Your Browser Content inside HTML tags missing in Latest Hugo? Creating Markdown Front Matter with Ultisnips Labelme JSON 标注格式转 voc XML 格式 Nifty Nvim Techniques That Make My Life Easier -- Series 6 macOS 下如何为视频制作字幕 Running Command Asynchronously inside Neovim Resolving Merge Conflict after Git Stash Pop Pylint: command not found? A Hands-on Experience with Neovim's Built-in LSP Support How to Convert PDF to Images with Imagemagick 互联网上常用缩略语集锦 File Backup in Neovim Converting PDF Pages to Images with Poppler Nifty Nvim Techniques That Make My Life Easier -- Series 5 Neovim Configuration for System-wide Use How to sort a list of tuple or list in Python -- lambda or itemgetter? Building A Vim Statusline from Scratch 人类第一颗原子弹爆炸始末 Distributed Training in PyTorch with Horovod Learning Expect Programming Essential Knowledge about SSH Nifty LaTeX Techniques -- Series 1 更改 Adsense 邮寄地址，重新寄送 PIN Mintty Tips and Configurations Generating Table of Contents for Markdown with Tagbar Convert Python Script to Exe on Windows with Pyinstaller Ubuntu on Windows Missing after Windows Update 使用代理加速 Mac 终端下载速度 My Experience with Several Zsh Plugin Managers 深圳租房小记 How to Install zplug inside Docker Container Why don't settings inside bashrc or bash_profile take effect? Setting Up Locale in Linux 谷歌 Adsense 申请及在 Hugo 中的配置 How to Write Algorithm Pseudo Code in LaTeX Nifty Nvim Techniques That Make My Life Easier -- Series 4 A Few Grammar Questions in Writing How to Read and Write Images with Unicode Paths in OpenCV Creating A Professional Table in LaTeX with booktabs How to Create Proper Folding for Vim/Nvim Configuration Linux Tips and Tricks -- s1 JPEG Image Orientation and Exif How Do I Show the Current File Path In Neovim? JPEG Image Quality in PIL Difference between view, reshape, transpose and permute in PyTorch Convert PIL or OpenCV Image to Bytes without Saving to Disk Fast Movement and Navigation Inside Vim or Neovim Unintuitive Behaviour of Case Sensitivity in Python glob Binding Keys in Zsh 几把机械键盘试用体验 Nvim Autocompletion with Deoplete Converting Markdown to Beautiful PDF with Pandoc Exclusive and Inclusive Motion in Neovim/Vim Nifty Nvim Techniques Which Make My Life Easier -- Series 3 Why Doesn't Jedi Autocompletion Work for Some Methods Vim-like Editing inside Browser Markdown 生成 HTML 时汉字之间出现多余空格问题小米 9 安装谷歌商店（Google Play Store）与相关配置 Create Mappings That Take A Count in Neovim Spell Checking in Nvim English Words Completion inside Neovim/Vim How to Use Python Inside Vim Script with Neovim Nifty Little Nvim Techniques to Make My Life Easier -- Series 2 Setting up Ultisnips for Neovim Mac 上罗技 M590 鼠标设置 Nifty Little Nvim Techniques to Make My Life Easier -- Series 1 A Complete Guide on Writing LaTeX with Vimtex in Neovim Manipulating Images with Alpha Channels in Pillow Sublime Text Regular Expression Cheat Sheet Cropping Rotated Rectangles from Image with OpenCV Boosting Your Productivity on Terminal with Zsh and Plugins 最新版 Rime 输入法使用 (2022 更新) Display Image with Pillow inside Ubuntu on Windows Faster Directory Navigation with z.lua Cmder Advanced Configurations Nvim-qt Settings on Windows 10 Tmux Plugin Install and Management How to Debug Python Code in Terminal Markdown Writing and Previewing in Neovim -- A Complete Guide Line Number Settings for More Efficient Movement in Neovim 两个大规模中文语料库介绍以及处理 Windows 系统下几款程序员不可不用的神器我的 2018 阅读清单 A Complete Guide to Neovim Configuration for Python Development How Is Newline Handled in Python and Various Editors? Two Issues Related to ImageFont Module in PIL 在 Listary 中调用 GoldenDict 或欧路词典查单词 Reading and Writing Text Files on Windows The Mathematics behind Font Shapes --- Bézier Curves and More 快速识别图片字体：字体识别工具介绍 Deoplete Failed to Load at Startup after Updating Python neovim Package What Is The Difference between pip, pip3 and pip3.6 Shipped with Anaconda3? Windows 10 系统下 Neovim 安装与配置

2025-11-15 · via jdhao's digital space

relationship between cluster, node, index, shard, segment#

Explanations of basic terminology:

An Elasticsearch cluster has multiple nodes, for example data nodes, ML nodes, etc.
A node is a JVM instance that is running Elasticsearch.
An index is a collection of documents, an index can have multiple primary shards and replica shards.
A shard is placed in a node in the Elasticsearch cluster.
A shard is a Apache Lucene index
A Lucene index consists of multiple segments (internal structure used by Lucene)

You can use the cat API to get the info about nodes/index/shard/segments:

GET _cat/nodes?v=true

GET _cat/indices/my-index

GET _cat/shards/my-index

GET _cat/segments/my-index

# or you can also use the following api to get segments info about an index
GET my-index/_segments

ref:

https://discuss.elastic.co/t/relation-between-shards-and-nodes/104562
Elasticsearch node: https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-node.html
cat node API: https://www.elastic.co/guide/en/elasticsearch/reference/current/cat-nodes.html
understanding segments in Elastic: https://stackoverflow.com/q/15426441/6064933

shards and replia (primary and replica)#

A index can have multiple primary shards and replica shards. Primary shard can accept read/write requests, while replica shards can only accept read requests.

The cat shards api can be used to check the status of shards:

# v=true will show the column header
GET _cat/shards/my_index?v=true&h=index,shard,prirep,state,docs,unassigned.for,unassigned.reason&s=state

number of shards and number of replicas#

For how to set proper number of shards, refer to official doc

Constraints for the number of replica: for a primary shard, its replica shards can not be in the same node, also between those replica shards, they can not be in the same node. This effectively means that the number of replica must be less or equal to num_node - 1. For example, if you have 3 nodes, if primary1 is in node 1, then its replica shards can only take node 2 and node 3. If you break this constraint, and set the number of replica to larger value, when you check the info of this index, you will see that its health status is yellow instead of green.

GET _cat/indices/my_index?v

If you check the shard info about this index (GET _cat/shards/my_index), you will see that some replica has UNASSIGNED status:

The number of shards is a static index setting and can only be set at index creation time. The number of replicas is a dynamic setting that can be changed dynamically for a index without interrupting search and indexing request. You can set the number of shards and replicas using the index creation api:

DELETE my_index

PUT my_index
{
  "settings": {
    "index.number_of_shards": "1",
    "index.number_of_replicas": "2"
  }
}

As explained, the number of replica is a dynamic setting, you can change the value after index creation with index-update-setting api:

PUT my_index/_settings
{
  "settings": {
    "index.number_of_replicas": "1"
  }
}

When you decrease the number of replicas, Elastic will delete the extra replicas. When you increase the number of replicas, Elastic will automatically copy the primary shards to suitable node. For some time, you will see the index status is yellow. If you use the cat-shard API, you will see that the state for the replica shards is INITIALIZING. After some time, the state of these replica shards become STARTED, and the index status becomes green.

ref:

Elastic shard and replica guide: https://www.elastic.co/search-labs/blog/elasticsearch-shards-and-replicas-guide

shard write and read model#

When we do indexing operation for an index, the operation is first done on primary shards, then synced to replica shards. If you have a large number of documents to index, this is usually slower than only updating the primary shards. So the Elastic official doc recommends to set the number of replica to 0 for initial large load. After indexing, you can set the number of replica to its original value, Elastic will then sync the changes under the hood.

Having multiple replica helps Elastic to prevent data loss and also let Elastic to handle more search request, because it can distribute the read operation to one of the node holding the replica shards. When Elastic receive search/read request, the request will be routed to nodes that contains the relevant data, see shard-routing.

explain why shard is unassigned or assigned to a certain node#

If you see that a shard is unassigned in the cat-shard API and want to get more detailed info. The cluster allocation explain API can explain why a shard is unassigned or assigned. The API: https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-allocation-explain.html

GET _cluster/allocation/explain
{
  "index": "my_custom_index",
  "shard": 0,
  "primary": true
}

For the index parameter, it seems we can not use alias, and we have to use the actual index name. shard refers to the shard number. primary refers whether this is a primary or replica shard.

Note that when we want to explain for unassigned shard, we should not use the current_node:

To explain an unassigned shard, omit this parameter.

References#

no shard available exception: https://stackoverflow.com/a/54019924/6064933
Elastic shards and replicas: https://stackoverflow.com/q/15694724/6064933

此内容由惯性聚合(RSS阅读器)自动聚合整理，仅供阅读参考。原文来自 — 版权归原作者所有。

推荐订阅源

jdhao's digital space

relationship between cluster, node, index, shard, segment#

shards and replia (primary and replica)#

number of shards and number of replicas#

shard write and read model#

explain why shard is unassigned or assigned to a certain node#

References#