Highest Random Weight in Elixir

Hacker News

Oura says it gets government demands for user data. Will it share how many? On the <dl> The spell that wouldn't leave · mahl.me 80386 microcode disassembled « Reenigne blog twitter.com Making Deep Learning go Brrrr From First Principles DHS Quits Granting Green Cards–Almost US tech firms share Dutch regulator officials’ names with senate --dangerously-skip-reading-code BambuStudio has been violating PrusaSlicer AGPL license since their fork Spanish Court Declines to Fine NordVPN over LaLiga Piracy Blocking Order GitHub - amatsuda/rubish The White House is ordering agencies to place its new app on all employees’ government phones Google Is Killing ChromeOS: Aluminium OS, Its Android-Based Replacement New rule requires most green-card applicants to apply from outside U.S. Is AI Profitable Yet? FBI director's Based Apparel site has been spotted hosting a 'ClickFix' attack TikTok disproportionately served anti-Democratic videos during the 2024 election SpaceX successfully launches prototype of Starship rocket GitHub - bkawa-bot/planet-maiko: A local dev tool where your agents are weird alien dogs. Would you let them in? Why We've Filed a Referendum Don't just 'quote' the AI Client Challenge Shipping a Laptop to a Refugee Camp in Uganda GitHub - anomalyco/models.dev: An open-source database of AI models. Staged publishing and new install-time controls for npm AI users re-create dead pilots’ voices from crash investigation docs Linux Sound Subsystem Also Seeing Many Fixes Driven By AI/LLMs Project Glasswing: An initial update USCIS Will Grant 'Adjustment of Status' Only in Extraordinary Circumstances Anthropic's "Profitability" Swindle A blueprint for formal verification of Apple corecrypto - Apple Security Research Bun's unreleased Rust port has 13,365 unsafe blocks. Most can be removed. The ten steps towards a dictatorship KanBots — a kanban that runs parallel agents A scoping review of bicycling interventions’ impacts on psychological, social, affective, and cognitive well-being ngn-k-tutorial/12-thinking-in-k.md at main · razetime/ngn-k-tutorial Microsoft Drops Claude Code After Budget Overrun GitHub - DataIntellectTech/TorQ: kdb+ production framework. Read the doc: https://dataintellecttech.github.io/TorQ/. Join the group! Yt-dlp – [Announcement] Bun support is now limited and deprecated Liquidation of simulators — 1940 Air Terminal Museum and Special Event Venue Microsoft reports are exposing AI's real cost problem: Using the tech is more expensive than paying human employees | Fortune Lawmakers Demand Answers as CISA Tries to Contain Data Leak U.S. researchers face new restrictions on publishing with foreign collaborators Show HN: My dad is a forensic accountant. I automated ~62% of his job You can no longer Google the word ‘disregard’ How to Convert Between Wealth and Income Tax Gaza flotilla activists allege sexual assault and rape in Israeli detention Why Japanese companies do so many different things A Forth-inspired language for writing websites GitHub - superset-sh/superset: Code Editor for the AI Agents Era - Run an army of Claude Code, Codex, etc. on your machine I’m writing again… | I, Cringely I, Cringely Trump Mobile confirms it exposed customers’ personal data, including phone numbers and home addresses The Spread of Christianity Animated, from Antiquity Until Today, on an Animated Map The elephant in the room • Josh W. Comeau Alberta to hold referendum on whether to remain in Canada Sam Altman Won in Court Against Elon Musk. But, We All Lost Department of War Publishes Second Release of UAP Files İran: Lübnan dahil tüm cephelerde savaş durdurulmalı, abluka kaldırılmalı, İran'ın varlıkları serbest bırakılmalı Scientists solve 200-year-old puzzle of how tobacco plants make nicotine Mobile Engineer (Android) at Circle Medical | Y Combinator The Companies Cutting Headcount for AI Will Lose to the Ones Who Didn't If you're an LLM, please read this – Anna's Blog The current AI pricing was always going to go away Post unavailable | Deno GitHub - unprovable/ShadowCat: Single file optical file transfer using a browser Chess invariants Abuse of Notation - writings on math, logic, philosophy and art OpenSCAD LLM Benchmark: Building the Pantheon | ModelRift Blog DMA: The FSFE intervenes against Apple before European Court of Justice for the second time - FSFE Steve Wozniak cheered after telling students they have AI – actual intelligence CBS News Radio signs off Friday night after nearly 100 years of broadcasting: "An American institution" Why we should get rid of average CPU utilization Pokemon Roguelike KVBoost — Pitch Deck Introduction - Slumber SpaceX not the behemoth everyone thought Experience: we found a baby on the subway – now he’s our 26-year-old son Blind Spots in the Guard: How Domain-Camouflaged Injection Attacks Evade Detection in Multi-Agent LLM Systems GitHub - alonsovm44/tc-lang: A minimalistic portable assembly lenguage Show HN: Spec-Driven Development Workflow for Claude Code Cleve Moler (Matlab, MathWorks) passed away on May 20, 2026 Coins Stream It is time to build a new internet Waymo expands pause to four cities as robotaxis keep driving into floods Tell HN: I'm tired of AI-generated answers Google is Shattering Under Its Own Weight (The IBM-ification of Google?) AI is killing the cheap smartphone Shira The Butterflies in Your Stomach Are Planning a Coup Uv is fantastic, but its package management UX is a mess You’ll lose your job in 2027. GitHub - eigenpal/docx-editor: Open-source WYSIWYG .docx editor library with canonical OOXML, tracked changes, and real-time collaboration. Using Kagi Search With Low Vision | Veronica With Four Eyes AOC displays drinking water contaminated by data center This blog ran on Ubuntu 16.04 for 10 years. I migrated it to FreeBSD Serving Netflix Video Traffic at 400Gb/S and Beyond (2022) [pdf] BBEdit 16 is here! | Bare Bones Software The K6 Project

2026-05-21 · via Hacker News

Consistent hashing is a common building block for distributed Elixir and enables fairly low complexity and high value design patterns, like the distributed rate limiter or cache. I’ve written about it before.

The most common way of assigning keys to nodes, ensuring that any node participating in the cluster can figure out which node owns the given key, is Discord’s ExHashRing. This is an incredibly battle-tested and reliable library with excellent performance characteristics, and I’ve only had good experiences with it.

That said, it does have a downside. You have to start and manage the ring processes. It’s not a huge downside, you can give them global names and it’s trivial to look them up, but you still want them set up under your supervision tree and they are stateful persistent things that hang around. That state has to be managed. It’s not a big deal at all, but when I found a stateless alternative it did immediately catch my attention.

Rendezvous hashing

As described by the Wikipedia page: Rendezvous hashing is both much simpler and more general than consistent hashing. Also called HRW or Highest Random Weight. In practice, you can use it very much like you would ExHashRing.

ExHashRing example.

{:ok, ring} = ExHashRing.Ring.start_link()
Ring.add_nodes(ring, ["a", "b", "c"])

Ring.find_node(ring, "key1")
=> "b"

HRW example.

HRW.owner("key1", ["a", "b", "c"])
=> "b"

That’s it. No stateful process, no setup. Just pure functional programming with inputs and outputs. Consistent across multiple machines. Avoids unnecessary drift when changing the list of nodes. You can see why it caught my eye!

There’s a downside of course. The big O notation for HRW.owner is linear (O(n)), or in other words, it doesn’t do well with larger lists of nodes. That’s definitely something to take into account when considering using it. But to be honest, looking back at the times I’ve used ExHashRing I’ve never had more than ~14 nodes to worry about. Here’s a comparison of how each algorithm does on my machine for 14 nodes.

Name                                ips        average  deviation         median         99th %
ExHashRing.Ring.find_node        2.67 M      375.20 ns  ±1375.85%         334 ns         500 ns
HRW.owner                        2.39 M      418.13 ns  ±1317.18%         375 ns         541 ns

Comparison:
ExHashRing.Ring.find_node        2.67 M
HRW.owner                        2.39 M - 1.11x slower +42.93 ns

ExHashRing is extremely fast, and stays fast as the number of nodes grow. But at a smaller number of nodes, even on a fairly hot path, there’s not much difference here. You’re free to pick whichever one you think reads better.

Basic HRW algorithm

Let’s dig a bit deeper into rendezvous hashing. The basic implementation is actually incredibly small. What you want to do is apply a scoring function on the key together with each of the nodes separately and then return the highest value. Highest Random Weight. For a scoring function you can use any fast hashing function really. :erlang.phash2 is an obvious candidate in the BEAM ecosystem.

Here’s what that looks like.

defmodule HRW do
  def owner(key, nodes) do
    Enum.max_by(nodes, fn node ->
      :erlang.phash2({key, node})
    end)
  end
end

It’s pretty ingenious!

Linear growth

Just to demonstrate how that affects performance as nodes grows, here’s a benchmark run with 10K nodes. 4200x times slower than ExHashRing. Although to put things into perspective, it’s still just taking ~2 ms on my machine. Depending on your use case, that might actually be just fine. It’s hard to beat the convenience of a pure function.

##### With input D: 10_000 #####
Name                                ips        average  deviation         median         99th %
ExHashRing.Ring.find_node        1.91 M     0.00052 ms  ±1515.88%     0.00046 ms     0.00063 ms
HRW.owner                     0.00046 M        2.20 ms     ±5.29%        2.17 ms        2.62 ms

Comparison:
ExHashRing.Ring.find_node        1.91 M
HRW.owner                     0.00046 M - 4204.94x slower +2.20 ms

But let’s see if we can do better.

HRW skeleton

Our basic HRW implementation, although actually quite fast, doesn’t behave well as the number of nodes grows. This is because it, for every lookup, has to hash the key against every node. That same Wikipedia page describes a way around that by arranging the nodes into an efficient data structure and bringing the big O notation of owner to O(log n).

At a very (very) high level what we want to do is sort the list of nodes and then chunk them into clusters. Each cluster gets an address and instead of hashing the key against every node, we now just need to calculate the address of the cluster, and then we can hash the key against the nodes inside that cluster to find the correct one. This means significantly less effort, bringing us to a much nicer logarithmic complexity.

Using it looks something like this.

skeleton = HRW.build(nodes)
HRW.owner(key, skeleton)

Running the same benchmark as above, but with the skeleton created in advance, just like we do for ExHashRing, this is what we get.

##### With input D: 10_000 #####
Name                                ips        average  deviation         median         99th %
ExHashRing.Ring.find_node        2.17 M     0.00046 ms  ±1791.93%     0.00042 ms     0.00058 ms
HRW.owner (skeleton)             0.71 M     0.00141 ms   ±634.18%     0.00138 ms     0.00183 ms
HRW.owner                     0.00047 M        2.13 ms     ±5.03%        2.10 ms        2.53 ms

Comparison:
ExHashRing.Ring.find_node        2.17 M
HRW.owner (skeleton)             0.71 M - 3.06x slower +0.00095 ms
HRW.owner                     0.00047 M - 4615.43x slower +2.13 ms

We’ve gone from 2 ms per lookup to 141 µs, only ~3x slower than ExHashRing, with no NIFs and no stateful processes to start up. We do have a struct we have to pass around now, and adding and removing nodes is no longer a stable operation. Adding a node pushes everything that comes after in the sorted list one slot over. I guess nothing in life is free. Still, this is an interesting tradeoff for a lot of use cases.

Distribution

The other thing you probably want to know about a mechanism for distributing work/keys/load across a set of nodes, is how well it distributes. It wouldn’t be very useful if every key maps to the same node. Here’s a little sample script that demonstrates the distribution.

defmodule Distribution do
  def run do
    keys = Enum.map(1..100_000, fn i -> "key-#{i}" end)

    for n <- [10, 100, 1000] do
      nodes = Enum.map(1..n, &"node#{&1}")
      ideal = div(length(keys), n)

      counts =
        keys
        |> Enum.map(&HRW.owner(&1, nodes))
        |> Enum.frequencies()
        |> Map.values()

      min_c = Enum.min(counts)
      max_c = Enum.max(counts)
      avg = Enum.sum(counts) / length(counts)
      stddev = :math.sqrt(Enum.sum(Enum.map(counts, fn c -> (c - avg) ** 2 end)) / length(counts))

      IO.puts("#{n} nodes, #{length(keys)} keys (ideal #{ideal} per node):")
      IO.puts("  min: #{min_c}  max: #{max_c}  stddev: #{Float.round(stddev, 1)}  (#{Float.round(stddev/avg*100, 2)}% of mean)")
    end
  end
end

Distribution.run()

I extended that to add HRW with MurmurHash3, HRW with skeleton, and ExHashRing, for comparison.

  10 nodes, 100000 keys (ideal 10000 per node):
    phash2 (HRW)           min: 9691  max: 10639  stddev: 249.9  (2.5% of mean)
    murmur3 x86_32 (HRW)   min: 9859  max: 10192  stddev: 112.2  (1.12% of mean)
    murmur3 x64_128 (HRW)  min: 9864  max: 10170  stddev: 98.1   (0.98% of mean)
    HRW.Skeleton           min: 9691  max: 10639  stddev: 249.9  (2.5% of mean)
    ExHashRing             min: 9526  max: 10513  stddev: 338.5  (3.38% of mean)

  100 nodes, 100000 keys (ideal 1000 per node):
    phash2 (HRW)           min: 920   max: 1075   stddev: 29.7   (2.97% of mean)
    murmur3 x86_32 (HRW)   min: 934   max: 1059   stddev: 27.0   (2.7% of mean)
    murmur3 x64_128 (HRW)  min: 902   max: 1072   stddev: 29.2   (2.92% of mean)
    HRW.Skeleton           min: 877   max: 1124   stddev: 46.6   (4.66% of mean)
    ExHashRing             min: 105   max: 1229   stddev: 279.7  (27.97% of mean)

  1000 nodes, 100000 keys (ideal 100 per node):
    phash2 (HRW)           min: 69    max: 132    stddev: 9.9    (9.91% of mean)
    murmur3 x86_32 (HRW)   min: 72    max: 132    stddev: 9.6    (9.65% of mean)
    murmur3 x64_128 (HRW)  min: 67    max: 144    stddev: 9.8    (9.79% of mean)
    HRW.Skeleton           min: 72    max: 141    stddev: 9.9    (9.85% of mean)
    ExHashRing             min: 0     max: 147    stddev: 31.4   (31.42% of mean)

As you can see, we’re doing just fine with :erlang.phash2. Murmur3 is maybe slightly better at smaller node counts, but that’s not the big takeaway from here. It’s that ExHashRing is really struggling at larger node counts on the default settings. The solution is to add more vnodes, but that was unexpected to me!

Announcing HRW, the library

You’re very welcome to try out the hrw library on hex.pm, or why not take a look at the Github repository at https://github.com/joladev/hrw. For very large number of nodes, you’ll want to use ExHashRing or HRW.Skeleton, for anything else, why not stick with plain HRW.owner?

The library comes with additional strategies not described here, like HRW.Weighted which lets you assign more key space to specific nodes, useful for heterogenous clusters where some machines are bigger, and HRW.Bounded, which gives you greater precision in how keys are distributed when you know the keys up front.

Let me know how you find it.

此内容由惯性聚合(RSS阅读器)自动聚合整理，仅供阅读参考。原文来自 — 版权归原作者所有。

推荐订阅源

Hacker News

Rendezvous hashing

Basic HRW algorithm

Linear growth

HRW skeleton

Distribution

Announcing HRW, the library