惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

T
Threat Research - Cisco Blogs
Blog — PlanetScale
Blog — PlanetScale
D
Docker
Recorded Future
Recorded Future
Engineering at Meta
Engineering at Meta
宝玉的分享
宝玉的分享
B
Blog
C
Cybersecurity and Infrastructure Security Agency CISA
Attack and Defense Labs
Attack and Defense Labs
N
News | PayPal Newsroom
C
Check Point Blog
S
Secure Thoughts
www.infosecurity-magazine.com
www.infosecurity-magazine.com
大猫的无限游戏
大猫的无限游戏
E
Exploit-DB.com RSS Feed
A
About on SuperTechFans
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
月光博客
月光博客
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
Recent Announcements
Recent Announcements
H
Heimdal Security Blog
F
Fortinet All Blogs
N
News and Events Feed by Topic
D
DataBreaches.Net
N
Netflix TechBlog - Medium
Last Week in AI
Last Week in AI
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
TaoSecurity Blog
TaoSecurity Blog
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
Hacker News - Newest:
Hacker News - Newest: "LLM"
T
Troy Hunt's Blog
Simon Willison's Weblog
Simon Willison's Weblog
MongoDB | Blog
MongoDB | Blog
Schneier on Security
Schneier on Security
云风的 BLOG
云风的 BLOG
Know Your Adversary
Know Your Adversary
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
Hugging Face - Blog
Hugging Face - Blog
T
Threatpost
aimingoo的专栏
aimingoo的专栏
S
SegmentFault 最新的问题
The Last Watchdog
The Last Watchdog
酷 壳 – CoolShell
酷 壳 – CoolShell
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
L
LangChain Blog
NISL@THU
NISL@THU
W
WeLiveSecurity
H
Help Net Security
Recent Commits to openclaw:main
Recent Commits to openclaw:main

Show HN

暂无文章

GitHub - expanso-io/log-simulators: Realistic log generators for testing data pipelines at volume - web, IoT, syslog, Windows, Cisco ASA, CEF/LEEF, JSON app, cloud audit, Kubernetes, PostgreSQL. Requires only uv.
TheIronYuppi · 2026-06-14 · via Show HN

Realistic log generators for testing data pipelines at volume. Ten simulators covering the device types that matter for SIEM and observability pipelines — each one a single command that needs only uv.

uvx --from git+https://github.com/expanso-io/log-simulators logsim-web --rate 100

No clone, no install, no Docker. Pipe the output anywhere — a file, a TCP/UDP collector, or straight into an Expanso Edge pipeline.

The simulators

Tool Generates Demo scenario
logsim-web Apache/nginx access + error logs (NCSA combined/common/JSON), session-coherent visitors error-storm — recurring 5xx spikes
logsim-iot IoT sensor telemetry NDJSON: temperature, humidity, pressure, vibration, voltage with drift + diurnal cycles sensor-fault — spikes, stuck values, dropouts
logsim-syslog RFC 3164 and RFC 5424 syslog with realistic facility/severity mix auth-burst — failed-login floods
logsim-windows Windows Security Event XML (4624/4625/4688/4672) brute-force — 4625 password-spray bursts
logsim-asa Cisco ASA firewall syslog — paired build/teardown with consistent connection IDs, denies port-scan — deny storms from one source
logsim-cef CEF and LEEF security events (firewall/IPS style) malware-burst — high-severity event waves
logsim-app Structured JSON app logs with trace IDs and realistic embedded PII (for redaction demos) error-storm, pii-leak
logsim-cloud AWS CloudTrail JSON and VPC Flow Logs suspicious-login — off-region console logins
logsim-k8s Kubernetes CRI container logs — multi-pod node, klog + JSON apps, partial-line mechanics crash-loop — restarting pod
logsim-postgres PostgreSQL server logs incl. multiline ERROR/DETAIL/STATEMENT and slow queries deadlock — lock-contention windows

Every tool shares the same CLI contract:

--rate N            average events/sec (Poisson-paced, like real traffic)
--count N           stop after N events (0 = run forever)
--duration 5m       stop after a wall-clock duration
--backfill 24h      synthesize 24h of history at full speed, then exit
--follow            ...then keep streaming live
--start-time ISO    anchor the backfill window (deterministic with --seed)
--seed N            fully reproducible output
--diurnal           overnight trough, midday peak
--output DEST       '-' stdout (default) | file path | tcp://host:port | udp://host:port
--rotate-mb N       rotate + gzip file output
--scenario NAME     inject recurring anomaly windows (per-tool)

Quick start

# Stream Apache combined logs at 50/sec forever
uvx --from git+https://github.com/expanso-io/log-simulators logsim-web --rate 50

# 24 hours of historical IoT telemetry, then exit
uvx --from git+https://github.com/expanso-io/log-simulators logsim-iot --backfill 24h --output sensors.ndjson

# A brute-force attack inside normal Windows event noise, to a UDP collector
uvx --from git+https://github.com/expanso-io/log-simulators logsim-windows \
    --scenario brute-force --rate 20 --output udp://localhost:5514

# Reproducible test fixture: same command, byte-identical output
uvx --from git+https://github.com/expanso-io/log-simulators logsim-asa \
    --seed 42 --count 1000 --backfill 1h --start-time 2026-01-15T12:00:00+00:00

# Umbrella command works too
uvx --from git+https://github.com/expanso-io/log-simulators logsim k8s --rate 30

Single-file versions of the most-used tools live in standalone/ — each is a self-contained PEP 723 script:

uv run https://raw.githubusercontent.com/expanso-io/log-simulators/main/standalone/web_access_sim.py --rate 10

Why these formats

The May 2025 joint CISA/NSA/ACSC guidance, Priority logs for SIEM ingestion, names the sources practitioners should prioritize: OS logs, network devices, firewalls/IDS, and cloud audit trails — and explicitly recommends against shipping everything raw into the SIEM. This suite generates exactly those sources, so you can build and demo the filtering/routing layer in front of the SIEM with realistic volume, then prove zero-loss delivery (seeded, countable output) end to end.

What makes the output realistic rather than random:

  • Entity consistency — the same hosts, users, IPs, and devices recur coherently (a firewall's teardown matches its build; a session keeps its IP).
  • Skewed distributions — Zipf popularity for paths/IPs, long-tail response sizes, Poisson inter-arrival times.
  • Scenario injection — a baseline of boring traffic with deterministic, recurring anomaly windows you can catch in a pipeline.
  • Seeded determinism--seed + --start-time reproduce byte-identical streams for tests and fixtures.

Development

git clone https://github.com/expanso-io/log-simulators
cd log-simulators
uv sync            # installs everything incl. dev tools
uv run pytest      # full test suite
uv run ruff check . && uv run ruff format --check .
uv run logsim list # see all tools

The layout is a single distribution with one subpackage per simulator plus a shared core (src/log_simulators/core/) providing pacing, sinks, entity pools, and scenario scheduling. This keeps uvx --from git+... working verbatim — a multi-package workspace would not survive git installation (see uv issues #16328 / #10728).

Lineage

Aggregates and supersedes bacalhau-project/access-log-generator, bacalhau-project/sensor-log-generator, and several smaller internal generators. CLI ergonomics inspired by mingrammer/flog.

License

Apache-2.0