惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

F
Full Disclosure
WordPress大学
WordPress大学
小众软件
小众软件
Cloudbric
Cloudbric
AWS News Blog
AWS News Blog
腾讯CDC
量子位
人人都是产品经理
人人都是产品经理
大猫的无限游戏
大猫的无限游戏
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
V
Vulnerabilities – Threatpost
Scott Helme
Scott Helme
Hugging Face - Blog
Hugging Face - Blog
博客园_首页
C
CXSECURITY Database RSS Feed - CXSecurity.com
The Hacker News
The Hacker News
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
IT之家
IT之家
Jina AI
Jina AI
Attack and Defense Labs
Attack and Defense Labs
S
SegmentFault 最新的问题
Simon Willison's Weblog
Simon Willison's Weblog
The Cloudflare Blog
阮一峰的网络日志
阮一峰的网络日志
T
Tailwind CSS Blog
Last Week in AI
Last Week in AI
博客园 - 【当耐特】
Google Online Security Blog
Google Online Security Blog
美团技术团队
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
V
Visual Studio Blog
罗磊的独立博客
L
LINUX DO - 最新话题
博客园 - Franky
博客园 - 叶小钗
Apple Machine Learning Research
Apple Machine Learning Research
The Last Watchdog
The Last Watchdog
J
Java Code Geeks
AI
AI
C
Cisco Blogs
酷 壳 – CoolShell
酷 壳 – CoolShell
C
Cyber Attacks, Cyber Crime and Cyber Security
Cisco Talos Blog
Cisco Talos Blog
博客园 - 三生石上(FineUI控件)
雷峰网
雷峰网
Help Net Security
Help Net Security
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
云风的 BLOG
云风的 BLOG
I
Intezer
S
Securelist

Home on Alex Plescan

Just for fun: animating a mosaic of 90s GIFs Two computers, one monitor, zero fiddling Placeholder names should be bad and unique Rebuilding this site Okay, I really like WezTerm Timeseries with PostgreSQL Easy SVG sparklines Using Declarative Shadow DOM to embed HTML emails on a web page Selling SaaS on Gumroad PDF: The Conjoined Triangles of Success Deploying Metabase to Fly.io The ".x" Files Xcode 8 managed signing: adding new device UUIDs to a provisioning profile Emojify your Wi-Fi (Netgear R6300 edition) How to use the San Francisco Mono typeface before macOS Sierra is released Disabling App Transport Security in your development environment Swift: A nicer way to tell if your app is running in Debug mode Development environment config overrides in Jekyll Setting up SwiftLint on Travis CI
GNU Parallel, where have you been all my life?
2023-08-20 · via Home on Alex Plescan

I was recently trying to figure out how likely a bunch of end-to-end tests were to be flaky, and wanted to gather some stats about their pass/fail rates on my local machine before including them in a broader test suite.

These tests run for a long time, as they execute extensive scenarios against a live service over HTTP. In this post I’ll share the approach I ended up with using GNU Parallel.


A quick aside: If you wanna follow along and run the upcoming examples in your own terminal, use this command to generate some test files. They’ll emulate a flaky test by sleeping between 5-15 seconds, then randomly exiting with a failure (exit code 1) or success (exit code 0):

parallel "echo 'sleep \$((\$RANDOM%10+5)) && [ \$((\$RANDOM%2)) = 1 ] \
  && printf PASS \
  || (printf FAIL && exit 1)'" \
  '>' potentially_flaky_{1}.sh ::: {1..5}

Typically to gather flakiness stats I’d use a couple of nested loops, one for each test I want to run, and another for each attempt. I like doing this kind of stuff in bash for its simplicity/portability:

tests=(
  potentially_flaky_1.sh
  potentially_flaky_2.sh
  potentially_flaky_3.sh
  potentially_flaky_4.sh
  potentially_flaky_5.sh
)

# For each test
for test in "${tests[@]}"; do
  # For each attempt of that test (1 through to 10)
  for attempt in $(seq 1 10); do
    # Capture timestamp for when the test started
    start=$(date "+%s")
    # Run the test and capture its exit code
    bash "$test" > /dev/null
    status=$?
    # Calculate duration of the test run
    end=$(date "+%s")
    duration=$((end - start))

    # Print results
    printf "$test attempt $attempt took ${duration}s "
    if [[ status -eq 0 ]]; then
      printf "PASS\n"
    else
      printf "FAIL\n"
    fi
  done
done

This approach ended up being tediously slow though… since the tests take a while to execute, running them sequentially wasn’t gonna cut it.

I knew about GNU Parallel, but had never used it before. $ man parallel and 15 minutes later, I was “living life in the parallel lane” (as the GNU Parallel book encourages you to do!)

Rewriting the above to work using parallel ended up looking like:

tests=(
  potentially_flaky_1.sh
  potentially_flaky_2.sh
  potentially_flaky_3.sh
  potentially_flaky_4.sh
  potentially_flaky_5.sh
)

parallel --progress --jobs 5 --delay 2 --timeout 3600 --shuf --results out.csv \
  bash {1} ::: ${tests[@]} ::: {1..10}

The joy of finding the right tool for the job can’t be beat - more performance and functionality, with less code!

Let’s go into a bit more detail…

Passing inputs

In GNU Parallel, you specify a command that is to be executed in parallel. In the example provided, the command is bash {1}. The {1} is a placeholder that gets replaced with each input value (if you have more than one input you can use {2}, {3} etc).

The inputs to the command are specified after the ::: operator. In this case, the inputs are the array of test scripts (${tests[@]}) and a sequence of numbers from 1 to 10 ({1..10}). These inputs are provided to the command in all possible combinations.

So in this case, we have 5 test scripts and we want to run each one 10 times, parallel will execute each command 50 times in total.

Controlling concurrency

parallel provides a number of options that can be used to avoid resource contention, here are a few that I found useful for my tests:

  • --jobs 5: Caps the number of concurrent jobs to 5 (by default parallel will try to execute as many jobs as you have CPU cores).
  • --delay 2: Ensures each job waits for 2 seconds before starting, preventing a thundering herd problem.
  • --timeout 3600: Terminates any jobs that have been running for over an hour.
  • --shuf: Runs the jobs in a shuffled order.

Capturing output

By default the output of your command will be printed to your terminal, however in this case since I wanted to capture stats - using parallel’s capability to output a CSV file instead was very helpful:

  • --results out.csv: Outputs job completion results to the given file which includes duration, exit codes, and captured stdout/stderr.
  • --progress prints live progress as the jobs are executing.

The CSV file ends up looking like this (only including the first lines for brevity):

Seq,Host,Starttime,JobRuntime,Send,Receive,Exitval,Signal,Command,V1,V2,Stdout,Stderr
4,:,1692491267.732,6.025,0,4,1,0,"bash potentially_flaky_5.sh",potentially_flaky_5.sh,9,FAIL,
2,:,1692491263.646,12.025,0,4,0,0,"bash potentially_flaky_3.sh",potentially_flaky_3.sh,3,PASS,
1,:,1692491261.604,14.067,0,4,1,0,"bash potentially_flaky_5.sh",potentially_flaky_5.sh,2,FAIL,
5,:,1692491269.779,6.023,0,4,1,0,"bash potentially_flaky_1.sh",potentially_flaky_1.sh,3,FAIL,
3,:,1692491265.686,11.055,0,4,1,0,"bash potentially_flaky_4.sh",potentially_flaky_4.sh,8,FAIL,

It’d be trivial to use this output to aggregate/chart stats.

Exploring further…

This is barely scratching the surface of what parallel can do. I strongly recommend the excellent, free, and funny book by Parallel’s author Ole Tange. The first chapter takes 15 minutes to get through and covers 80% of what you’re likely going to use.

The book covers things such as:

  • Distributing jobs across different hosts using SSH, a powerful feature for leveraging multiple machines.
  • Monitoring the mean time for job completion and setting timeouts for jobs based on a percentage of the mean, providing more control over long-running tasks.
  • Retrying if jobs are known to be failure prone.
  • Resuming jobs if parallel execution stops midway, ensuring you don’t lose progress.
  • Limiting amount of jobs that can run based on CPU utilization (or other signals).
  • Limiting concurrency with a semaphore (and an excellent analogy about toilets).

Happy paralleling!