惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

博客园 - 【当耐特】
Help Net Security
Help Net Security
P
Proofpoint News Feed
J
Java Code Geeks
爱范儿
爱范儿
Last Week in AI
Last Week in AI
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
F
Full Disclosure
Google DeepMind News
Google DeepMind News
H
Help Net Security
G
Google Developers Blog
Jina AI
Jina AI
Vercel News
Vercel News
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
L
Lohrmann on Cybersecurity
S
Schneier on Security
Microsoft Azure Blog
Microsoft Azure Blog
IT之家
IT之家
Security Archives - TechRepublic
Security Archives - TechRepublic
阮一峰的网络日志
阮一峰的网络日志
N
News and Events Feed by Topic
GbyAI
GbyAI
B
Blog
O
OpenAI News
博客园_首页
Cisco Talos Blog
Cisco Talos Blog
K
KPMG report finds enterprise disconnect between AI and its ROI | CIO
Hacker News: Ask HN
Hacker News: Ask HN
TaoSecurity Blog
TaoSecurity Blog
腾讯CDC
MongoDB | Blog
MongoDB | Blog
M
MIT News - Artificial intelligence
C
Cybersecurity and Infrastructure Security Agency CISA
Cyberwarzone
Cyberwarzone
Webroot Blog
Webroot Blog
Simon Willison's Weblog
Simon Willison's Weblog
Y
Y Combinator Blog
C
Cisco Blogs
A
Arctic Wolf
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
T
The Exploit Database - CXSecurity.com
Security Latest
Security Latest
AI
AI
W
WeLiveSecurity
aimingoo的专栏
aimingoo的专栏
The Register - Security
The Register - Security
Project Zero
Project Zero
H
Hackread – Cybersecurity News, Data Breaches, AI and More
N
Netflix TechBlog - Medium
Blog — PlanetScale
Blog — PlanetScale

OCI Registry As Storage Blog

🚀 Announcing ORAS v1.3.0 - Elevate your artifact and registry management workflows 🚀 Announcing ORAS v1.3.0-beta.3 - Enrich formatted output of oras discover Announcing ORAS v1.2.0 - OCI Spec v1.1.0 support, formatted output, and more! ORAS Welcomes New Owners, Maintainers, and Thanks Emeritus Owners ORAS 0.15 - A Fully Functional OCI Registry Client Bundle, test and deploy Gatekeeper policies as OCI image ORAS - Looking back on 2022 and forward to 2023 ORAS 0.14 and Future - Empower Container Secure Supply Chain ORAS Artifacts Draft Specification Release – Adding Secure Supply Chain Artifacts References
Lightweight Registry with Oras OCI-Layouts and Object Storage in The Cloud
ORAS Community · 2023-10-14 · via OCI Registry As Storage Blog

ORAS Lightweight Cloud Registry

Setting up a centralized cloud OCI registry is sometimes too much for small scale tests and proof of concepts!
What if I told you there was a lightweight alternative?

Everything Should Be a File

One of the great Unix philosophies is the concept of making everything a file. Modems, keyboards, printers, disk drives, serial interfaces, configuration, kernel parameters, IPC, everything and anything is a file that you can read() and/or write().

In a nutshell, this separates a many concerns to the file system layer instead of your program.
It allows, among other things, plugging stuff into any program without writing special integration code. That code was already written in the "driver" for that "device", so why repeat yourself in the program?

In General: programs write and read files
Making everything a file means that you can generally interface with every program.

Oras is EIAF Compatible

oras has a --oci-layout

Instead of talking to an upstream service over http/https, oras can create the OCI artifact structure as files and directories.

Most (if not every) operation that can be performed on an upstream registry can also be performed locally as an OCI directory structure.

Cloud Object Storage is EIAF Compatible

The file system is the foundation on which EIAF rests upon.
They are special components that convert media (CD, Tape, HDD, Controllers, etc.) into interface layers of which you can read/write files.

If you can mount it as a file system, you got EIAF.

Your preferred Cloud provider should have its own FUSE driver for their object storage offering, and if it doesn't, there is probably an alternative.


It is useful to think of Object Storage as key: data databases instead of the file systems they pretend to be.

Mounting one as a file system should be seen as "unintended functionality" and it breaks some assumptions programs make about file systems.

Example: Instant nanosecond retrieval times, turning into millisecond or even second long affairs.

Putting it together

Now that we have the all the parts, it's just a matter of combining them!

For this tutorial we are using s3 with s3fs, but feel free to use a different provider and follow along!

Prerequisites

  • Install ORAS
  • Install s3fs
  • Obtain an AWS account and create a S3 bucket

Creating some test artifacts

Let us first create some artifacts!

echo "Hello World!" > hello.txt

echo "Goodbye World!" > goodbye.txt

Mounting your s3 bucket

Now we should mount our s3 bucket so that we can use it as an oci-layout target.
Checkout s3fs --help for the different ways you can authenticate, here we are using credential profiles.

mkdir s3

s3fs my-bucket s3 -o profile=MY_AWS_PROFILE

Using your new "registry"

Pushing our newly created artifacts is as easy as:

oras push --oci-layout s3/hello:latest hello.txt:text/plain

oras push --oci-layout s3/goodbye:latest goodbye.txt:text/plain

Pulling them is just as simple!

rm hello.txt

rm goodbye.txt

oras pull --oci-layout s3/hello:latest

oras pull --oci-layout s3/goodbye:latest

cat hello.txt

cat goodbye.txt

Congratulations!

Now you are running your own lightweight OCI registry on the cloud!

Caveats

You can't delete yet!

You can't delete your artifacts yet!

Currently1 you can't delete your local oci-layout artifacts with oras manifest delete or oras blob delete

Normally it is the registry's responsibility to garbage collect blobs without any references, but in this case the cli does not have the functionality (yet).

It is still possible to just do rm -rf s3/my-repo, but that is probably not what you want to do.

Concurrency?

Same concurrency rules of your chosen Object Store apply, generally last-write-wins. Please check your documentation

Using this trick in your application

You can use this trick in your applications as well!

Instead of integrating in your application code (hard dependency, vendor lock-in, complicates your interface: "how do I pass in creds?") you instead mount the Object Store in the container and then "just use the file system" (transferable, simple, external, interfaceable)

There are some caveats to this approach and your application might leverage the capabilities of the Native Object Storage API extremely well, but if it's just a simple interaction, why complicate things?

Upgrading to Registry

AWS Simple Storage Service (S3) to ZOT

You can just point ZOT at the s3 bucket as it is already expecting an oci-layout

Other

If you have a "normal" registry you'll need to export all your repos and their tags to that registry

Script

NOTE: won't work with any repo containing a space or tab

ls -1 s3/ | xargs -P0 -n1 sh -c 'oras repo tags --oci-layout "s3/$1" | xargs -P0 -n1 -I{} oras copy --recursive --from-oci-layout "s3/$1:{}" "yourregistry/$1:{}"' _

There is no --oci-layout on oras repo ls, so we are substituting with: ls -1 s3/

Performance

Goofys was the definitive champion utilizing the strengths of a local caching file system with the powers of a remote Object Store, it was the fastest in the cloud and local.

s3fs made a commendable effort and was faster than the native API on small objects, but started to lose performance on the bigger objects.

Findings

tl:dr

Local performance:
Goofys made testing very difficult because it so effectively cached and reused objects, but installing it was a pain.
s3fs installed easily, but suffered around 512MiB, was bad at 1Gib, and was awful at 2Gib.

Cloud performance:
Amazing throughout.
Use Goofys.

Note: goofys is cheating!. It is a caching fs, so the numbers below are not an accurate representation of all use-cases.

Local

Done inside WSL Ubuntu over WiFi.

This test was done over Wifi and was overall pretty slow especially with the bigger objects. Chunking the artifact did not lead to any substantial speed increases.

goofys isn't displayed, but you should try it out yourself!
It's asynchronous update and caching nature is so powerful and led to testing complications.

local plot

AWS

Welp it's WAY faster over here!

goofys is an entire order of magnitude faster, but normalized around 2GiB.
s3fs was also faster than native up until about 2GiB.

Chunking the artifacts did not lead to any substantial speed increases... Except for goofys, which showed an upwards of 2x speed increase!

aws plot

AWS2023 has removed the Extra Packages for Enterprise Linux (EPEL) repositories, so I used Debian instead. I used the apt distribution of s3fs and the go install of goofys.

Scripts used

Feel free to modify and test how your flavors of fuse drivers stack up!

test-lib.sh

#!/bin/bash

# bucket - the s3 bucket

# s3dir - the s3 mounted directory

# size - bytes size. e.g. 15MiB, 1GiB, 2MB. should be valid for `head -c `

# count - a integer

# datafile - the file that contains the time information

# jtag - a valid `junk` repo tag

# bucket dir

s3fsSetup() {

dirname="$2"

mkdir -p "$dirname"

s3fs "$1" "$dirname" -o iam_role; #setup for cloud, you should modify for local testing

echo "$dirname"

}

# bucket dir

goofysSetup() {

dirname="$2"

mkdir -p "$dirname"

goofys "$1" "$dirname" #setup for cloud, you should modify for local testing

echo "$dirname"

}

# s3dir

unmountS3() {

umount "$1" &&

rmdir "$1"

}

# size

genSingle() {

base64 < /dev/random | head -c "$1" > "${1}.junk"

echo "${1}.junk"

}

# size count

genMulti() {

seq 1 "$2" | xargs -n1 sh -c 'base64 < /dev/random | head -c "$1" > "${1}.${2}.junk"; echo "${1}.${2}.junk"' _ "$1"

}

# <(time)

time2line() {

grep -Eo '[0-9]+m.+s' | tr '\n' '\t'

}

# s3dir size

singleBlobPush() {

file=$(genSingle "$2")

oras push --oci-layout "${1}/junk:${2}" "$file"

rm "$file"

}

# s3dir size count

multiBlobPush() {

files=$(genMulti "$2" "$3")

echo "$files" | xargs -d'\n' -n"${3}" oras push --oci-layout "${1}/junk:${2}x${3}"

echo "$files" | xargs -d'\n' rm -f

}

# datafile

resetDataFile() {

printf '#%s\t%s\t%s\t%s\t%s\t\n' "type" "sizelabel" "realtime" "usertime" "systime" > "$1"

}

# s3dir jtag datafile

timeOrasPull() {

exec 3>&1

{ time oras pull --oci-layout "${1}/junk:${2}"; } 2>&1 1>&3 | printf '%s\t%s\t%s\n' "${1}" "${2}" "$(cat | time2line)" >> "$3"

rm -f *.junk

}

# bucket jtag datafile

timeS3Junk() {

exec 3>&1

tmpdir=$(mktemp -d)

{ time { aws s3 cp --recursive "s3://$1/junk/blobs/sha256/" "${tmpdir}/" 1>&1 2>&1 ; } ; } 2>&1 1>&3 | printf '%s\t%s\t%s\n' "aws-cli-s3" "${2}" "$(cat | time2line)" >> "$3"

rm -rf "$tmpdir"

}

# s3dir

deleteJunkRepo() {

rm -rf "$1/junk"

}

oci-test.sh

Remember to set the AWS ENV Variables

#!/bin/bash

. test-lib.sh

set -u

set -e

set -o pipefail

: "${S3_BUCKET:="$1"}"

: "${DAT_FILE:="orastime.dat"}"

: "${S3FS_TDIR:="s3fs3"}"

: "${S3FS_TEST:="true"}"

: "${S3FS_S3_BUCKET:="$S3_BUCKET"}"

: "${GOOFYS_TDIR:="goofys3"}"

: "${GOOFYS_TEST:="true"}"

: "${GOOFYS_S3_BUCKET:="$S3_BUCKET"}"

: "${SINGLE_LIST:="8MiB 16MiB 32MiB 64MiB 128MiB 256MiB 512MiB 1GiB 2GiB"}"

: "${MULTI_LIST:="16MiBx8 64MiBx8 64MiBx16"}"

# time postmessage

sleepy() {

echo "Sleeping for ${1}"

sleep "$1"

echo "$2"

}

# dir bucket datafile singlelist multilist

testit() {

dir="$1"

bucket="$2"

datfile="$3"

singlelist=($4)

multilist=($5)

# size

trippleTimePull() {

s="$1"

echo "Pull 1 Starting..." &&

timeOrasPull "$dir" "$s" "$datfile" && sleepy 1.3 "Pull 2 Starting..." &&

timeOrasPull "$dir" "$s" "$datfile" && sleepy 1.3 "Pull 3 Starting..." &&

timeOrasPull "$dir" "$s" "$datfile"

}

trippleTimeS3() {

s="$1"

echo "Copy 1 Starting..." &&

timeS3Junk "$bucket" "$s" "$datfile" && sleepy 1.3 "Copy 2 Starting..." &&

timeS3Junk "$bucket" "$s" "$datfile" && sleepy 1.3 "Copy 3 Starting..." &&

timeS3Junk "$bucket" "$s" "$datfile"

}

echo "Starting: Oras S3 test on ${dir} into ${datfile}"

for size in "${singlelist[@]}"

do

echo "Testing ${size}" &&

singleBlobPush "$dir" "$size" &&

trippleTimePull "$size" &&

trippleTimeS3 "$size" && echo "Cleaning up..." &&

deleteJunkRepo "$dir"

done

for sizexcount in "${multilist[@]}"

do

count="${sizexcount#*x}"

size="${sizexcount%x*}"

echo "Testing ${count} chunks of ${size}" &&

multiBlobPush "$dir" "$size" "$count" &&

trippleTimePull "$sizexcount" &&

trippleTimeS3 "$sizexcount" && echo "Cleaning up..." &&

deleteJunkRepo "$dir"

done

}

if [ "S3FS-TDIR" = "s3fs" ]; then

echo "IDFK Why but if you make the s3fs directory 's3fs' it breaks"

exit 1

fi

echo "Init Data File"

resetDataFile "$DAT_FILE"

if [ "$S3FS_TEST" = "true" ]; then

echo "Setting up s3fs..."

echo "$S3FS_S3_BUCKET"

s3fsSetup "$S3FS_S3_BUCKET" "$S3FS_TDIR"

echo "Lets do this..."

testit "$S3FS_TDIR" "$S3FS_S3_BUCKET" "$DAT_FILE" "$SINGLE_LIST" "$MULTI_LIST"

echo "Unmounting..."

unmountS3 "$S3FS_TDIR"

fi

if [ "$GOOFYS_TEST" = "true" ]; then

echo "Setting up goofys..."

goofysSetup "$GOOFYS_S3_BUCKET" "$GOOFYS_TDIR"

echo "Lets do this..."

testit "$GOOFYS_TDIR" "$GOOFYS_S3_BUCKET" "$DAT_FILE" "$SINGLE_LIST" "$MULTI_LIST"

echo "Unmounting..."

unmountS3 "$GOOFYS_TDIR"

fi

  1. It will probably be available in the next releases > v1.1.0. See this merged issue. ↩