惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

N
News | PayPal Newsroom
Security Archives - TechRepublic
Security Archives - TechRepublic
Hacker News: Ask HN
Hacker News: Ask HN
H
Hacker News: Front Page
Apple Machine Learning Research
Apple Machine Learning Research
TaoSecurity Blog
TaoSecurity Blog
Help Net Security
Help Net Security
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
V
V2EX
Hugging Face - Blog
Hugging Face - Blog
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
人人都是产品经理
人人都是产品经理
博客园 - 三生石上(FineUI控件)
Security Latest
Security Latest
Cloudbric
Cloudbric
WordPress大学
WordPress大学
S
SegmentFault 最新的问题
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
www.infosecurity-magazine.com
www.infosecurity-magazine.com
Know Your Adversary
Know Your Adversary
A
Arctic Wolf
L
LangChain Blog
Application and Cybersecurity Blog
Application and Cybersecurity Blog
The GitHub Blog
The GitHub Blog
P
Proofpoint News Feed
W
WeLiveSecurity
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
M
MIT News - Artificial intelligence
Google DeepMind News
Google DeepMind News
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
The Cloudflare Blog
小众软件
小众软件
NISL@THU
NISL@THU
云风的 BLOG
云风的 BLOG
P
Privacy & Cybersecurity Law Blog
S
Security @ Cisco Blogs
博客园 - 【当耐特】
I
InfoQ
Vercel News
Vercel News
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
P
Proofpoint News Feed
O
OpenAI News
Google DeepMind News
Google DeepMind News
N
News and Events Feed by Topic
K
KPMG report finds enterprise disconnect between AI and its ROI | CIO
K
Kaspersky official blog
T
Threat Research - Cisco Blogs
量子位
宝玉的分享
宝玉的分享

Reimu's blog

端午节游玩:徒步、骑行、跑山 考驾照和买二手车 使用 BuildKit on Kubernetes 构建多架构容器镜像 写在上海封城一年之后 使用 Packer 构建虚拟机镜像踩的坑 使用 Redfish 自动化安装 ESXi OS 搬砖工具 govc 使用记录 使用 kubectl 自动归档 argo workflow 日志 VMware Tanzu kubernetes 发行版部署尝鲜
流水线中使用 docker in pod 方式构建容器镜像
Reimu · 2022-03-15 · via Reimu's blog

上个月参加了 Rancher 社区举办的 《Dockershim 即将被移除,你准备好了么?》直播分享后,得知自 1.24 版本之后,Kubernetes 社区将正式放弃对 docker CRI 的支持,docker CRI 这部分代码则由 cri-dockerd 项目来接盘。目前众多主流的 Kubernetes 私有化部署工具(比如 kubespraykubekeysealos)也逐渐地不再将 docker 作为首选的容器运行时而纷纷转向了 containerd,去 docker 也成为了目前云原生社区热门的讨论话题。

docker 虽然作为一个 CRI 在 Kubernetes 社区一直被人诟病,但我们要知道 CRI 仅仅是 docker 的一部分功能而已。对于本地开发测试或者 CI/CD 流水线镜像构建来讲,依然有很多地方严重地依赖着 docker。比如 GitHub 上容器镜像构建的 Action 里, docker 官方的 build-push-action 是众多项目首选的方式。即便是 docker 的竞争对手 podman + skopeo + buildah 三剑客它们自身的容器镜像也是采用 docker 来构建的 multi-arch-build.yaml

jobs:
  multi:
    name: multi-arch image build
    env:
      REPONAME: buildah  # No easy way to parse this out of $GITHUB_REPOSITORY
      # Server/namespace value used to format FQIN
      REPONAME_QUAY_REGISTRY: quay.io/buildah
      CONTAINERS_QUAY_REGISTRY: quay.io/containers
      # list of architectures for build
      PLATFORMS: linux/amd64,linux/s390x,linux/ppc64le,linux/arm64
      # Command to execute in container to obtain project version number
      VERSION_CMD: "buildah --version"

    # build several images (upstream, testing, stable) in parallel
    strategy:
      # By default, failure of one matrix item cancels all others
      fail-fast: false
      matrix:
        # Builds are located under contrib/<reponame>image/<source> directory
        source:
          - upstream
          - testing
          - stable
    runs-on: ubuntu-latest
    # internal registry caches build for inspection before push
    services:
      registry:
        image: quay.io/libpod/registry:2
        ports:
          - 5000:5000
    steps:
      - name: Checkout
        uses: actions/checkout@v2

      - name: Set up QEMU
        uses: docker/setup-qemu-action@v1

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v1
        with:
          driver-opts: network=host
          install: true

      - name: Build and locally push image
        uses: docker/build-push-action@v2
        with:
          context: contrib/${{ env.REPONAME }}image/${{ matrix.source }}
          file: ./contrib/${{ env.REPONAME }}image/${{ matrix.source }}/Dockerfile
          platforms: ${{ env.PLATFORMS }}
          push: true
          tags: localhost:5000/${{ env.REPONAME }}/${{ matrix.source }}

Jenkins 流水线

我们的 CI/CD 流水线是使用 Jenkins + Kubernetes plugin 的方式在 Kubernetes 上动态地创建 Pod 作为 Jenkins Slave。在使用 docker 作为容器时的情况下,Jenkins Slave Pod 将宿主机上的 /var/run/docker.sock 文件通过 hostPath 的方式挂载到 pod 容器内,容器内的 docker CLI 就能通过该 sock 与宿主机的 docker 守护进程进行通信,这样在 pod 容器内就可以无缝地使用 docker build 、push 等命令了。

// Kubernetes pod template to run.
podTemplate(
    cloud: "kubernetes",
    namespace: "default",
    name: POD_NAME,
    label: POD_NAME,
    yaml: """
apiVersion: v1
kind: Pod
spec:
  containers:
  - name: debian
    image: "${JENKINS_POD_IMAGE_NAME}"
    imagePullPolicy: IfNotPresent
    tty: true
    volumeMounts:
    - name: dockersock
      mountPath: /var/run/docker.sock
  - name: jnlp
    args: ["\$(JENKINS_SECRET)", "\$(JENKINS_NAME)"]
    image: "jenkins/inbound-agent:4.3-4-alpine"
    imagePullPolicy: IfNotPresent
  volumes:
    - name: dockersock
      hostPath:
        path: /var/run/docker.sock
""",
)

当不再使用 docker 作为 Kubernetes 的容器运行时之后,宿主机上则就没有了 docker 守护进程,挂载 /var/run/docker.sock 的方式也就凉凉了,因此我们需要找到一些替代的方法。

目前能想到的有两种方案:方案一是替代掉 docker 使用其他镜像构建工具比如 podman + skopeo + buildah。陈少文博主在《基于 Kubernetes 的 Jenkins 服务也可以去 Docker 了》详细地讲过该方案。但我们的 Makefile 里缝合了一些 docker buildKit 的特性参数并不能地通过 alias docker=podman 别名的方式简单粗暴地给替换掉 😂。

比如 podman 构建镜像就不支持 --output type=local,dest=path Support custom build outputs #3789 这种特性。目前看来 podman 想要完全取代掉 docker 的老大哥地位仍还有很长的路要走,尤其 podman 还没有解决自身的镜像是由 docker 来构建的这个尴尬难题。

方案二就是继续使用 docker 作为镜像构建工具,虽然集群节点上没有了 docker 守护进程,但这并不意味着在 Kubernetes 集群里就无法使用 docker 了。我们可以换种方式将 docker 作为一个 pod 运行在 kubernetes 集群中,而非以 systemd 的方式部署在节点上。然后通过 service IP 或 Node IP 访问 docker 的 TCP 端口进行通信,这样也能无缝地继续使用 docker 。于是在 dind (docker-in-docker) 的基础上就有了 dinp (docker-in-pod) 的套娃操作,其实二者本质上都是相同的,只不过是部署方式和访问方式不太相同而已。

对比一下这两种方案,方案一通过 alias docker=podman 使用 podman 替代 docker 有点投机取巧,在正式的生产环境流水线中应该很少会被采用,除非你的 Makefile 或者镜像构建脚本中没有依赖 docker 的特性参数,能够完全兼容 podman;方案二比较稳定可靠,它无非就是将之前的宿主机节点上的 docker 守护进程替换成了集群内的 Pod,对于使用者而言只需要修改一下访问 docker 的方式,即 DOCKER_HOST 环境变量即可。因此本文选用方案二来给大家介绍几种在 K8s 集群里部署和使用 dind/dinp 的方式。

不同于 docker in docker,docker in pod 并不关心底层的容器运行时是什么,可以是 docker 也可以是 containerd。在 pod 内运行和使用 docker 个人总结出以下三种比较合适的方式,可以根据不同的场景选择一个合适的:

sidecar

将 dind 容器作为 sidecar 容器 来运行,主容器通过 localhost 的方式访问 docker 的 2375/2376 TCP 端口。这种方案的好处就是如果创建了多个 Pod,各个 Pod 之间是相互独立的,dind 容器不会共享给其他 pod 使用,隔离性比较好。缺点也比较明显,每一个 Pod 都带一个 dind 容器占用的系统资源比较多,有点大材小用的感觉;

apiVersion: v1
kind: Pod
metadata:
  name: dinp-sidecar
spec:
  containers:
  - image: docker:20.10.12
    name: debug
    command: ["sleep", "3600"]
    env:
    - name: DOCKER_TLS_VERIFY
      value: ""
    - name: DOCKER_HOST
      value: tcp://localhost:2375
  - name: dind
    image: docker:20.10.12-dind-rootless
    args: ["--insecure-registry=$(REGISTRY)"]
    env:
    # 如果镜像仓库域名为自签证书,需要在这里配置 insecure-registry
    - name: REGISTRY
      value: hub.k8s.li
    - name: DOCKER_TLS_CERTDIR
      value: ""
    - name: DOCKER_HOST
      value: tcp://localhost:2375
    securityContext:
      privileged: true
    tty: true
    # 使用 docker info 命令就绪探针来确保 dind 容器正常启动 
    readinessProbe:
      exec:
        command: ["docker", "info"]
      initialDelaySeconds: 10
      failureThreshold: 6

daemonset

daemonset 则是在集群的每一个 Node 节点上运行一个 dind Pod,并且使用 hostNetwork 方式来暴露 2375/2376 TCP 端口。使用者则通过 status.hostIP 访问宿主机的 2375/2376 TCP 端口来与 docker 进行通信;另外再通过 hostPath 挂载的方式来将 dind 容器内的 /var/lib/docker 数据持久化存储下来,能够缓存一些数据提高镜像构建的效率。

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: dinp-daemonset
  namespace: default
spec:
  selector:
    matchLabels:
      name: dinp-daemonset
  template:
    metadata:
      labels:
        name: dinp-daemonset
    spec:
      hostNetwork: true
      containers:
      - name: dind
        image: docker:20.10.12-dind
        args: ["--insecure-registry=$(REGISTRY)"]
        env:
        - name: REGISTRY
          value: hub.k8s.li
        - name: DOCKER_TLS_CERTDIR
          value: ""
        securityContext:
          privileged: true
        tty: true
        volumeMounts:
        - name: docker-storage
          mountPath:  /var/lib/docker
        readinessProbe:
          exec:
            command: ["docker", "info"]
          initialDelaySeconds: 10
          failureThreshold: 6
        livenessProbe:
          exec:
            command: ["docker", "info"]
          initialDelaySeconds: 60
          failureThreshold: 10
      volumes:
      - name: docker-storage
        hostPath:
          path: /var/lib/docker

deployment

Deployment 方式则是在集群中部署一个或多个 dind Pod,使用者通过 service IP 来访问 docker 的 2375/2376 端口,如果是以非 TLS 方式启动 dind 容器,使用 service IP 来访问 docker 要比前面的 daemonset 使用 host IP 安全性要好一些。

apiVersion: apps/v1
kind: Deployment
metadata:
  name: dinp-deployment
  namespace: default
  labels:
    name: dinp-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      name: dinp-deployment
  template:
    metadata:
      labels:
        name: dinp-deployment
    spec:
      containers:
      - name: dind
        image: docker:20.10.12-dind
        args: ["--insecure-registry=$(REGISTRY)"]
        env:
        - name: REGISTRY
          value: hub.k8s.li
        - name: DOCKER_TLS_CERTDIR
          value: ""
        - name: DOCKER_HOST
          value: tcp://localhost:2375
        securityContext:
          privileged: true
        tty: true
        volumeMounts:
        - name: docker-storage
          mountPath:  /var/lib/docker
        readinessProbe:
          exec:
            command: ["docker", "info"]
          initialDelaySeconds: 10
          failureThreshold: 6
        livenessProbe:
          exec:
            command: ["docker", "info"]
          initialDelaySeconds: 60
          failureThreshold: 10
      volumes:
      - name: docker-storage
        hostPath:
          path: /var/lib/docker
---
kind: Service
apiVersion: v1
metadata:
  # 定义 service name,使用者通过它来访问 docker 的 2375 端口
  name: dinp-deployment
spec:
  selector:
    name: dinp-deployment
  ports:
  - protocol: TCP
    port: 2375
    targetPort: 2375

Jenkinsfile

在 Jenkins 的 podTemplate 模版里,可以根据 dinp 部署方式的不同选用以下几种不同的模版:

sidecare

Pod 内容器共享同一个网络协议栈,因此可以通过 localhost 来访问 docker 的 TCP 端口,另外最好使用 rootless 模式启动 dind 容器,这样能在同一节点上运行多个这样的 Pod 实例。

def JOB_NAME = "${env.JOB_BASE_NAME}"
def BUILD_NUMBER = "${env.BUILD_NUMBER}"
def POD_NAME = "jenkins-${JOB_NAME}-${BUILD_NUMBER}"
def K8S_CLUSTER = params.k8s_cluster ?: kubernetes

// Kubernetes pod template to run.
podTemplate(
    cloud: K8S_CLUSTER,
    namespace: "default",
    name: POD_NAME,
    label: POD_NAME,
    yaml: """
apiVersion: v1
kind: Pod
spec:
  containers:
  - name: runner
    image: golang:1.17-buster
    imagePullPolicy: IfNotPresent
    tty: true
    env:
    - name: DOCKER_HOST
      vaule: tcp://localhost:2375
    - name: DOCKER_TLS_VERIFY
      value: ""
  - name: jnlp
    args: ["\$(JENKINS_SECRET)", "\$(JENKINS_NAME)"]
    image: "jenkins/inbound-agent:4.11.2-4-alpine"
    imagePullPolicy: IfNotPresent
  - name: dind
    image: docker:20.10.12-dind-rootless
    args: ["--insecure-registry=$(REGISTRY)"]
    env:
    - name: REGISTRY
      value: hub.k8s.li
    - name: DOCKER_TLS_CERTDIR
      value: ""
    securityContext:
      privileged: true
    tty: true
    readinessProbe:
      exec:
        command: ["docker", "info"]
      initialDelaySeconds: 10
      failureThreshold: 6
""",
) {
    node(POD_NAME) {
        container("runner") {
            stage("Checkout") {
                retry(10) {
                    checkout([
                        $class: 'GitSCM',
                        branches: scm.branches,
                        doGenerateSubmoduleConfigurations: scm.doGenerateSubmoduleConfigurations,
                        extensions: [[$class: 'CloneOption', noTags: false, shallow: false, depth: 0, reference: '']],
                        userRemoteConfigs: scm.userRemoteConfigs,
                    ])
                }
            }
            stage("Build") {
                sh """
                # make docker-build
                docker build -t app:v1.0.0-alpha.1 .
                """
            }
        }
    }
}

daemonset

由于使用的是 hostNetwork,因此可以通过 host IP 来访问 docker 的 TCP 端口,当然也可以像 deployment 那样通过 service Name 来访问,在这里就不演示了。

def JOB_NAME = "${env.JOB_BASE_NAME}"
def BUILD_NUMBER = "${env.BUILD_NUMBER}"
def POD_NAME = "jenkins-${JOB_NAME}-${BUILD_NUMBER}"
def K8S_CLUSTER = params.k8s_cluster ?: kubernetes

// Kubernetes pod template to run.
podTemplate(
    cloud: K8S_CLUSTER,
    namespace: "default",
    name: POD_NAME,
    label: POD_NAME,
    yaml: """
apiVersion: v1
kind: Pod
spec:
  containers:
  - name: runner
    image: golang:1.17-buster
    imagePullPolicy: IfNotPresent
    tty: true
    env:
    - name: DOCKER_HOST
      valueFrom:
        fieldRef:
          fieldPath: status.hostIP
    - name: DOCKER_TLS_VERIFY
      value: ""
  - name: jnlp
    args: ["\$(JENKINS_SECRET)", "\$(JENKINS_NAME)"]
    image: "jenkins/inbound-agent:4.11.2-4-alpine"
    imagePullPolicy: IfNotPresent
""",
) {
    node(POD_NAME) {
        container("runner") {
            stage("Checkout") {
                retry(10) {
                    checkout([
                        $class: 'GitSCM',
                        branches: scm.branches,
                        doGenerateSubmoduleConfigurations: scm.doGenerateSubmoduleConfigurations,
                        extensions: [[$class: 'CloneOption', noTags: false, shallow: false, depth: 0, reference: '']],
                        userRemoteConfigs: scm.userRemoteConfigs,
                    ])
                }
            }
            stage("Build") {
                sh """
                # make docker-build
                docker build -t app:v1.0.0-alpha.1 .
                """
            }
        }
    }
}

deployment

通过 service name 访问 docker,其他参数和 daemonset 都是相同的

def JOB_NAME = "${env.JOB_BASE_NAME}"
def BUILD_NUMBER = "${env.BUILD_NUMBER}"
def POD_NAME = "jenkins-${JOB_NAME}-${BUILD_NUMBER}"
def K8S_CLUSTER = params.k8s_cluster ?: kubernetes

// Kubernetes pod template to run.
podTemplate(
    cloud: K8S_CLUSTER,
    namespace: "default",
    name: POD_NAME,
    label: POD_NAME,
    yaml: """
apiVersion: v1
kind: Pod
spec:
  containers:
  - name: runner
    image: golang:1.17-buster
    imagePullPolicy: IfNotPresent
    tty: true
    env:
    - name: DOCKER_HOST
       value: tcp://dinp-deployment:2375
    - name: DOCKER_TLS_VERIFY
      value: ""
  - name: jnlp
    args: ["\$(JENKINS_SECRET)", "\$(JENKINS_NAME)"]
    image: "jenkins/inbound-agent:4.11.2-4-alpine"
    imagePullPolicy: IfNotPresent
""",
) {
    node(POD_NAME) {
        container("runner") {
            stage("Checkout") {
                retry(10) {
                    checkout([
                        $class: 'GitSCM',
                        branches: scm.branches,
                        doGenerateSubmoduleConfigurations: scm.doGenerateSubmoduleConfigurations,
                        extensions: [[$class: 'CloneOption', noTags: false, shallow: false, depth: 0, reference: '']],
                        userRemoteConfigs: scm.userRemoteConfigs,
                    ])
                }
            }
            stage("Build") {
                sh """
                # make docker-build
                docker build -t app:v1.0.0-alpha.1 .
                """
            }
        }
    }
}

其他

readinessProbe

有些时候 dind 无法正常启动,所以一定要设置就绪探针,来确保 diind 容器能够正常启动

readinessProbe:
  exec:
    command: ["docker", "info"]
  initialDelaySeconds: 10
  failureThreshold: 6

2375/2376 端口

docker 默认是以 TLS 方式启动,监听端口为 2376,如果设置环境变量 DOCKER_TLS_CERTDIR 为空则就以非 TLS 模式启动,监听端口为 2375,这时就不会校验 TLS 证书。如果使用 2376 端口,则就需要一个持久化存储来将 docker 生成的证书共享给客户端,这点比较麻烦。因此如果不想瞎折腾还是使用 2375 非 TLS 方式吧 😂。

dinp 必须以开启 privileged: true

以 pod 方式运行 docker,无论是否是 rootless 模式,都要在 pod 容器的 securityContext 中设置 privileged: true,否则 pod 无法正常启动。而且 rootless 模式也有一定的限制,需要依赖一些内核的特性,目前也只是实验阶段,没有特殊的需求还是尽量不要使用 rootless 特性吧。

[root@localhost ~]# kubectl logs -f dinp-sidecar
error: a container name must be specified for pod dinp-sidecar, choose one of: [debug dind]
[root@localhost ~]# kubectl logs -f dinp-sidecar -c dind
Device "ip_tables" does not exist.
ip_tables              27126  4 iptable_raw,iptable_mangle,iptable_filter,iptable_nat
modprobe: can't change directory to '/lib/modules': No such file or directory
WARN[0000] failed to mount sysfs, falling back to read-only mount: operation not permitted
WARN[0000] failed to mount sysfs: operation not permitted
open: No such file or directory
[rootlesskit:child ] error: executing [[ip tuntap add name tap0 mode tap] [ip link set tap0 address 02:50:00:00:00:01]]: exit status 1

rootless user.max_user_namespaces

rootless 模式下需要依赖一些内核参数 Run the Docker daemon as a non-root user (Rootless mode)。在 CentOS 7.9 上会出现 dind-rootless: failed to start up dind rootless in k8s due to max_user_namespaces 问题。解决方案是在修改一下 user.max_user_namespaces=28633 内核参数。

Add user.max_user_namespaces=28633 to /etc/sysctl.conf (or /etc/sysctl.d) and run sudo sysctl -p

[root@localhost ~]# kubectl get pod -w
NAME                              READY   STATUS   RESTARTS     AGE
dinp-deployment-cf488bfd8-g8vxx   0/1     CrashLoopBackOff   1 (2s ago)   4s
[root@localhost ~]# kubectl logs -f dinp-deployment-cf488bfd8-m5cms
Device "ip_tables" does not exist.
ip_tables              27126  5 iptable_raw,iptable_mangle,iptable_filter,iptable_nat
modprobe: can't change directory to '/lib/modules': No such file or directory
error: attempting to run rootless dockerd but need 'user.max_user_namespaces' (/proc/sys/user/max_user_namespaces) set to a sufficiently large value

非 rootless 模式下同一 node 节点只能运行一个 dinp

如果是使用 deployment 方式部署 dinp,一个 node 节点上只能有一个 dinp Pod,多余的 Pod 无法正常启动。因此如果想要运行多个 dinp Pod,建议使用 daemonset 方式运行它;

[root@localhost ~]# kubectl get deploy
NAME              READY   UP-TO-DATE   AVAILABLE   AGE
dinp-deployment   1/3     3            1           4m16s
[root@localhost ~]# kubectl get pod -w
NAME                               READY   STATUS    RESTARTS      AGE
dinp-deployment-547bd9bb6d-2mn6c   0/1     Running   3 (61s ago)   4m9s
dinp-deployment-547bd9bb6d-8ht8l   1/1     Running   0             4m9s
dinp-deployment-547bd9bb6d-x5vpv   0/1     Running   3 (61s ago)   4m9s
[root@localhost ~]# kubectl logs -f dinp-deployment-547bd9bb6d-2mn6c
INFO[2022-03-14T14:14:10.905652548Z] Starting up
WARN[2022-03-14T14:14:10.906986721Z] could not change group /var/run/docker.sock to docker: group docker not found
WARN[2022-03-14T14:14:10.907249071Z] Binding to IP address without --tlsverify is insecure and gives root access on this machine to everyone who has access to your network.  host="tcp://0.0.0.0:2375"
WARN[2022-03-14T14:14:10.907269951Z] Binding to an IP address, even on localhost, can also give access to scripts run in a browser. Be safe out there!  host="tcp://0.0.0.0:2375"
WARN[2022-03-14T14:14:11.908057635Z] Binding to an IP address without --tlsverify is deprecated. Startup is intentionally being slowed down to show this message  host="tcp://0.0.0.0:2375"
WARN[2022-03-14T14:14:11.908103696Z] Please consider generating tls certificates with client validation to prevent exposing unauthenticated root access to your network  host="tcp://0.0.0.0:2375"
WARN[2022-03-14T14:14:11.908114541Z] You can override this by explicitly specifying '--tls=false' or '--tlsverify=false'  host="tcp://0.0.0.0:2375"
WARN[2022-03-14T14:14:11.908125477Z] Support for listening on TCP without authentication or explicit intent to run without authentication will be removed in the next release  host="tcp://0.0.0.0:2375"
INFO[2022-03-14T14:14:26.914587276Z] libcontainerd: started new containerd process  pid=41
INFO[2022-03-14T14:14:26.914697125Z] parsed scheme: "unix"                         module=grpc
INFO[2022-03-14T14:14:26.914710376Z] scheme "unix" not registered, fallback to default scheme  module=grpc
INFO[2022-03-14T14:14:26.914785052Z] ccResolverWrapper: sending update to cc: {[{unix:///var/run/docker/containerd/containerd.sock  <nil> 0 <nil>}] <nil> <nil>}  module=grpc
INFO[2022-03-14T14:14:26.914796039Z] ClientConn switching balancer to "pick_first"  module=grpc
INFO[2022-03-14T14:14:26.930311832Z] starting containerd                           revision=7b11cfaabd73bb80907dd23182b9347b4245eb5d version=v1.4.12
INFO[2022-03-14T14:14:26.953641900Z] loading plugin "io.containerd.content.v1.content"...  type=io.containerd.content.v1
INFO[2022-03-14T14:14:26.953721059Z] loading plugin "io.containerd.snapshotter.v1.aufs"...  type=io.containerd.snapshotter.v1
INFO[2022-03-14T14:14:26.960295816Z] skip loading plugin "io.containerd.snapshotter.v1.aufs"...  error="aufs is not supported (modprobe aufs failed: exit status 1 \"ip: can't find device 'aufs'\\nmodprobe: can't change directory to '/lib/modules': No such file or directory\\n\"): skip plugin" type=io.containerd.snapshotter.v1
INFO[2022-03-14T14:14:26.960329840Z] loading plugin "io.containerd.snapshotter.v1.btrfs"...  type=io.containerd.snapshotter.v1
INFO[2022-03-14T14:14:26.960524514Z] skip loading plugin "io.containerd.snapshotter.v1.btrfs"...  error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.btrfs (xfs) must be a btrfs filesystem to be used with the btrfs snapshotter: skip plugin" type=io.containerd.snapshotter.v1
INFO[2022-03-14T14:14:26.960537441Z] loading plugin "io.containerd.snapshotter.v1.devmapper"...  type=io.containerd.snapshotter.v1
WARN[2022-03-14T14:14:26.960558843Z] failed to load plugin io.containerd.snapshotter.v1.devmapper  error="devmapper not configured"
INFO[2022-03-14T14:14:26.960569516Z] loading plugin "io.containerd.snapshotter.v1.native"...  type=io.containerd.snapshotter.v1
INFO[2022-03-14T14:14:26.960593224Z] loading plugin "io.containerd.snapshotter.v1.overlayfs"...  type=io.containerd.snapshotter.v1
INFO[2022-03-14T14:14:26.960678728Z] loading plugin "io.containerd.snapshotter.v1.zfs"...  type=io.containerd.snapshotter.v1
INFO[2022-03-14T14:14:26.960814844Z] skip loading plugin "io.containerd.snapshotter.v1.zfs"...  error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.zfs must be a zfs filesystem to be used with the zfs snapshotter: skip plugin" type=io.containerd.snapshotter.v1
INFO[2022-03-14T14:14:26.960827133Z] loading plugin "io.containerd.metadata.v1.bolt"...  type=io.containerd.metadata.v1
WARN[2022-03-14T14:14:26.960839223Z] could not use snapshotter devmapper in metadata plugin  error="devmapper not configured"
INFO[2022-03-14T14:14:26.960848698Z] metadata content store policy set             policy=shared
WARN[2022-03-14T14:14:27.915528371Z] grpc: addrConn.createTransport failed to connect to {unix:///var/run/docker/containerd/containerd.sock  <nil> 0 <nil>}. Err :connection error: desc = "transport: error while dialing: dial unix:///var/run/docker/containerd/containerd.sock: timeout". Reconnecting...  module=grpc
WARN[2022-03-14T14:14:30.722257725Z] grpc: addrConn.createTransport failed to connect to {unix:///var/run/docker/containerd/containerd.sock  <nil> 0 <nil>}. Err :connection error: desc = "transport: error while dialing: dial unix:///var/run/docker/containerd/containerd.sock: timeout". Reconnecting...  module=grpc
WARN[2022-03-14T14:14:35.549453706Z] grpc: addrConn.createTransport failed to connect to {unix:///var/run/docker/containerd/containerd.sock  <nil> 0 <nil>}. Err :connection error: desc = "transport: error while dialing: dial unix:///var/run/docker/containerd/containerd.sock: timeout". Reconnecting...  module=grpc
WARN[2022-03-14T14:14:41.759010407Z] grpc: addrConn.createTransport failed to connect to {unix:///var/run/docker/containerd/containerd.sock  <nil> 0 <nil>}. Err :connection error: desc = "transport: error while dialing: dial unix:///var/run/docker/containerd/containerd.sock: timeout". Reconnecting...  module=grpc
failed to start containerd: timeout waiting for containerd to start

/var/lib/docker 不支持共享存储

陈少文博主曾在 《/var/lib/docker 能不能挂载远端存储》提到过 docker 目前并支持将 /var/lib/docker 挂载远程存储使用,因此建议使用 hostPath 的方式保存 docker 的持久化存储数据。

本次测试使用的 Docker 版本为 20.10.6,不能将 /var/lib/docker 挂载远程存储使用。主要原因是容器的实现依赖于内核的能力(xttrs),而类似 NFS Server 这种远程存储无法提供这些能力。如果采用 Device Mapper 进行映射,使用磁盘挂载存在可行性,但只能用于迁移而不能实现共享。

INFO[2022-03-13T13:43:08.750810130Z] ClientConn switching balancer to "pick_first"  module=grpc
ERRO[2022-03-13T13:43:08.781932359Z] failed to mount overlay: invalid argument     storage-driver=overlay2
ERRO[2022-03-13T13:43:08.782078828Z] exec: "fuse-overlayfs": executable file not found in $PATH  storage-driver=fuse-overlayfs
ERRO[2022-03-13T13:43:08.793311119Z] AUFS was not found in /proc/filesystems       storage-driver=aufs
ERRO[2022-03-13T13:43:08.813505621Z] failed to mount overlay: invalid argument     storage-driver=overlay
ERRO[2022-03-13T13:43:08.813529990Z] Failed to built-in GetDriver graph devicemapper /var/lib/docker
INFO[2022-03-13T13:43:08.897769363Z] Loading containers: start.
WARN[2022-03-13T13:43:08.919252078Z] Running modprobe bridge br_netfilter failed with message: ip: can't find device 'bridge'
[root@localhost dinp]# kubectl exec -it dinp-sidecar -c debug sh
/ # docker pull alpine
Using default tag: latest
Error response from daemon: error creating temporary lease: file resize error: truncate /var/lib/docker/containerd/daemon/io.containerd.metadata.v1.bolt/meta.db: bad file descriptor: unknown

参考