惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

D
Docker
Microsoft Azure Blog
Microsoft Azure Blog
云风的 BLOG
云风的 BLOG
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
L
LangChain Blog
P
Privacy & Cybersecurity Law Blog
Hugging Face - Blog
Hugging Face - Blog
C
CXSECURITY Database RSS Feed - CXSecurity.com
大猫的无限游戏
大猫的无限游戏
Cyberwarzone
Cyberwarzone
The Register - Security
The Register - Security
Stack Overflow Blog
Stack Overflow Blog
A
Arctic Wolf
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
T
Threatpost
The GitHub Blog
The GitHub Blog
P
Privacy International News Feed
WordPress大学
WordPress大学
U
Unit 42
S
Securelist
T
The Exploit Database - CXSecurity.com
C
Cyber Attacks, Cyber Crime and Cyber Security
P
Proofpoint News Feed
Latest news
Latest news
Hacker News: Ask HN
Hacker News: Ask HN
小众软件
小众软件
Know Your Adversary
Know Your Adversary
The Cloudflare Blog
V
Vulnerabilities – Threatpost
The Hacker News
The Hacker News
Scott Helme
Scott Helme
有赞技术团队
有赞技术团队
Security Latest
Security Latest
Google DeepMind News
Google DeepMind News
Application and Cybersecurity Blog
Application and Cybersecurity Blog
Simon Willison's Weblog
Simon Willison's Weblog
博客园 - Franky
Y
Y Combinator Blog
博客园 - 叶小钗
Security Archives - TechRepublic
Security Archives - TechRepublic
Google DeepMind News
Google DeepMind News
N
Netflix TechBlog - Medium
S
Secure Thoughts
T
Threat Research - Cisco Blogs
aimingoo的专栏
aimingoo的专栏
S
SegmentFault 最新的问题
Microsoft Security Blog
Microsoft Security Blog
K
KPMG report finds enterprise disconnect between AI and its ROI | CIO
博客园 - 司徒正美
M
MIT News - Artificial intelligence

StudyingLover's Blog

Diffusion Policy笔记 rwkv笔记 act笔记 nanovllm-block_manager opencode多智能体 nanobot-pre-train nanobot-rl nanobot-sft nanobot-checkpoint_manager nanobot-gpt nanobot-mid-train Vision Mamba (Vim)笔记 BPE演示 最后一遍学习Transformer YOLOv5 目标检测笔记 下载根服务器解析记录 Dynaseal A Backend-Controlled LLM API Key Distribution Scheme with Constrained Invocation Parameters 判断链表有环 王道25数据结构勘误 关于perplexity的open-sourcing-r1-1776 AI为什么不像人类一样进行多轮对话 新博客改造日记和功能测试 linuxqq只显示登陆背景图 数字设计和计算机体系结构(机械工业出版社)勘误(自制) Dynaseal:面向未来端侧llm agent的llm api key分发机制 A Definitive Guide to Markdown Style This post is using MDX, Where you can embed JSX and Astro components RT-Patch学习 pydantic实现的LLM ReAct fastapi 和 uvicorn 设置监听 ipv6 pydantic+openai+json 控制大模型输出的最佳范式 解决 Matplotlib Scatter 不支持 Marker 列表的问题:mscatter 实现 roofline model zhipuAI接口兼容openai 在docker部署fastapi宝塔里使用nginx反代套上cloudflare获取请求的真实ip clion搭建libbpf-bootstrap开发环境 coze+coze-discord-proxy+ChatNextWebUI实现AI自由 安卓内核时间使用的是UTC时间 colab运行google最新开源模型Gemma Sora技术报告 视频生成模型作为世界模拟器 笔记 archlinux flutter开发踩坑 fastapi集成google auth登录 linux下NTFS磁盘报错输入输出错误 Venn-Abers 预测器 基于Venn-Abers预测器的系统日志异常检测方法_顾兆军 手机平板远程访问kvm虚拟机的windows phi-2弱智吧测评 poe的gemini pro或是百度开发 google gemini api使用 google gemini api申请 构建用于复杂数据处理的高效UDP服务器和客户端 matplotlib中文字体渲染 TruFor笔记和代码复现 深入分析:GitHub Trending 项目 "multipleWindow3dScene" pua大模型 ggml教程|mnist手写体识别量化推理 xgboost2.0最佳实践 xgboost使用GPU最佳实践 马踏棋盘 cloudlflare推理llama2 docker搭建elasticsearch并使用python连接 FreeU-文字生成图片的免费午餐笔记 使用xgboost的c接口推理模型 Archlinux使用CMake调用xgboost的c接口 m2cgen生成机器学习c语言推理代码 xgboost模型序列化存储并推理 speculative-sampling笔记 prompt2model笔记 RoboTAP笔记 自建obsidian同步服务 MediaPipe即将推出图像生成服务 Dual-Stream Diffusion Net for Text-to-Video Generation笔记 ViT在DDPM取代UNet(DiT) arch4edu搞崩了我的flutter LISA(推理分割)笔记 在终端绘制GPU显存使用曲线 GPTBot介绍 arch蓝牙无法连接 GPU部署llama-cpp-python(llama.cpp通用) 花式求GCD 使用llama构建一个蜜罐(前端) 使用llama构建一个蜜罐(后端) llama-cpp-python快速上手 快速上手llama2.c(更新版) Paper Gestalt笔记 DINO-v2笔记 快速上手llama2.c AnyDoor笔记 Archlinux安装scrcpy加载共享库出错 error while loading shared libraries:libusb-1.0.so.0:wrong ELF class:ELFCLASS32 npc_gzip笔记 python调用c++函数 Filesystem type ntfs3,ntfs not configured in kernel open_clip编码图像和文本 PicGo配置CloudflareR2图片储存 ArchlinuxGnome快捷键打开终端 clip-interrogator代码解析 GroundingDINO安装报错解决 2023华为鲲鹏畅想日暨西安高新国际会议中心零食午饭测评 RoboMaster开源仓库汇总(长期更新) 没有手都可以在腾讯云创建镜像
ControlNet训练和微调自己数据集
About the Author StudyingLover · 2023-04-28 · via StudyingLover's Blog

训练和微调在这里是一件事情,我们下面就统一用训练这个词。

2024.1.20 更新 controlnet 发布快一年了,diffusers 已经有了很完整的生态,建议直接使用第二种方式 diffusers 进行训练+推理

从官方仓库训练

官方教程 https://github.com/lllyasviel/ControlNet/blob/main/docs/train.md

环境配置

先看一下有没有显卡

nvidia-smi

首先下载整个仓库

git clone https://github.com/lllyasviel/ControlNet.git

然后创建 conda 虚拟环境(选做,只要你能配好环境)

conda env create -f environment.yaml
conda activate control

接下来需要下载 stable diffusion 和训练集,因为我们是对 stable diffusion 模型做微调。

下载 sd1.5 到,models 目录

cd ./models
wget https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned.ckpt

下载训练数据集到 training 文件夹

mkdir training
cd ./training
wget https://huggingface.co/lllyasviel/ControlNet/resolve/main/training/fill50k.zip

解压数据集

unzip fill50k.zip

当然这个数据集非常大,我们也可以选择小一点的

wget https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet_training/conditioning_image_1.png

wget https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet_training/conditioning_image_2.png

然后将 conditioning_image_1.png 改名 0.png 放到./source 目录下,conditioning_image_2.png 改名放到./target 目录下

mv conditioning_image_1.png 0.png
mv 0.png ./source

mv conditioning_image_2.png 0.png
mv 0.png ./target

然后创建一个prompt.json 的文件写入

{
	"source": "source/0.png",
	"target": "target/0.png",
	"prompt": "pale golden rod circle with old lace background"
}

无论是哪种方式,最后的文件结构是这样的image.png

训练

首先调一下tutorial_train.py 里的 batch_size,训练过程中如果出现 out of memory 的情况可以调小。

接下来运行 tutorial_train.py,闭上眼睛等待训练完成即可

python tutorial_train.py

如果是完整数据集,大概 6 个小时一个 epoch,如果是单张图片会很快。

当然,为了不要出现网不好 ssh 断掉导致训练终端,我们可以使用 screne

screen -S train
conda activate control
python tutorial_train.py

训练出的结果可以在image_log 中看到

image.png

推理

原作者没有给出怎么推理代码的方式,但是有人给出了一个脚本 GitHub 将你训练出来的模型转换成 diffusers,接着你就可以中下面 diffusers 的方式推理模型了。

踩坑解决

out of memory(oom)

首先开启save_memory模式,将config.py 中 False 改为 True

同时调低 batch_size

No operator found for memory_efficient_attention_backward

卸载  xformers

pip uninstall  xformers

TypeError: on_train_batch_start() missing 1 required positional argument: ‘dataloader_idx’

这个比较坑,是论文代码有问题,改一下源码就好

  1. ControlNet/ldm/models/diffusion/ddpm.py 文件 591 行
def on_train_batch_start(self, batch, batch_idx, dataloader_idx):

删除 dataloader_idx,改为

def on_train_batch_start(self, batch, batch_idx):
  1. ControlNet/cldm/logger.py 文件 74 行
def on_train_batch_end(self, trainer, pl_module, outputs, batch, batch_idx, dataloader_idx):

删除 dataloader_idx,改为

def on_train_batch_end(self, trainer, pl_module, outputs, batch, batch_idx):

Diffusers 训练

Diffusers 是一个 huggingface 推出的扩散模型的封装库,同时也对 ControlNet 做了封装,https://github.com/huggingface/diffusers/tree/main/examples/controlnet

训练

代码跑起来其实也非常简单,首先下载 diffusers 整个仓库,然后安装依赖

git clone https://github.com/huggingface/diffusers
cd diffusers
pip install -r requirements.txt

你可能会发现这样的报错 image.png

  WARNING: The scripts accelerate, accelerate-config and accelerate-launch are installed in '/home/ubuntu/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  WARNING: The script transformers-cli is installed in '/home/ubuntu/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  WARNING: The script ftfy is installed in '/home/ubuntu/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  WARNING: The script tensorboard is installed in '/home/ubuntu/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  WARNING: The script datasets-cli is installed in '/home/ubuntu/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.

别慌,依赖已经下载成功了,只是下载到了一个不在 PATH 的路径,接下来如果要使用这些被提到的库就需要指明路径,例如下面我们要使用 accelerate,正常的用法是

accelerate 你要执行的东西

我们只需要改成

/home/ubuntu/.local/bin/accelerate 你要执行的东西

接下来运行 tutorial_train

accelerate config

全部选 NO 就好,如果你有多卡什么的可以参考官方文档

我们需要测试数据集

wget https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet_training/conditioning_image_1.png

wget https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet_training/conditioning_image_2.png

接着运行,设置基础模型和模型输出目录

export OUTPUT_DIR="./out_models"
export MODEL_DIR="runwayml/stable-diffusion-v1-5"

运行代码,这里 epoch=1,steps=1

/home/ubuntu/.local/bin/accelerate launch train_controlnet.py   --pretrained_model_name_or_path=$MODEL_DIR  --output_dir=$OUTPUT_DIR   --dataset_name=fusing/fill50k   --resolution=512   --learning_rate=1e-5   --validation_image "./conditioning_image_1.png" "./conditioning_image_2.png"   --validation_prompt "red circle with blue background" "cyan circle with brown floral background"   --train_batch_size=4 --num_train_epochs=1 --max_train_steps=1

推理

新建一个文件inference.py

from diffusers import StableDiffusionControlNetPipeline, ControlNetModel, UniPCMultistepScheduler
from diffusers.utils import load_image
import torch

base_model_path = "path to model"
controlnet_path = "path to controlnet"

controlnet = ControlNetModel.from_pretrained(controlnet_path, torch_dtype=torch.float16)
pipe = StableDiffusionControlNetPipeline.from_pretrained(
    base_model_path, controlnet=controlnet, torch_dtype=torch.float16
)

# speed up diffusion process with faster scheduler and memory optimization
pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)
# remove following line if xformers is not installed
pipe.enable_xformers_memory_efficient_attention()

pipe.enable_model_cpu_offload()

control_image = load_image("./conditioning_image_1.png")
prompt = "pale golden rod circle with old lace background"

# generate image
generator = torch.manual_seed(0)
image = pipe(
     prompt, num_inference_steps=20, generator=generator, image=control_image
).images[0]

image.save("./output.png")

这里的 base_model_path 和 controlnet_path 改成之前设置的 MODEL_DIR 和 OUTPUT_DIR(注意顺序)

接下来运行就可

python inference.py

结果会被保存到output.png

踩坑解决

WARNING: The scripts accelerate, accelerate-config and accelerate-launch are installed in ‘/home/ubuntu/.local/bin’ which is not on PATH.Consider adding this directory to PATH or, if you prefer to suppress this warning, use —no-warn-script-location.

  WARNING: The scripts accelerate, accelerate-config and accelerate-launch are installed in '/home/ubuntu/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  WARNING: The script transformers-cli is installed in '/home/ubuntu/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  WARNING: The script ftfy is installed in '/home/ubuntu/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  WARNING: The script tensorboard is installed in '/home/ubuntu/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  WARNING: The script datasets-cli is installed in '/home/ubuntu/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.

类似的问题,这里的依赖已经安装成功了,只是被安装到了未被添加到 PATH 的目录,接下来运行的时候只需要指明目录即可。例如下面我们要使用 accelerate,正常的用法是

accelerate 你要执行的东西

我们只需要改成

/home/ubuntu/.local/bin/accelerate 你要执行的东西