惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

K
KPMG report finds enterprise disconnect between AI and its ROI | CIO
T
Troy Hunt's Blog
Schneier on Security
Schneier on Security
N
News | PayPal Newsroom
Hacker News: Ask HN
Hacker News: Ask HN
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
Google DeepMind News
Google DeepMind News
www.infosecurity-magazine.com
www.infosecurity-magazine.com
N
News and Events Feed by Topic
V
Vulnerabilities – Threatpost
Cyberwarzone
Cyberwarzone
K
Kaspersky official blog
P
Privacy & Cybersecurity Law Blog
P
Privacy International News Feed
WordPress大学
WordPress大学
U
Unit 42
PCI Perspectives
PCI Perspectives
S
Schneier on Security
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
V
Visual Studio Blog
Engineering at Meta
Engineering at Meta
The Cloudflare Blog
I
Intezer
宝玉的分享
宝玉的分享
N
News and Events Feed by Topic
Martin Fowler
Martin Fowler
B
Blog
美团技术团队
T
The Blog of Author Tim Ferriss
C
Cisco Blogs
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
酷 壳 – CoolShell
酷 壳 – CoolShell
The Last Watchdog
The Last Watchdog
J
Java Code Geeks
博客园_首页
A
About on SuperTechFans
Vercel News
Vercel News
Attack and Defense Labs
Attack and Defense Labs
H
Heimdal Security Blog
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
IT之家
IT之家
小众软件
小众软件
H
Help Net Security
D
Darknet – Hacking Tools, Hacker News & Cyber Security
Exploit-DB.com RSS Feed
Exploit-DB.com RSS Feed
T
The Exploit Database - CXSecurity.com
Y
Y Combinator Blog
Recent Commits to openclaw:main
Recent Commits to openclaw:main
Webroot Blog
Webroot Blog
T
Tenable Blog

StudyingLover's Blog

Diffusion Policy笔记 rwkv笔记 act笔记 nanovllm-block_manager opencode多智能体 nanobot-pre-train nanobot-rl nanobot-sft nanobot-checkpoint_manager nanobot-gpt nanobot-mid-train Vision Mamba (Vim)笔记 BPE演示 最后一遍学习Transformer YOLOv5 目标检测笔记 下载根服务器解析记录 Dynaseal A Backend-Controlled LLM API Key Distribution Scheme with Constrained Invocation Parameters 判断链表有环 王道25数据结构勘误 关于perplexity的open-sourcing-r1-1776 AI为什么不像人类一样进行多轮对话 新博客改造日记和功能测试 linuxqq只显示登陆背景图 数字设计和计算机体系结构(机械工业出版社)勘误(自制) Dynaseal:面向未来端侧llm agent的llm api key分发机制 A Definitive Guide to Markdown Style This post is using MDX, Where you can embed JSX and Astro components RT-Patch学习 pydantic实现的LLM ReAct fastapi 和 uvicorn 设置监听 ipv6 pydantic+openai+json 控制大模型输出的最佳范式 解决 Matplotlib Scatter 不支持 Marker 列表的问题:mscatter 实现 roofline model zhipuAI接口兼容openai 在docker部署fastapi宝塔里使用nginx反代套上cloudflare获取请求的真实ip clion搭建libbpf-bootstrap开发环境 coze+coze-discord-proxy+ChatNextWebUI实现AI自由 安卓内核时间使用的是UTC时间 colab运行google最新开源模型Gemma Sora技术报告 视频生成模型作为世界模拟器 笔记 archlinux flutter开发踩坑 fastapi集成google auth登录 linux下NTFS磁盘报错输入输出错误 Venn-Abers 预测器 基于Venn-Abers预测器的系统日志异常检测方法_顾兆军 手机平板远程访问kvm虚拟机的windows phi-2弱智吧测评 poe的gemini pro或是百度开发 google gemini api使用 google gemini api申请 构建用于复杂数据处理的高效UDP服务器和客户端 matplotlib中文字体渲染 TruFor笔记和代码复现 深入分析:GitHub Trending 项目 "multipleWindow3dScene" pua大模型 ggml教程|mnist手写体识别量化推理 xgboost2.0最佳实践 xgboost使用GPU最佳实践 马踏棋盘 cloudlflare推理llama2 docker搭建elasticsearch并使用python连接 FreeU-文字生成图片的免费午餐笔记 使用xgboost的c接口推理模型 Archlinux使用CMake调用xgboost的c接口 m2cgen生成机器学习c语言推理代码 xgboost模型序列化存储并推理 speculative-sampling笔记 prompt2model笔记 RoboTAP笔记 自建obsidian同步服务 MediaPipe即将推出图像生成服务 Dual-Stream Diffusion Net for Text-to-Video Generation笔记 ViT在DDPM取代UNet(DiT) arch4edu搞崩了我的flutter LISA(推理分割)笔记 在终端绘制GPU显存使用曲线 GPTBot介绍 arch蓝牙无法连接 GPU部署llama-cpp-python(llama.cpp通用) 花式求GCD 使用llama构建一个蜜罐(前端) 使用llama构建一个蜜罐(后端) llama-cpp-python快速上手 快速上手llama2.c(更新版) Paper Gestalt笔记 DINO-v2笔记 快速上手llama2.c AnyDoor笔记 Archlinux安装scrcpy加载共享库出错 error while loading shared libraries:libusb-1.0.so.0:wrong ELF class:ELFCLASS32 npc_gzip笔记 python调用c++函数 Filesystem type ntfs3,ntfs not configured in kernel open_clip编码图像和文本 PicGo配置CloudflareR2图片储存 ArchlinuxGnome快捷键打开终端 clip-interrogator代码解析 GroundingDINO安装报错解决 2023华为鲲鹏畅想日暨西安高新国际会议中心零食午饭测评 RoboMaster开源仓库汇总(长期更新) 没有手都可以在腾讯云创建镜像
ControlNet训练和微调自己数据集
About the Author StudyingLover · 2023-04-28 · via StudyingLover's Blog

训练和微调在这里是一件事情,我们下面就统一用训练这个词。

2024.1.20 更新 controlnet 发布快一年了,diffusers 已经有了很完整的生态,建议直接使用第二种方式 diffusers 进行训练+推理

从官方仓库训练

官方教程 https://github.com/lllyasviel/ControlNet/blob/main/docs/train.md

环境配置

先看一下有没有显卡

nvidia-smi

首先下载整个仓库

git clone https://github.com/lllyasviel/ControlNet.git

然后创建 conda 虚拟环境(选做,只要你能配好环境)

conda env create -f environment.yaml
conda activate control

接下来需要下载 stable diffusion 和训练集,因为我们是对 stable diffusion 模型做微调。

下载 sd1.5 到,models 目录

cd ./models
wget https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned.ckpt

下载训练数据集到 training 文件夹

mkdir training
cd ./training
wget https://huggingface.co/lllyasviel/ControlNet/resolve/main/training/fill50k.zip

解压数据集

unzip fill50k.zip

当然这个数据集非常大,我们也可以选择小一点的

wget https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet_training/conditioning_image_1.png

wget https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet_training/conditioning_image_2.png

然后将 conditioning_image_1.png 改名 0.png 放到./source 目录下,conditioning_image_2.png 改名放到./target 目录下

mv conditioning_image_1.png 0.png
mv 0.png ./source

mv conditioning_image_2.png 0.png
mv 0.png ./target

然后创建一个prompt.json 的文件写入

{
	"source": "source/0.png",
	"target": "target/0.png",
	"prompt": "pale golden rod circle with old lace background"
}

无论是哪种方式,最后的文件结构是这样的image.png

训练

首先调一下tutorial_train.py 里的 batch_size,训练过程中如果出现 out of memory 的情况可以调小。

接下来运行 tutorial_train.py,闭上眼睛等待训练完成即可

python tutorial_train.py

如果是完整数据集,大概 6 个小时一个 epoch,如果是单张图片会很快。

当然,为了不要出现网不好 ssh 断掉导致训练终端,我们可以使用 screne

screen -S train
conda activate control
python tutorial_train.py

训练出的结果可以在image_log 中看到

image.png

推理

原作者没有给出怎么推理代码的方式,但是有人给出了一个脚本 GitHub 将你训练出来的模型转换成 diffusers,接着你就可以中下面 diffusers 的方式推理模型了。

踩坑解决

out of memory(oom)

首先开启save_memory模式,将config.py 中 False 改为 True

同时调低 batch_size

No operator found for memory_efficient_attention_backward

卸载  xformers

pip uninstall  xformers

TypeError: on_train_batch_start() missing 1 required positional argument: ‘dataloader_idx’

这个比较坑,是论文代码有问题,改一下源码就好

  1. ControlNet/ldm/models/diffusion/ddpm.py 文件 591 行
def on_train_batch_start(self, batch, batch_idx, dataloader_idx):

删除 dataloader_idx,改为

def on_train_batch_start(self, batch, batch_idx):
  1. ControlNet/cldm/logger.py 文件 74 行
def on_train_batch_end(self, trainer, pl_module, outputs, batch, batch_idx, dataloader_idx):

删除 dataloader_idx,改为

def on_train_batch_end(self, trainer, pl_module, outputs, batch, batch_idx):

Diffusers 训练

Diffusers 是一个 huggingface 推出的扩散模型的封装库,同时也对 ControlNet 做了封装,https://github.com/huggingface/diffusers/tree/main/examples/controlnet

训练

代码跑起来其实也非常简单,首先下载 diffusers 整个仓库,然后安装依赖

git clone https://github.com/huggingface/diffusers
cd diffusers
pip install -r requirements.txt

你可能会发现这样的报错 image.png

  WARNING: The scripts accelerate, accelerate-config and accelerate-launch are installed in '/home/ubuntu/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  WARNING: The script transformers-cli is installed in '/home/ubuntu/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  WARNING: The script ftfy is installed in '/home/ubuntu/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  WARNING: The script tensorboard is installed in '/home/ubuntu/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  WARNING: The script datasets-cli is installed in '/home/ubuntu/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.

别慌,依赖已经下载成功了,只是下载到了一个不在 PATH 的路径,接下来如果要使用这些被提到的库就需要指明路径,例如下面我们要使用 accelerate,正常的用法是

accelerate 你要执行的东西

我们只需要改成

/home/ubuntu/.local/bin/accelerate 你要执行的东西

接下来运行 tutorial_train

accelerate config

全部选 NO 就好,如果你有多卡什么的可以参考官方文档

我们需要测试数据集

wget https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet_training/conditioning_image_1.png

wget https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet_training/conditioning_image_2.png

接着运行,设置基础模型和模型输出目录

export OUTPUT_DIR="./out_models"
export MODEL_DIR="runwayml/stable-diffusion-v1-5"

运行代码,这里 epoch=1,steps=1

/home/ubuntu/.local/bin/accelerate launch train_controlnet.py   --pretrained_model_name_or_path=$MODEL_DIR  --output_dir=$OUTPUT_DIR   --dataset_name=fusing/fill50k   --resolution=512   --learning_rate=1e-5   --validation_image "./conditioning_image_1.png" "./conditioning_image_2.png"   --validation_prompt "red circle with blue background" "cyan circle with brown floral background"   --train_batch_size=4 --num_train_epochs=1 --max_train_steps=1

推理

新建一个文件inference.py

from diffusers import StableDiffusionControlNetPipeline, ControlNetModel, UniPCMultistepScheduler
from diffusers.utils import load_image
import torch

base_model_path = "path to model"
controlnet_path = "path to controlnet"

controlnet = ControlNetModel.from_pretrained(controlnet_path, torch_dtype=torch.float16)
pipe = StableDiffusionControlNetPipeline.from_pretrained(
    base_model_path, controlnet=controlnet, torch_dtype=torch.float16
)

# speed up diffusion process with faster scheduler and memory optimization
pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)
# remove following line if xformers is not installed
pipe.enable_xformers_memory_efficient_attention()

pipe.enable_model_cpu_offload()

control_image = load_image("./conditioning_image_1.png")
prompt = "pale golden rod circle with old lace background"

# generate image
generator = torch.manual_seed(0)
image = pipe(
     prompt, num_inference_steps=20, generator=generator, image=control_image
).images[0]

image.save("./output.png")

这里的 base_model_path 和 controlnet_path 改成之前设置的 MODEL_DIR 和 OUTPUT_DIR(注意顺序)

接下来运行就可

python inference.py

结果会被保存到output.png

踩坑解决

WARNING: The scripts accelerate, accelerate-config and accelerate-launch are installed in ‘/home/ubuntu/.local/bin’ which is not on PATH.Consider adding this directory to PATH or, if you prefer to suppress this warning, use —no-warn-script-location.

  WARNING: The scripts accelerate, accelerate-config and accelerate-launch are installed in '/home/ubuntu/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  WARNING: The script transformers-cli is installed in '/home/ubuntu/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  WARNING: The script ftfy is installed in '/home/ubuntu/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  WARNING: The script tensorboard is installed in '/home/ubuntu/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  WARNING: The script datasets-cli is installed in '/home/ubuntu/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.

类似的问题,这里的依赖已经安装成功了,只是被安装到了未被添加到 PATH 的目录,接下来运行的时候只需要指明目录即可。例如下面我们要使用 accelerate,正常的用法是

accelerate 你要执行的东西

我们只需要改成

/home/ubuntu/.local/bin/accelerate 你要执行的东西