AUTOMATIC1111 / stable-diffusion-webui

AUTOMATIC1111 / stable-diffusion-webui

Stable Diffusion web UI A web interface for Stable Diffusion, implemented using Gradio library. Stable Diffusion web UI 这是一个基于 Gradio 库实现的 Stable Diffusion Web 界面。

Features

Detailed feature showcase with images:

  • Original txt2img and img2img modes
  • One click install and run script (but you still must install python and git)
  • Outpainting
  • Inpainting
  • Color Sketch
  • Prompt Matrix
  • Stable Diffusion Upscale
  • Attention, specify parts of text that the model should pay more attention to:
    • a man in a ((tuxedo)) - will pay more attention to tuxedo
    • a man in a (tuxedo:1.21) - alternative syntax
    • select text and press Ctrl+Up or Ctrl+Down (or Command+Up or Command+Down if you’re on a MacOS) to automatically adjust attention to selected text (code contributed by anonymous user)
  • Loopback, run img2img processing multiple times
  • X/Y/Z plot, a way to draw a 3 dimensional plot of images with different parameters
  • Textual Inversion:
    • have as many embeddings as you want and use any names you like for them
    • use multiple embeddings with different numbers of vectors per token
    • works with half precision floating point numbers
    • train embeddings on 8GB (also reports of 6GB working)
  • Extras tab with:
    • GFPGAN, neural network that fixes faces
    • CodeFormer, face restoration tool as an alternative to GFPGAN
    • RealESRGAN, neural network upscaler
    • ESRGAN, neural network upscaler with a lot of third party models
    • SwinIR and Swin2SR (see here), neural network upscalers
    • LDSR, Latent diffusion super resolution upscaling
  • Resizing aspect ratio options
  • Sampling method selection
  • Adjust sampler eta values (noise multiplier)
  • More advanced noise setting options
  • Interrupt processing at any time
  • 4GB video card support (also reports of 2GB working)
  • Correct seeds for batches
  • Live prompt token length validation
  • Generation parameters:
    • parameters you used to generate images are saved with that image in PNG chunks for PNG, in EXIF for JPEG
    • can drag the image to PNG info tab to restore generation parameters and automatically copy them into UI
    • can be disabled in settings
    • drag and drop an image/text-parameters to promptbox
    • Read Generation Parameters Button, loads parameters in promptbox to UI
  • Settings page
  • Running arbitrary python code from UI (must run with —allow-code to enable)
  • Mouseover hints for most UI elements
  • Possible to change defaults/mix/max/step values for UI elements via text config
  • Tiling support, a checkbox to create images that can be tiled like textures
  • Progress bar and live image generation preview
  • Can use a separate neural network to produce previews with almost none VRAM or compute requirement
  • Negative prompt, an extra text field that allows you to list what you don’t want to see in generated image
  • Styles, a way to save part of prompt and easily apply them via dropdown later
  • Variations, a way to generate same image but with tiny differences
  • Seed resizing, a way to generate same image but at slightly different resolution
  • CLIP interrogator, a button that tries to guess prompt from an image
  • Prompt Editing, a way to change prompt mid-generation, say to start making a watermelon and switch to anime girl midway
  • Batch Processing, process a group of files using img2img
  • Img2img Alternative, reverse Euler method of cross attention control
  • Highres Fix, a convenience option to produce high resolution pictures in one click without usual distortions
  • Reloading checkpoints on the fly
  • Checkpoint Merger, a tab that allows you to merge up to 3 checkpoints into one
  • Custom scripts with many extensions from community
  • Composable-Diffusion, a way to use multiple prompts at once:
    • separate prompts using uppercase AND
    • also supports weights for prompts: a cat :1.2 AND a dog AND a penguin :2.2
  • No token limit for prompts (original stable diffusion lets you use up to 75 tokens)
  • DeepDanbooru integration, creates danbooru style tags for anime prompts
  • xformers, major speed increase for select cards: (add —xformers to commandline args)
  • via extension:
    • History tab: view, direct and delete images conveniently within the UI
    • Generate forever option
    • Training tab: hypernetworks and embeddings options
    • Preprocessing images: cropping, mirroring, autotagging using BLIP or deepdanbooru (for anime)
    • Clip skip
    • Hypernetworks
    • Loras (same as Hypernetworks but more pretty)
    • A separate UI where you can choose, with preview, which embeddings, hypernetworks or Loras to add to your prompt
    • Can select to load a different VAE from settings screen
    • Estimated completion time in progress bar
    • API
    • Support for dedicated inpainting model by RunwayML
    • via extension: Aesthetic Gradients, a way to generate images with a specific aesthetic by using clip images embeds
  • Stable Diffusion 2.0 support
  • Alt-Diffusion support
  • Load checkpoints in safetensors format
  • Eased resolution restriction: generated image’s dimensions must be a multiple of 8 rather than 64
  • Reorder elements in the UI from settings screen
  • Segmind Stable Diffusion support

功能特性

详细功能展示(含图片):

  • 原始 txt2img(文生图)和 img2img(图生图)模式
  • 一键安装和运行脚本(但仍需安装 Python 和 Git)
  • 外绘 (Outpainting)
  • 内绘 (Inpainting)
  • 色彩草图 (Color Sketch)
  • 提示词矩阵 (Prompt Matrix)
  • Stable Diffusion 超分辨率放大
  • 注意力机制 (Attention),指定模型应更关注文本的哪些部分:
    • a man in a ((tuxedo)) - 会更关注 tuxedo
    • a man in a (tuxedo:1.21) - 替代语法
    • 选中文字并按 Ctrl+Up/Down(MacOS 为 Command+Up/Down)可自动调整选中文字的权重(由匿名用户贡献代码)
  • 循环回馈 (Loopback),多次运行 img2img 处理
  • X/Y/Z 图表,一种绘制不同参数图像的三维图表的方法
  • Textual Inversion(文本反转):
    • 可以拥有任意数量的嵌入 (embeddings),并为它们使用任何你喜欢的名称
    • 支持每个 token 使用不同向量数量的多个嵌入
    • 支持半精度浮点数
    • 可在 8GB 显存上训练嵌入(也有 6GB 成功的报告)
  • 附加功能选项卡 (Extras tab):
    • GFPGAN,修复面部的神经网络
    • CodeFormer,作为 GFPGAN 替代方案的面部修复工具
    • RealESRGAN,神经网络放大器
    • ESRGAN,带有大量第三方模型的神经网络放大器
    • SwinIR 和 Swin2SR,神经网络放大器
    • LDSR,潜在扩散超分辨率放大
  • 调整宽高比选项
  • 采样方法选择
  • 调整采样器 eta 值(噪声乘数)
  • 更高级的噪声设置选项
  • 随时中断处理
  • 支持 4GB 显存显卡(也有 2GB 成功的报告)
  • 批处理的正确种子 (Seeds)
  • 实时提示词 token 长度验证
  • 生成参数:
    • 生成图像时使用的参数会保存在 PNG 的块中(PNG 格式)或 EXIF 中(JPEG 格式)
    • 可将图像拖动到 PNG info 选项卡以恢复生成参数并自动复制到 UI
    • 可在设置中禁用
    • 支持将图像/文本参数拖放到提示词框
    • “读取生成参数”按钮,将参数加载到 UI 提示词框中
  • 设置页面
  • 从 UI 运行任意 Python 代码(必须使用 —allow-code 启用)
  • 大多数 UI 元素的鼠标悬停提示
  • 可通过文本配置更改 UI 元素的默认值/最大值/最小值/步长
  • 平铺支持 (Tiling),勾选后可创建像纹理一样平铺的图像
  • 进度条和实时图像生成预览
  • 可使用单独的神经网络生成预览,几乎不占用显存或计算资源
  • 负面提示词 (Negative prompt),一个额外的文本框,用于列出你不希望在生成图像中看到的内容
  • 样式 (Styles),一种保存部分提示词并稍后通过下拉菜单轻松应用的方法
  • 变体 (Variations),一种生成相同图像但带有细微差异的方法
  • 种子调整大小 (Seed resizing),一种以略微不同的分辨率生成相同图像的方法
  • CLIP 询问器 (CLIP interrogator),一个尝试从图像猜测提示词的按钮
  • 提示词编辑 (Prompt Editing),在生成过程中更改提示词的方法,例如开始制作西瓜,中途切换到动漫女孩
  • 批处理 (Batch Processing),使用 img2img 处理一组文件
  • Img2img 替代方案,交叉注意力控制的逆向欧拉方法
  • 高分辨率修复 (Highres Fix),一种一键生成高分辨率图片且无常见畸变的便捷选项
  • 实时重新加载检查点 (Checkpoints)
  • 检查点合并 (Checkpoint Merger),允许将最多 3 个检查点合并为一个的选项卡
  • 带有许多社区扩展的自定义脚本
  • 组合扩散 (Composable-Diffusion),一种同时使用多个提示词的方法:
    • 使用大写 AND 分隔提示词
    • 支持提示词权重:a cat :1.2 AND a dog AND a penguin :2.2
  • 提示词无 token 限制(原始 Stable Diffusion 最多允许 75 个 token)
  • DeepDanbooru 集成,为动漫提示词创建 Danbooru 风格标签
  • xformers,显著提升特定显卡的运行速度(在命令行参数中添加 —xformers)
  • 通过扩展功能:
    • 历史记录选项卡:在 UI 内方便地查看、管理和删除图像
    • 无限生成选项
    • 训练选项卡:Hypernetworks 和 Embeddings 选项
    • 图像预处理:使用 BLIP 或 deepdanbooru(针对动漫)进行裁剪、镜像、自动打标签
    • Clip skip
    • Hypernetworks
    • Loras(与 Hypernetworks 类似但更美观)
    • 独立的 UI,可预览并选择要添加到提示词中的 Embeddings、Hypernetworks 或 Loras
    • 可在设置屏幕选择加载不同的 VAE
    • 进度条中的预计完成时间
    • API
    • 支持 RunwayML 的专用内绘模型
    • 通过扩展:Aesthetic Gradients,通过使用 clip 图像嵌入生成具有特定美感的图像
  • 支持 Stable Diffusion 2.0
  • 支持 Alt-Diffusion
  • 以 safetensors 格式加载检查点
  • 放宽分辨率限制:生成图像的尺寸必须是 8 的倍数,而不是 64
  • 在设置屏幕中重新排列 UI 元素
  • 支持 Segmind Stable Diffusion

Installation and Running

Make sure the required dependencies are met and follow the instructions available for:

  • NVidia (recommended)
  • AMD GPUs.
  • Intel CPUs, Intel GPUs (both integrated and discrete) (external wiki page)
  • Ascend NPUs (external wiki page) Alternatively, use online services (like Google Colab): List of Online Services

安装与运行

确保满足所需的依赖项,并按照以下说明进行操作:

  • NVidia(推荐)
  • AMD GPU
  • Intel CPU、Intel GPU(集成和独立显卡)(外部维基页面)
  • 昇腾 NPU(外部维基页面) 或者,使用在线服务(如 Google Colab):在线服务列表