All In One – Anime Illust Diffusion – AIDV2.6
Run Model | 使用GRAVITI Diffus 关于GRAVITI Diffus |
Model Name | All In One – Anime Illust Diffusion |
Model Type | Checkpoint |
Base Model | Other |
Version Name | AIDV2.6 |
File Name | allInOneAnimeIllust_aidv26.safetensors |
SHA256 | 432c408373 |
NSFW | False |
Trigger Words | |
Tags | anime, character, style, illustration |
Metadata | fp = fp32 size = pruned format = SafeTensor |
Size | 4.79 GB |
Version Description |
AIDv2.6 在 AIDV2.5 的基础上训练了 350w 步。与 AIDV2.5 相比,AIDV2.6 的笔触感更强烈,构图更好。 AIDv2.6 trains 350w steps based on AIDV2.5. Compared with AIDV2.5, AIDV2.6 has stronger strokes and better composition. |
Source | https://civitai.com/models/16828 |
Author | EG |
模型介绍(中文)
I 介绍
AnimeIllustDiffusion (AID) 是一款预训练、非商用且多风格的动漫插画模型。您能使用特殊的触发词(见附录 A)来生成特定风格的图像。您需要搭配我提供的负面文本嵌入 bad系列 [1] 使用。否则,您将得到糟糕的结果。我推荐 sd-vae-ft-mse-original [5] 这款 vae。第 II 部分将简单地介绍 AnimeIllustDiffusionV1.0 的制作过程;第 III 部分将介绍负面文本嵌入;附录 A 部分将提供部分关键词列表。
AID 模型拥有 超过 50 种 稳定的 动漫插画风格 和 100 名动漫角色。生成风格需要的特殊提示词见附录 A。生成角色则直接使用角色名即可。AID 模型像一个调色板,您可以通过任意组合提示词创造出新的风格。
II 模型
这款模型由三个不同的模型融合而来,其中两个由我训练,另一个为 GoldSun 融合的 Pretty 2.5D 模型 [2]。
1 模型训练
我使用 4300+ 张经过人工裁剪、打标、512×512尺寸的二次元插画图像作为训练集,使用 dreambooth 微调 Naifu 7G 大模型训练风格。我以较高的学习率为每张训练集图像训练了100期。我没有使用正则化图像。我训练了文本编码器。如果有兴趣,您可以在 [3] 处找到详细参数信息。
2 模型融合
我使用 Merge Block Weighted 扩展融合模型。在三个模型中,一个模型被用于提供风格和文本编码器(base alpha 和全部 OUT 层),一个模型被用于优化手部细节(IN 层 00 – 05),另一个模型(Pretty 2.5D)被用于提供构图(IN 层 06 – 11 和 M 层)。
III 负面文本嵌入
该模型推荐使用 badv3 —— 一个负面提示词的文本嵌入文件。它不仅能简化提示词的书写,还能激发模型潜力,提高生成图片的质量。通常,badv3 的效果已经足够,您无需再额外填写质量提示词。但它并不能解决 100% 的画面问题。
1 使用方法
您应该将下载得到的负面文本嵌入文件,即 badv3.pt 文件放置在您 stable diffusion 目录下的 embeddings 文件夹内。之后,您只需要在填写负面提示词处输入 badv3 即可。
2 制作思路
我的想法是训练了一个糟糕图像的概念,并把它放入负面提示词中以避免生成这类不好的图像。
我使用了几百张由模型生成的糟糕图像训练负面文本嵌入,即 badv3,其原理与 EasyNegative [4] 相似。我尝试把它训练到过拟合以缓解传统负面文本嵌入对模型画风的影响,这似乎很有效。
与 EasyNegative 相比,badv3 对本模型的效果要更好。我暂未对比其他负面文本嵌入。
badv3 是我继 deformityv6 后训练的第 n 个负面文本嵌入。其制作非常容易,但结果也相当随机。我曾尝试通过添加差分来从模型中移除另一个用糟糕图像训练的模型的权重,但目前并没有乐观的结果。我接下来打算训练负面 Lora 来代替负面文本嵌入以直接从模型中“移除”一部分权重而非“避免”它们。
IV 声明
本模型用于测试多风格模型训练,非盈利或商用,皆兴趣使然。若有侵权,立即删除。
使用者仅被授权使用此模型生成图片,不允许未经同意的转载。
严禁将本模型用于一切商业用途。
附录 A 部分的展示图为本模型特殊提示词的大分类提示词参考,并非一定使用指定提示词才能生成。
请勿使用本模型生成带有血腥、暴力、色情的违规图片及任何侵权内容!因此,附录 A 部分仅能够提供部分经过训练的关键词。
Model Introduction (English)
I Introduction
AnimeIllustDiffusion is a pre-trained, non-commercial and multi-styled anime illustration model. You can use some trigger words (see Appendix A) to generate specific styles of images. You need to use it with my negative text embedding bad series [1]. Otherwise, you will get bad results. For VAE, I recommend sd-vae-ft-mse-original [5]. Part II of this introduction describes how the model was made; part III presents my proposed negative text embeddings; and Appendix A provides a partial list of keywords.
The model has over 50 stable anime illustration styles and 100 anime characters. See Appendix A for specific style trigger words. To generate a specific character, just use the character’s name as prompt directly. The AID model is like a palette, and you can create new styles by combining different prompts.
II Model
This model is a fusion of three different models, two of which I trained and one is the Pretty 2.5D model fused by GoldSun [2].
1 Model Training
I use 4300+ artificially cropped, tagged, 512×512 size anime illustration images as the training set, and use dreambooth to fine-tune the Naifu 7G model. I trained for 100 epochs per training set image with a high learning rate. I didn’t use regularized images. I also trained its text encoder. If interested, you can find detailed parameter information at [3].
2 Model Merging
I merged 3 models using Merge Block Weighted to create this AnimeIllustDiffusion model. Among the three models, one model is used to provide style and text encoder (base alpha and all OUT layers), one model is used to optimize hand details (IN layers 00 – 05), and another model (Pretty 2.5D [3]) are used to provide better composition (IN layers 06 – 11 and M00 layers).
III Negative Text Embedding
The model recommends using badv3 – a text embedding file of negative cue words. It not only simplifies the writing of prompt words, but also stimulates the potential of the model and improves the quality of generated images. Usually, the effect of badv3 is enough, and you don’t need to fill in additional quality prompt words. But it doesn’t solve 100% of the picture problems.
1 How to Use It
You should place the downloaded negative text embedding file, the badv3.pt file, in the embeddings folder of your stable diffusion directory. After that, you just need to enter badv3 in the negative prompt word field.
2 Ideas on It
My idea is to train a concept of bad images and put it into negative prompt to avoid generating such bad images. I trained a negative text embedding, badv3, using a few hundred bad images generated by the model, which works in a similar way to EasyNegative [4]. I tried training it to overfit to mitigate the effect of traditional negative text embeddings on the style of the model, and it seemed to work. Badv3 works better for this model than EasyNegative. I haven’t compared other negative text embeddings yet. badv3 is the nth negative text embedding I trained after deformityv6. It’s pretty easy to make, but the results are pretty random. I have tried removing weights from another model trained with bad images by adding differencing, but so far with no promising results. My next plan is to train Negative Lora instead of Negative Text Embeddings to directly “remove” some of the weights from the model rather than “avoid” them.
IV Declarations
This model is used to test multi-style model training, non-profit or commercial, all interest. If there is any infringement, it will be deleted immediately.
Users are only authorized to use this model to generate pictures, and unauthorized reproduction is not allowed.
Any commercial use of this model is strictly prohibited!
The display picture in Appendix A is a large classification prompt word for the special label of this model, and it is for reference only.
Please do not use this model to generate bloody, violent, pornographic images and any infringing content! Therefore, only part of the trained keywords can be provided in Appendix A.
V 引用网页 / Referenced Pages
[1] Useful Quality Embeddings – AnimeIllustDiffusion | Stable Diffusion TextualInversion | Civitai
[2] Pretty 2.5D | Stable Diffusion Checkpoint | Civitai
[3] 多风格模型 – 赛璐璐风格科幻插画 – AI加速器社区 (acceleratori.com)
[4] EasyNegative | Stable Diffusion TextualInversion | Civitai
[5] vae-ft-mse-840000-ema-pruned.ckpt · stabilityai/sd-vae-ft-mse-original at main (huggingface.co)
附录 A / Appendix A
截止至 AIDV2.5 / Until AIDV2.5:by 35s00, by agm, by ajimita, by akizero, by ask, by chicken utk, by demizu posuka, by dino, by fadingz, by fuzichico, by hamukukka, by hitomio16, by ichigo ame, by key999, by kooork55, by matcha, by mika pikazo, by modare, by myung yi, by naji yanagida, by nezukonezu32, by nico tine, by nikuzume, by ninev, by oda non, by palow, by qooo003, by rolua, by samip, by serie niai, by shirentutu, by sho, by silver, by sonomura00, by void, by wlop, by xilmo, by yoneyama mai, by yosk6000, by zumizumi
AIDV2.6 新增 / AIDV2.6 adds:by caaaarrot, by hinaki, by homutan, by kazari tayu, by kitada mo, by roitz, by teffish, by ukiatsuya, by yejji, by ziyun
AIDV2.7 新增 / AIDV2.7 adds:by poharo, by jnthed, by 7thknights, by some1else45, by yohan, by yomu, by tsvbvra