ComfyUI – TrilightLabs

分享一个不全面不客观的flux-train训练工具测评

黑森林团队出的flux模型因其强大的参数，惊艳的细节效果广受欢迎，现在已经在各类AIGC平台占有很大的曝光量，甚至现在libulibu首页也是主推f1.0的lora模型，因其少量数据的微调就能获得很好的效果受到很多人的喜欢。

因为职业关系，我在工作上也尝试了flux-lora模型训练，接下来我分享pinokio-flux-gym训练工具和flux train-aitoolkit 两种工具的差别。

pinokio-fluxgym

优点：安装简单，易上手

缺点：只能训练flux-lora模型，如果想要训练sd1.5和XL模型你需要在社区中找到Kohya训练工具

Pinokio是一个AI社区浏览器，它整合了很多AI工具及工作流。它的优势在于即使你是一个不懂编程语言的小白也能轻松配置好flux lora 的训练环境，安装好pinokio工具之后，直接在探索中找到fluxgym，即可一键安装，其中python环境、第三方依赖性、模型配置等等你完全不需要自行考虑。坐等它配置好就行。

它底层基于Kohya Scripts开发，所有的参数设置都是统一的。在前端界面上采用三分法的布局设计，简化了操作步骤，123的布局设计让人一目了然，降低的理解门槛。在刚开始阶段，你都不需要详细了解具体的参数设置，只管提供优质的训练集就能拿到很好的模型效果，点赞。

实时的训练预览效果，它提供多种不同的预览图触发效果：

1.否定提示词

2.指定生成图像的宽度和宽度

3.指定生成图像的种子

4.指定生成图像的CFG比例

5.指定生成中的步骤数

支持显卡显存12G、16G、20G以上

支持的底模：flux-dev、flux schnell、flux-dev2pro（实际上训练最好用flux-dev1.0、flux-dev2pro）

FluxGYM可以修改训练集的数量 app.py文件，将 MAX_IMAGES = 修改即可

import os
import sys
os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"
os.environ['GRADIO_ANALYTICS_ENABLED'] = '0'
sys.path.insert(0, os.getcwd())
sys.path.append(os.path.join(os.path.dirname(__file__), 'sd-scripts'))
import subprocess
import gradio as gr
from PIL import Image
import torch
import uuid
import shutil
import json
import yaml
from slugify import slugify
from transformers import AutoProcessor, AutoModelForCausalLM
from gradio_logsview import LogsView, LogsViewRunner
from huggingface_hub import hf_hub_download, HfApi
from library import flux_train_utils, huggingface_util
from argparse import Namespace
import train_network
import toml
import re
MAX_IMAGES = 650 //修改训练集数量

flux train-aitoolkit

优点：远程训练，面向专业级玩家、专业的UI界面

缺点：不稳定，需要简单的代码阅读能力，要配合ChatGpt或者deepseek使用

flux train-aitoolkit 目前处在一个早期版本，意味着在稳定性、功能可能不是那么的好用，该工具并非基于Kohya Scripts开发，其目录结构也和我见到的不同，因为自己对这块认识不够专业性，不便多说。

它最大的特点是基于huggingface远程访问的方式训练模型，这意味着你需要使用huggingface账户，从huggingface获取一个READ密钥，方可进行训练。

它支持FLUX.1-schnell、Flux-dev两个版本的模型训练，因为需要在本地预先写好模型配置信息，所以你需要一定的耐心来配置远程访问的信息。说实话，笔者在github上安装完这个工具都头大，你需要有一定的耐心和好奇心。否则很容易劝退。

笔者在配置以上两个工具都遇到diffusers无法正常克隆的情况，所以你在配置这两个工具都要开启全局git代理，否则很大几率会克隆不成功报错。

如果依然报错请使用国内镜像：https://gitee.com/opensource-customization/diffusers

训练打标

flux的特性是基于自然语言描述打标，所以你在训练中请使用触发词和自然语言描述打标，这样在训练中能够得到很好的效果。这意味着你需要使用GPT、caption等模型工作流来处理你的训练集。笔者实际尝试过仅用tag来打标，发现训练效果并不好（基于秋叶lora-script）。

分辨率

flux对尺寸分辨率没有特殊要求，小到512、大到1024，768*1024也是可以的。

实际跑图

实际使用上，权重同样是0.7-0.9之间最好，并且是触发词加自然语言描述，你的描述越多，生成的效果越细节。这也意味着你在使用上需要对画面更具体的要求，甚至在想法没有那么具体的情况下需要借助deepseek这类工具给你提供帮助。

当把XL训练的repeat提高到100以上

上一期文章分享了扁平插画女孩的LoRa-XL模型，训练的repeat扫描次数在10-20之间，这次我将他们的repeat提高到100-150，在同样的提示词下生成的效果如图：

masterpiece:(1.2),chahua_nvhai,,British girl,Exquisite facial details,long hair,1girl,illustration style,brown hair,wear blue dress,illustration, 5 fingers,8K,hud,Grand Budapest Hotel background,happy

头发的细节，脸部细节泛化能力都提升不少。如果你在模型训练的时候如果感觉效果没有提升，特别是XL模型训练，试试看将repeat扫描次数提高到100以上。

扒取-“超自然AI换脸教程”-来自哔哩哔哩

看了这个作者的演示，效果确实不错，但是我细查了下发现是培训机构的。所以想要获取他的流程必须要三连加微信才能获取到，而且一时半会还没发给我。本着提升自己对comfyui的熟悉度，我照猫画虎的1：1给还原了过来。

原视频地址：https://www.bilibili.com/video/BV1eXsheNEon?p=2&vd_source=b6c524e3d38fe874f7e2148d9ca2d1bc

为什么叫扒取又打引号呢？根据视频截图保存本地推演出来的，但实际效果是否那么好呢？有待验证。

下面分享下我在配置这个工作流遇到的问题：

按作者要求安装一下插件节点（我列的不全，这个视频作者提到的插件都要安装昂）：

1.mixlab（我在这报错多）

2.ipadapter（这个已安装的略过）

3.comfyroll（原图放大节点，改善加载图像不够高清的问题也可以理解为高清修复）

4.instantID（这个已安装的略过，记得需要下载ipadapter.bin模型）

总体流程搭建简单，主要分为三部分，首先通过输入图像（不清晰的直接comfyroll放大节点）来到ReActor换脸，这时候效果可能不佳，接着引入instantID节点，之后再通过ipadapter进行迁移风格，得到最终效果。（使用基础模型为XL）

说下mixlab报错，这个插件需要将版本切换到4月月份的版本，同时instantID也要切换到4月27的版本，否则两者会出现兼容性问题。尤其是mixlab最新版本会导入错误。切换版本解决。

工作流分享：https://trilightlab.com/wp-content/uploads/2024/09/segment-ipadapter-anything换脸.zip

演示环节：

我要将刘亦菲的脸型嫁接到中间这个美女脸上，得到图3刘亦菲换脸的最终效果

这里又2点要注意：

controlnet加载器的模型选择instantid/diffusion_pytorch_model.sfetensors模型不然效果很差

K采样器的参数：

继续演示：

将图1的女孩脸型替换到图2上，得到图3的效果。

结论：总体来说符合我的预期，品质也很高，并且相比flxu上期分享的流程速度快，精准度和稳定性要高一些。

记录我在尝试使用Flux换脸工作流遇到的问题

这个问题搞到最后把我自己逗乐了，我暂且还原下我在配置这个工作流环境遇到的问题。

这个工作流来自civitai 老外分享的一个换脸工作流，我将其导入本地comfyui之后照常安装缺失节点，包括如下部分：

怎么会这么多节点要安装？首先我导入这个工作流默认只提示缺失三个节点，GGUF和everywhere以及reactor这些。

安装好重启出现如下报错：

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\ComfyUI-aki\ComfyUI-aki-v1.3\nodes.py", line 1993, in load_custom_node
    module_spec.loader.exec_module(module)
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "D:\ComfyUI-aki\ComfyUI-aki-v1.3\custom_nodes\comfyui_face_parsing\__init__.py", line 18, in <module>
    download_url("https://huggingface.co/jonathandinu/face-parsing/resolve/main/config.json?download=true", face_parsing_path, "config.json")
  File "D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\site-packages\torchvision\datasets\utils.py", line 134, in download_url
    url = _get_redirect_url(url, max_hops=max_redirect_hops)
  File "D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\site-packages\torchvision\datasets\utils.py", line 82, in _get_redirect_url
    with urllib.request.urlopen(urllib.request.Request(url, headers=headers)) as response:
  File "D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\urllib\request.py", line 216, in urlopen
    return opener.open(url, data, timeout)
  File "D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\urllib\request.py", line 519, in open
    response = self._open(req, data)
  File "D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\urllib\request.py", line 536, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\urllib\request.py", line 496, in _call_chain
    result = func(*args)
  File "D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\urllib\request.py", line 1391, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\urllib\request.py", line 1351, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [WinError 10060] 由于连接方在一段时间后没有正确答复或连接的主机没有反应，连接尝试失败。>

Cannot import D:\ComfyUI-aki\ComfyUI-aki-v1.3\custom_nodes\comfyui_face_parsing module for custom nodes: <urlopen error [WinError 10060] 由于连接方在一段时间后没有正确答复或连接的主机没有反应，连接尝试失败。>
FizzleDorf Custom Nodes: Loaded
FaceDetailer: Model directory already exists
FaceDetailer: Model doesnt exist
FaceDetailer: Downloading model

说明face_parsing节点是安装了，但是facedetailer依赖的模型下载失败，GPT指引我去这个插件文件夹下找到nodes.py 文件，看它需要下载哪些模型，以及模型的下载地址都是哪些，代码如下：

ef get_restorers():
    models_path = os.path.join(models_dir, "facerestore_models/*")
    models = glob.glob(models_path)
    models = [x for x in models if (x.endswith(".pth") or x.endswith(".onnx"))]
    if len(models) == 0:
        fr_urls = [
            "https://huggingface.co/datasets/Gourieff/ReActor/resolve/main/models/facerestore_models/GFPGANv1.3.pth",
            "https://huggingface.co/datasets/Gourieff/ReActor/resolve/main/models/facerestore_models/GFPGANv1.4.pth",
            "https://huggingface.co/datasets/Gourieff/ReActor/resolve/main/models/facerestore_models/codeformer-v0.1.0.pth",
            "https://huggingface.co/datasets/Gourieff/ReActor/resolve/main/models/facerestore_models/GPEN-BFR-512.onnx",
            "https://huggingface.co/datasets/Gourieff/ReActor/resolve/main/models/facerestore_models/GPEN-BFR-1024.onnx",
            "https://huggingface.co/datasets/Gourieff/ReActor/resolve/main/models/facerestore_models/GPEN-BFR-2048.onnx",
        ]
        for model_url in fr_urls:
            model_name = os.path.basename(model_url)
            model_path = os.path.join(dir_facerestore_models, model_name)
            download(model_url, model_path, model_name)
        models = glob.glob(models_path)
        models = [x for x in models if (x.endswith(".pth") or x.endswith(".onnx"))]
    return models

好家伙，地址都对，也能手动下载模型放到models/facerestore_models即可。但是偏偏后台下载提示无响应。手动下载重启节点后问题消失。

之后重磅的来了，提示找不到 AV_Facedetailer 节点，这就让我很纳闷折腾了2个小时，为什么呢？谷歌找不到一个唯一匹配的答案是这个但偏偏又没有下载链接，github上空空如也。所以你就知道为啥我第一张图拉了一个那么长的清单安装的节点了吧？我一直以为是impact-pack 依赖项节点是它，一顿操作下来原地杵。

继续回到civitai去看评论，果然大家都遇到这个问题。

如图：

AV_Facedetailer 这个节点命名和art-venture相差也太远了吧！要没这个大哥发出来天王老子都找不到这个节点啊。事实上3个多小时，总算把这个流程顺利安装好了。

嗯，如果有遇到类似的朋友，记得看评论，记得看说明书，太操蛋了。先做个记录，换脸效果后续补上，因为我发现GGUF这个节点需要依赖预训练好的模型。

9月28日更新换脸工作流节点的问题：

今天像实际跑一下流程看是否跑通，上传照片之后出现如下报错：

执行 KSampler 时发生错误：cast_to() 得到了一个意外的关键字参数“copy” 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\execution.py”, 第 317 行，在执行 output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, executive_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\execution.py”, 第 192 行，在 get_output_data return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, executive_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\execution.py”, 第 169 行, 在 _map_node_over_list process_inputs(input_dict, i) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\execution.py”, 第 158 行, 在 process_inputs results.append(getattr(obj, func)(**inputs)) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\nodes.py”, 第 1429 行, 在 sample return common_ksampler(model, seed, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, denoise=denoise) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\nodes.py”, 第 1396 行, 在 common_ksampler samples = comfy.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, File "D:\ComfyUI-aki\ComfyUI-aki-v1.3\custom_nodes\ComfyUI-Impact-Pack\modules\impact\sample_error_enhancer.py", line 9, in informative_sample return original_sample(*args, **kwargs) # 此代码有助于解释异常中发生的错误消息，但不会对其他操作产生任何影响。文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\custom_nodes\ComfyUI-AnimateDiff-Evolved\animatediff\sampling.py”，第 420 行，在 motion_sample 中返回 orig_comfy_sample(model, noise, *args, **kwargs) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\sample.py”，第 43 行，在样本中 samples = sampler.sample(noise, positive, negative, cfg=cfg, latent_image=latent_image, start_step=start_step, last_step=last_step, force_full_denoise=force_full_denoise, denoise_mask=noise_mask, sigmas=sigmas, callback=callback, disable_pbar=disable_pbar, seed=seed) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\samplers.py”, 第 829 行，在样本中返回样本（self.model、noise、positive、negative、cfg、self.device、sampler、sigmas、self.model_options、latent_image=latent_image、denoise_mask=denoise_mask、callback=callback、disable_pbar=disable_pbar、seed=seed）文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\samplers.py”, 第 729 行，在样本中返回 cfg_guider.sample（noise、latent_image、sampler、sigmas、denoise_mask、callback、disable_pbar、seed）文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\samplers.py”, 第 716 行，在样本中输出 = self.inner_sample(noise, latent_image, device, sampler, sigmas, denoise_mask, callback, disable_pbar, seed) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\samplers.py”，第 695 行，在 inner_sample 中，samples = sampler.sample(self,sigmas、extra_args、callback、noise、latent_image、denoise_mask、disable_pbar) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\samplers.py”, 第 600 行, 在样本中 samples = self.sampler_function(model_k、noise、sigmas、extra_args=extra_args、callback=k_callback、disable=disable_pbar、**self.extra_options) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\site-packages\torch\utils\_contextlib.py”, 第 115 行, 在 decorate_context return func(*args, **kwargs) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\k_diffusion\sampling.py”, 第 144 行, 在sample_euler denoised = model(x, sigma_hat * s_in, **extra_args) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\samplers.py”, 第 299 行, 在 __call__ out = self.inner_model(x, sigma, model_options=model_options, seed=seed) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\samplers.py”, 第 682 行, 在 __call__ return self.predict_noise(*args, **kwargs) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\samplers.py”, 第 685 行, 在 predict_noise return samples_function(self.inner_model, x, timestep, self.conds.get("negative", None), self.conds.get("positive", None), self.cfg, model_options=model_options, seed=seed) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\samplers.py”, 第 279 行, 在 samples_function 中 out = calc_cond_batch(model, conds, x, timestep, model_options) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\samplers.py”, 第 228 行, 在 calc_cond_batch 中 output = model.apply_model(input_x, timestep_, **c).chunk(batch_chunks) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\model_base.py”, 第 142 行, 在 apply_model 中 model_output = self.diffusion_model(xc, t, context=context，control=control，transformer_options=transformer_options，**extra_conds).float() 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\site-packages\torch\nn\modules\module.py”，第 1518 行，在 _wrapped_call_impl 中返回 self._call_impl(*args，**kwargs) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\site-packages\torch\nn\modules\module.py”，第 1527 行，在 _call_impl 中返回 forward_call(*args，**kwargs) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\ldm\flux\model.py”，第 159 行，在 forward out = self.forward_orig(img， img_ids、context、txt_ids、timestep、y、guided、control）文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\ldm\flux\model.py”，第 118 行，在 forward_orig img 中，txt = block(img=img, txt=txt, vec=vec, pe=pe) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\site-packages\torch\nn\modules\module.py”，第 1518 行，在 _wrapped_call_impl 中返回 self._call_impl(*args, **kwargs) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\site-packages\torch\nn\modules\module.py”，第 1527 行，在 _call_impl 中返回forward_call(*args, **kwargs) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\ldm\flux\layers.py”，第 148 行，向前 img_mod1，img_mod2 = self.img_mod(vec) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\site-packages\torch\nn\modules\module.py", 第 1518 行，在 _wrapped_call_impl 中返回 self._call_impl(*args, **kwargs) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\site-packages\torch\nn\modules\module.py", 第 1527 行，在 _call_impl 中返回 forward_call(*args, **kwargs) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\ldm\flux\layers.py", 第 110 行，在 forward out = self.lin(nn. functional.silu(vec))[:, None, :].chunk(self.multiplier, dim=-1) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\site-packages\torch\nn\modules\module.py”, 第 1518 行, 在 _wrapped_call_impl 中返回 self._call_impl(*args, **kwargs) 文件 “D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\site-packages\torch\nn\modules\module.py”, 第 1527 行, 在 _call_impl 中返回 forward_call(*args, **kwargs) 文件 “D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\ops.py”, 第 67 行, 在 forward 中返回 self.forward_comfy_cast_weights(*args, **kwargs) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\custom_nodes\ComfyUI-GGUF\ops.py”, 第 152 行, 在 forward_comfy_cast_weights 中返回 self.forward_ggml_cast_weights(input, *args, **kwargs) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\custom_nodes\ComfyUI-GGUF\ops.py”, 第 174 行, 在 forward_ggml_cast_weights 中 weight, bias = self.cast_bias_weight(input) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\custom_nodes\ComfyUI-GGUF\ops.py”, 第 144 行, 在 cast_bias_weight 中 bias = comfy.ops.cast_to(bias, bias_dtype, device, non_blocking=non_blocking,复制=False)在 cast_bias_weight bias = comfy.ops.cast_to(bias、bias_dtype、device、non_blocking=non_blocking、copy=False) 中在 cast_bias_weight bias = comfy.ops.cast_to(bias、bias_dtype、device、non_blocking=non_blocking、copy=False) 中

很显然，这个报错出现GGUF和Animatediff插件之间的兼容性问题，我发现本地的Animatediff插件是没有模型依赖的，猜测是没模型导致报错？所以下载了两个模型一个ckpt一个lora，但是没有解决，依然报以上错误。

查询GPT找答案：

将以上报错发过去，补充了如下问题：我看别人反馈这种报错是animatediff插件导致的链接地址：https://github.com/Kosinkadink/ComfyUI-AnimateDiff-Evolved，说什么不是最新版本我升级的都是新版本，依然是这个报错

得到答案：

您遇到的错误cast_to() got an unexpected keyword argument 'copy'确实与 ComfyUI 的最新版本与插件之间的兼容性问题有关AnimateDiff-Evolved。

看来根本原因是 ComfyUI 更新与AnimateDiff-Evolved插件不匹配。最新的 ComfyUI 更新引入了某些自定义节点的旧版本不完全支持的更改，包括它们处理参数的方式，例如copy在以下函数中（cast_to()

GitHub) (

GitHub）。

要解决此问题，请执行以下操作：

更新 AnimateDiff-Evolved：确保您正在运行AnimateDiff-Evolved插件的最新版本。插件的开发人员 Kosinkadink 提到，需要进行小规模重构才能使插件与 ComfyUI 中的最新更改兼容。更新主 ComfyUI 和AnimateDiff-Evolved应该可以解决问题 (GitHub）。

检查节点依赖关系：如果更新后问题仍然存在，请检查任何节点或采样器特定的依赖关系（例如ComfyUI-GGUF）可能也需要更新，因为某些节点可能仍引用已弃用的参数，例如（copyGitHub）。

遵循这些步骤应该有助于消除错误并使插件顺利运行。

所以这时候，更新了comfyui的主程序版本到最新，报错问题解决！

ComfyUI-Flux-支持文本直出-模型跑图感受分享

我预计 Flux 模型将会是在视觉模型中长期霸榜，生成风格的多样性、对人物肢体的控制，都是其他模型无法比拟的。接下来我将展示2个流程来具体看看它的生成质量。

案例一迪斯尼风格电影海报

首先，我用这张海报来进行文本反推来获取海报的关键词，使用工具：https://huggingface.co/spaces/fancyfeast/joy-caption-pre-alpha

获得关键词如下：This is a digital promotional poster for the Disney+ animated film “The Ice Age Adventures of Buck Wild.” The image is a vibrant, colorful cartoon depiction set in a lush, jungle-like environment. The background features dense foliage, tall trees with broad leaves, and a variety of greenery, creating a sense of depth and immersion.

In the foreground, two anthropomorphic ground sloths, Buck and Crash, are prominently featured. They are standing on a large, gnarled tree branch, with Buck on the left and Crash on the right. Buck is holding a stick in his right hand and has a playful expression, while Crash is smiling and has his arms outstretched, as if excited. Both characters have light brown fur with darker brown stripes, and their eyes are large and expressive.

The title “The Ice Age Adventures of Buck Wild” is prominently displayed in large, bold, yellow letters in the center of the poster. Above the title, the text “Disney+ + gets wild” is written in white. Below the title, the Disney+ logo is visible, along with the phrase “Original movie from 20th Century Studios.” The poster’s overall style is bright and cheerful, with a playful, adventurous tone.

别小看上面这个反推工具，目前来说使用体验最好的，对图像的识别能力非常强。有兴趣可以制作成插件。跑题了，来看看我的工作流。

我使用的是flux_bnb_nf4_v2的checkpoint，直接简单的文生图流程，生成的效果如下：

我将海报中的两个角色换成猫和狗，海报输出的文字进行了修改，得到上图的效果。

案例二 3D 风格人像

依然是网上搜集的一张海报，赛博风格

依然用反推提示词，这里要注意。可能是这个反推文本模型存在一定的局限，这类人物角色它描述成二次元的风格。所以呢，我对此进行了适当修改，让他具备3d，blender技术效果。

提示词如下：This image is a digital illustration, likely created in a comic book style, featuring a futuristic, cyberpunk aesthetic. The central figure is a young woman with pale blue skin and striking, large, orange eyes. Her hair is platinum blonde and styled in a sleek, high ponytail. She is dressed in a high-tech, form-fitting outfit with metallic accents, giving her a futuristic, robotic appearance. Her left hand, which is gloved in a black, mechanical-looking glove, is holding a clear glass filled with a refreshing drink, which she is sipping through a straw.

The background is predominantly black, with vibrant yellow and orange accents, creating a striking contrast that highlights the central figure. The magazine cover title, “FAVR,” is prominently displayed in large, bold letters at the top, with additional Japanese text on the left side. The word “SMOOTHIE” is written in bold, white letters at the bottom, emphasizing the theme of the cover. The overall color palette is a mix of cool blues and warm oranges, contributing to the high-tech, futuristic vibe of the artwork. The image is detailed, with a focus on the woman’s expressive face and the sleek, futuristic design of her outfit.

生成效果如下：