近期,快手开源了名为Kolors(可图)的文本到图像生成模型,该模型具有对英语和汉语的深刻理解,并能够生成高质量、逼真的图像。技术报告中也提了几个重要的工作内容:
首先,Kolors基于通用语言模型(ChatGLM),而不是像Imagen和Stable Diffusion 3基于大语言模型T5,这增强了其对英语和汉语的理解能力,并利用多模态大型语言模型CogVLM重新为训练数据集中的图像生成更详细的描述;
其次,Kolors训练分为两个阶段,即概念学习阶段和质量改进阶段,并使用特定的数据集进行训练以提高视觉吸引力,通过引入高质量的数据和优化高分辨率训练技术来改善图像质量;
最后,Kolors团队提出了一种平衡类别的基准数据集KolorsPrompts,用于指导Kolors的训练和评估。
实验结果表明,即使使用U-Net backbone,可图Kolors也表现出色,在人类评价中超越了现有的开源模型,性能达到了Midjourney-v6水平。Kolors代码和权重已经开源!
代码开源链接:https://github.com/Kwai-Kolors/Kolors
模型开源链接:https://modelscope.cn/models/Kwai-Kolors/Kolors
技术报告链接:https://github.com/Kwai-Kolors/Kolors/blob/master/imgs/Kolors_paper.pdf
模型卡片直达:
下载方式:
sdk下载:
#模型下载from modelscope import snapshot_downloadmodel_dir = snapshot_download('Kwai-Kolors/Kolors')git下载
git clone https://www.modelscope.cn/Kwai-Kolors/Kolors.git
CLI下载
modelscope download --model=Kwai-Kolors/Kolors --local_dir ./Kolors/
参考开源项目:https://github.com/kijai/ComfyUI-KwaiKolorsWrapper,我们在魔搭社区免费GPU算力上,完成了Kolors的ComfyUI环境搭建和体验实践。
体验环境
使用魔搭社区的Notebook运行Kolors可图模型:
搭建 ComfyUI
从最新的ComfyUI的代码安装
# #@title Environment Setupfrom pathlib import PathOPTIONS = {}UPDATE_COMFY_UI = True#@param {type:"boolean"}INSTALL_COMFYUI_MANAGER = True#@param {type:"boolean"}INSTALL_KOLORS = True#@param {type:"boolean"}INSTALL_CUSTOM_NODES_DEPENDENCIES = True#@param {type:"boolean"}OPTIONS['UPDATE_COMFY_UI'] = UPDATE_COMFY_UIOPTIONS['INSTALL_COMFYUI_MANAGER'] = INSTALL_COMFYUI_MANAGEROPTIONS['INSTALL_KOLORS'] = INSTALL_KOLORSOPTIONS['INSTALL_CUSTOM_NODES_DEPENDENCIES'] = INSTALL_CUSTOM_NODES_DEPENDENCIEScurrent_dir = !pwdWORKSPACE = f"{current_dir[0]}/ComfyUI"%cd /mnt/workspace/![ ! -d $WORKSPACE ] && echo -= Initial setup ComfyUI =- && git clone https://github.com/comfyanonymous/ComfyUI%cd $WORKSPACEif OPTIONS['UPDATE_COMFY_UI']:!echo "-= Updating ComfyUI =-"!git pullif OPTIONS['INSTALL_COMFYUI_MANAGER']:%cd custom_nodes![ ! -d ComfyUI-Manager ] && echo -= Initial setup ComfyUI-Manager =- && git clone https://github.com/ltdrdata/ComfyUI-Manager%cd ComfyUI-Manager!git pullif OPTIONS['INSTALL_KOLORS']:%cd ../![ ! -d ComfyUI-KwaiKolorsWrapper ] && echo -= Initial setup KOLORS =- && git clone https://github.com/kijai/ComfyUI-KwaiKolorsWrapper.git%cd ComfyUI-KwaiKolorsWrapper!git pull%cd $WORKSPACEif OPTIONS['INSTALL_CUSTOM_NODES_DEPENDENCIES']:!pwd!echo "-= Install custom nodes dependencies =-"![ -f "custom_nodes/ComfyUI-Manager/scripts/colab-dependencies.py" ] && python "custom_nodes/ComfyUI-Manager/scripts/colab-dependencies.py"
下载模型权重
#@markdown ###Download standard resourcesOPTIONS = {}#@markdown **unet**!wget -c "https://modelscope.cn/models/Kwai-Kolors/Kolors/resolve/master/unet/diffusion_pytorch_model.fp16.safetensors" -P ./models/diffusers/Kolors/unet/!wget -c "https://modelscope.cn/models/Kwai-Kolors/Kolors/resolve/master/unet/config.json" -P ./models/diffusers/Kolors/unet/#@markdown **encoder**!modelscope download --model=ZhipuAI/chatglm3-6b-base --local_dir ./models/diffusers/Kolors/text_encoder/#@markdown **vae**!wget -c "https://modelscope.cn/models/AI-ModelScope/sdxl-vae-fp16-fix/resolve/master/sdxl.vae.safetensors" -P ./models/vae/ #sdxl-vae-fp16-fix.safetensors#@markdown **scheduler**!wget -c "https://modelscope.cn/models/Kwai-Kolors/Kolors/resolve/master/scheduler/scheduler_config.json" -P ./models/diffusers/Kolors/scheduler/#@markdown **modelindex**!wget -c "https://modelscope.cn/models/Kwai-Kolors/Kolors/resolve/master/model_index.json" -P ./models/diffusers/Kolors/
通过cloudflareg启动ComfyUI
!wget "https://modelscope.oss-cn-beijing.aliyuncs.com/resource/cloudflared-linux-amd64.deb"!dpkg -i cloudflared-linux-amd64.deb%cd /mnt/workspace/ComfyUIimport subprocessimport threadingimport timeimport socketimport urllib.requestdef iframe_thread(port):while True:time.sleep(0.5)sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)result = sock.connect_ex(('127.0.0.1', port))if result == 0:breaksock.close()print("\nComfyUI finished loading, trying to launch cloudflared (if it gets stuck here cloudflared is having issues)\n")p = subprocess.Popen(["cloudflared", "tunnel", "--url", "http://127.0.0.1:{}".format(port)], stdout=subprocess.PIPE, stderr=subprocess.PIPE)for line in p.stderr:l = line.decode()if "trycloudflare.com " in l:print("This is the URL to access ComfyUI:", l[l.find("http"):], end='')#print(l, end='')threading.Thread(target=iframe_thread, daemon=True, args=(8188,)).start()!python main.py --dont-print-server
点击右侧 load,加载ComfyUI-KwaiKolorsWrapper项目提供的 workflow
文生图体验:
图生图体验(一辆白色小汽车):
显存占用:
效果测试
简单 Prompt
复杂 Prompt
多实体生成能力很能打,颜色能做到分别控制,空间关系也比较完美
多风格
多风格,强!
文本
可以处理简单的文本
多样性
多样性还不错
性能测试
1024 分辨率,A10,生成一张图片(25步)耗时7秒。
粤ICP备14082021号