T2I 文生图的新秀Ideogram 4体验和评价 - 司云有崖

BlogFinder

日常漫步 Vol.24 之漫步前山河 - 雅余周报 #1-聊聊本周的收获 - Edwin's Blog 我的OpenCode必装插件与Skill Write Something 掌中之物未必在掌握之中 · CRIVU PiliNara，一个更顺手的 PiliPlus 分支「NekoEcho」：做一个必有回响的猫娘主题博客 2026-05 书影音总结简化博客主题 - 安迪你要加油呐我第一次发布 npm 包拾花小记#45：中考前的二三事 – 小改学习志黛西花园5月游 #18 枇杷又熟了的五月月报一些奇奇怪怪的需求？word仿方正书版的几个小操作 - Xiobb's Blog 0419 御温泉之旅修复了一些bug,网站基本上趋于稳定了 - 新锐博客又回到四十年前如何定义成功迷鹿屋2026已重新上线科技冰火两重天+一周回顾 ${title} 热度退了，我反而用得更深了-咕咚同学我到底该不该换个域名？随身WIFI折腾记 - 安迪博客撰写体验提升——hexo pro插件为什么不用相机把屏幕上的接关密码拍下来？国清寺与天台山 – Ouroboros ★★★★☆《挽救计划》——久违的经济上行感 - Davidの3号基地删除右键“打开方式”里多余选项第三周刊_No.53｜一切都会被支付两次安卓APP通话记录与录音上传踩坑记录 - 子舒的博客天量下跌 inBox 笔记 2.3.8，把工具栏交给了你-咕咚同学我把小龙虾搬到了微信-咕咚同学安好 - 响石潭 Compound Engineering Plugin：让每个工程单元都比上一个更容易 MOSS-TTS Family：开源高质量语音与声音生成模型家族深度解析 Crawl4AI：专为 LLM 设计的开源 Web 爬虫与数据抓取工具 Build Your Own X：从零实现你最喜欢的技术——程序员进阶的终极资源清单 Anthropic Skills：用文件夹教 Claude 专业技能的开源框架 1年的去月球（下） - 梅之夏欢迎回来。简单讲讲 ASN.1 与 OID DTV - 直播聚合客户端 5.22-5.27 – 不兴江还没去过鸭川 – 不兴江张晶晶同学三刷林志颖关于我 – 不兴江爱与嫉妒 – 不兴江港股被持续做空备案码花了四百块-咕咚同学一句话生成封面：我给公众号做了4种风格的AI封面生成技能「官」方認證再谈费曼学习法 2026-05-28T00:34:11+08:00 2026-05-28T00:28:45+08:00 离谱的英语学习指南：基于AI的英语进阶系统方法论 iii：零集成架构的后端统一运行时 Claude Code Harness：让 Claude Code 工作有迹可循的工程化框架 Heretic：全自动移除大语言模型审查机制的开源工具 MarkItDown：微软开源的万能文档转 Markdown 利器 Harness：让 Claude Code 秒变多智能体协作工厂这段时间尽折腾AI Agent了，确实极大地提高了效率近期动态：两个新站点正式上线啦误判解除！zhouayuan.com 腾讯安全申诉成功 - 周阿源｜玩具设计・插画日常・生活随笔 Ralph：让 AI 编码工具自主循环跑完所有 PRD 任务的量产神器全都违法 – 个人工作记录关于zhouayuan.com被误判 “含违规信息” 的说明与申诉记录 - 周阿源｜玩具设计・插画日常・生活随笔小米 MiMo v2.5 Pro 白嫖最大的人间清醒，兜里有钱，但是不花。夜晚靓歌(12)：于文文现场solo - 王志勇的Blog 今日插画：风扬起的倔强 - 周阿源｜玩具设计・插画日常・生活随笔回门习俗独立网卡 - 忘记了回忆 500亿入股人工智能企业从命令行到桌面智能体-咕咚同学第一性原理读书笔记行者微评论223-加班の守株待兔-博客|政治与时事-风雨行者 ZOZO开源物理接触求解器：GPU加速的可扩展仿真引擎 OpenStock：开源股票市场交易平台技术深度解析 MoneyPrinterTurbo：基于AI的全自动短视频生成工具深度解析 Claude-Mem：为 Claude Code 构建的持久化记忆压缩系统 Twenty：可代码化定制的企业级开源 CRM 平台技术深度解析 2026-05-26T22:59:17+08:00 企业级开源大模型部署平台 GPUStack 实战教程 1年的去月球（上） - 梅之夏 Sevalla - 静态网站托管服务不用翻墙、不用注册、不用月费，普通人也能用上 Claude Code 装修灯具要注意⚠️ 黄梅天先锋 - 游子微博公安备案顺利办结，站点备案全部完成 - 周阿源｜玩具设计・插画日常・生活随笔第三次兑换天猫超市卡了宗宗酱-三维狐少儿编程 Don't think, feel. - Rolen's Blog 人这一辈子，到底图个什么博客迁移 - Edwin's Blog 情感赛道写作模板再现本轮行情的典型特征裁员与平常心-咕咚同学别让“偷懒”，成为隐私泄露的破绽

司云有崖 · 2026-06-24 · via BlogFinder

考完试后，除了做一些Research Assistant的工作，我惊喜的发现了AI生图开源模型里又来了一个新秀ideogram 4。上一个给我如此惊喜震撼的还是z-image，而细究起来，我对ideogram4的表现以及未来更有信心一些，尤其是考虑到z-image几乎被阿里给砍了，edit模型迟迟不出。

ideogram 4声称自己是开源模型的第一人，仅仅输给gpt-image-2之类的闭源模型：

对我来说，这个榜单的具体真假不好说，但是这个模型比较优秀的点在于：

它使用的是json prompt提示词，支持对obj进行具体拆分和位置描述，这让对图片进行具体精细控制成为可能。
它整体画面风格类似gpt-image-2等真人写实风格。我不再用z-image或者flux2 klein的一大原因就是这些模型的生成人像风格总有AI磨皮，或者说麦橘感，而ideogram的图片效果却更贴近真实摄影风格。
对Lora的支持也比较好，基本4000-6000步就可以获得比较好的人像拟合。

当然，ideogram 4也有不少毛病，最大的一个就是自带审查，还有就是商用许可，这两个也是被不少人诟病的。
下面我们就来看看具体是使用示例：
json prompt:

{
    "high_level_description": "A cinematic historical-drama scene featuring a young East Asian noblewoman seated in an elegant palace interior, dressed in luxurious white and red ceremonial robes beneath glowing autumn foliage.",
    "compositional_deconstruction": {
        "background": "The setting is a refined palace hall decorated with warm wooden furnishings, softly illuminated screens, and a large tree covered in vivid golden-orange leaves. The shallow depth of field softens the background while preserving the rich autumnal atmosphere. Warm amber lighting contrasts with the cool blue tones visible through the windows.",
        "elements": [
            {
                "type": "obj",
                "bbox": [216, 371, 1000, 949],
                "desc": "A young East Asian woman seated slightly right of center, shown in a composed and upright posture. She has fair skin, delicate features, and a serious, attentive expression while looking slightly to the left of the camera. Her black hair is arranged in an elaborate high court hairstyle decorated with a small gold ornament and a slender hairpin. She wears an intricately patterned ivory robe with a high layered collar, partly covered by a voluminous red outer garment embroidered with gold floral motifs and trimmed with soft white fur. Her hands rest together in her lap."
            },
            {
                "type": "obj",
                "bbox": [19, 94, 799, 754],
                "desc": "A large tree with a dark trunk and dense golden-orange foliage filling much of the upper-left and central background, creating a dramatic autumn canopy."
            },
            {
                "type": "obj",
                "bbox": [516, 0, 1000, 1000],
                "desc": "A pale upholstered wooden seat positioned across the left midground, with carved dark wood framing and softly patterned fabric."
            },
            {
                "type": "obj",
                "bbox": [622, 254, 857, 509],
                "desc": "A small dark wooden side table placed in front of the seated woman, holding a decorative dish with small orange-colored food and green garnish."
            },
            {
                "type": "obj",
                "bbox": [151, 829, 828, 983],
                "desc": "A tall rectangular floor lantern with a wooden frame and softly glowing cream-colored panels, positioned in the right background."
            }
        ]
    }
}

效果图：

人物是我训练的赵今麦lora半成品，可以看出模型对中国古典的家具理解还是不行，当然，也可能是我的提示词问题。
再看一个实例：

{"aspect_ratio":"16:9","high_level_description":"A cinematic wuxia photograph of a young East Asian swordswoman in pale traditional robes holding a straight sword defensively in a bamboo forest, framed off-center with the weapon extending diagonally toward the foreground.","compositional_deconstruction":{"background":"A dense bamboo grove with tall green stalks, layered dark foliage, and an earthy woodland floor. The distant vegetation is softly out of focus, creating strong depth behind the subject. Diffused natural daylight filters through the canopy with a cool-neutral color balance and subdued green tones.","elements":[{"type":"obj","bbox":[45,405,995,865],"desc":"Young East Asian woman with fair skin and long black hair styled half-up with a simple silver hairpin. She wears layered pale blue-gray and white hanfu robes with wide sleeves. Her expression is focused and serious, eyes directed forward, body angled slightly left while both hands brace the sword hilt."},{"type":"obj","bbox":[465,20,985,570],"desc":"Long straight Chinese sword extending diagonally from the woman's hands toward the lower-left foreground. The narrow dark steel blade is strongly foreshortened, with a rounded brass guard and a dark wrapped grip held firmly in both hands."}]}}

无论是构图，还是阴影，效果都很不错，人物是我训练的文淇Lora
对比我使用gpt-image-2 生成的图片

gpt-image-2本身的能力还是很强，尤其是本身的知识库，不过考虑到模型大小和lora，ideogram4本身的情况也是非常惊艳的。
这几天，也有krea2和boogu两个新模型，粗略看了一下，应该没有超越ideogram4，总而言之，ideogram4确实是这几个月以来最惊艳的模型。

此内容由惯性聚合(RSS阅读器)自动聚合整理，仅供阅读参考。原文来自 — 版权归原作者所有。

推荐订阅源

BlogFinder