























考完试后,除了做一些Research Assistant的工作,我惊喜的发现了AI生图开源模型里又来了一个新秀ideogram 4。上一个给我如此惊喜震撼的还是z-image,而细究起来,我对ideogram4的表现以及未来更有信心一些,尤其是考虑到z-image几乎被阿里给砍了,edit模型迟迟不出。
ideogram 4声称自己是开源模型的第一人,仅仅输给gpt-image-2之类的闭源模型:

对我来说,这个榜单的具体真假不好说,但是这个模型比较优秀的点在于:
它使用的是json prompt提示词,支持对obj进行具体拆分和位置描述,这让对图片进行具体精细控制成为可能。
它整体画面风格类似gpt-image-2等真人写实风格。我不再用z-image或者flux2 klein的一大原因就是这些模型的生成人像风格总有AI磨皮,或者说麦橘感,而ideogram的图片效果却更贴近真实摄影风格。
对Lora的支持也比较好,基本4000-6000步就可以获得比较好的人像拟合。
当然,ideogram 4也有不少毛病,最大的一个就是自带审查,还有就是商用许可,这两个也是被不少人诟病的。
下面我们就来看看具体是使用示例:
json prompt:
{
"high_level_description": "A cinematic historical-drama scene featuring a young East Asian noblewoman seated in an elegant palace interior, dressed in luxurious white and red ceremonial robes beneath glowing autumn foliage.",
"compositional_deconstruction": {
"background": "The setting is a refined palace hall decorated with warm wooden furnishings, softly illuminated screens, and a large tree covered in vivid golden-orange leaves. The shallow depth of field softens the background while preserving the rich autumnal atmosphere. Warm amber lighting contrasts with the cool blue tones visible through the windows.",
"elements": [
{
"type": "obj",
"bbox": [216, 371, 1000, 949],
"desc": "A young East Asian woman seated slightly right of center, shown in a composed and upright posture. She has fair skin, delicate features, and a serious, attentive expression while looking slightly to the left of the camera. Her black hair is arranged in an elaborate high court hairstyle decorated with a small gold ornament and a slender hairpin. She wears an intricately patterned ivory robe with a high layered collar, partly covered by a voluminous red outer garment embroidered with gold floral motifs and trimmed with soft white fur. Her hands rest together in her lap."
},
{
"type": "obj",
"bbox": [19, 94, 799, 754],
"desc": "A large tree with a dark trunk and dense golden-orange foliage filling much of the upper-left and central background, creating a dramatic autumn canopy."
},
{
"type": "obj",
"bbox": [516, 0, 1000, 1000],
"desc": "A pale upholstered wooden seat positioned across the left midground, with carved dark wood framing and softly patterned fabric."
},
{
"type": "obj",
"bbox": [622, 254, 857, 509],
"desc": "A small dark wooden side table placed in front of the seated woman, holding a decorative dish with small orange-colored food and green garnish."
},
{
"type": "obj",
"bbox": [151, 829, 828, 983],
"desc": "A tall rectangular floor lantern with a wooden frame and softly glowing cream-colored panels, positioned in the right background."
}
]
}
}效果图:

人物是我训练的赵今麦lora半成品,可以看出模型对中国古典的家具理解还是不行,当然,也可能是我的提示词问题。
再看一个实例:
{"aspect_ratio":"16:9","high_level_description":"A cinematic wuxia photograph of a young East Asian swordswoman in pale traditional robes holding a straight sword defensively in a bamboo forest, framed off-center with the weapon extending diagonally toward the foreground.","compositional_deconstruction":{"background":"A dense bamboo grove with tall green stalks, layered dark foliage, and an earthy woodland floor. The distant vegetation is softly out of focus, creating strong depth behind the subject. Diffused natural daylight filters through the canopy with a cool-neutral color balance and subdued green tones.","elements":[{"type":"obj","bbox":[45,405,995,865],"desc":"Young East Asian woman with fair skin and long black hair styled half-up with a simple silver hairpin. She wears layered pale blue-gray and white hanfu robes with wide sleeves. Her expression is focused and serious, eyes directed forward, body angled slightly left while both hands brace the sword hilt."},{"type":"obj","bbox":[465,20,985,570],"desc":"Long straight Chinese sword extending diagonally from the woman's hands toward the lower-left foreground. The narrow dark steel blade is strongly foreshortened, with a rounded brass guard and a dark wrapped grip held firmly in both hands."}]}}
无论是构图,还是阴影,效果都很不错,人物是我训练的文淇Lora
对比我使用gpt-image-2 生成的图片


gpt-image-2本身的能力还是很强,尤其是本身的知识库,不过考虑到模型大小和lora,ideogram4本身的情况也是非常惊艳的。
这几天,也有krea2和boogu两个新模型,粗略看了一下,应该没有超越ideogram4,总而言之,ideogram4确实是这几个月以来最惊艳的模型。
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。