[1] The photo of the outfit: https://share.google/mHJbchlsTNJ771yBa
1. I’d wager that given their previous release history, this will be open‑weight within 3-4 weeks.
2. It looks like they’re following suit with other models like Z-Image Turbo (6B parameters) and Flux.2 Klein (9B parameters), aiming to release models that can run on much more modest GPUs. For reference, the original Qwen-Image is a 20B-parameter model.
3. This is a unified model (both image generation and editing), so there’s no need to keep separate Qwen-Image and Qwen-Edit models around.
4. The original Qwen-Image scored the highest among local models for image editing in my GenAI Showdown (6 out of 12 points), and it also ranked very highly for image generation (4 out of 12 points).
Generative Comparisons of Local Models:
https://genai-showdown.specr.net/?models=fd,hd,kd,qi,f2d,zt
Editing Comparison of Local Models:
https://genai-showdown.specr.net/image-editing?models=kxd,og...
I'll probably be waiting until the local version drops before adding Qwen-Image-2 to the site.
What Linux tools are you guys using for image generation models like Qwen's diffusion models, since LMStudio only supports text gen.
LinkedIn is filled with them now.
When I used the exact prompt the post - the chat works. It gives me the exact output from the blog post.
Then I used Google Translate to understand the prompt format. The prompt is: A 4x6 panel comic, four lines, six panels per line. Each panel is separated by a white dividing line.
The first row, from left to right: Panel 1: Panel 2: .....
and when I try to change the inputs the comic example fails miserably. It keeps creating random grids - sometimes 4x5 other times 4x6 but then by third row the model will get confused and the output has only 3 panels. Other times English dialogue is replaced with Chinese dialogue. so, not very reliable in my books.
"""A desolate grassland stretches into the distance, its ground dry and cracked. Fine dust is kicked up by vigorous activity, forming a faint grayish-brown mist in the low sky. Mid-ground, eye-level composition: A muscular, robust adult brown horse stands proudly, its forelegs heavily pressing between the shoulder blades and spine of a reclining man. Its hind legs are taut, its neck held high, its mane flying against the wind, its nostrils flared, and its eyes sharp and focused, exuding a primal sense of power. The subdued man is a white male, 30-40 years old, his face covered in dust and sweat, his short, messy dark brown hair plastered to his forehead, his thick beard slightly damp; he wears a badly worn, grey-green medieval-style robe, the fabric torn and stained with mud in several places, a thick hemp rope tied around his waist, and scratched ankle-high leather boots; his body is in a push-up position—his palms are pressed hard against the cracked, dry earth, his knuckles white, the veins in his arms bulging, his legs stretched straight back and taut, his toes digging into the ground, his entire torso trembling slightly from the weight. The background is a range of undulating grey-blue mountains, their outlines stark, their peaks hidden beneath a low-hanging, leaden-grey, cloudy sky. The thick clouds diffuse a soft, diffused light, which pours down naturally from the left front at a 45-degree angle, casting clear and voluminous shadows on the horse's belly, the back of the man's hands, and the cracked ground. The overall color scheme is strictly controlled within the earth tones: the horsehair is warm brown, the robe is a gradient of gray-green-brown, the soil is a mixture of ochre, dry yellow earth, and charcoal gray, the dust is light brownish-gray, and the sky is a transition from matte lead gray to cool gray with a faint glow at the bottom of the clouds. The image has a realistic, high-definition photographic quality, with extremely fine textures—you can see the sweat on the horse's neck, the wear and tear on the robe's warp and weft threads, the skin pores and stubble, the edges of the cracked soil, and the dust particles. The atmosphere is tense, primitive, and full of suffocating tension from a struggle of biological forces."""
What the actual fuck
"Analyze this webpage: https://en.wikipedia.org/wiki/1989_Tiananmen_Square_protests...
Generate an infographic with all the data about the main event timeline and estimated number of victims.
The background image should be this one: https://en.wikipedia.org/wiki/Tank_Man#/media/File :Tank_Man_(Tiananmen_Square_protester).jpg
Improve the background image clarity and resolution."
I've received an error:
"Oops! There was an issue connecting to Qwen3-Max. Content Security Warning: The input file data may contain inappropriate content."
I wonder if locally running the model they published in December does have the same censorship in place (i.e. if it's already trained like this), or if they implement the censorship by the Chinese regimen in place for the web service only.