How to Fix Blurry, Warped, or Distorted Faces in AI-Generated Art
AI image generators create beautiful compositions but often ruin faces in group scenes. Here is how to fix them without losing the composition you spent hours getting right.
Three weeks ago I generated what I thought was the best image I had ever made. A fantasy tavern scene. Warm firelight, detailed wooden beams, a group of travelers at a corner table. The composition was perfect. The lighting was exactly what I wanted. I was ready to post it everywhere.
Then I zoomed in.
One of the travelers had a left eye floating about half an inch above where it should be. Another had a mouth that looked like it was melting sideways. The third one, the one in the back, had no nose at all. Just a smooth patch of skin where a nose should be.
I tried re-rolling. I got the same composition maybe one out of every fifteen generations, and every time, the faces were broken in different ways. Sometimes the eyes were fine but the mouth was wrong. Sometimes the proportions were okay but everything was blurry. I spent two evenings trying to get a clean generation before I realized I was solving the wrong problem.
Why group scenes always break faces
Most AI image generators create images at a base resolution of about 1024x1024 pixels. That sounds like a lot until you think about what happens in a group scene. If your image has four people in it, each face might only get a 40x40 pixel patch to work with. The further a face is from the camera, the fewer pixels it gets.
At 40x40 pixels, the AI model simply does not have enough canvas to draw two symmetrical eyes, a properly proportioned nose, and an evenly shaped mouth. It is not that the model is bad at faces. It is that you are asking it to paint a portrait on a postage stamp. The pixel budget is too small.
This is not a Midjourney problem or a Stable Diffusion problem. It is a resolution problem that affects every current AI image generator. Until base resolutions get significantly larger, group scenes and distant figures will always have face issues.
Re-rolling is a trap
I burned through probably 40 GPU hours trying to generate that tavern scene with clean faces. Every time I got good faces, the composition changed. Every time I kept the composition, the faces broke. The math is against you here. The number of variables that have to align for perfect faces and a perfect composition in the same generation is astronomical.
The smarter approach is to separate the two problems. Get the composition you want first. Do not worry about the faces at all. Once you have a composition that works, fix the faces separately. This is faster, cheaper, and dramatically less frustrating.
How dedicated face restoration works
General upscalers sharpen everything equally. They do not know the difference between a face and a wooden beam. Face restoration models like GFPGAN were trained specifically on human faces. Millions of them. They understand facial anatomy. Where eyes should sit relative to each other. How a nose aligns with a mouth. What natural skin texture looks like versus artificial smoothing.
When you run a face restoration pass on your image, the model first detects all face regions, then enhances only those regions. The background, the lighting, the colors, the clothing. None of it changes. Only the faces get touched. And the model is conservative by design. It enhances what is there rather than inventing completely new facial features.
Cost is 4 credits per image on ClarifyPix Face Restoration. Processing takes three to six seconds regardless of how many faces are in the scene. That tavern image with four travelers cost me 4 credits and six seconds. The same image had cost me two evenings of re-rolling before I figured this out.
When face restoration does not help
There are limits. If a face is smaller than about 24x24 pixels, there is not enough structural information for even a specialized model to work with. The model needs at least a basic face shape to enhance. An eye, a nose outline, a mouth position. If all it can see is a flesh-colored blob, the result will still look like a blob. Maybe a slightly more detailed blob, but a blob.
For those cases, I found a workaround. Generate the same prompt a few more times. Pick the generation where that specific face happened to come out best. Crop it out and composite it into your main image. The lighting and colors usually match well enough, and a quick levels adjustment seals the deal. Then run a single face restoration pass on the composite to clean up any minor inconsistencies.
Is this more work than just re-rolling? Yes, in the short term. But you do it once and you are done. No more praying to the RNG gods for a generation where everything aligns. You take control of the output instead of hoping the AI gets lucky.
This also fixes real photos
I should mention this because a lot of people do not realize it. The same face restoration model that fixes AI generated faces also works on real photographs. Old family photos where faces are soft or blurry. Smartphone group shots where some people were out of focus. Low resolution digital camera images from 2005. The model does not care where the image came from. It just sees a face and tries to make it clearer.
I tested this on a blurry photo of my grandparents from the 1980s. The original was a wallet-sized print scanned at low resolution. My grandfather's face was maybe 30 pixels across. The face restoration pass recovered his eyes, his nose shape, even the slight asymmetry in his smile. My mom cried when she saw it. That alone was worth more than the 4 credits it cost.
If you are tired of throwing away great compositions because the faces are broken, try running a face restoration pass on your next generation. You might already have the image you want. It just needs the faces fixed.