A way to make AI gen less RNG perhaps? #8898
Replies: 1 comment
-
The concept you're referring to with "upvote" and "downvote" is actually known as At each step of generation, the model produces slightly adjusted versions of the latent image in both directions (positive and negative), and the final image is guided by expanding the gap between these two directions according to the CFG scale. However, this approach effectively doubles the computation per step, which makes it extremely slow for large models like FLUX. Moreover, in the case of released FLUX models that are CFG-distilled, the image can degrade significantly unless additional techniques like dynamic thresholding are applied. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I think we've all been at the point where we keep getting close to the image we want in different ways.. maybe you want the hands to be palm up and no matter what engine you use its just not getting it- sure you can try doing a control net to help ensure the pose is right maybe you're going for a specific character and you use IPAdapter and/or FaceID but still it is almost there but the specific facial expression you are going for isn't easy to describe. You finally get that facial expression but now the eyes are completely off so you throw in a Her Eyes lora but that shifts the expression away even though you're on the same seed. What if there was a way that you could sort of 'upvote' pictures that were heading in the general direction and 'downvote' the ones that were moving further away from your goal? I have a limited understanding of how AI image gen works and have only been messing around with it for a couple months but what if you could apply some sort of weight to specific feature to try to lock it in place where words from your text prompt just aren't translating well even in Flux. It could really cut down the 'RNG' element quite a lot if it were possible to favor certain vectors during the generation process similar to how I think an embedding works? Maybe it could even be more general than that "I liked elements from this one but not this." It could also give meaning to those hundreds of images you produced that just didn't meet the mark Just a thought. Love your work so far guys keep it up!
Beta Was this translation helpful? Give feedback.
All reactions