Replies: 2 comments 6 replies
-
Those are absolutely fantastic comments, thank you so much! The b64 response format should be a game changer if it's that much faster, can't wait to integrate. I grabbed your Google Drive code to try port the relevant parts. For image compression, I could do that as an opt-in option in the env file. This way, those who prefer to keep PNGs can keep them. I like highest quality personally and it also allows me to do e.g. a quick rotate function on-disk (using Windows Photo Viewer), knowing it is and remains lossless. But I can definitely see how one might prefer less disk space, so an option could come in handy. Maybe another route to take could be to save images as PNG but for the server to look for JPG if the PNG isn't found -- this way, anyone could use a batch converter like XnConvert at any time on old pictures, and the app would still cope with it. Will also look into your other points. For your private prompt-enforcer jailbreak -- does that then also allow one to e.g. use celebrity names? That's one issue I'm having at the moment. |
Beta Was this translation helpful? Give feedback.
-
After some delay, as your cool branch had more changes than I was able to super quickly incorporate, I had another look. Your app is really great -- I did notice, doing some quick manual stopwatching, that it wasn't faster for me though (i.e. to generate images through a binary stream instead of pulling from a live url). Both were around the 12ish+ second mark. I used the API default settings Vivid, non-HD, Square in both cases. Might depend on internet speed? Not sure. Anyway, great job! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi, thanks for your project, I've been using it a lot over the last couple of days! I have a private fork (mostly because it's easier for me to throw stuff into GPT-4 for it to edit stuff).
There are some suggestions that I think would be beneficial to everyone using the repo:
Use
response_format
set tob64_json
instead of the default (url
) - since we control the whole application, we don't need a separate URL to place it somewhere. Withb64_json
format the API will reply withb64_json
field in the image object containing the base64-encoded image, so that then it could just be decoded and saved as previously. The reason to do that is because (at least in my experience) it takes much less time for powerdalle to do that compared to getting the image generation API result + downloading from the URL - about 15-20 sec compared to 30sec.Compress images (maybe optionally?). DALL-E 3 API answers with huge PNGs that are 2-3MB, the quality is virtually the same if you e.g. compress to 90% quality JPG, or even WebP. The space savings there can be about 5-10x for JPG and up to 50x for WebP. A good library for that is https://www.npmjs.com/package/jimp because it's pure-JS so that it doesn't require any binary dependencies. This can be easily done directly on the fly with both the URL download and b64_json because Jimp accepts buffers fine.
Just to expand on this, my
images
folder was 3.2GB with ~2100 generated images, but after I resized all of them to 90% quality JPG (which is virtually the same quality), it became around 650MB. I had to manually change the local URLs in the DB, but it was an easy SQL command in sqlite3 CLI.Better error messages and their format. Here's the function to parse error messages based on the different responses OpenAI API gives (the 3 stages of filters), written by GPT-4:
My fork already diverged quite a bit (it's easier for me that way), e.g. I removed the prompt inspirer, changed the style with the help of GPT-4, added the gallery, and the fixes above. The styling isn't really that good, but I'm can't really do any better :P
Here's how it looks:
Base:
Gallery (default Spotlight preset, only has the images that are loaded on the page right now). I chose to not include descriptions, but base prompt could be included in the description, although with revised prompt it gets too wordy:
Image card separately:

Error messages:
Here's the archive (I'm using a different jailbreak to force the model to use the exact prompt, but I don't think it's a good idea for me to post it, so I removed it): Google Drive. I'm really sorry for not having an easy Git repo :(
In short, thanks for the project, it's really useful and I didn't find anything similar! By the way, what is the license of the code in the repo?
(UPD: Fixed error parsing function, rate limit parsing works now. Added the screenshot for error message output.)
Beta Was this translation helpful? Give feedback.
All reactions