-
Notifications
You must be signed in to change notification settings - Fork 39
Better handle audio and video conversion #32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Those are the changes made and the reasons: avi:rgb => avi:jyuv: as the documentations says "This is most similar to the original PlayStation video colors" add zscale: the PSX video are in a particular color space (full range, bt601, centered chroma location), but modern browser, AFAIK, will use the "full HD colorspace" (limited range, bt709, left chroma location), making this conversion at the encoding stage will make sure that browser will display correct colors and levels w=640:h=526: upscale the video and fix AR, i'm not quite sure regarding the AR, i've tested it using the preview from jpsxdec and counting pixels -crf 15: a bit of a quality boost, the videos are quite small anyway, it can be raised to lower quality/size -preset veryslow: quality boost at the expense of time, the video are short and at low resolution/framerate, it will take not much more -ar 44100: with default conversion ffmpeg was downsampling the 18.9KHz sound to 16KHz, also AAC use an internal lowpass filter so it was muffling the sound, this will upsample to 44.1KHz so that it will sound correctly -b:a 192k: a bit more bitrate for the sound, it can probably be lowered to 128k but i prefer a bit more as the original sounds are already quite compressed Regarding the videos a better job could be made introducing avs/vs in the mix, the original videos suffers quite a bit of blocking and Deblock_QED fix almost all of it with minimal/low loss of quality, also the upscale can be done using better algorithms like nnedi3 or a specifically trained ESRGAN model
I think a better way to check the AR is to execute the game in an emulator and count the pixel here, there i found two resolution 291*224 and 291.5*224 (i used duckstation x4 zoom), then i found an integer mod2 resolution soutable, 582*448 is the first one
JPG, and so the PSX format should, use the color primaries specified by Rec.601 standard https://en.wikipedia.org/wiki/Rec._601 , there are two version of this standard, PAL and NTSC, i expect it to use the NTSC version, so the one called SMPTE 170M I forgot to check it yesterday
Here an example of video INS17.STR[0] This is how it is converted right now This is with my settings This instead is an experiment i made using ESRGAN and model 4xFSDedither_Manga taken from here https://upscale.wiki/wiki/Model_Database#Dithering (i haven't tried other in reality)
|
What benefit is there to upscaling the video using ffmpeg over keeping it at the original resolution? I have checked the game and the aspect ratio and resolution of the original video is correct. avi:rgb is used as it provides the most accurate colours. The rest of the PR is good, especially the audio stuff (it's kinda strange that ffmpeg did that). |
I'll ask the devs of jpsxdec for that also, as the preview windows is not 320*240, but i'll amend my PR to not change the resolution
I think the same, my upscale was used only to fix the AR
Let me know, personally i'll leave jyuv, but if you intend to accept this PR in the near future and prefer rgb i can revisit the colorpsace conversion to start with an rgb input |
Maybe my configuration of duckstation is wrong? I'll need to check
Those are the changes made and the reasons:
avi:rgb => avi:jyuv: as the documentations says "This is most similar to the original PlayStation video colors"
add zscale: the PSX video are in a particular color space (full range, bt601, centered chroma location), but modern browser, AFAIK, will use the "full HD colorspace" (limited range, bt709, left chroma location), making this conversion at the encoding stage will make sure that browser will display correct colors and levels
w=582:h=448: upscale the video and fix AR, i'm not quite sure regarding the AR, i've tested it counting pixel from duckstation on a couple of videos
-crf 15: a bit of a quality boost, the videos are quite small anyway, it can be raised to lower quality/size
-preset veryslow: quality boost at the expense of time, the video are short and at low resolution/framerate, it will take not much more
-level 31: force H264 level to 3.1 to allow better compatibility
-ar 44100: with default conversion ffmpeg was downsampling the 18.9KHz sound to 16KHz, also AAC use an internal lowpass filter so it was muffling the sound, this will upsample to 44.1KHz so that it will sound correctly
-b:a 192k: a bit more bitrate for the sound, it can probably be lowered to 128k but i prefer a bit more as the original sounds are already quite compressed
Regarding the videos a better job could be made introducing avs/vs in the mix, the original videos suffers quite a bit of blocking and Deblock_QED fix almost all of it with minimal/low loss of quality, also the upscale can be done using better algorithms like nnedi3 or a specifically trained ESRGAN model