Skip to content

Commit 76e993a

Browse files
committed
small idct code improvement
See #79 The new code is both more readable and faster Old assembly: mov eax, edi cmp edi, 256 jb .LBB1_2 sar eax, 31 not al .LBB1_2: ret New assembly: xor ecx, ecx test edi, edi cmovns ecx, edi cmp ecx, 255 mov eax, 255 cmovl eax, ecx ret Benchmark results : Benchmarking decode a 512x512 JPEG: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 12.5s or reduce sample count to 40 decode a 512x512 JPEG time: [2.4692 ms 2.4873 ms 2.5106 ms] change: [-18.558% -17.141% -15.659%] (p = 0.00 < 0.05) Performance has improved. Found 7 outliers among 100 measurements (7.00%) 1 (1.00%) high mild 6 (6.00%) high severe Benchmarking decode a 512x512 progressive JPEG: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 28.0s or reduce sample count to 20 decode a 512x512 progressive JPEG time: [5.5010 ms 5.5212 ms 5.5459 ms] change: [-12.718% -11.746% -10.721%] (p = 0.00 < 0.05) Performance has improved. Found 9 outliers among 100 measurements (9.00%) 3 (3.00%) high mild 6 (6.00%) high severe extract metadata from an image time: [1.3028 us 1.3110 us 1.3207 us] change: [+1.8341% +2.8787% +3.8439%] (p = 0.00 < 0.05) Performance has regressed. Found 9 outliers among 100 measurements (9.00%) 7 (7.00%) high mild 2 (2.00%) high severe
1 parent dab997e commit 76e993a

File tree

1 file changed

+1
-7
lines changed

1 file changed

+1
-7
lines changed

src/idct.rs

Lines changed: 1 addition & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -297,13 +297,7 @@ fn dequantize_and_idct_block_1x1(coefficients: &[i16], quantization_table: &[u16
297297
// take a -128..127 value and stbi__clamp it and convert to 0..255
298298
fn stbi_clamp(x: i32) -> u8
299299
{
300-
// trick to use a single test to catch both cases
301-
if x as u32 > 255 {
302-
if x < 0 { return 0; }
303-
if x > 255 { return 255; }
304-
}
305-
306-
x as u8
300+
x.max(0).min(255) as u8
307301
}
308302

309303
fn stbi_f2f(x: f32) -> i32 {

0 commit comments

Comments
 (0)