@Michael II, Nice ! this is the right way, by changing the operation from read and write in two different places stressing the CPU cache, do it once in sequential and fast move after that perform the the local operation in one place, CPU can perform 2-3 reads and one write per cycle, but let be real, if Delphi compiler managed to spit code that perform one read in 3 cycle i consider this is a Delphi win.
This way will win with Delphi every time :
Code:
Move( InBmp.ScanLine[height-1]^, OutBmp.ScanLine[height-1]^, abs(deltascan)*height);
for y := 0 to Height - 1 do
begin
OutPixel := DstScanline;
for x := 0 to Width - 1 do
begin
if OutPixel^.Blue > Threshold then OutPixel^.Blue := Threshold;
if OutPixel^.Red > Threshold then OutPixel^.Red := Threshold;
if OutPixel^.Green > Threshold then OutPixel^.Green := Threshold;
inc(OutPixels);
end ;
inc(PByte(DstScanline), deltascan);
end ;
One thing though :
Check if InBmp is (=) OutBmp then skip the copy altogether.