Delphi-PRAXiS - Einzelnen Beitrag anzeigen - Delphi Floyd-Steinberg Dithering

**Amateurprofi**

Hi for all,
Please bare a little with me, i mean really don't take my calling the Delphi Compiler as an offense toward anyone, it is just i know it, i used to be and still earning my bread from optimizing old and new code, i know the compiler very intimately to call it a fucking centuries old brick.

I will explain just this piece of code, and if you want to expand on it, but first and foremost please, at least consider your knowledge of the code generated by Delphi is wrong(outdated or naively trusting) or unoptimized (inefficient) and will go from there to proof a contradiction, how about that ?.. it is my best method as mathematician by education.

so many agreed on div 16 is always shr 4 or to be more concise should be sar 4, i agree and i know that the compiler wrongly do it for unsigned integers, but this is not the case, as i was talking about that specific case at hand, may be it is my mistake may to not wrote an essay for each line i wrote.

So here a proof that the compiler doesn't use sar 4 or shr 4 for div 16, the proof is just look at x64 version of it !!

try this function in the above optimized version

markieren

Code:

			PROCEDURE SetPixel(XOffset,YOffset,Factor:NativeInt);

var AP:TPBGR;

begin

   // XOffset=Horizontaler Offset in Pixel

   // YOffset=Vertikaler Offset in Bytes

   AP:=P;

   Inc(AP,XOffset);

   Inc(NativeInt(AP),YOffset);

   with AP^, Delta do begin

      Blue:=EnsureRange(Blue+B*Factor shr 4,0,255);

      Green:=EnsureRange(Green+G*Factor shr 4,0,255);

      Red:=EnsureRange(Red+R*Factor shr 4,0,255);

   end;

end;

My speed shows that it is faster by 18% in Win32 and 29% on Win64 !! do it please, it is not slower by 200%, and these values as positive so no problem here.
also if you look at the generated assembly code, then this is it
Anhang 56355

Also try this

markieren

Code:

			  procedure SetPixel(XOffset, YOffset, Factor: NativeInt);

  var

    AP: TPBGR;

    v: NativeInt;

  begin

    AP := P;

    Inc(AP, XOffset);

    Inc(NativeInt(AP), YOffset);

    with AP^, Delta do

    begin

      v := Blue + B * Factor shr 4;

      if v < 0 then

        Blue := 0

      else if v > 255 then

        Blue := 255

      else

        Blue := v;

      v := Green + G * Factor shr 4;

      if v < 0 then

        Green := 0

      else if v > 255 then

        Green := 255

      else

        Green := v;

      v := Red + R * Factor shr 4;

      if v < 0 then

        Red := 0

      else if v > 255 then

        Red := 255

      else

        Red := v;

    end;

  end;

The above function makes the whole process around double the speed,

for both platform .

and again not saying that i know it all, i do mistakes, but not in this case, would love to be proven wrong, but with factual code done right not with you assumptions based on something you didn't see for sure.

My test is attached here Anhang 56359 and hope it is working unlike the above attached project as it is empty.

As for "with" the compiler might fail to generate nice assembly and will revert to shuffle the data and access them continuously on the stack introducing unneeded memory access, this happen with complex loops also, with "with" it in many case will resolve the pointer and reused it from a register, alas it seems no gain in the above mentioned function, but once the function have few more local variables and it will go 90s turbo pascal mode specially in x64 platforms, i don't have the mood to sit and tweak such case for you now, but the effect is there.

ps re-reading before posting this, i sound retarded and offended, and i am sorry for that, i don't mean to offend anyone and never meant to, just had very bad experience from an neighbor forum and trying to be triggered by personal sentences.

markieren

Delphi-Quellcode:

			    with AP^, Delta do

    begin

      v := Blue + B * Factor shr 4;

Please be aware of that the values in Delta (R,G,B) may be negative.
Assume AP.Blue=200 and Delta.B=-10 and Factor=7.
Then V will evaluated as
V := Blue + ((B * Factor) shr 4);
V := 200 + ((-10 * 7) shr 4);
V := 200 + (-70 shr 4);
V := 200 + 268435451;
V := 268435651;
Korrekt is
V := Blue + ((B * Factor) div 16);
V := 200 + ((-10 * 7) div 16);
V := 200 + (-70 div 16);
V := 200 + -4;
V := 196;

Einzelnen Beitrag anzeigen

AW: Floyd-Steinberg Dithering