Ich fand da interesasnt:
Zitat:
Note: LUT (lookup table) timings are probably rather optimistic here. Due to running in tight loop the whole table was sucked into L1 CPU cache. In real computations this function most probably would be called much less frequently and L1 cache would not keep the table entirely.
D.h. selbst wenn der Code mit LUT schneller ist, kann es passieren dass dadurch anderer Code langsamer wird weil die LUT andere Daten in im Cache verdrängt.