Lieber Roland
Ja genausowas in der Art.
So schlau war ich auch schon. Aber der Algorhitmus bildet alle Permutationen und prüft auf Quersumme 4.
Das ist leider sehr ineffizient.
Mich interessiert das ganze konkret für N = 256.
Und eben effizient...
Danke und GrRuss
Reto (Möbius)
Hmmmmm.....
...... hmmmmmmmmmm again
Are you trying to attack AES (or something similar in size) at 256Bit ?!!
You already explained in later post what are you trying to do, but... at 256bit it is near impossible to any consumer PC to do this and, simply put it is too much !
To explain on this, no matter how you do it with integers and their arrays, it will be inefficient because it will grow exponentially, this matrix will be very huge no matter what type of integers you are handling ( byte, word, integer...)
So the cheapest and fastest way is to switch to minimum element, and use bits, where there indexes will represent their value, example , lets represent your example with 5 balls
Code:
balls Vector as bits
XX345 (0,0,3) 00111
X2X45 (0,1,2) 01011
X23X5 (0,2,1) 01101
X234X (0,3,0) 01110
1XX45 (1,0,2) 10011
1X3X5 (1,1,1) 10101
1X34X (1,2,0) 10110
12XX5 (2,0,1) 11001
12X4X (2,1,0) 11010
123XX (3,0,0) 11100
What do we see here:
1) With bits, we instead of building all the permutations, we simply can use an integer value going from 0 to 2^5 (where 5 bits is needed).
2) simple loop with an integer (5 bits integer , it could be a byte or an integer) will let us skip the permutation building in full.
3) In the above example notice that 3 is the sum of all 1 bits !, and there is specific assembly instruction on modern CPUs that can return this sum POPCNT
https://www.felixcloutier.com/x86/popcnt
4) I might be mistaken though about what are you really need, is it the separation of the bits by some count ?, like 3 small vectors, in your example only one is there "1X3X5 (1,1,1) 10101" this will complicate the thing but still way easier than handling integers and matrices.
In any case my suggestion is simply to use increased integer (or Int64) or even some BigInt, this at least will save you speed and space.
Auxiliary assembly instructions that might help here:
LZCNT
https://www.felixcloutier.com/x86/lzcnt
TZCNT
https://www.felixcloutier.com/x86/tzcnt
and of course there is SIMD version for only two them, there is no trailing zero counting SIMD instruction
https://www.felixcloutier.com/x86/vpopcnt
https://www.felixcloutier.com/x86/vplzcntd:vplzcntq
Anyway, this still strongly smell like deferential cryptoanalysis, i remember read many of very close proposed statistically approaches/attacks, yet none of these papers had an implemented algorithm in computer compliable language, it is always mathematical theories and formulas, so will not waste time searching for them.
This approach by generating the permutation by simply increase an integer, is .. well ... firstly i like it.. secondly it will be very fast, i don't think there is faster than this, yet increasing an integer to large values above 32bit (we are here about only 32 balls) it is a big deal but doable on our consumer PCs, for 64bit it will be way slower, going up from there is border sanity to run and wait, so you might divide you problem into smaller and repeatable periodic parts, this might help, but too long for me to go on about it here for more than few lines, and yes i need to wrap my head around it first,... if only the bit counts is needed then, each 5 bits result will be repeated, like directly deduce the result of concatenation the result of two 5bits into one 10bit, or slicing the 10bits into two 5bits, of course if you need larger numbers then use numbers makes sense something is equal to 2^k, like 8 bit or 4 bit, now these 8bit or 4bit can also in a higher layer represented as bits, hence 32bit will be come 4bit representing a matrix of 8bit, the usage for the this higher layer, will be used for elimination only, but again i am suggesting elimination here because it is first what comes to my mind at this moment... this might need more than scratching head and beard.
I wrote too much and hope this still clear, if your needed algorithm only needed the count then it is simple assembly instruction (or few for longer), but if you need the smaller vectors, then it will be harder but doable, by divide and conquer.
About count the bits, i can't find my Hackers Delights book, i didn't see it for few years now, so might lost it or lend it to someone and forgot, but may someone here or even you might it, and there is nice algorithms as i remembers you can use without using assembly, slower though, and some of them might be very slow do the stupidity of Delphi compiler.
Hope i didn't mess the subject by more than few kilometers, and hope it help you with ideas.
And of course good luck.