Moderator: Andres Valverde
Tom Likens wrote:Hello Everyone,
Tord's question on bitboards made me decide to ask a question of my own. I'm wondering how compact I can get the rotated bitboard arrays for the ranks, files and diagonals. I already use the trick of masking off the end bits, so that the arrays decrease in size from
4 x [64][256] x 8 = 512k
to
4 x [64][64] x 8 = 128k
A nice size reduction of 3/4, at the cost of additional runtime calculations. Changing to this scheme gave me a nice speed up, although off-hand I don't remember how much, it was measurable. I haven't implemented, but am considering taking advantage of reducing the diagonals, since they don't need an index of 6 bits, except for the one long diagonal. In fact, if I've done my math right I'm estimating that I can reduce the combined memory requirements for both diagonals down to about 20k, much smaller than their current 64k requirement.
Anyway, this is similar to what Crafty does with its compact attacks stuff. But I don't really want to duplicate Bob's work. Instead I'd like to do something semi-original. I'm wondering if anyone else would be willing to share some of what they are doing to reduce these arrays, (or at least engage in a conversation on the subject)?
unsigned char OneDimensionalHorizontalAttacks[8][256];
unsigned char AttackIndex[8][256]
bitboard_t HorizontalAttacksArray[70][8];
#define HorizontalAttacks(sq) \
HorizontalAttacksArray[AttackIndex[file(sq)][(R00bb>>(rank(sq)/8))&0xFF]][rank(sq)]
bitboard_t HorizontalAttacksArray[70][8];
bitboard_t VerticalAttacksArray[70][8];
bitboard_t DiagonalAttacksArray[70][16];
bitboard_t AntiDiagonalAttacksArray[70][16];
Tom Likens wrote:Hello Everyone,
Tord's question on bitboards made me decide to ask a question of my own. I'm wondering how compact I can get the rotated bitboard arrays for the ranks, files and diagonals. I already use the trick of masking off the end bits, so that the arrays decrease in size from
4 x [64][256] x 8 = 512k
to
4 x [64][64] x 8 = 128k
A nice size reduction of 3/4, at the cost of additional runtime calculations. Changing to this scheme gave me a nice speed up, although off-hand I don't remember how much, it was measurable. I haven't implemented, but am considering taking advantage of reducing the diagonals, since they don't need an index of 6 bits, except for the one long diagonal. In fact, if I've done my math right I'm estimating that I can reduce the combined memory requirements for both diagonals down to about 20k, much smaller than their current 64k requirement.
Anyway, this is similar to what Crafty does with its compact attacks stuff. But I don't really want to duplicate Bob's work. Instead I'd like to do something semi-original. I'm wondering if anyone else would be willing to share some of what they are doing to reduce these arrays, (or at least engage in a conversation on the subject)?
regards,
--tom
Zach Wegner wrote:Hello Tord,
Wouldn't it be a bit faster if you changed AttackIndex to a bitboard_t*, and just indexed it by [file][rank_state][rank]?
And BTW, changing the 70 index to a power of two won't help in this code, because it's the leftmost dimension.
Tony van Roon-Werten wrote:I've got this idea from Gerd, I don't think he minds sharing it.
It's about sliders, the nonsliders are obvious.
For every square on the board, there are 4 directions. For each combination of square<->direction generate a mask and a magic number so that (for this square<->direction combination) :
1) the mask gives all reachable squares on an empty board ( minus the edges ie the 64 trick)
2) the magic number generates an unique index in the range [0..63] ( use deBruijn for this) for every possible outcome of 1)
this costs 64*4*2*8= 4KB
generate a lookuptable for [square][direction][index] wich contains the squares the piece can move to.
--------
------#-
-----#--
----#---
--------
--#-----
-#------
--------
--------
------#-
-----#--
--------
--------
--------
-#------
--------
Tord Romstad wrote:Tony van Roon-Werten wrote:I've got this idea from Gerd, I don't think he minds sharing it.
It's about sliders, the nonsliders are obvious.
For every square on the board, there are 4 directions. For each combination of square<->direction generate a mask and a magic number so that (for this square<->direction combination) :
1) the mask gives all reachable squares on an empty board ( minus the edges ie the 64 trick)
2) the magic number generates an unique index in the range [0..63] ( use deBruijn for this) for every possible outcome of 1)
this costs 64*4*2*8= 4KB
generate a lookuptable for [square][direction][index] wich contains the squares the piece can move to.
Hi Tony,
Evidently I am even slower than usual today. I didn't understand a word of Zach's suggestion, and I also struggle to understand the idea you describe. Let me see if I understand you correctly.
Assume that we want to compute a bitboard of all attacks in the a1-h8 direction for a bishop on d4. We begin by looking up the mask of all reachable squares on an empty board, minus the edge squares. This mask will look like this:
- Code: Select all
--------
------#-
-----#--
----#---
--------
--#-----
-#------
--------
We 'and' this mask with the bitboard of all occupied squares, and obtain a bitboard which could (for instance) look like this:
- Code: Select all
--------
------#-
-----#--
--------
--------
--------
-#------
--------
This bitboard is multiplied by some magic number (and perhaps shifted to the right?), and the result is a number in the range 0-63. This number, together with the square index and the direction, are used to look up a attack bitboard from a 128 kB array.
Assuming that my explanation above is correct, I still don't understand how to find the 'magic numbers', and your de Bruijn hint doesn't really help me. I understand how to use binary de Bruijn sequences to hash a single-1 bitboard to an integer in the range 0-63, but I don't see how they can be used to create perfect hash functions for bitboards with more than one nonzero bit.
Tord
bool FoundMagic(unsigned __int64 Mask,unsigned __int64 magicNumber)
{
unsigned __int64 subMask=0,check=0;
int index;
do
{
subMask = (subMask-Mask) & Mask;
index=(subMask *magicNumber)>>(64-6);
if (check & ((unsigned __int64) 1)<<index) return(false);
check|=((unsigned __int64) 1)<<index;
} while (subMask);
return(true);
}
Zach Wegner wrote:
And BTW, changing the 70 index to a power of two won't help in this code, because it's the leftmost dimension.
Zach
Tony van Roon-Werten wrote:It's probably possible to reduce the index range even further, but I haven't tested that yet. I was to happy to get this working.
Tony van Roon-Werten wrote:I've got this idea from Gerd, I don't think he minds sharing it.
It's about sliders, the nonsliders are obvious.
For every square on the board, there are 4 directions. For each combination of square<->direction generate a mask and a magic number so that (for this square<->direction combination) :
1) the mask gives all reachable squares on an empty board ( minus the edges ie the 64 trick)
2) the magic number generates an unique index in the range [0..63] ( use deBruijn for this) for every possible outcome of 1)
this costs 64*4*2*8= 4KB
generate a lookuptable for [square][direction][index] wich contains the squares the piece can move to.
this costs 64*64*4*8=128 KB
so together 132 KB.
Why is this nice ? You don't need rotated bitboards
Cheers,
Tony
Tom Likens wrote:Tony van Roon-Werten wrote:It's probably possible to reduce the index range even further, but I haven't tested that yet. I was to happy to get this working.
Hello Tony,
I hadn't realized that XiniX was bitboard based, for some reason I assumed it was an 0x88 / mailbox hybrid. What kind of a speed did this produce? The idea of getting rid of the rotated bitboards is attractive. It would be especially nice to get rid of the overhead of having to update the rotated maps at every make / unmake call.
BTW, do you have a decent link for information on deBruijn numbers and theory?
regards,
--tom
mathmoi wrote:Tony van Roon-Werten wrote:I've got this idea from Gerd, I don't think he minds sharing it.
It's about sliders, the nonsliders are obvious.
For every square on the board, there are 4 directions. For each combination of square<->direction generate a mask and a magic number so that (for this square<->direction combination) :
1) the mask gives all reachable squares on an empty board ( minus the edges ie the 64 trick)
2) the magic number generates an unique index in the range [0..63] ( use deBruijn for this) for every possible outcome of 1)
this costs 64*4*2*8= 4KB
generate a lookuptable for [square][direction][index] wich contains the squares the piece can move to.
this costs 64*64*4*8=128 KB
so together 132 KB.
Why is this nice ? You don't need rotated bitboards
Cheers,
Tony
Hi Tony,
This technique seems to be a simple substitute to the rotated bitmap technique.
However, the multiplication and shift used to compute the index can be an overhead compared to the classic bitboard technique.
Do you have any data about the efficiency of this technique compared to a classic one?
My bet is that it is significantly slower, but I can be wrong. Actualy I hope I am.
Gerd Isenberg wrote:This rotated index substitute requires a fast 64*64 = 64-bit multiplication.
AMD64 64-bit mode has 4 cycles direct path mul - and one cycle shift.
Guess future intels will be competetive as well.
Gerd
u32 bitScanFwd(BB b) {
b ^= (b - 1);
unsigned int fold = ((int) b) ^ ((int)(b>>32));
return lsz64_tbl[(fold * 0x78291ACF) >> (32-6)];
}
Tord Romstad wrote:Tony van Roon-Werten wrote:I've got this idea from Gerd, I don't think he minds sharing it.
It's about sliders, the nonsliders are obvious.
For every square on the board, there are 4 directions. For each combination of square<->direction generate a mask and a magic number so that (for this square<->direction combination) :
1) the mask gives all reachable squares on an empty board ( minus the edges ie the 64 trick)
2) the magic number generates an unique index in the range [0..63] ( use deBruijn for this) for every possible outcome of 1)
this costs 64*4*2*8= 4KB
generate a lookuptable for [square][direction][index] wich contains the squares the piece can move to.
Hi Tony,
Evidently I am even slower than usual today. I didn't understand a word of Zach's suggestion, and I also struggle to understand the idea you describe. Let me see if I understand you correctly.
Assume that we want to compute a bitboard of all attacks in the a1-h8 direction for a bishop on d4. We begin by looking up the mask of all reachable squares on an empty board, minus the edge squares. This mask will look like this:
- Code: Select all
--------
------#-
-----#--
----#---
--------
--#-----
-#------
--------
We 'and' this mask with the bitboard of all occupied squares, and obtain a bitboard which could (for instance) look like this:
- Code: Select all
--------
------#-
-----#--
--------
--------
--------
-#------
--------
This bitboard is multiplied by some magic number (and perhaps shifted to the right?), and the result is a number in the range 0-63. This number, together with the square index and the direction, are used to look up a attack bitboard from a 128 kB array.
Assuming that my explanation above is correct, I still don't understand how to find the 'magic numbers', and your de Bruijn hint doesn't really help me. I understand how to use binary de Bruijn sequences to hash a single-1 bitboard to an integer in the range 0-63, but I don't see how they can be used to create perfect hash functions for bitboards with more than one nonzero bit.
Tord
Return to Programming and Technical Discussions
Users browsing this forum: No registered users and 5 guests