Fastest Magic Move Bitboard Generator ready to use

Programming Topics (Computer Chess) and technical aspects as test techniques, book building, program tuning etc

Moderator: Andres Valverde

Re: Fastest Magic Move Bitboard Generator ready to use

Postby bob » 09 Nov 2006, 18:27

Some data from Crafty...

I have been playing around with this stuff since you, Gerd, etc started discussing it. I have taken bits and pieces posted here, and slowly found the time to migrate this approach into Crafty. I completely removed the rotated bitboard stuff, including the updates to the rotated boards in make/unmake, to make this a "real magic implementation."

Here's the results:

log.001: time=1:48 mat=0 n=93585150 fh=90% nps=866K
log.002: time=1:47 mat=0 n=93585150 fh=90% nps=869K

The second line is the magic version. Which appears to be about 1% faster overall, not very significant. I will add that I did not try any of the "smaller" implementations, this is the full-sized magic table approach with 64 x 512 entries for bishops and 64 x 4096 entries for rooks...

As I had thought, the extra size of the table seems to hurt my 32 bit version (running on my usual xeon). I then tested on our 64 bit xeon boxes to see what happens when the 64 bit multiply becomes less expensive. Here's the result there:

log.001: time=1:19 mat=0 n=93585150 fh=90% nps=1.2M
log.002: time=1:19 mat=0 n=93585150 fh=90% nps=1.2M

So on the 64 bit processors, the advantage I thought might show up is not there at all and there appears to be no speed advantage whatsoever.

I have not tried the reduced-table approaches although I guess they would be easy enough to merge in. Does anyone have any data suggesting that the reduced-array versions performs any better? Otherwise the advantage over rotated bitboards seems to be pretty well zero...
User avatar
bob
 
Posts: 156
Joined: 10 May 2006, 17:59

Re: Fastest Magic Move Bitboard Generator ready to use

Postby Zach Wegner » 09 Nov 2006, 18:50

Hello Bob,

Nice to see you are checking out the magic techniques! They are quite fun. I posted earlier in the thread some results for my move generator versus magic, repeated here:
Code: Select all
mine:
time1 in sec = 14.400 runs 7624512*64 ~ 2.951 ns
MINIMIZE_MAGIC:
time2 in sec = 14.190 runs 7624512*64 ~ 2.908 ns
PERFECT_HASH:
time1 in sec = 14.440 runs 7624512*64 ~ 2.959 ns
neither:
time2 in sec = 18.800 runs 7624512*64 ~ 3.853 ns

This is not quite a fair test, as it is a pure move generator loop with a "snoob" to determine the bitboards. But all results have shown pretty clearly that using the "pure" magic is slower than the compressed. Note that these tests were on a G4, which is 32 bit.

You might want to try out some of the anti-rotated approaches found at http://wbforum.vpittlik.org/viewtopic.php?t=4523. I have not tested them head-to-head with the magic approach, but by perft anti-rotated is faster than my old rotated.
User avatar
Zach Wegner
 
Posts: 182
Joined: 26 Sep 2004, 22:02
Location: Austin, Texas, USA

Re: Fastest Magic Move Bitboard Generator ready to use

Postby bob » 09 Nov 2006, 19:02

I thought the idea looked attractive, as it would eliminate some overhead in updating the rotated bitboards. Unfortunately it didn't eliminate enough to offset the cost... I assume your other "names" mean something I overlooked (minimize-magic, etc) which actually looked a bit better than (I guess) the full-sized table version??

I'm continuing to look, but at least I know my implementation is working since the node counts match exactly on every test, so the generation stuff is clean. I'll look at the anti-rotated stuff (I did not follow that thread at all, but now have some time to get caught back up I hope...)
User avatar
bob
 
Posts: 156
Joined: 10 May 2006, 17:59

Re: Fastest Magic Move Bitboard Generator ready to use

Postby Gerd Isenberg » 09 Nov 2006, 19:33

bob wrote:Some data from Crafty...

I have been playing around with this stuff since you, Gerd, etc started discussing it. I have taken bits and pieces posted here, and slowly found the time to migrate this approach into Crafty. I completely removed the rotated bitboard stuff, including the updates to the rotated boards in make/unmake, to make this a "real magic implementation."

Here's the results:

log.001: time=1:48 mat=0 n=93585150 fh=90% nps=866K
log.002: time=1:47 mat=0 n=93585150 fh=90% nps=869K

The second line is the magic version. Which appears to be about 1% faster overall, not very significant. I will add that I did not try any of the "smaller" implementations, this is the full-sized magic table approach with 64 x 512 entries for bishops and 64 x 4096 entries for rooks...

As I had thought, the extra size of the table seems to hurt my 32 bit version (running on my usual xeon). I then tested on our 64 bit xeon boxes to see what happens when the 64 bit multiply becomes less expensive. Here's the result there:

log.001: time=1:19 mat=0 n=93585150 fh=90% nps=1.2M
log.002: time=1:19 mat=0 n=93585150 fh=90% nps=1.2M

So on the 64 bit processors, the advantage I thought might show up is not there at all and there appears to be no speed advantage whatsoever.

I have not tried the reduced-table approaches although I guess they would be easy enough to merge in. Does anyone have any data suggesting that the reduced-array versions performs any better? Otherwise the advantage over rotated bitboards seems to be pretty well zero...


The full sized table approach was already suspected not to be the fastest. Pradu's reduced MINIMIZE_MAGIC approach with Java like arrays with 800K for rooks and 41K for bishops seems favorable, see:
http://wbforum.vpittlik.org/viewtopic.p ... 991&t=5452

If you are in the mood, you may also try the 9,5K kindergarten approach as a reference, for instance to get straight attacks in conjunction with Pradu's Bishop attacks, so < 50K in total:
http://wbforum.vpittlik.org/viewtopic.p ... 400&t=4523
Gerd Isenberg
 
Posts: 285
Joined: 31 Jan 2005, 20:31
Location: Hattingen, Germany

Re: Fastest Magic Move Bitboard Generator ready to use

Postby Pradu » 09 Nov 2006, 21:49

bob wrote:Otherwise the advantage over rotated bitboards seems to be pretty well zero...


There is one advantage, even if it isn't in speed. Magic move bitboard generators are independant of the position and this allows you to do some really neat tricks such as creating xray-attacks and finding pins in all directions simultaneously.
User avatar
Pradu
 
Posts: 343
Joined: 12 Jan 2005, 19:17
Location: Chandler, Arizona, USA

Re: Fastest Magic Move Bitboard Generator ready to use

Postby smcracraft » 10 Nov 2006, 06:18

Pradu wrote:
bob wrote:Otherwise the advantage over rotated bitboards seems to be pretty well zero...


There is one advantage, even if it isn't in speed. Magic move bitboard generators are independant of the position and this allows you to do some really neat tricks such as creating xray-attacks and finding pins in all directions simultaneously.


How would I do that with this?

Code: Select all
  return (
          (GameBoard.p[side][pawn] & MoveArray[ptype[side^1]][tsq]) |
          (Knight[tsq] & GameBoard.p[side][knight]) |
          (Bmagic(tsq^63, GameBoard.Occupied) & (GameBoard.p[side][bishop]|GameBoard.p[side][queen])) |
          (Rmagic(tsq^63, GameBoard.Occupied) & (GameBoard.p[side][rook]|GameBoard.p[side][queen])) |
          (King[tsq] & GameBoard.p[side][king])
          );


This expression returns bits of all attackers on the board
for the given side.

I'd like to use your idea to obtain Pinned pieces
as well as X-ray for uses in other ways...

but have no idea how to pursue it.

Stuart
smcracraft
 
Posts: 65
Joined: 15 Jan 2006, 05:38

Re: Fastest Magic Move Bitboard Generator ready to use

Postby Pradu » 10 Nov 2006, 07:58

smcracraft wrote:
Pradu wrote:
bob wrote:Otherwise the advantage over rotated bitboards seems to be pretty well zero...


There is one advantage, even if it isn't in speed. Magic move bitboard generators are independant of the position and this allows you to do some really neat tricks such as creating xray-attacks and finding pins in all directions simultaneously.


How would I do that with this?

Stuart


Code: Select all
attacks=Rmoves(square,occupancy);
xrayattacks=Rmoves(square,(occupancy&~attacks)) & ~attacks


Code: Select all
attacks=Bmoves(square,occupancy);
batteryattacksB=Bmoves(square,(occupancy&~Queens[myside])) & ~attacks


Code: Select all
Finding diagonal pinners
   //Diagonal pins
   possiblePinned=Bmoves(pos->KingPos[side],pos->AllPieces)&pos->PiecesSide[side];
   pinners=Bmoves(pos->KingPos[side],pos->AllPieces^possiblePinned)
      & ((pos->Pieces[B]|pos->Pieces[Q])&pos->PiecesSide[!side]);
User avatar
Pradu
 
Posts: 343
Joined: 12 Jan 2005, 19:17
Location: Chandler, Arizona, USA

Re: Fastest Magic Move Bitboard Generator ready to use

Postby bob » 10 Nov 2006, 14:45

Pradu wrote:
bob wrote:Otherwise the advantage over rotated bitboards seems to be pretty well zero...


There is one advantage, even if it isn't in speed. Magic move bitboard generators are independant of the position and this allows you to do some really neat tricks such as creating xray-attacks and finding pins in all directions simultaneously.


I'm not sure I follow. I've always done x-ray attacks using rotated bitboards with no effort. Ditto for pins, where I need that in GenerateCheckEvasions()... And even doing all that, I got a zero speed increase...
User avatar
bob
 
Posts: 156
Joined: 10 May 2006, 17:59

Re: Fastest Magic Move Bitboard Generator ready to use

Postby smcracraft » 10 Nov 2006, 15:46

Pradu wrote:
smcracraft wrote:
Pradu wrote:
bob wrote:Otherwise the advantage over rotated bitboards seems to be pretty well zero...


There is one advantage, even if it isn't in speed. Magic move bitboard generators are independant of the position and this allows you to do some really neat tricks such as creating xray-attacks and finding pins in all directions simultaneously.


How would I do that with this?

Stuart


Code: Select all
attacks=Rmoves(square,occupancy);
xrayattacks=Rmoves(square,(occupancy&~attacks)) & ~attacks


Code: Select all
attacks=Bmoves(square,occupancy);
batteryattacksB=Bmoves(square,(occupancy&~Queens[myside])) & ~attacks


Code: Select all
Finding diagonal pinners
   //Diagonal pins
   possiblePinned=Bmoves(pos->KingPos[side],pos->AllPieces)&pos->PiecesSide[side];
   pinners=Bmoves(pos->KingPos[side],pos->AllPieces^possiblePinned)
      & ((pos->Pieces[B]|pos->Pieces[Q])&pos->PiecesSide[!side]);


Thanks much. This is one set of code I won't be seeking to
provide conditional compilation #ifdef's for with non-magic!

I will be spending some time this weekend to verify the
above does indeed find x-rays, diagonals, and batteries
and then thinking for quite a while on how best to use
them in the program.

Magic is *exactly* the kind of library-support I had hoped would
be provided some day when I read Frey/Slate/Atkin bemoan
the lack of tools for chess debugging.

Magic goes further than any tool could, by providing a
significant subset of the hardest part of the move generator
to do efficiently, and provide a method where simple table
lookup can be used across all pieces, not just non-sliders.

Stuart
smcracraft
 
Posts: 65
Joined: 15 Jan 2006, 05:38

Re: Fastest Magic Move Bitboard Generator ready to use

Postby Pradu » 10 Nov 2006, 18:46

bob wrote:I'm not sure I follow. I've always done x-ray attacks using rotated bitboards with no effort. Ditto for pins, where I need that in GenerateCheckEvasions()... And even doing all that, I got a zero speed increase...


Well it is atleast as fast as rotated according to your tests. Can you post a quick profile of Crafty with magics here? It could be that generating move bitboards isn't done much timewise in Crafty. As for comparing versitality of rotated and magic approaches:

Let me guess that you have extra rotated xray-databases (RmovesXray[64][64], BmovesXray[64][64]) to do fast xrays? Even if one can do xrays with rotated, it isn't versitle. One can probably get away with it for checking pins but for anything complex it won't work because you can't xray selectively.

Some things you can do faster with magics are like finding pinners to your pieces (not opponents pieces as well).
Code: Select all
possiblePinned=Bmoves(pos->KingPos[side],pos->AllPieces)&pos->PiecesSide[side];
pinners=Bmoves(pos->KingPos[side],pos->AllPieces^possiblePinned)
      & ((pos->Pieces[B]|pos->Pieces[Q])&pos->PiecesSide[!side]);


Take the example of battery attacks which you can't do with rotated at all.
Code: Select all
//Battery attacks including regular attacks
batteryattacksB=Bmoves(square,occupancy&~diagonalSliders[myside])


Tell me how one would do battery with rotated bitboards without any expensive LastOne/FirstOne calls and without 6 additional rotated bitboards for bishops+queens and rooks+queens. (You'll need to add 3 extra rotated bitboards for every case to generate selective xray attacks).

Or how about multiple xrays which might be required in SEE?

The thing is, one can't avoid FirstOne/LastOne calls completely to do some basic things with rotated bitboards...unless you can show me otherwise.

In any case, anything you can do with rotated, you can do with magics -- and then some. :mrgreen:
User avatar
Pradu
 
Posts: 343
Joined: 12 Jan 2005, 19:17
Location: Chandler, Arizona, USA

Re: Fastest Magic Move Bitboard Generator ready to use

Postby bob » 10 Nov 2006, 21:10

Pradu wrote:
bob wrote:I'm not sure I follow. I've always done x-ray attacks using rotated bitboards with no effort. Ditto for pins, where I need that in GenerateCheckEvasions()... And even doing all that, I got a zero speed increase...


Well it is atleast as fast as rotated according to your tests. Can you post a quick profile of Crafty with magics here? It could be that generating move bitboards isn't done much timewise in Crafty. As for comparing versitality of rotated and magic approaches:

Let me guess that you have extra rotated xray-databases (RmovesXray[64][64], BmovesXray[64][64]) to do fast xrays? Even if one can do xrays with rotated, it isn't versitle. One can probably get away with it for checking pins but for anything complex it won't work because you can't xray selectively.



No, I just look at the diagonals to see if a sliding piece is beyond the piece in question (see swap.c and SwapXray() in it to see what I mean)...


Some things you can do faster with magics are like finding pinners to your pieces (not opponents pieces as well).
Code: Select all
possiblePinned=Bmoves(pos->KingPos[side],pos->AllPieces)&pos->PiecesSide[side];
pinners=Bmoves(pos->KingPos[side],pos->AllPieces^possiblePinned)
      & ((pos->Pieces[B]|pos->Pieces[Q])&pos->PiecesSide[!side]);


Take the example of battery attacks which you can't do with rotated at all.
Code: Select all
//Battery attacks including regular attacks
batteryattacksB=Bmoves(square,occupancy&~diagonalSliders[myside])


Tell me how one would do battery with rotated bitboards without any expensive LastOne/FirstOne calls and without 6 additional rotated bitboards for bishops+queens and rooks+queens. (You'll need to add 3 extra rotated bitboards for every case to generate selective xray attacks).



In Swap() for example, After I "use" a queen or bishop to capture on a diagonal, I just look the other direction from that queen/bishop to see if there is another one back there. Very cheap...




Or how about multiple xrays which might be required in SEE?



See above. Once I use a piece, I do a very quick (one line) check to see if there is another slider behind it. And once I use that one, I look again. I don't really need to find all battery pieces at once, although I suppose I might find a way to use that reasonably in the SEE code. However, SEE is not a major part of my total time anyway...


The thing is, one can't avoid FirstOne/LastOne calls completely to do some basic things with rotated bitboards...unless you can show me otherwise.


At least in my SEE stuff I don't need any MSB/LSB stuff to find batteries at all... I do use MSB/LSB to pick off each individual piece as it participates in the capture "party"...



In any case, anything you can do with rotated, you can do with magics -- and then some. :mrgreen:


There are complications. Sometimes I only want a specific diagonal. With rotated I can look that up. With magic I have to compute both and then remove one, which is not so easy with diagonals. I also use rank attacks or file attacks for things like asking if two rooks are connected on the 7th... Again I have to and off 1/2 the data the magic stuff gives me...

Here's my latest testing going back and modifying things to use your smaller tables:

32bit:
log.001: time=1:45 mat=0 n=93585150 fh=90% nps=883K
log.002: time=1:47 mat=0 n=93585150 fh=90% nps=869K

so there magic is actually slower, probably those 64 bit multiples that are killers on a 32 bit machine...

64bit:
log.001: time=1:19 mat=0 n=93585150 fh=90% nps=1.2M
log.002: time=1:19 mat=0 n=93585150 fh=90% nps=1.2M

No difference there at all. Looks like the 2mb version is actually faster in 32 bit mode and about equal on the 64 bit xeons we have. I'll try to run some opteron tests later...
User avatar
bob
 
Posts: 156
Joined: 10 May 2006, 17:59

Re: Fastest Magic Move Bitboard Generator ready to use

Postby Zach Wegner » 11 Nov 2006, 03:31

bob wrote:At least in my SEE stuff I don't need any MSB/LSB stuff to find batteries at all... I do use MSB/LSB to pick off each individual piece as it participates in the capture "party"...
But the thing is, you don't need to do this, because you don't need to know the actual square of the attacker. You know the piece type, so you can just do an x ^= x & -x. Actually it is a little more complicated because you don't necessarily strip the LSB: y = x & piece_bb; x ^= y & -y. But this is still going to be faster than an LSB.

There are complications. Sometimes I only want a specific diagonal. With rotated I can look that up. With magic I have to compute both and then remove one, which is not so easy with diagonals. I also use rank attacks or file attacks for things like asking if two rooks are connected on the 7th... Again I have to and off 1/2 the data the magic stuff gives me...
I agree, and this is one of the reasons I don't use magic. However anti-rotated is the best of both worlds, easily modified occupied sets and separated directions.

Here's my latest testing going back and modifying things to use your smaller tables:

32bit:
log.001: time=1:45 mat=0 n=93585150 fh=90% nps=883K
log.002: time=1:47 mat=0 n=93585150 fh=90% nps=869K

so there magic is actually slower, probably those 64 bit multiples that are killers on a 32 bit machine...

64bit:
log.001: time=1:19 mat=0 n=93585150 fh=90% nps=1.2M
log.002: time=1:19 mat=0 n=93585150 fh=90% nps=1.2M

No difference there at all. Looks like the 2mb version is actually faster in 32 bit mode and about equal on the 64 bit xeons we have. I'll try to run some opteron tests later...
That's rather strange. I've never seen results where the large tables are faster. Perhaps you might want to run the tests multiple times, as there is some variation, at least in the 32 bit (1:45 and 1:47 for non-magic).
User avatar
Zach Wegner
 
Posts: 182
Joined: 26 Sep 2004, 22:02
Location: Austin, Texas, USA

Re: Fastest Magic Move Bitboard Generator ready to use

Postby smcracraft » 11 Nov 2006, 03:32

Pradu wrote:
smcracraft wrote:
Pradu wrote:
bob wrote:Otherwise the advantage over rotated bitboards seems to be pretty well zero...


There is one advantage, even if it isn't in speed. Magic move bitboard generators are independant of the position and this allows you to do some really neat tricks such as creating xray-attacks and finding pins in all directions simultaneously.


How would I do that with this?

Stuart


Code: Select all
attacks=Rmoves(square,occupancy);
xrayattacks=Rmoves(square,(occupancy&~attacks)) & ~attacks


Code: Select all
attacks=Bmoves(square,occupancy);
batteryattacksB=Bmoves(square,(occupancy&~Queens[myside])) & ~attacks


Code: Select all
Finding diagonal pinners
   //Diagonal pins
   possiblePinned=Bmoves(pos->KingPos[side],pos->AllPieces)&pos->PiecesSide[side];
   pinners=Bmoves(pos->KingPos[side],pos->AllPieces^possiblePinned)
      & ((pos->Pieces[B]|pos->Pieces[Q])&pos->PiecesSide[!side]);


Hi,

I tried your code but am getting nowhere finding xrays, battery
and pins.

Please see:

Code: Select all
ShowXrayBatteryPins(void)
{
  int square, side;
  BitBoard attacks, xrayattacks,batteryattacksB,pinners,possiblePinned, x;

  for (side = white; side <= black; side++)

    {

      x = GameBoard.AllPieces[side];

      while (x)

        {

          square = NEXTBIT(x);

          CLEARBIT(x,square);

          if (square == c6 || square == c3 || square == f3 || square == f6)
            {
              printf("woops\n");
            }

          attacks=Rmagic(square^63,GameBoard.Occupied);
          xrayattacks=Rmagic(square^63,(GameBoard.Occupied&~attacks)) & ~attacks;

          attacks=Bmagic(square^63,GameBoard.Occupied);
          batteryattacksB=Bmagic(square^63,(GameBoard.Occupied&~GameBoard.p[side][queen])) & ~attacks;

          possiblePinned=Bmagic(GameBoard.kingloc[side],GameBoard.AllPieces[side]);
          pinners=Bmagic(GameBoard.kingloc[side],GameBoard.Occupied^possiblePinned)
            & ((GameBoard.p[side][bishop]|GameBoard.p[side][queen]|
                GameBoard.p[side^1][bishop]|GameBoard.p[side^1][queen])&GameBoard.AllPieces[side^1]);

            if (xrayattacks)
              {
                printf("xrayattacks:\n");
                ShowBitBoard(&xrayattacks);
              }

            if (batteryattacksB)
              {
                printf("batteryattacksB:\n");
                ShowBitBoard(&batteryattacksB);
              }

            if (pinners)
              {
                printf("pinners:\n");
                ShowBitBoard(&pinners);
              }
        }
    }
}
smcracraft
 
Posts: 65
Joined: 15 Jan 2006, 05:38

Re: Fastest Magic Move Bitboard Generator ready to use

Postby bob » 11 Nov 2006, 03:39

Zach Wegner wrote:
bob wrote:At least in my SEE stuff I don't need any MSB/LSB stuff to find batteries at all... I do use MSB/LSB to pick off each individual piece as it participates in the capture "party"...
But the thing is, you don't need to do this, because you don't need to know the actual square of the attacker. You know the piece type, so you can just do an x ^= x & -x. Actually it is a little more complicated because you don't necessarily strip the LSB: y = x & piece_bb; x ^= y & -y. But this is still going to be faster than an LSB.


Actually you do. Take the case of a bishop with a queen behind it and on another diagonal you have something like a pawn with a bishop behind it. As I use a piece in the Swap() calculation, I need to replace that piece with the piece that is behind it. Therefore I need to know _specifically_ which piece is in back. Knowing that there are lots of pieces in battery with other attackers is not important right now, I just need to know what is behind the piece I just used to make a capture...





There are complications. Sometimes I only want a specific diagonal. With rotated I can look that up. With magic I have to compute both and then remove one, which is not so easy with diagonals. I also use rank attacks or file attacks for things like asking if two rooks are connected on the 7th... Again I have to and off 1/2 the data the magic stuff gives me...
I agree, and this is one of the reasons I don't use magic. However anti-rotated is the best of both worlds, easily modified occupied sets and separated directions.

Here's my latest testing going back and modifying things to use your smaller tables:

32bit:
log.001: time=1:45 mat=0 n=93585150 fh=90% nps=883K
log.002: time=1:47 mat=0 n=93585150 fh=90% nps=869K

so there magic is actually slower, probably those 64 bit multiples that are killers on a 32 bit machine...

64bit:
log.001: time=1:19 mat=0 n=93585150 fh=90% nps=1.2M
log.002: time=1:19 mat=0 n=93585150 fh=90% nps=1.2M

No difference there at all. Looks like the 2mb version is actually faster in 32 bit mode and about equal on the 64 bit xeons we have. I'll try to run some opteron tests later...
That's rather strange. I've never seen results where the large tables are faster. Perhaps you might want to run the tests multiple times, as there is some variation, at least in the 32 bit (1:45 and 1:47 for non-magic).


I did run it multiple times with multiple positions. It caught my eye too... Remember my test is not just move generation, it is a complete search, where those 64 bit multiplies might get pipelined OK, but the extra memory reference can be a killer since cache is getting romped on pretty hard by everything and an extra L2 miss can be expensive... And with random accessing into the tables, fetching 128 bytes at a chunk (my L2) that extra reference might hurt more than expected.
User avatar
bob
 
Posts: 156
Joined: 10 May 2006, 17:59

Re: Fastest Magic Move Bitboard Generator ready to use

Postby smcracraft » 11 Nov 2006, 04:24

Pradu,

Here's a rewrite that I think is more inline with your original
note, though it still doesn't find pinned pieces at c3, c6, f3, f6
after e4, e5, Nf3, Nf6, Nc3, Nc6, Bb5, Bb4, d3, d6, Bg5 Bg4.

Code: Select all
void
/*
 * Show x-ray, battery, and pins
 */
ShowXrayBatteryPins(void)
{
  int square, side;
  BitBoard attacks, xrayattacks,batteryattacksB,pinners,possiblePinned, x;

  for (side = white; side <= black; side++)
    {
      xrayattacks = batteryattacksB = pinners = NULLBITBOARD;
      x = GameBoard.p[side][bishop]|GameBoard.p[side][rook];
      while (x)
        {
          square = NEXTBIT(x);
          CLEARBIT(x,square);

          if (BitPosArray[square] & GameBoard.p[side][bishop])
            {
              attacks=Bmagic(square^63,GameBoard.Occupied);
              batteryattacksB=Bmagic(square^63,(GameBoard.Occupied&~GameBoard.p[side][queen])) & ~attacks;
            }

          else if (BitPosArray[square] & GameBoard.p[side][rook])
            {
              attacks=Rmagic(square^63,GameBoard.Occupied);
              xrayattacks=Rmagic(square^63,(GameBoard.Occupied&~attacks)) & ~attacks;
            }

          if (xrayattacks)
            {
              printf("xrayattacks:\n");
              ShowBitBoard(&xrayattacks);
            }

          if (batteryattacksB)
            {
              printf("batteryattacksB:\n");
              ShowBitBoard(&batteryattacksB);
            }
        }
      possiblePinned=Bmagic(GameBoard.kingloc[side]^63,GameBoard.AllPieces[side]);
      pinners=Bmagic(GameBoard.kingloc[side]^63,GameBoard.Occupied^possiblePinned)
        & ((GameBoard.p[side][bishop]|GameBoard.p[side][queen]|
            GameBoard.p[side^1][bishop]|GameBoard.p[side^1][queen])&GameBoard.AllPieces[side^1]);
      if (pinners)
        {
          printf("pinners:\n");
          ShowBitBoard(&pinners);
        }
    }
}
smcracraft
 
Posts: 65
Joined: 15 Jan 2006, 05:38

Re: Fastest Magic Move Bitboard Generator ready to use

Postby Zach Wegner » 11 Nov 2006, 04:33

bob wrote:Actually you do. Take the case of a bishop with a queen behind it and on another diagonal you have something like a pawn with a bishop behind it. As I use a piece in the Swap() calculation, I need to replace that piece with the piece that is behind it. Therefore I need to know _specifically_ which piece is in back. Knowing that there are lots of pieces in battery with other attackers is not important right now, I just need to know what is behind the piece I just used to make a capture...
I don't follow. Looking at the Crafty Swap() function, you can replace square with a bitboard that is set to WhitePawns & attacks, for example, and set attacked_piece to pawn or whichever piece you find. Then later you can find the LSB of this bitboard and clear it from attacks. You keep a bitboard with these LSBs ored into it, to represent what pieces have been used. The SwapXray function then uses this bitboard to calculate new attackers: Bmagic(square, Occupied & ~used) & (BishopsQueens & ~used), and likewise for rook directions (not sure if these names are the same as in Crafty). You can pass in the piece type to find which directions need to be recalculated (pawns, bishops, queens=diagonals; rooks, queens=orthogonals).

Of course whether this is faster than using bitscans, I don't know. I currently don't use xrays and I've been meaning to get rid of SEE as we know it altogether (I will post details if and when I can work it out).
User avatar
Zach Wegner
 
Posts: 182
Joined: 26 Sep 2004, 22:02
Location: Austin, Texas, USA

Re: Fastest Magic Move Bitboard Generator ready to use

Postby Pradu » 11 Nov 2006, 14:03

I did run it multiple times with multiple positions. It caught my eye too... Remember my test is not just move generation, it is a complete search, where those 64 bit multiplies might get pipelined OK, but the extra memory reference can be a killer since cache is getting romped on pretty hard by everything and an extra L2 miss can be expensive... And with random accessing into the tables, fetching 128 bytes at a chunk (my L2) that extra reference might hurt more than expected.


You are using MINIMIZE_MAGIC right? It only makes 1 L2 reference but the PERFECT_HASH thing uses 2 L2 references. I think most of us determined that PERFECT_HASH was a slower method, try MINIMIZE_MAGIC instead.
User avatar
Pradu
 
Posts: 343
Joined: 12 Jan 2005, 19:17
Location: Chandler, Arizona, USA

Re: Fastest Magic Move Bitboard Generator ready to use

Postby Gerd Isenberg » 11 Nov 2006, 18:02

bob wrote:There are complications. Sometimes I only want a specific diagonal. With rotated I can look that up. With magic I have to compute both and then remove one, which is not so easy with diagonals. I also use rank attacks or file attacks for things like asking if two rooks are connected on the 7th... Again I have to and off 1/2 the data the magic stuff gives me...

Here's my latest testing going back and modifying things to use your smaller tables:

32bit:
log.001: time=1:45 mat=0 n=93585150 fh=90% nps=883K
log.002: time=1:47 mat=0 n=93585150 fh=90% nps=869K

so there magic is actually slower, probably those 64 bit multiples that are killers on a 32 bit machine...

64bit:
log.001: time=1:19 mat=0 n=93585150 fh=90% nps=1.2M
log.002: time=1:19 mat=0 n=93585150 fh=90% nps=1.2M

No difference there at all. Looks like the 2mb version is actually faster in 32 bit mode and about equal on the 64 bit xeons we have. I'll try to run some opteron tests later...


May be due to L2/L3 size the intel chips are faster with huge arrays, saving two additional L1 accesses for the variable shift and the pointer to the square-array. What imul performance does those 64-bit xeons have? Guess a lot more than the 4 K8 cycles. Let's see how crafty will perform on an amd-box. For 32-bit mode - 64bit imul is a three 32-bit imul call, i already found your 32-bit result stunning.

For distinct rank-, file- and diagonal-attacks you should really consider the 9.5K approach. I like to call it kindergarten bitboards now, to distinguish it from Lasse's or Pradu's magic bitboards. The multiplication of a file-pattern with a diagonal-, antidiagonal- or rank-pattern, with bits a-h are either zero or one, is rather easy to understand - even for preschool children i guess ;-)
Code: Select all
+-               -+   +-               -+   +-               -+
| h 0 0 0 0 0 0 0 |   | 0 0 0 0 0 0 0 0 |   | 0 0 b c d e f g |
| g 0 0 0 0 0 0 0 |   | 0 0 1 0 0 0 0 0 |   | 0 0 a b c d e f |
| f 0 0 0 0 0 0 0 |   | 0 0 0 1 0 0 0 0 |   | 0 0 0 a b c d e |
| e 0 0 0 0 0 0 0 | * | 0 0 0 0 1 0 0 0 | = | 0 0 0 0 a b c d |
| d 0 0 0 0 0 0 0 |   | 0 0 0 0 0 1 0 0 |   | 0 0 0 0 0 a b c |
| c 0 0 0 0 0 0 0 |   | 0 0 0 0 0 0 1 0 |   | 0 0 0 0 0 0 a b |
| b 0 0 0 0 0 0 0 |   | 0 0 0 0 0 0 0 1 |   | 0 0 0 0 0 0 0 a |
| a 0 0 0 0 0 0 0 |   | 0 0 0 0 0 0 0 0 |   | 0 0 0 0 0 0 0 0 |
+-               -+   +-               -+   +-               -+

+-               -+   +-               -+   +-               -+
| 0 1 0 0 0 0 0 0 |   | 0 0 0 0 0 0 0 0 |   | 0 0 b c d e f g |
| 0 1 0 0 0 0 0 0 |   | 0 b 0 0 0 0 0 0 |   | 0 0 b c d e f g |
| 0 1 0 0 0 0 0 0 |   | 0 0 c 0 0 0 0 0 |   | 0 0 0 c d e f g |
| 0 1 0 0 0 0 0 0 | * | 0 0 0 d 0 0 0 0 | = | 0 0 0 0 d e f g |
| 0 1 0 0 0 0 0 0 |   | 0 0 0 0 e 0 0 0 |   | 0 0 0 0 0 e f g |
| 0 1 0 0 0 0 0 0 |   | 0 0 0 0 0 f 0 0 |   | 0 0 0 0 0 0 f g |
| 0 1 0 0 0 0 0 0 |   | 0 0 0 0 0 0 g 0 |   | 0 0 0 0 0 0 0 g |
| 0 1 0 0 0 0 0 0 |   | 0 0 0 0 0 0 0 0 |   | 0 0 0 0 0 0 0 0 |
+-               -+   +-               -+   +-               -+

Gerd Isenberg
 
Posts: 285
Joined: 31 Jan 2005, 20:31
Location: Hattingen, Germany

Re: Fastest Magic Move Bitboard Generator ready to use

Postby bob » 11 Nov 2006, 22:46

Pradu wrote:
I did run it multiple times with multiple positions. It caught my eye too... Remember my test is not just move generation, it is a complete search, where those 64 bit multiplies might get pipelined OK, but the extra memory reference can be a killer since cache is getting romped on pretty hard by everything and an extra L2 miss can be expensive... And with random accessing into the tables, fetching 128 bytes at a chunk (my L2) that extra reference might hurt more than expected.


You are using MINIMIZE_MAGIC right? It only makes 1 L2 reference but the PERFECT_HASH thing uses 2 L2 references. I think most of us determined that PERFECT_HASH was a slower method, try MINIMIZE_MAGIC instead.


I am not actually using either. I started playing with this by using code posted here... The "minimize" version is what I am testing, not what you later released as "perfect".... Here is my AttacksRook() macro that I am testing with. I see four memory accesses there, the indices array, the mask array, the magic array and the shift array. Any or all of those could be out of L1/L2...

#define AttacksRook(square) *(magic_rook_indices[square]+((((tree->pos.w_occupied|tree->pos.b_occupied)&magic_rook_mask[square])*magic_rook[square])>>magic_rook_shift[square]))


Is that the version you are thinking about???
User avatar
bob
 
Posts: 156
Joined: 10 May 2006, 17:59

Re: Fastest Magic Move Bitboard Generator ready to use

Postby Pradu » 12 Nov 2006, 10:15

bob wrote:#define AttacksRook(square) *(magic_rook_indices[square]+((((tree->pos.w_occupied|tree->pos.b_occupied)&magic_rook_mask[square])*magic_rook[square])>>magic_rook_shift[square]))
Is that the version you are thinking about???


Yes this is the correct one.
User avatar
Pradu
 
Posts: 343
Joined: 12 Jan 2005, 19:17
Location: Chandler, Arizona, USA

PreviousNext

Return to Programming and Technical Discussions

Who is online

Users browsing this forum: No registered users and 23 guests