Speedup by bitboards

Programming Topics (Computer Chess) and technical aspects as test techniques, book building, program tuning etc

Moderator: Andres Valverde

Speedup by bitboards

Postby Onno Garms » 13 Jul 2007, 02:25

Excuse me for introducing a statement from the Strelka discussion here.

Vas stated:

a) 'Osipov' claims that he changed the Fruit board representation from mailbox to bitboard and got a 2x speedup in performance. This is simply a clueless comment, there would be no speedup of anywhere near this magnitude.


Has anybody implented the same search code with bitboard and mailbox? Maybe somebody who switched in his engine from mailbox to bitboard? How large is the speedup with bitboards? This question is of interest independent of the clone debate.

Some time ago I did some comparison on the move generation only. (As I don't have the Fruit search algorithms over my bitboard move generator like the Strelka author claims to have.)

Normal pseudo-legal move generation is almost exactly the same speed in my bitboard code and in Fruit (not so surprising in my opinion). Tactical pseudo-legal move generation is only about 20% faster in my bitboard code. Is this normal or is something wrong with my bitboard implementation?

Checking pseudo-legal moves for legality is really expensive in Fruit, takes about 3x as much time as in my code (for move generation and legality checking together). However, Fruit works around this problem by delayed legality checking.

Evaluation of things like piece activity can also be done faster with bitboards. On the other hand we have more cache trashing in a bitboard engine. So what will be the overall result? Of course the answer depends on the search algorithms that are used. Nevertheless it would be nice to hear a number.
Last edited by Onno Garms on 20 Jul 2007, 20:30, edited 2 times in total.
User avatar
Onno Garms
 
Posts: 128
Joined: 17 Feb 2007, 11:17
Location: Bonn, Germany

Re: Speedup by bitboards

Postby Pradu » 13 Jul 2007, 06:15

Onno Garms wrote:Excuse me for introducing a statement from the Strelka discussion here.

Vas stated:

a) 'Osipov' claims that he changed the Fruit board representation from mailbox to bitboard and got a 2x speedup in performance. This is simply a clueless comment, there would be no speedup of anywhere near this magnitude.


Has anybody implented the same search code with bitboard and mailbox? Maybe somebody who switched in his engine from mailbox to bitboard? How large is the speedup with bitboards? This question is of interest independent of the clone debate.

Some time ago I did some comparison on the move generation only. (As I don't have the Fruit search algorithms over my bitboard move generator like the Strelka author claims to have.)

Normal pseudo-legal move generation is almost exactly the same speed in my bitboard code and in Fruit (not so surprising in my opinion). Tactical pseudo-legal move generation is only about 20% faster in my bitboard code. Is this normal or is something wrong with my bitboard implementation?

Checking pseudo-legal moves for legality is really expensive in Fruit, takes about 3x as much time as in my code (for move generation and legality checking together). However, Fruit works around this problem by delayed legality checking.

Evaluation of things like piece activity can also be done faster with bitboards. On the other hand we have more cache trashing in a bitboard engine. So what will be the overall result? Of course the answer depends on the search algorithms that are used. Nevertheless it would be nice to hear a number.
Right. I think 2x speedup is entirely possible if you have bad arrayboard code where you're scanning the board a lot. I used to scan the board a lot in my first 0x88 program. I think writing fast code comes naturally if you are using bitboards because you pretty much use sub-routines, like attack-getters, flood fills, or popcount to do everything and all the sub-routines are optimized like crazy by the chess community already. But I don't think the difference would be big if you had good arrayboard code as well. I think just straight out pseudo-legal move generation would be a bit faster in arrayboards and also the makmove. HG's perft would be a good example. Eval can be faster or slower with bitboards or arrayboards depending on what you want to do I guess.
User avatar
Pradu
 
Posts: 343
Joined: 12 Jan 2005, 19:17
Location: Chandler, Arizona, USA

Re: Speedup by bitboards

Postby bob » 13 Jul 2007, 06:25

the main advantage for bitboards comes from 64 bit architectures like core-2 and opterons... It becomes a matter of "data density". Traditional programs make little use of the extra 32 bits the new architectures offer, where bitboarders use 64 bit values where every bit has meaning (white pawns for example, 1's = white pawns, 0 = no white pawns. Since 64 bit processors move 64 bit chunks of data around, taking advantage of that is a win. Moving a typical mailbox to a 64 bit machine gains nothing except for the extra 8 registers available (the bitboard program also gets this advantage so that's a wash).

However, that said, not everything we do is 64 bits. maintaining alpha and beta, counters, etc, are no better in 64 bits than in 32. So the actual speedup is less than 2x in general, unless someone was doing a poor comparison choosing a bad mailbox implementation to compare against a good bitboard implementation.
User avatar
bob
 
Posts: 156
Joined: 10 May 2006, 17:59

Re: Speedup by bitboards

Postby Tord Romstad » 13 Jul 2007, 09:01

Onno Garms wrote:Has anybody implented the same search code with bitboard and mailbox?

Yes. For the first few months of its life, Glaurung 2 was able to use both bitboard and mailbox, I could switch between the two board representations (and a hybrid representation) with only minor changes to the source code. The mailbox version was about 20-30% faster on 32-bit CPUs, on 64-bit CPUs this difference would probably vanish (but I don't know, as I still don't have any computers with 64-bit CPUs).

How large is the speedup with bitboards?

There is no single answer to this question. It depends a lot on how the program works, and on what hardware is used.

Most chess programmers, especially the inexperienced ones, tend to overestimate the importance of the board representation. In the long run, it really doesn't matter much: All the common board representations are good enough, and the choice of which one to use is largely a matter of taste.

Tord
User avatar
Tord Romstad
 
Posts: 639
Joined: 09 Oct 2004, 12:49
Location: Oslo, Norway

Re: Speedup by bitboards

Postby Onno Garms » 13 Jul 2007, 15:57

I forgot to say that my comparison is for 64-bit CPUs, i.e. a (useless) 64 bit compile of fruit and a 64-bit compile of my engine. I did not yet bother to support 32 bit efficiently, so my engine is much slower on a 32-bit CPU (and currently even does not compile under 32-bit Linux for lack of a gcc 32-bit least-significant-bit implementation)
User avatar
Onno Garms
 
Posts: 128
Joined: 17 Feb 2007, 11:17
Location: Bonn, Germany

Re: Speedup by bitboards

Postby Michael Sherwin » 14 Jul 2007, 15:36

The 2x speedup was only in relation to the original fruit. Fabien himself admits that Fruit is slow. Fruit for example scans the board several times to generate all the moves. Rewriting Fruit to get a 2x speedup is certainly possible. Some of the things that Fruit does in the eval especially lends itself to optimization by bitboards. I would expect a 2x speedup just in 32 bits. In 64 bits it may be a 3x speedup over the original 32 bit Fruit. But, all this is contingent on the premis that Fruit is slow to begin with.
User avatar
Michael Sherwin
 
Posts: 168
Joined: 21 Jul 2005, 06:10
Location: USA

Re: Speedup by bitboards

Postby Onno Garms » 14 Jul 2007, 15:59

Michael Sherwin wrote:The 2x speedup was only in relation to the original fruit. Fabien himself admits that Fruit is slow. Fruit for example scans the board several times to generate all the moves.


Fruit scans the board twice, first for captures and than for quiet moves. Fruit does have a function to generate the moves in one pass, but it is not used in most cases. I did not test, but I thought that Fabien did, and scanning twice was more efficient than generating quiet moves that later turn out to be pruned away by a capture.

Rewriting Fruit to get a 2x speedup is certainly possible. Some of the things that Fruit does in the eval especially lends itself to optimization by bitboards.


Evaluating lazily like Toga might solve the problem with less efford.

I would expect a 2x speedup just in 32 bits. In 64 bits it may be a 3x speedup over the original 32 bit Fruit. But, all this is contingent on the premis that Fruit is slow to begin with.


Thank you for your opinion. I am very uncertain about the possible speedup and don't think I can make an adequate guess myself.
User avatar
Onno Garms
 
Posts: 128
Joined: 17 Feb 2007, 11:17
Location: Bonn, Germany

Re: Speedup by bitboards

Postby Onno Garms » 19 Jul 2007, 21:54

Michael Sherwin wrote:I would expect a 2x speedup just in 32 bits. In 64 bits it may be a 3x speedup over the original 32 bit Fruit. But, all this is contingent on the premis that Fruit is slow to begin with.


I thought over that and tend not to believe that.

Node per second on 64 bit Crafty is about 2x that of Fruit. And Crafty is fast; in deed it has the highest nps among all Engines I have tried so far. And Crafty's code is said to be so full of speed tricks that it is very difficult to understand. Fruit, on the other hand, is well structured. Exchanging the board representation would not change that.

I've been told that there are differences in counting nodes between Engines, but I don't see a way to count more nodes than in Toga.

If a strong engine with 3x Fruit's nps were possible, it would be surprising that no existing engine has more than 2x this nps.

Which are the engines with highest nps? Are there faster engines than Crafty?
User avatar
Onno Garms
 
Posts: 128
Joined: 17 Feb 2007, 11:17
Location: Bonn, Germany

Re: Speedup by bitboards

Postby Dann Corbit » 20 Jul 2007, 02:08

Onno Garms wrote:
Michael Sherwin wrote:I would expect a 2x speedup just in 32 bits. In 64 bits it may be a 3x speedup over the original 32 bit Fruit. But, all this is contingent on the premis that Fruit is slow to begin with.


I thought over that and tend not to believe that.

Node per second on 64 bit Crafty is about 2x that of Fruit. And Crafty is fast; in deed it has the highest nps among all Engines I have tried so far. And Crafty's code is said to be so full of speed tricks that it is very difficult to understand. Fruit, on the other hand, is well structured. Exchanging the board representation would not change that.

I've been told that there are differences in counting nodes between Engines, but I don't see a way to count more nodes than in Toga.

If a strong engine with 3x Fruit's nps were possible, it would be surprising that no existing engine has more than 2x this nps.

Which are the engines with highest nps? Are there faster engines than Crafty?


There are bitboard engines faster than crafty in NPS. To increase NPS, the best things to do are to use a super-simple eval, don't hash at all (or count hash nodes as search nodes), don't do a qsearch, and examine a wide, bushy tree.

Of course, these things make the engine play much weaker.

The current version of Glaurung is a bitboard engine.

Here are some figures for a few bitboard engines:
Code: Select all
Analysis from C:\epd\orangutan.epd   
7/19/2007 5:30:23 PM Level: 30 Seconds
Analyzing engine: Crafty 20.14 BH

1) -                   
    Avoid move:
    Best move (Crafty 20.14 BH): e7-e5
    Not found in: 00:30
      9   00:00        208.955   949.795   -0.42   e7e6 Bc1b2 Ng8f6 a2a3 d7d5 e2e3 Bc8d7 Bf1d3 Bf8d6
      9   00:00        249.531   998.124   -0.42   e7e6 Bc1b2 Ng8f6 a2a3 d7d5 e2e3 Bc8d7 Bf1d3 Bf8d6
     10   00:00        500.196   1.064.246   -0.41   e7e6 b4b5 Ng8f6 e2e3 Bf8e7 Bc1b2 00 Bf1d3 d7d6 Ng1f3
     10   00:00        775.899   1.108.427   -0.43   Ng8f6 Bc1b2 e7e6 b4b5 d7d5 e2e3 Bf8d6 Bf1d3 00
     10   00:00        868.082   1.098.837   -0.43   Ng8f6 Bc1b2 e7e6 b4b5 d7d5 e2e3 Bf8d6 Bf1d3 00
     11   00:01      1.614.441   1.076.294   -0.59   Ng8f6 d2d4 e7e6 Bc1d2 Nb8c6 c2c3 Bf8d6 e2e3 00 Bf1d3 e6e5
     11   00:01      1.742.293   1.075.489   -0.59   Ng8f6 d2d4 e7e6 Bc1d2 Nb8c6 c2c3 Bf8d6 e2e3 00 Bf1d3 e6e5
     12   00:02      2.643.385   1.106.018   -0.39   Ng8f6 d2d4 e7e6 Bc1d2 Nb8c6 c2c3 Bf8d6 e2e3 00 b4b5 Nc6e7 Bf1c4
     12   00:06      7.523.278   1.157.427   -0.43   e7e5 a2a3 Ng8f6 Bc1b2 Bf8d6 c2c4 Nb8c6 d2d4 Nc6xd4 c4c5 Bd6e7 Bb2xd4 e5xd4 Qd1xd4
     12   00:07      8.323.631   1.148.087   -0.43   e7e5 a2a3 Ng8f6 Bc1b2 Bf8d6 c2c4 Nb8c6 d2d4 Nc6xd4 c4c5 Bd6e7 Bb2xd4 e5xd4 Qd1xd4
     13   00:11     13.548.772   1.153.086   -0.59   e7e5 a2a3 Ng8f6 e2e4 Nf6xe4 Qd1e2 Ne4f6 Qe2xe5+ Bf8e7 Nb1c3 00 Bc1b2 d7d6 Qe5d4
     13   00:12     14.488.725   1.159.098   -0.59   e7e5 a2a3 Ng8f6 e2e4 Nf6xe4 Qd1e2 Ne4f6 Qe2xe5+ Bf8e7 Nb1c3 00 Bc1b2 d7d6 Qe5d4
     14   00:19     23.845.508   1.255.026   -0.49   e7e5 b4b5 Ng8f6 Bc1b2 d7d6 Nb1c3 Bf8e7 Ng1f3 00 e2e4 a7a6 b5xa6 Nb8xa6 Bf1d3
     14   00:24     30.019.956   1.200.798   -0.49   e7e5 b4b5 Ng8f6 Bc1b2 d7d6 Nb1c3 Bf8e7 Ng1f3 00 e2e4 a7a6 b5xa6 Nb8xa6 Bf1d3
   7/19/2007 5:30:57 PM, Time for this analysis: 00:00:30, Rated time: 00:30

0 of 1 matching moves
7/19/2007 5:30:58 PM, Total time: 12:00:35 AM
Rated time: 00:30 = 30 Seconds

--------------------------------------------------------------------------------

Analysis from C:\epd\orangutan.epd   
7/19/2007 5:34:23 PM Level: 5 Seconds
Analyzing engine: Glaurung 2 Epsilon 5

1) -                   
    Avoid move:
    Best move (Glaurung 2 Epsilon 5): e7-e5
    Not found in: 00:30
      2   00:00             50   103   +0.19   Ng8f6 Ng1f3
      2   00:00            110   226   +0.80   Nb8c6 Bc1a3
      3   00:00            323   646   +0.94   Nb8c6 b4b5 Nc6d4
      4   00:00            645   1.290   +0.54   Nb8c6 b4b5 Nc6d4 Nb1c3
      4   00:00          1.276   2.552   +0.76   Ng8f6 Nb1c3 Nb8c6 Bc1a3
      5   00:00          2.045   4.090   +0.60   Ng8f6 b4b5 d7d6 Ng1f3 Bc8f5
      5   00:00          3.826   7.652   +0.76   e7e6 a2a3 Ng8f6 Nb1c3 Nb8c6
      6   00:00          6.957   13.482   +0.21   e7e6 b4b5 Ng8f6 Ng1f3 Bf8c5 Nb1c3
      6   00:00         10.090   19.554   +0.33   Ng8f6 Ng1f3 Nb8c6 b4b5 Nc6b4 Nb1c3
      7   00:00         17.083   32.110   +0.37   Ng8f6 Ng1f3 Nb8c6 b4b5 Nc6b4 Nb1c3 d7d6
      8   00:00         25.020   45.740   +0.31   Ng8f6 Ng1f3 Nb8c6 b4b5 Nc6b4 Nb1c3 d7d6 Bc1b2
      8   00:00         47.309   81.849   +0.74   e7e6 a2a3 Ng8f6 Nb1c3 Bf8d6 Ng1f3 00 Bc1b2
      9   00:00         82.130   131.408   +0.78   e7e6 b4b5 Ng8f6 Ng1f3 Bf8d6 Nb1c3 00 Bc1b2 Nf6g4
     10   00:00        142.936   203.322   +0.84   e7e6 b4b5 Ng8f6 Ng1f3 Bf8e7 Nb1c3 00 e2e3 d7d5 Bf1d3
     11   00:00        226.726   278.875   +0.68   e7e6 b4b5 Ng8f6 Ng1f3 a7a6 e2e3 Bf8d6 Bf1d3 a6xb5 00 00 Bd3xb5
     12   00:01        485.470   419.593   +0.80   e7e6 b4b5 Ng8f6 Ng1f3 Bf8b4 a2a3 Bb4e7 Nb1c3 00 e2e3 d7d5 Bf1d3
     13   00:01        996.361   535.677   +0.74   e7e6 b4b5 Ng8f6 Ng1f3 Bf8b4 e2e3 00 Bc1b2 a7a6 a2a3 Bb4c5 Nb1c3 d7d5
     14   00:03      2.325.523   627.840   +0.70   e7e6 b4b5 Ng8f6 Ng1f3 Bf8b4 e2e3 00 Bc1b2 a7a6 a2a3 Bb4c5 Nb1c3 d7d5 Nf3e5 a6xb5 Bf1xb5
     15   00:07      5.245.988   670.071   +0.62   e7e6 b4b5 Ng8f6 Ng1f3 d7d5 e2e3 Bf8d6 Bf1e2 00 00 Nb8d7 Bc1a3 Nf6e4 Ba3xd6 c7xd6 Nb1a3
     15   00:17     12.371.917   689.051   +0.66   e7e5 b4b5 Ng8f6 Ng1f3 e5e4 Nf3d4 Bf8e7 Bc1a3 00 Ba3xe7 Qd8xe7 Nb1c3 d7d5 e2e3 Bc8e6 Bf1e2
     16   00:28     20.154.590   700.590   +0.68   e7e5 b4b5 Ng8f6 Ng1f3 e5e4 Nf3d4 Bf8e7 Nb1c3 00 e2e3 c7c5 Nd4f5 d7d5 Nf5xe7+ Qd8xe7 Bf1e2 d5d4 Nc3a4
   7/19/2007 5:35:47 PM, Time for this analysis: 00:00:30, Rated time: 00:30

0 of 1 matching moves
7/19/2007 5:35:48 PM, Total time: 12:01:25 AM
Rated time: 00:30 = 30 Seconds

--------------------------------------------------------------------------------

Analysis from C:\epd\orangutan.epd   
7/19/2007 5:34:23 PM Level: 5 Seconds
Analyzing engine: Olithink413

1) -                   
    Avoid move:
    Best move (Olithink413): e7-e5
    Not found in: 00:30
      2   00:00            177   177   +0.16   Nb8c6 a2a3
      3   00:00            794   794   +0.48   a7a5 b4b5 d7d5
      4   00:00          3.521   3.521   +0.20   e7e5 Bc1b2 Ng8f6 a2a3
      5   00:00          8.317   831.700   +0.55   e7e5 a2a3 d7d5 e2e3 Nb8c6
      6   00:00         42.545   607.785   +0.32   Ng8f6 d2d3 e7e5 a2a3 a7a5 b4b5
      7   00:00        144.648   628.904   +0.55   e7e5 a2a3 a7a5 Bc1b2 f7f6 c2c3 d7d5
      8   00:00        615.038   750.046   +0.33   e7e6 a2a3 Ng8f6 Ng1f3
      9   00:01      1.496.258   813.183   +0.56   e7e5 a2a3 a7a5 Bc1b2 f7f6 b4b5 d7d5 Ng1f3 Bc8f5
     10   00:04      4.112.646   820.887   +0.34   Ng8f6 a2a3 a7a5 b4b5 e7e5 Bc1b2
     11   00:12     10.761.748   860.251   +0.50   e7e5 a2a3 a7a5 Bc1b2 d7d5 b4b5 e5e4 Bb2d4 Ng8f6 Nb1c3 Bf8e7
   7/19/2007 5:36:32 PM, Time for this analysis: 00:00:30, Rated time: 00:30

0 of 1 matching moves
7/19/2007 5:36:34 PM, Total time: 12:02:11 AM
Rated time: 00:30 = 30 Seconds

--------------------------------------------------------------------------------

Analysis from C:\epd\orangutan.epd   
7/19/2007 5:39:41 PM Level: 30 Seconds
Analyzing engine: LGEvolution 1.0.0.8

1) -                   
    Avoid move:
    Best move (LGEvolution 1.0.0.8): e7-e6
    Not found in: 00:30
     8/16   00:00         32.402   1.620.100   +0.41   Nb8c6 Bc1a3 e7e6 b4b5 Qd8f6 c2c3 Bf8xa3 Nb1xa3
     9/19   00:00         71.456   1.786.400   +0.15   Nb8c6 b4b5 Nc6b4 Ng1f3 e7e6 a2a3 Nb4d5 Bc1b2 Ng8f6
     9/19   00:00         82.216   1.174.500   +0.52   Ng8f6 Bc1a3 Nb8c6 Ng1f3 e7e6 b4b5 Bf8xa3 Nb1xa3 Nc6b4
    10/19   00:00        103.608   1.295.100   +0.26   Ng8f6 Bc1a3 d7d6 Ng1f3 Bc8f5 Nb1c3 Nb8c6 b4b5 Nc6e5 Nf3xe5 d6xe5
    11/20   00:00        170.704   1.707.000   +0.42   Ng8f6 Bc1b2 e7e6 a2a3 a7a5 b4xa5 Ra8xa5 Ng1f3 Bf8c5 e2e3 00 Bf1c4
    12/23   00:00        395.562   1.883.600   +0.11   Ng8f6 Ng1f3 e7e6 a2a3 c7c5 b4b5 a7a6 b5xa6 Nb8xa6 Bc1b2 d7d5 e2e3
    12/23   00:00        553.968   1.351.100   +0.25   e7e6 a2a3 c7c5 Bc1b2 c5xb4 a3xb4 Ng8f6 c2c3 a7a5 b4xa5 Ra8xa5 Ra1xa5 Qd8xa5
    13/23   00:00        772.516   1.266.400   +0.50   e7e6 a2a3 c7c5 Bc1b2 Ng8f6 b4b5 Bf8d6 Ng1f3 00 Nb1c3 a7a5 e2e3 Nf6g4
    14/28   00:01      1.469.822   1.300.700   +0.35   e7e6 a2a3 c7c5 b4xc5 Bf8xc5 d2d4 Bc5e7 Ng1f3 Nb8c6 e2e4 Ng8f6 e4e5 Nf6e4 Bf1d3 Qd8a5+ Bc1d2 Ne4xd2 Qd1xd2
    15/32   00:03      4.415.850   1.388.600   +0.45   e7e6 b4b5 Ng8f6 Ng1f3 Bf8e7 Bc1a3 d7d6 e2e3 00 Nb1c3 Bc8d7 Bf1c4 d6d5 Ba3xe7 Qd8xe7
    16/33   00:05      7.575.094   1.395.000   +0.45   e7e6 b4b5 Ng8f6 Ng1f3 Bf8e7 Nb1c3 00 e2e3 a7a6 Bc1b2 a6xb5 Bf1xb5 c7c6 Bb5d3 Qd8b6 Ra1b1 Qb6c5
    17/35   00:09     14.017.692   1.481.700   +0.53   e7e6 a2a3 a7a5 b4b5 Ng8f6 Ng1f3 Bf8e7 Bc1b2 00 e2e3 d7d5 Bf1d3 Nb8d7 00 Nd7c5 Bd3e2 Bc8d7
    18/35   00:13     19.761.550   1.511.900   +0.43   e7e6 a2a3 Ng8f6 Ng1f3 c7c5 c2c3 a7a5 b4xc5 Bf8xc5 d2d4 Bc5e7 c3c4 00 Nb1c3 d7d5 c4c5 Nb8c6 Bc1f4 Nf6e4 Nc3xe4 d5xe4
    19/38   00:22     33.294.826   1.500.400   +0.43   e7e6 a2a3 Ng8f6 Ng1f3 c7c5 c2c3 a7a5 b4xc5 Bf8xc5 d2d4 Bc5e7 c3c4 00 Nb1c3 d7d5 c4c5 Nb8c6 Bc1f4 Nf6e4 Nc3xe4 d5xe4
   7/19/2007 5:40:14 PM, Time for this analysis: 00:00:30, Rated time: 00:30

0 of 1 matching moves
7/19/2007 5:40:15 PM, Total time: 12:00:33 AM
Rated time: 00:30 = 30 Seconds

--------------------------------------------------------------------------------

Analysis from C:\epd\orangutan.epd   
7/19/2007 6:04:37 PM Level: 30 Seconds
Analyzing engine: Pharaon 3.3

1) -                   
    Avoid move:
    Best move (Pharaon 3.3): d7-d5
    Not found in: 00:30
     6/25   00:00         18.474   595.935   +0.01   e7e5 Bc1b2 Nb8c6 b4b5 Nc6d4 e2e4
     6/25+   00:00         22.112   470.468   +0.01   d7d5
     6/25   00:00         24.686   398.161   +0.04   d7d5 e2e3 a7a5 b4b5 e7e5 Bc1b2
     7/25   00:00         67.996   485.685   +0.19   d7d5 e2e3 e7e6 Qd1g4 e6e5 Qg4h5 Bf8d6
     7/25+   00:00         92.442   455.379   +0.19   e7e6
     7/25   00:00         98.692   450.648   +0.22   e7e6 Bc1b2 Ng8f6 a2a3 d7d5 e2e3
     8/25   00:00        156.551   455.090   +0.12   e7e6 Bc1b2 Ng8f6 a2a3 a7a5 b4b5 d7d5 g2g3
     9/25   00:01        540.326   501.230   +0.23   e7e6 c2c3 Qd8f6 d2d4 a7a5 b4xa5 d7d5 Ng1f3 Ra8xa5
    10/25   00:03      1.735.990   536.793   +0.14   e7e6 Bc1b2 Ng8f6 b4b5 d7d5 e2e3 Bc8d7 Ng1f3 Nf6e4 Nf3e5
    10/25+   00:04      2.576.696   538.944   +0.14   e7e5
    10/25   00:05      3.134.926   542.280   +0.19   e7e5 Bc1b2 Ng8f6 Bb2xe5 Nb8c6 Be5b2 Bf8xb4 e2e3 d7d5 Nb1c3
    11/25   00:11      6.334.997   558.444   +0.26   e7e5 Bc1b2 Ng8f6 Bb2xe5 Nb8c6 Be5c3 Bf8xb4 e2e3 d7d5 Bc3xb4 Nc6xb4 Nb1c3
    11/25+   00:17     10.000.002   572.967   +0.26   d7d5
   7/19/2007 6:05:10 PM, Time for this analysis: 00:00:30, Rated time: 00:30

0 of 1 matching moves
7/19/2007 6:05:11 PM, Total time: 12:00:33 AM
Rated time: 00:30 = 30 Seconds

Dann Corbit
 

Re: Speedup by bitboards

Postby Uri Blass » 20 Jul 2007, 07:22

Michael Sherwin wrote:The 2x speedup was only in relation to the original fruit. Fabien himself admits that Fruit is slow. Fruit for example scans the board several times to generate all the moves. Rewriting Fruit to get a 2x speedup is certainly possible. Some of the things that Fruit does in the eval especially lends itself to optimization by bitboards. I would expect a 2x speedup just in 32 bits. In 64 bits it may be a 3x speedup over the original 32 bit Fruit. But, all this is contingent on the premis that Fruit is slow to begin with.


You may be right or wrong and I do not know but
Vasik expressed a different opinion.

http://rybkaforum.net/cgi-bin/rybkaforu ... 6#pid18939

By Vasik Rajlich Date 2007-07-12 04:07 Note that this explanation is itself bogus:

" a) 'Osipov' claims that he changed the Fruit board representation from mailbox to bitboard and got a 2x speedup in performance. This is simply a clueless comment, there would be no speedup of anywhere near this magnitude."

Uri
User avatar
Uri Blass
 
Posts: 727
Joined: 09 Oct 2004, 05:59
Location: Tel-Aviv

Re: Speedup by bitboards

Postby Onno Garms » 20 Jul 2007, 20:30

Normal pseudo-legal move generation is almost exactly the same speed in my bitboard code and in Fruit (not so surprising in my opinion). Tactical pseudo-legal move generation is only about 20% faster in my bitboard code. Is this normal or is something wrong with my bitboard implementation?


Maybe I should better test against another bitboard engine to get a better impression how fast or slow my move generation is. Has anybody implemented a performance test with Crafty or to I have to implement it myself?
User avatar
Onno Garms
 
Posts: 128
Joined: 17 Feb 2007, 11:17
Location: Bonn, Germany

Re: Speedup by bitboards

Postby Pradu » 20 Jul 2007, 22:05

Onno Garms wrote:
Normal pseudo-legal move generation is almost exactly the same speed in my bitboard code and in Fruit (not so surprising in my opinion). Tactical pseudo-legal move generation is only about 20% faster in my bitboard code. Is this normal or is something wrong with my bitboard implementation?


Maybe I should better test against another bitboard engine to get a better impression how fast or slow my move generation is. Has anybody implemented a performance test with Crafty or to I have to implement it myself?
I believe Buzz's perft was a lot slower than Crafty's in 32-bits but a bit faster in 64-bits on an Opteron with Windows. I lost the numbers but my source for the perft/makemove/movegen is on the website and hasn't changed since.
User avatar
Pradu
 
Posts: 343
Joined: 12 Jan 2005, 19:17
Location: Chandler, Arizona, USA

Re: Speedup by bitboards

Postby Dann Corbit » 25 Jul 2007, 23:40

Uri Blass wrote:
Michael Sherwin wrote:The 2x speedup was only in relation to the original fruit. Fabien himself admits that Fruit is slow. Fruit for example scans the board several times to generate all the moves. Rewriting Fruit to get a 2x speedup is certainly possible. Some of the things that Fruit does in the eval especially lends itself to optimization by bitboards. I would expect a 2x speedup just in 32 bits. In 64 bits it may be a 3x speedup over the original 32 bit Fruit. But, all this is contingent on the premis that Fruit is slow to begin with.


You may be right or wrong and I do not know but
Vasik expressed a different opinion.

http://rybkaforum.net/cgi-bin/rybkaforu ... 6#pid18939

By Vasik Rajlich Date 2007-07-12 04:07 Note that this explanation is itself bogus:

" a) 'Osipov' claims that he changed the Fruit board representation from mailbox to bitboard and got a 2x speedup in performance. This is simply a clueless comment, there would be no speedup of anywhere near this magnitude."

Uri


I guess that the switch to bitboard, together with being table driven gives about a 2x speedup over array representation.
Dann Corbit
 

Re: Speedup by bitboards

Postby Onno Garms » 26 Jul 2007, 07:26

Dann Corbit wrote:I guess that the switch to bitboard, together with being table driven gives about a 2x speedup over array representation.


What do you mean by being table driven? Have tables like the bitboards of attacked squares for every square? Doesn't every bitboard engine have that?

Or do you think of some other kind of tables?
User avatar
Onno Garms
 
Posts: 128
Joined: 17 Feb 2007, 11:17
Location: Bonn, Germany

Re: Speedup by bitboards

Postby Michael Sherwin » 29 Jul 2007, 22:22

Onno Garms wrote:
Michael Sherwin wrote:I would expect a 2x speedup just in 32 bits. In 64 bits it may be a 3x speedup over the original 32 bit Fruit. But, all this is contingent on the premis that Fruit is slow to begin with.


I thought over that and tend not to believe that.

Node per second on 64 bit Crafty is about 2x that of Fruit. And Crafty is fast; in deed it has the highest nps among all Engines I have tried so far. And Crafty's code is said to be so full of speed tricks that it is very difficult to understand. Fruit, on the other hand, is well structured. Exchanging the board representation would not change that.

I've been told that there are differences in counting nodes between Engines, but I don't see a way to count more nodes than in Toga.

If a strong engine with 3x Fruit's nps were possible, it would be surprising that no existing engine has more than 2x this nps.

Which are the engines with highest nps? Are there faster engines than Crafty?


Fruit's (2.1) evaluation function is much simpler than Crafty's, therefore, Fruit should already run faster than Crafty. But it is slower. Gerbil which is more complicated than Fruit is almost twice as fast. And Wchess (32 bit) that came with the Borland compilers is even faster yet. Fruit is a slow program. Fruit can be rewritten in mailbox to be at least 2x faster. Then rewritting in bitboard would achive the same or better. Compiling for 64 bits could then speed the rewrite up by 50% for a total 3x speedup. I standby my original statement.

My program RomiChess is poorly written for speed, because, I learned to program in C by writing it and I made plenty of beginner mistakes. The sliding piece code is extreamly slow.

From the initial position:

(AMD 1.3 GHz 1500+)
Crafty 19.19 504k nodes per second
RomiChess 802k nodes per second

With what I have learned and now know, I am confident that the speed of RomiChess can be more than doubled. The trick is, is that the eval in RomiChess is very, very simple. Even more simple than Fruits.

It should not be all that much of a stretch to believe that Fruit can be made 2x faster by moving to bitboards on a 32 bit machine!
User avatar
Michael Sherwin
 
Posts: 168
Joined: 21 Jul 2005, 06:10
Location: USA

Re: Speedup by bitboards

Postby H.G.Muller » 30 Jul 2007, 09:14

Well, from what you say it sounds more like Fruit will speed up 2x from a decent rewrite. Doesn't matter so much if it is bitboard or mailbox.
User avatar
H.G.Muller
 
Posts: 3297
Joined: 16 Nov 2005, 12:02
Location: Diemen, NL

Re: Speedup by bitboards

Postby Uri Blass » 30 Jul 2007, 10:12

Michael Sherwin wrote:
Onno Garms wrote:
Michael Sherwin wrote:I would expect a 2x speedup just in 32 bits. In 64 bits it may be a 3x speedup over the original 32 bit Fruit. But, all this is contingent on the premis that Fruit is slow to begin with.


I thought over that and tend not to believe that.

Node per second on 64 bit Crafty is about 2x that of Fruit. And Crafty is fast; in deed it has the highest nps among all Engines I have tried so far. And Crafty's code is said to be so full of speed tricks that it is very difficult to understand. Fruit, on the other hand, is well structured. Exchanging the board representation would not change that.

I've been told that there are differences in counting nodes between Engines, but I don't see a way to count more nodes than in Toga.

If a strong engine with 3x Fruit's nps were possible, it would be surprising that no existing engine has more than 2x this nps.

Which are the engines with highest nps? Are there faster engines than Crafty?


Fruit's (2.1) evaluation function is much simpler than Crafty's, therefore, Fruit should already run faster than Crafty. But it is slower. Gerbil which is more complicated than Fruit is almost twice as fast. And Wchess (32 bit) that came with the Borland compilers is even faster yet. Fruit is a slow program. Fruit can be rewritten in mailbox to be at least 2x faster. Then rewritting in bitboard would achive the same or better. Compiling for 64 bits could then speed the rewrite up by 50% for a total 3x speedup. I standby my original statement.

My program RomiChess is poorly written for speed, because, I learned to program in C by writing it and I made plenty of beginner mistakes. The sliding piece code is extreamly slow.

From the initial position:

(AMD 1.3 GHz 1500+)
Crafty 19.19 504k nodes per second
RomiChess 802k nodes per second

With what I have learned and now know, I am confident that the speed of RomiChess can be more than doubled. The trick is, is that the eval in RomiChess is very, very simple. Even more simple than Fruits.

It should not be all that much of a stretch to believe that Fruit can be made 2x faster by moving to bitboards on a 32 bit machine!


You may be right or wrong and I do not know.
Vasik claims that he does not believe it so the only way to prove him wrong is to make fruit twice faster.

If somebody can write a program that gives the same analysis as fruit except doing it twice faster then this is going to be convincing.

If it is not going to be done it is only a speculation like my speculation that fruit is better than Crafty because of better search and not only because of better evaluation.

Uri
User avatar
Uri Blass
 
Posts: 727
Joined: 09 Oct 2004, 05:59
Location: Tel-Aviv

Re: Speedup by bitboards

Postby Michael Sherwin » 31 Jul 2007, 04:16

H.G.Muller wrote:Well, from what you say it sounds more like Fruit will speed up 2x from a decent rewrite. Doesn't matter so much if it is bitboard or mailbox.


Yes, that is correct. The Strelka author never claimed that rewriteing Fruit in bitboards is the reason that his code was 2x faster. It was 2x faster, because, he rewrote it.

But, Uri is right, someone needs to do an open source rewrite to demonstrate/prove what can be done. An expert in fast coding techniques would be a good choice. Looking into my crystal ball, I see some letters starting to form. Wait a minute, yes, the letters are becoming clear. HGM, is what I see! :D
User avatar
Michael Sherwin
 
Posts: 168
Joined: 21 Jul 2005, 06:10
Location: USA

Re: Speedup by bitboards

Postby Andreas Guettinger » 31 Jul 2007, 07:58

With a rewrite by HGM, Fruit will probably not only end up two times faster, but also two times smaller. :D

Still, I'm not convinced by this Fruit rewriting stuff. Why somebody should start from Fuit code, rewrite it to bitboards and replace the nice UCI interface by a xboard hack, and later reimplements an UCI interface using similar strings as Rybka is beyond me. The whole story smells fishy.
I would not be surprised if there is as much/little fruit code in Strelka than in Rybka.
regards,
Andy
Andreas Guettinger
 
Posts: 20
Joined: 26 Aug 2006, 23:37
Location: Bern, Switzerland

Re: Speedup by bitboards

Postby H.G.Muller » 31 Jul 2007, 08:13

Andreas Guettinger wrote:With a rewrite by HGM, Fruit will probably not only end up two times faster, but also two times smaller. :D

And people would probably prefer to disassemble it to know how it works, rather than looking at the source.... :D :D :D
User avatar
H.G.Muller
 
Posts: 3297
Joined: 16 Nov 2005, 12:02
Location: Diemen, NL

Next

Return to Programming and Technical Discussions

Who is online

Users browsing this forum: No registered users and 2 guests