How can you test which opening book is better?

Archive of the old Parsimony forum. Some messages couldn't be restored. Limitations: Search for authors does not work, Parsimony specific formats do not work, threaded view does not work properly. Posting is disabled.

How can you test which opening book is better?

Postby Norm Pollock » 03 Mar 2004, 23:35

Geschrieben von: / Posted by: Norm Pollock at 03 March 2004 23:35:18:

Suppose you have two opening books, X and Y, for your core engine E. How can you determine which is a better book for E?
Here is my test. The first part is to play 200 short blitz games between E/X and E/Y, at 1 minute + 1 second (so there are no "wins on time"). My feeling is that short blitz games depend a larger amount on the opening book than longer games.
Then as the second part of the test, with the same type of blitz game, play E/X and E/Y 100 games each against the same 2 comparable engines.
After these 600 games, a reasonable decision can be made as to which is the better book.
Any better tests?
Norm Pollock
 

Re: How can you test which opening book is better?

Postby Dann Corbit » 03 Mar 2004, 23:55

Geschrieben von: / Posted by: Dann Corbit at 03 March 2004 23:55:24:
Als Antwort auf: / In reply to: How can you test which opening book is better? geschrieben von: / posted by: Norm Pollock at 03 March 2004 23:35:18:
Suppose you have two opening books, X and Y, for your core engine E. How can you determine which is a better book for E?
Here is my test. The first part is to play 200 short blitz games between E/X and E/Y, at 1 minute + 1 second (so there are no "wins on time"). My feeling is that short blitz games depend a larger amount on the opening book than longer games.
Then as the second part of the test, with the same type of blitz game, play E/X and E/Y 100 games each against the same 2 comparable engines.
After these 600 games, a reasonable decision can be made as to which is the better book.
Any better tests?
At that high speed, you will introduce a lot of randomness. The only reason I play games at a speed that high is to test for problems like crashes, misreported results, etc. I would recommend 5 minutes as the bare minimum base. One second increment should be OK.
Also, be sure that your reflect the machines and the programs so that you can eliminate that variable.



my ftp site {remove http:// unless you like error messages}
Dann Corbit
 

Re: How can you test which opening book is better?

Postby Sune Fischer » 04 Mar 2004, 00:34

Geschrieben von: / Posted by: Sune Fischer at 04 March 2004 00:34:32:
Als Antwort auf: / In reply to: How can you test which opening book is better? geschrieben von: / posted by: Norm Pollock at 03 March 2004 23:35:18:
Suppose you have two opening books, X and Y, for your core engine E. How can you determine which is a better book for E?
Here is my test. The first part is to play 200 short blitz games between E/X and E/Y, at 1 minute + 1 second (so there are no "wins on time").
My feeling is that short blitz games depend a larger amount on the opening book than longer games.
Then as the second part of the test, with the same type of blitz game, play E/X and E/Y 100 games each against the same 2 comparable engines.
After these 600 games, a reasonable decision can be made as to which is the better book.
Any better tests?
That is also how I would do it.
I think it is more or less the same as saying it is easier to win
from a good position in blitz than at longer time controls.
I have no idea if this is really the case.
It is said that blitz games tend to have fewer draws, so perhaps it's true.
I take it these two engines would be using their own book, a different one?
If that is the case, then that is a big additional noise factor.
The experiment is not necessarily invalid, but IMO it is inefficent as it would
require many more games to reach the same level of significance.
Better would be, IMO, if you could run the same experiment with a new engine
in selfplay.
If book X beats book Y in several selfplay matches with different engines, then
I think there is a solid indication that X objectively is the better book.
-S.
Sune Fischer
 

Re: How can you test which opening book is better?

Postby Norm Pollock » 04 Mar 2004, 00:45

Geschrieben von: / Posted by: Norm Pollock at 04 March 2004 00:45:13:
Als Antwort auf: / In reply to: Re: How can you test which opening book is better? geschrieben von: / posted by: Sune Fischer at 04 March 2004 00:34:32:
Then as the second part of the test, with the same type of blitz game, play E/X and E/Y 100 games each against the same 2 comparable engines.
After these 600 games, a reasonable decision can be made as to which is the better book.
I take it these two engines would be using their own book, a different one?
If that is the case, then that is a big additional noise factor.
Let me clarify the second part of my test. Suppose I have 2 books X and Y for Crafty. Then I choose 2 comparable engines, each with their own book that is not related to either book X or book Y. For example: ktulu and aristarch. Then Crafty/X plays 100 against ktulu and 100 against aristarch. Same for Crafty/Y. So all together there are 600 games: 200 with crafty/X vs crafty/Y, 100 with crafty/X vs ktulu, 100 with crafty/X vs aristarch,100 with crafty/Y vs ktulu, and 100 with crafty/Y vs aristarch.
Also all 600 matches are on the same machine.
Norm Pollock
 

Re: How can you test which opening book is better?

Postby Sune Fischer » 04 Mar 2004, 01:16

Geschrieben von: / Posted by: Sune Fischer at 04 March 2004 01:16:01:
Als Antwort auf: / In reply to: Re: How can you test which opening book is better? geschrieben von: / posted by: Norm Pollock at 04 March 2004 00:45:13:

Let me clarify the second part of my test. Suppose I have 2 books X and Y for Crafty. Then I choose 2 comparable engines, each with their own book that is not related to either book X or book Y. For example: ktulu and aristarch. Then Crafty/X plays 100 against ktulu and 100 against aristarch. Same for Crafty/Y. So all together there are 600 games: 200 with crafty/X vs crafty/Y, 100 with crafty/X vs ktulu, 100 with crafty/X vs aristarch,100 with crafty/Y vs ktulu, and 100 with crafty/Y vs aristarch.
Also all 600 matches are on the same machine.
Ok yes, that's how I understood it.
But you see, in this test you first play:
*) Crafty/X plays 100 against ktulu and 100 against aristarch.
then you do the "Same for Crafty/Y".
So basicly you are only testing one book at the time.
By playing them directly against eachother you get to test them both at the
same time, two birds with one stone :)
Then there are some things to consider, like will playing against third party
books provide better testing of the books due to higher variaty?
Imagine e.g. if book X was twice as big as Y, that means book Y would only be
able to test half of book X, so no matter the amount of games we play we could
never detect if X is half full of junk lines.
I don't know how likely that is, actually I think it is a good assumption
to take a small sample of a book, say 1 percent, and use this as a representativ
set. That is how statistics and polls are usually done AFAIK so why shouldn't it
work here.
Something I'm more concerned with is what happens when you take a known
distribution and pit it against an unknown distribution?
Each book has some percent of good and bad lines, now we want to know the
distribution in X and Y, but we couldn't care less for the distributions of
Z and W, yet they must be just as significant factors as X and Y.
-S.
Sune Fischer
 

Re: How can you test which opening book is better?

Postby Norm Pollock » 04 Mar 2004, 02:07

Geschrieben von: / Posted by: Norm Pollock at 04 March 2004 02:07:36:
Als Antwort auf: / In reply to: Re: How can you test which opening book is better? geschrieben von: / posted by: Sune Fischer at 04 March 2004 01:16:01:
By playing them directly against eachother you get to test them both at the
same time, two birds with one stone :)
Yes. The first part of the test has them playing each other 200 times.
Norm Pollock
 


Return to Archive (Old Parsimony Forum)

Who is online

Users browsing this forum: No registered users and 44 guests