More testers welcome and some thoughts

Archive of the old Parsimony forum. Some messages couldn't be restored. Limitations: Search for authors does not work, Parsimony specific formats do not work, threaded view does not work properly. Posting is disabled.

Re: More testers welcome and some thoughts

Postby Bryan Hofmann » 16 Jul 2004, 17:36

Geschrieben von:/Posted by: Bryan Hofmann at 16 July 2004 18:36:54:
Als Antwort auf:/In reply to: Re: More testers welcome and some thoughts geschrieben von:/posted by: Heinz van Kempen at 16 July 2004 16:12:36:
Hello Olivier,
Personnaly I don't mind testing weaker engines, so you can put me in group B or both groups if you want.
- 64mb hash is fine for me and for most people I think, but it seems that Igor cannot afford this amount of RAM.
- 8mb for EGTB. Some testers will have only 4 men, some other incomplete or complete 5 men, but I don't think this will be a big problem.
- Of course learning should be off.
I use the Fritz GUI for my tournament (Swiss system), but if you run round robin on this GUI, you will face the 1mb bug. In fact running more than 1 game per engine with the Fritz GUI is not safe in my opinion. Note that the bug will harm only engines running on UCI mode, and only if they take hashtable settings from the GUI. It does not affect those running with wb2uci, or UCI engines that take hasthable settings from the ini file (BigLion, Gothmog).
If you can add two computers like me, maybe the best would be to add one for group A and the other one to group B. If new testers are added, there might be the problem, that many are mainly interested in the strongest, so we can be flexible here.
64 mb is standard, but some like Crafty or Gothmog for instance support 24, 48, 96...
3 and 4 men would be also fine. So engines have to demonstrate that they still understand endgames, without being prompted.
learning and ponder should be off, agreed
I do not have this 1MB problem since Shredder8 came with a new and decent UCI.dll. Some authors like Peter Fendrich and Tord Romstad in the beginning send workarounds to be sure that the problem does not exist anymore. It should be no problem anymore. I ran gauntlets over hundreds of games with the same engine ckecking constantly hash and it was given correctly by task manager or task info. I really think that this is now an old problem solved when using the new UCI.dll. In former times those files were sometimes completely messed up, making good tournaments impossible.
Best Regards
Heinz

I disagree with turning learn off. One example is if you are going to use the default Crafty book it is full of bad lines as it is not hand tuned. The method Hyatt choose was to have the program tune the book. With learning off Crafty will repeat the same bad lines. This places Crafty and any other program that uses this method at a disadvanagte.
Bryan Hofmann
 

Re: More testers welcome and some thoughts

Postby Joachim Rang » 16 Jul 2004, 17:44

Geschrieben von:/Posted by: Joachim Rang at 16 July 2004 18:44:16:
Als Antwort auf:/In reply to: Re: More testers welcome and some thoughts geschrieben von:/posted by: Igor Gorelikov at 16 July 2004 17:07:50:
- 64mb hash is fine for me and for most people I think, but it seems that Igor cannot afford this amount of RAM.
For 64mb hash you need 512MB of RAM.
For 38Mb hash you need 256MB of RAM.
I have 256MB...
Igor
you should be able to give 64 MB Hash even with only 256 MB RAM. But that means you can not to anything on the computer while the tournament is running.
regards Joachim
Not at all. Run FreeMem and watch how change RAM during a long computer chess event. You'll see interesting things (I use Windows 98 so I'm not sure about other OS).
Igor
okay i was assuming you use Win XP. For Win XP its possible for Win98 i don't know.
regards Joachim
Joachim Rang
 

Re: More testers welcome and some thoughts

Postby Heinz van Kempen » 16 Jul 2004, 17:44

Geschrieben von:/Posted by: Heinz van Kempen at 16 July 2004 18:44:53:
Als Antwort auf:/In reply to: Re: More testers welcome and some thoughts geschrieben von:/posted by: Bryan Hofmann at 16 July 2004 18:36:54:

I disagree with turning learn off. One example is if you are going to use the default Crafty book it is full of bad lines as it is not hand tuned. The method Hyatt choose was to have the program tune the book. With learning off Crafty will repeat the same bad lines. This places Crafty and any other program that uses this method at a disadvanagte.
Hello Bryan,
okay, those who use own books will have to take a decision about that. I do not know how often lines are repeated when most are using a big own book. Probably not very much. For my Nunn positions learning=off and deleting learning files is a must of course. It will also be interesting if we will get different results for Nunn positions and own books.
By the way: thanks again for your Crafty compile. It is the best and fastest I ever tested. Do you think that Crafty 19.15 is again improved? I already downloaded your 19.15 compile, but did not have time to test because of tournaments I want to finish.
Best Regards
Heinz
Heinz van Kempen
 

Re: More testers welcome and some thoughts

Postby Slobodan R. Stojanovic » 16 Jul 2004, 18:15

Geschrieben von:/Posted by: Slobodan R. Stojanovic at 16 July 2004 19:15:56:
Als Antwort auf:/In reply to: Re: More testers welcome and some thoughts geschrieben von:/posted by: Heinz van Kempen at 16 July 2004 16:01:15:

One pattern couold be:

2´+12" time control on AMD 2200, which correspons (let´s imagine) to 20+3 in AMD 750, and so on.
SL.
Hi Heinz,
I think that the solution should be find conciliating all individual efforts with the colective effort. It is better when each tester has his own experiment, as a whole. The commun parameters should be find after tournament patterns standardization.
How to do that exactly, I don´t know.
Second: my hardware is probably the worst, and I don´t want to run games of 4 or 6 hours to be within a pattern of 30 min/game in AMD 2200.
But I could be within some other pattern. So we need, in the first place, to define many different tournament patterns: 3,5,7 or 9.
SL.
Hi Slobodan,
maybe I do not understand all correctly concerning those patterns.
Can you give 3, 5, 7 or 9 examples, please :-).
Best Regards
Heinz
Slobodan R. Stojanovic
 

Re: More testers welcome and some thoughts

Postby Bryan Hofmann » 16 Jul 2004, 18:27

Geschrieben von:/Posted by: Bryan Hofmann at 16 July 2004 19:27:07:
Als Antwort auf:/In reply to: Re: More testers welcome and some thoughts geschrieben von:/posted by: Heinz van Kempen at 16 July 2004 18:44:53:
I disagree with turning learn off. One example is if you are going to use the default Crafty book it is full of bad lines as it is not hand tuned. The method Hyatt choose was to have the program tune the book. With learning off Crafty will repeat the same bad lines. This places Crafty and any other program that uses this method at a disadvanagte.
Hello Bryan,
okay, those who use own books will have to take a decision about that. I do not know how often lines are repeated when most are using a big own book. Probably not very much. For my Nunn positions learning=off and deleting learning files is a must of course. It will also be interesting if we will get different results for Nunn positions and own books.
By the way: thanks again for your Crafty compile. It is the best and fastest I ever tested. Do you think that Crafty 19.15 is again improved? I already downloaded your 19.15 compile, but did not have time to test because of tournaments I want to finish.
Best Regards
Heinz
The book learning is but one issue, if learning is off Crafty can not use what I think is a great feature, position learning. Now I have heard all of the excuses stating that to make it more fair learning is off. Well let's look at it another way. Some engines have a separate pawn hash, if done correctly, it will make for a more effiecient engine. So do you turn off or not give memory to those engines that have pawn hash simply becuase other engines do not have it? Same goes for EGTBs not all engine use them. This list is endless here and removing something that is part of the engine is not a true representation of the engines capabilities which is the purpose of engine vs engine matches. There is only one reason that Hyatt added the ability to off learning and that was so that in a Crafty vs Crafty match the same book could be used and not corrupt it for testing. Using it in any other fashion limits it true strenght.

Yes 19.15 is faster and several bugs were fixed which improved it's game.
Bryan Hofmann
 

Re: More testers welcome and some thoughts

Postby Thomas Mayer » 16 Jul 2004, 18:41

Geschrieben von:/Posted by: Thomas Mayer at 16 July 2004 19:41:24:
Als Antwort auf:/In reply to: More testers welcome and some thoughts geschrieben von:/posted by: Heinz van Kempen at 16 July 2004 14:52:19:

Hi Heinz,
if you need some more helpers for Group B you might add me -> The question is only how many games you need... For business-reasons right now I have only very unsteadily time for such things... (e.g. Quark sleeps since the IPCCC 2004)
Anyway, usually I have one machine ready to do whatever I want, the question is how fast it should be. Usually I need my Dual for work, so I have two Athlon XP1800+ and from time to time a XP2700+ free for disposal, so why not letting them run some chess games...
By the way, I would prefer 40/40 for such a list. Or even 120/40 (but I know that a list with such a time control exists only in dreams, not enough persons would support it)
Also, do you have some informations what conditions you plan to use for it...
Greets, Thomas
Thomas Mayer
 

Re: More testers welcome and some thoughts

Postby Olivier Deville » 16 Jul 2004, 19:19

Geschrieben von:/Posted by: Olivier Deville at 16 July 2004 20:19:20:
Als Antwort auf:/In reply to: Re: More testers welcome and some thoughts geschrieben von:/posted by: Bryan Hofmann at 16 July 2004 19:27:07:
I disagree with turning learn off. One example is if you are going to use the default Crafty book it is full of bad lines as it is not hand tuned. The method Hyatt choose was to have the program tune the book. With learning off Crafty will repeat the same bad lines. This places Crafty and any other program that uses this method at a disadvanagte.
Hello Bryan,
okay, those who use own books will have to take a decision about that. I do not know how often lines are repeated when most are using a big own book. Probably not very much. For my Nunn positions learning=off and deleting learning files is a must of course. It will also be interesting if we will get different results for Nunn positions and own books.
By the way: thanks again for your Crafty compile. It is the best and fastest I ever tested. Do you think that Crafty 19.15 is again improved? I already downloaded your 19.15 compile, but did not have time to test because of tournaments I want to finish.
Best Regards
Heinz
The book learning is but one issue, if learning is off Crafty can not use what I think is a great feature, position learning. Now I have heard all of the excuses stating that to make it more fair learning is off. Well let's look at it another way. Some engines have a separate pawn hash, if done correctly, it will make for a more effiecient engine. So do you turn off or not give memory to those engines that have pawn hash simply becuase other engines do not have it? Same goes for EGTBs not all engine use them. This list is endless here and removing something that is part of the engine is not a true representation of the engines capabilities which is the purpose of engine vs engine matches. There is only one reason that Hyatt added the ability to off learning and that was so that in a Crafty vs Crafty match the same book could be used and not corrupt it for testing. Using it in any other fashion limits it true strenght.

Yes 19.15 is faster and several bugs were fixed which improved it's game.
Very interesting, this may change my opinion about learning... Let's wait and see !
Olivier


ChessWar
Olivier Deville
 

Re: More testers welcome and some thoughts

Postby Uri Blass » 16 Jul 2004, 19:55

Geschrieben von:/Posted by: Uri Blass at 16 July 2004 20:55:27:
Als Antwort auf:/In reply to: Re: More testers welcome and some thoughts geschrieben von:/posted by: Bryan Hofmann at 16 July 2004 19:27:07:
I disagree with turning learn off. One example is if you are going to use the default Crafty book it is full of bad lines as it is not hand tuned. The method Hyatt choose was to have the program tune the book. With learning off Crafty will repeat the same bad lines. This places Crafty and any other program that uses this method at a disadvanagte.
Hello Bryan,
okay, those who use own books will have to take a decision about that. I do not know how often lines are repeated when most are using a big own book. Probably not very much. For my Nunn positions learning=off and deleting learning files is a must of course. It will also be interesting if we will get different results for Nunn positions and own books.
The book learning is but one issue, if learning is off Crafty can not use what I think is a great feature, position learning. Now I have heard all of the excuses stating that to make it more fair learning is off. Well let's look at it another way. Some engines have a separate pawn hash, if done correctly, it will make for a more effiecient engine. So do you turn off or not give memory to those engines that have pawn hash simply becuase other engines do not have it? Same goes for EGTBs not all engine use them. This list is endless here and removing something that is part of the engine is not a true representation of the engines capabilities which is the purpose of engine vs engine matches.
No
The purpose of engine-engine match is not always to represent the engines capabilities.
There is only one reason that Hyatt added the ability to off learning and that was so that in a Crafty vs Crafty match the same book could be used and not corrupt it for testing. Using it in any other fashion limits it true strenght.
The problem is that not limiting Crafty's strength prevent correct testing
of what many programmers like to get from testers.

The problem is that opponents that are lucky to play Crafty first will get an advantage because Crafty still did not learn when it played against them.
The fact that we have some statisical error in estimate of rating of programs is enough and adding more reasons for errors is not productive.
Suppose that there is engine that is stronger than Crafty if you disable learning but has no learning.
Is it stronger or weaker than Crafty?
If we let engines learn the result may be dependent on the number of games in matches between engines and in short matches it may be stronger when in long match it may be weaker.
We want to have conditions when we measure something when there is only statistical error because of not playing enough games and not more problems.
Uri
Uri
Uri Blass
 

Re: More testers welcome and some thoughts

Postby Bryan Hofmann » 16 Jul 2004, 20:09

Geschrieben von:/Posted by: Bryan Hofmann at 16 July 2004 21:09:06:
Als Antwort auf:/In reply to: Re: More testers welcome and some thoughts geschrieben von:/posted by: Uri Blass at 16 July 2004 20:55:27:
I disagree with turning learn off. One example is if you are going to use the default Crafty book it is full of bad lines as it is not hand tuned. The method Hyatt choose was to have the program tune the book. With learning off Crafty will repeat the same bad lines. This places Crafty and any other program that uses this method at a disadvanagte.
Hello Bryan,
okay, those who use own books will have to take a decision about that. I do not know how often lines are repeated when most are using a big own book. Probably not very much. For my Nunn positions learning=off and deleting learning files is a must of course. It will also be interesting if we will get different results for Nunn positions and own books.
The book learning is but one issue, if learning is off Crafty can not use what I think is a great feature, position learning. Now I have heard all of the excuses stating that to make it more fair learning is off. Well let's look at it another way. Some engines have a separate pawn hash, if done correctly, it will make for a more effiecient engine. So do you turn off or not give memory to those engines that have pawn hash simply becuase other engines do not have it? Same goes for EGTBs not all engine use them. This list is endless here and removing something that is part of the engine is not a true representation of the engines capabilities which is the purpose of engine vs engine matches.
No
The purpose of engine-engine match is not always to represent the engines capabilities.
There is only one reason that Hyatt added the ability to off learning and that was so that in a Crafty vs Crafty match the same book could be used and not corrupt it for testing. Using it in any other fashion limits it true strenght.
The problem is that not limiting Crafty's strength prevent correct testing
of what many programmers like to get from testers.

The problem is that opponents that are lucky to play Crafty first will get an advantage because Crafty still did not learn when it played against them.
The fact that we have some statisical error in estimate of rating of programs is enough and adding more reasons for errors is not productive.
Suppose that there is engine that is stronger than Crafty if you disable learning but has no learning.
Is it stronger or weaker than Crafty?
If we let engines learn the result may be dependent on the number of games in matches between engines and in short matches it may be stronger when in long match it may be weaker.
We want to have conditions when we measure something when there is only statistical error because of not playing enough games and not more problems.
Uri
Uri

It is for the sake of this tourney were are not talking about testing the engines.

Again this is not for testing it is a tourney to represent the strongest engine.


This is why you need a great number of games the whole purpose of this. I simply does not make any sense to restrict an engine just because other engines do not have that feature as I have ready pointed out. The program was design to use learning in order for it to play its best. If you take that away you are not using the program as it was intended and the results mean nothing for that program.
Bryan Hofmann
 

Re: More testers welcome and some thoughts

Postby Bryan Hofmann » 16 Jul 2004, 20:17

Geschrieben von:/Posted by: Bryan Hofmann at 16 July 2004 21:17:20:
Als Antwort auf:/In reply to: Re: More testers welcome and some thoughts geschrieben von:/posted by: Olivier Deville at 16 July 2004 20:19:20:
I disagree with turning learn off. One example is if you are going to use the default Crafty book it is full of bad lines as it is not hand tuned. The method Hyatt choose was to have the program tune the book. With learning off Crafty will repeat the same bad lines. This places Crafty and any other program that uses this method at a disadvanagte.
Hello Bryan,
okay, those who use own books will have to take a decision about that. I do not know how often lines are repeated when most are using a big own book. Probably not very much. For my Nunn positions learning=off and deleting learning files is a must of course. It will also be interesting if we will get different results for Nunn positions and own books.
By the way: thanks again for your Crafty compile. It is the best and fastest I ever tested. Do you think that Crafty 19.15 is again improved? I already downloaded your 19.15 compile, but did not have time to test because of tournaments I want to finish.
Best Regards
Heinz
The book learning is but one issue, if learning is off Crafty can not use what I think is a great feature, position learning. Now I have heard all of the excuses stating that to make it more fair learning is off. Well let's look at it another way. Some engines have a separate pawn hash, if done correctly, it will make for a more effiecient engine. So do you turn off or not give memory to those engines that have pawn hash simply becuase other engines do not have it? Same goes for EGTBs not all engine use them. This list is endless here and removing something that is part of the engine is not a true representation of the engines capabilities which is the purpose of engine vs engine matches. There is only one reason that Hyatt added the ability to off learning and that was so that in a Crafty vs Crafty match the same book could be used and not corrupt it for testing. Using it in any other fashion limits it true strenght.

Yes 19.15 is faster and several bugs were fixed which improved it's game.
Very interesting, this may change my opinion about learning... Let's wait and see !
Olivier
Another way to look at it is in the WCCC and CCT6 do you think that Hyatt turned off learning so all of the opponents played the exact same Crafty?
Bryan Hofmann
 

Re: More testers welcome and some thoughts

Postby Heinz van Kempen » 16 Jul 2004, 20:48

Geschrieben von:/Posted by: Heinz van Kempen at 16 July 2004 21:48:30:
Als Antwort auf:/In reply to: Re: More testers welcome and some thoughts geschrieben von:/posted by: Slobodan R. Stojanovic at 16 July 2004 19:15:56:
One pattern couold be:

2´+12" time control on AMD 2200, which correspons (let´s imagine) to 20+3 in AMD 750, and so on.
SL.
Hi Heinz,
I think that the solution should be find conciliating all individual efforts with the colective effort. It is better when each tester has his own experiment, as a whole. The commun parameters should be find after tournament patterns standardization.
How to do that exactly, I don´t know.
Second: my hardware is probably the worst, and I don´t want to run games of 4 or 6 hours to be within a pattern of 30 min/game in AMD 2200.
But I could be within some other pattern. So we need, in the first place, to define many different tournament patterns: 3,5,7 or 9.
SL.
Hi Slobodan,
maybe I do not understand all correctly concerning those patterns.
Can you give 3, 5, 7 or 9 examples, please :-).
Best Regards
Heinz
Hello Slobodan,
I do not know what the others think, but 30+3 for example for each engine on a 2Ghz PC should be the absolute minimum for games with more time, as we have already a lot of testers at least for group A.
Best Regards
Heinz
Heinz van Kempen
 

Re: More testers welcome and some thoughts

Postby Heinz van Kempen » 16 Jul 2004, 21:01

Geschrieben von:/Posted by: Heinz van Kempen at 16 July 2004 22:01:24:
Als Antwort auf:/In reply to: Re: More testers welcome and some thoughts geschrieben von:/posted by: Thomas Mayer at 16 July 2004 19:41:24:
Hi Heinz,
if you need some more helpers for Group B you might add me -> The question is only how many games you need... For business-reasons right now I have only very unsteadily time for such things... (e.g. Quark sleeps since the IPCCC 2004)
Anyway, usually I have one machine ready to do whatever I want, the question is how fast it should be. Usually I need my Dual for work, so I have two Athlon XP1800+ and from time to time a XP2700+ free for disposal, so why not letting them run some chess games...
By the way, I would prefer 40/40 for such a list. Or even 120/40 (but I know that a list with such a time control exists only in dreams, not enough persons would support it)
Also, do you have some informations what conditions you plan to use for it...
Greets, Thomas
Hello Thomas,
it is great that we also get offers of help from the authors and appeciated a lot. Thanks a lot. We can need all computers and plan do adapt faster ones like your machines that way that for example a XP 2000+ computer is simulated by giving less time to A XP 2700+ and more time to a XP 1800+.
There are no conditions like giving a certain amount of games every week or something that might put pressure on someone. You give as much as you can and want, depending on your ressources and the times you want to do that. There must only be some coordination until a step is completed, to have an identical amount of games for all, for example before adding more engines via gauntlets.
There will be a gap initially between participants of group A and B, in this gap there are still those engines that can hopefully later be added to group A, but were not at once amongst the strongest opted for.
I would play games 120/40 if we can get at least 200 testers :-). You can play this of course without problems in a tournament with at most eight participants or in a Swiss Tournament, but this is not what we want, such a time control has to stay in dreams when wanting to test many engines.
I will send you more information on Sunday with a mail for all, where the votings will be done in an Excel sheet.
Best Regards
Heinz
Heinz van Kempen
 

Re: More testers welcome and some thoughts

Postby Uri Blass » 16 Jul 2004, 21:05

Geschrieben von:/Posted by: Uri Blass at 16 July 2004 22:05:22:
Als Antwort auf:/In reply to: Re: More testers welcome and some thoughts geschrieben von:/posted by: Bryan Hofmann at 16 July 2004 21:09:06:

The problem is that not limiting Crafty's strength prevent correct testing
of what many programmers like to get from testers.
Again this is not for testing it is a tourney to represent the strongest engine.
If you do not restrict the engines you cannot find the strongest engine because it is possible that A will be significantly stronger than B if every 2 engines play match of 10 games when B will be significantly stronger than A if every 2 engines play match of 100 games.
All the reason for disabling learning is to prevent this situation.
Uri
Uri Blass
 

Re: More testers welcome and some thoughts

Postby Graham Banks » 16 Jul 2004, 21:35

Geschrieben von:/Posted by: Graham Banks at 16 July 2004 22:35:20:
Als Antwort auf:/In reply to: More testers welcome and some thoughts geschrieben von:/posted by: Heinz van Kempen at 16 July 2004 14:52:19:

Sorry that I'm unable to assist due to running my own tournaments, but your whole concept is a great idea and I'll be following it with interest.
Graham.
Graham Banks
 

Re: More testers welcome and some thoughts

Postby Heinz van Kempen » 16 Jul 2004, 21:48

Geschrieben von:/Posted by: Heinz van Kempen at 16 July 2004 22:48:08:
Als Antwort auf:/In reply to: Re: More testers welcome and some thoughts geschrieben von:/posted by: Graham Banks at 16 July 2004 22:35:20:
Sorry that I'm unable to assist due to running my own tournaments, but your whole concept is a great idea and I'll be following it with interest.
Graham.
Hello Graham,
I am following your tournament with a lot of interest and you can be sure that a lot of others read all. Do not think that this is ignored, getting few feedback to results posted is something all testers are accustomed to after a while. Regrettably there is a bit of passivity in this respect in all fora.
Maybe when you have finished your super long tournament we will still add games for newer versions and then you can help at any time.
Thank you for your tournament and kind words.
Best Regards
Heinz
Heinz van Kempen
 

Re: More testers welcome and some thoughts

Postby Bryan Hofmann » 16 Jul 2004, 22:55

Geschrieben von:/Posted by: Bryan Hofmann at 16 July 2004 23:55:20:
Als Antwort auf:/In reply to: Re: More testers welcome and some thoughts geschrieben von:/posted by: Uri Blass at 16 July 2004 22:05:22:
The problem is that not limiting Crafty's strength prevent correct testing
of what many programmers like to get from testers.
Again this is not for testing it is a tourney to represent the strongest engine.
If you do not restrict the engines you cannot find the strongest engine because it is possible that A will be significantly stronger than B if every 2 engines play match of 10 games when B will be significantly stronger than A if every 2 engines play match of 100 games.
All the reason for disabling learning is to prevent this situation.
This makes no sense what so ever. You saying that if you do not remove a key element from an engine you can not find the strongest engine. Code and papers are out there on how to add learning to an engine. If the author does not do it then that is their choice. Let me ask you this, what if Hyatt removes the ability to turn off learning (which I hope he does)? Will it be banned from playing in these silly basement tourneys? Also in the WCCC and CCT tourneys do think learning was off? Do you think that Hyatt even thought that he must turn off learning in order to ensure that ever opponent played the same crafty?
If you answer is no to the last two questions then why would it be done here? It is a tourney and it is about time that people learn when they disable anything it is not the way the author created the program to run and any results for the program are completly invalid!
Bryan Hofmann
 

Re: More testers welcome and some thoughts

Postby Uri Blass » 16 Jul 2004, 23:54

Geschrieben von:/Posted by: Uri Blass at 17 July 2004 00:54:00:
Als Antwort auf:/In reply to: Re: More testers welcome and some thoughts geschrieben von:/posted by: Bryan Hofmann at 16 July 2004 23:55:20:
The problem is that not limiting Crafty's strength prevent correct testing
of what many programmers like to get from testers.
Again this is not for testing it is a tourney to represent the strongest engine.
If you do not restrict the engines you cannot find the strongest engine because it is possible that A will be significantly stronger than B if every 2 engines play match of 10 games when B will be significantly stronger than A if every 2 engines play match of 100 games.
All the reason for disabling learning is to prevent this situation.
This makes no sense what so ever. You saying that if you do not remove a key element from an engine you can not find the strongest engine. Code and papers are out there on how to add learning to an engine. If the author does not do it then that is their choice. Let me ask you this, what if Hyatt removes the ability to turn off learning (which I hope he does)? Will it be banned from playing in these silly basement tourneys?
If you answer is no to the last two questions then why would it be done here? It is a tourney and it is about time that people learn when they disable anything it is not the way the author created the program to run and any results for the program are completly invalid!
It is impossible to do it.
You can save a copy of the engine before the game so after the game you use another copy of the engine and delete everything.

Also in the WCCC and CCT tourneys do think learning was off?
No but these tournament are not about getting rating for engines.
They are not meant to help the authors to see if they did improvement in their non learning stuff.
Do you think that Hyatt even thought that he must turn off learning in order to ensure that ever opponent played the same crafty?

No but in that enviroment the opponents also do not turn off learning and we talk about a tournament when all turn off learning.
I already implemented some book learning in movei(no positional learning).
Suppose that moveiA learns and later I give MoveiB that also learns
MoveiB may be better engine than MoveiA but first result may give wrong impression not because of statistical error but because MoveiA already learned which lines are good for it when MoveiB did not learn it so the lines that are going to be played will favour MoveiA.
The only solution to this problem is to play something like the nunn match
with no learning.
I want to disable the learning factor in tests that I do unless I change the learning algorithm and I believe that most of the work of programmers are not about changing the learning algorithm.
The program already run in enough tournament with learning on and I believe that most programmers will agree that tournament without learning from fixed position like the nunn match may be productive for them.
I am not against additional tournaments with learning on(not to be combined for rating) but I prefer the data of Heinz with learning off in the nunn match and if unifying efforts to one tournament means that programs will start to use their own book in Heinz's tests then I prefer not to see unifying efforts.
We have enough tournaments with original books and learning on like Leo's tournament or the infinite loop and I think that it is better to continue to have tournaments with learning off and if I need to decide about rating I will trust more tournament with learning off because there is more noise that is not about statistical error in tournament with learning on.
Uri
Uri Blass
 

Re: More testers welcome and some thoughts

Postby Heinz van Kempen » 17 Jul 2004, 02:06

Geschrieben von:/Posted by: Heinz van Kempen at 17 July 2004 03:06:09:
Als Antwort auf:/In reply to: Re: More testers welcome and some thoughts geschrieben von:/posted by: Uri Blass at 17 July 2004 00:54:00:
The problem is that not limiting Crafty's strength prevent correct testing
of what many programmers like to get from testers.
Again this is not for testing it is a tourney to represent the strongest engine.
If you do not restrict the engines you cannot find the strongest engine because it is possible that A will be significantly stronger than B if every 2 engines play match of 10 games when B will be significantly stronger than A if every 2 engines play match of 100 games.
All the reason for disabling learning is to prevent this situation.
This makes no sense what so ever. You saying that if you do not remove a key element from an engine you can not find the strongest engine. Code and papers are out there on how to add learning to an engine. If the author does not do it then that is their choice. Let me ask you this, what if Hyatt removes the ability to turn off learning (which I hope he does)? Will it be banned from playing in these silly basement tourneys?
If you answer is no to the last two questions then why would it be done here? It is a tourney and it is about time that people learn when they disable anything it is not the way the author created the program to run and any results for the program are completly invalid!
It is impossible to do it.
You can save a copy of the engine before the game so after the game you use another copy of the engine and delete everything.

Also in the WCCC and CCT tourneys do think learning was off?
No but these tournament are not about getting rating for engines.
They are not meant to help the authors to see if they did improvement in their non learning stuff.
Do you think that Hyatt even thought that he must turn off learning in order to ensure that ever opponent played the same crafty?

No but in that enviroment the opponents also do not turn off learning and we talk about a tournament when all turn off learning.
I already implemented some book learning in movei(no positional learning).
Suppose that moveiA learns and later I give MoveiB that also learns
MoveiB may be better engine than MoveiA but first result may give wrong impression not because of statistical error but because MoveiA already learned which lines are good for it when MoveiB did not learn it so the lines that are going to be played will favour MoveiA.
The only solution to this problem is to play something like the nunn match
with no learning.
I want to disable the learning factor in tests that I do unless I change the learning algorithm and I believe that most of the work of programmers are not about changing the learning algorithm.
The program already run in enough tournament with learning on and I believe that most programmers will agree that tournament without learning from fixed position like the nunn match may be productive for them.
I am not against additional tournaments with learning on(not to be combined for rating) but I prefer the data of Heinz with learning off in the nunn match and if unifying efforts to one tournament means that programs will start to use their own book in Heinz's tests then I prefer not to see unifying efforts.
We have enough tournaments with original books and learning on like Leo's tournament or the infinite loop and I think that it is better to continue to have tournaments with learning off and if I need to decide about rating I will trust more tournament with learning off because there is more noise that is not about statistical error in tournament with learning on.
Uri
Hello Bryan and Uri,
both your opinions are convincing and make sense. It proves to be contradictory to test Crafty decently with all inherent features like position learning and ponder=on and at the same time not to give a horrific disadvantage to those who will have to play a "wise" Crafty later on when adding those gauntlets. So if we might opt for ponder=off and learing disabled, it would make sense to give a 20 point bonus to the rating of Crafty for not being able to use features Robert has worked a lot on. I know this is somehow constructed. After voting and before starting we will give of course all our common decisions here for another discussion.
As soon as it is decided who will be in the first bunch of top engines and group B, authors will have the opportunity to give us hints how to optimize their respective engines and of course anyone can release a version still. We will only use versions that are publically available.
I still do not know how to call this tournament. For the two groups maybe "Master Class" (because those engines are already as strong as human GM or at least IM with this time control) and "Higher Class" (because there will remain a lot of engines that are new and much weaker than Hermann for example).

Best Regards
Heinz
Heinz van Kempen
 

Re: More testers welcome and some thoughts

Postby Bryan Hofmann » 17 Jul 2004, 12:45

Geschrieben von:/Posted by: Bryan Hofmann at 17 July 2004 13:45:08:
Als Antwort auf:/In reply to: Re: More testers welcome and some thoughts geschrieben von:/posted by: Uri Blass at 17 July 2004 00:54:00:
The problem is that not limiting Crafty's strength prevent correct testing
of what many programmers like to get from testers.
Again this is not for testing it is a tourney to represent the strongest engine.
If you do not restrict the engines you cannot find the strongest engine because it is possible that A will be significantly stronger than B if every 2 engines play match of 10 games when B will be significantly stronger than A if every 2 engines play match of 100 games.
All the reason for disabling learning is to prevent this situation.
This makes no sense what so ever. You saying that if you do not remove a key element from an engine you can not find the strongest engine. Code and papers are out there on how to add learning to an engine. If the author does not do it then that is their choice. Let me ask you this, what if Hyatt removes the ability to turn off learning (which I hope he does)? Will it be banned from playing in these silly basement tourneys?
If you answer is no to the last two questions then why would it be done here? It is a tourney and it is about time that people learn when they disable anything it is not the way the author created the program to run and any results for the program are completly invalid!
It is impossible to do it.
You can save a copy of the engine before the game so after the game you use another copy of the engine and delete everything.

Also in the WCCC and CCT tourneys do think learning was off?
No but these tournament are not about getting rating for engines.
They are not meant to help the authors to see if they did improvement in their non learning stuff.
Do you think that Hyatt even thought that he must turn off learning in order to ensure that ever opponent played the same crafty?

No but in that enviroment the opponents also do not turn off learning and we talk about a tournament when all turn off learning.
I already implemented some book learning in movei(no positional learning).
Suppose that moveiA learns and later I give MoveiB that also learns
MoveiB may be better engine than MoveiA but first result may give wrong impression not because of statistical error but because MoveiA already learned which lines are good for it when MoveiB did not learn it so the lines that are going to be played will favour MoveiA.
The only solution to this problem is to play something like the nunn match
with no learning.
I want to disable the learning factor in tests that I do unless I change the learning algorithm and I believe that most of the work of programmers are not about changing the learning algorithm.
The program already run in enough tournament with learning on and I believe that most programmers will agree that tournament without learning from fixed position like the nunn match may be productive for them.
I am not against additional tournaments with learning on(not to be combined for rating) but I prefer the data of Heinz with learning off in the nunn match and if unifying efforts to one tournament means that programs will start to use their own book in Heinz's tests then I prefer not to see unifying efforts.
We have enough tournaments with original books and learning on like Leo's tournament or the infinite loop and I think that it is better to continue to have tournaments with learning off and if I need to decide about rating I will trust more tournament with learning off because there is more noise that is not about statistical error in tournament with learning on.
Uri
Saving a copy of the engine does nothing as the book.bin and position.bin are the files that contain the learning information. And yes deleteing these files between matches will removed the information. This point is, it makes it damn difficult.
You whole rebutal below is nothing more then turning this into a test not a tournament for all engines except for those that have the ability to learn. As Hyatt has said before there are 100 ways more to make Crafty weaker why not use them as well.
Bryan Hofmann
 

Re: More testers welcome and some thoughts

Postby Bryan Hofmann » 17 Jul 2004, 12:58

Geschrieben von:/Posted by: Bryan Hofmann at 17 July 2004 13:58:36:
Als Antwort auf:/In reply to: Re: More testers welcome and some thoughts geschrieben von:/posted by: Heinz van Kempen at 17 July 2004 03:06:09:
The problem is that not limiting Crafty's strength prevent correct testing
of what many programmers like to get from testers.
Again this is not for testing it is a tourney to represent the strongest engine.
If you do not restrict the engines you cannot find the strongest engine because it is possible that A will be significantly stronger than B if every 2 engines play match of 10 games when B will be significantly stronger than A if every 2 engines play match of 100 games.
All the reason for disabling learning is to prevent this situation.
This makes no sense what so ever. You saying that if you do not remove a key element from an engine you can not find the strongest engine. Code and papers are out there on how to add learning to an engine. If the author does not do it then that is their choice. Let me ask you this, what if Hyatt removes the ability to turn off learning (which I hope he does)? Will it be banned from playing in these silly basement tourneys?
If you answer is no to the last two questions then why would it be done here? It is a tourney and it is about time that people learn when they disable anything it is not the way the author created the program to run and any results for the program are completly invalid!
It is impossible to do it.
You can save a copy of the engine before the game so after the game you use another copy of the engine and delete everything.

Also in the WCCC and CCT tourneys do think learning was off?
No but these tournament are not about getting rating for engines.
They are not meant to help the authors to see if they did improvement in their non learning stuff.
Do you think that Hyatt even thought that he must turn off learning in order to ensure that ever opponent played the same crafty?

No but in that enviroment the opponents also do not turn off learning and we talk about a tournament when all turn off learning.
I already implemented some book learning in movei(no positional learning).
Suppose that moveiA learns and later I give MoveiB that also learns
MoveiB may be better engine than MoveiA but first result may give wrong impression not because of statistical error but because MoveiA already learned which lines are good for it when MoveiB did not learn it so the lines that are going to be played will favour MoveiA.
The only solution to this problem is to play something like the nunn match
with no learning.
I want to disable the learning factor in tests that I do unless I change the learning algorithm and I believe that most of the work of programmers are not about changing the learning algorithm.
The program already run in enough tournament with learning on and I believe that most programmers will agree that tournament without learning from fixed position like the nunn match may be productive for them.
I am not against additional tournaments with learning on(not to be combined for rating) but I prefer the data of Heinz with learning off in the nunn match and if unifying efforts to one tournament means that programs will start to use their own book in Heinz's tests then I prefer not to see unifying efforts.
We have enough tournaments with original books and learning on like Leo's tournament or the infinite loop and I think that it is better to continue to have tournaments with learning off and if I need to decide about rating I will trust more tournament with learning off because there is more noise that is not about statistical error in tournament with learning on.
Uri
Hello Bryan and Uri,
both your opinions are convincing and make sense. It proves to be contradictory to test Crafty decently with all inherent features like position learning and ponder=on and at the same time not to give a horrific disadvantage to those who will have to play a "wise" Crafty later on when adding those gauntlets. So if we might opt for ponder=off and learing disabled, it would make sense to give a 20 point bonus to the rating of Crafty for not being able to use features Robert has worked a lot on. I know this is somehow constructed. After voting and before starting we will give of course all our common decisions here for another discussion.
As soon as it is decided who will be in the first bunch of top engines and group B, authors will have the opportunity to give us hints how to optimize their respective engines and of course anyone can release a version still. We will only use versions that are publically available.
I still do not know how to call this tournament. For the two groups maybe "Master Class" (because those engines are already as strong as human GM or at least IM with this time control) and "Higher Class" (because there will remain a lot of engines that are new and much weaker than Hermann for example).

Best Regards
Heinz
I was thinking about offering up my system in this tournament until I seen the disabling of engines. As you have seen, I am strongly opposed to disabling of any default settings of engines. One must decide what they are measuring in these tourneys and if anything is disabled it may give a better picture of other engines but those engines that have their features disabled are nothing but whipping boys for the other engines.
I wish you luck with this as I'm placing myself on the sidelines.
Bryan Hofmann
 

PreviousNext

Return to Archive (Old Parsimony Forum)

Who is online

Users browsing this forum: No registered users and 21 guests

cron