More testers welcome and some thoughts

Archive of the old Parsimony forum. Some messages couldn't be restored. Limitations: Search for authors does not work, Parsimony specific formats do not work, threaded view does not work properly. Posting is disabled.

Re: More testers welcome and some thoughts

Postby Uri Blass » 17 Jul 2004, 13:17

Geschrieben von:/Posted by: Uri Blass at 17 July 2004 14:17:29:
Als Antwort auf:/In reply to: Re: More testers welcome and some thoughts geschrieben von:/posted by: Bryan Hofmann at 17 July 2004 13:45:08:
The problem is that not limiting Crafty's strength prevent correct testing
of what many programmers like to get from testers.
Again this is not for testing it is a tourney to represent the strongest engine.
If you do not restrict the engines you cannot find the strongest engine because it is possible that A will be significantly stronger than B if every 2 engines play match of 10 games when B will be significantly stronger than A if every 2 engines play match of 100 games.
All the reason for disabling learning is to prevent this situation.
This makes no sense what so ever. You saying that if you do not remove a key element from an engine you can not find the strongest engine. Code and papers are out there on how to add learning to an engine. If the author does not do it then that is their choice. Let me ask you this, what if Hyatt removes the ability to turn off learning (which I hope he does)? Will it be banned from playing in these silly basement tourneys?
It is impossible to do it.
You can save a copy of the engine before the game so after the game you use another copy of the engine and delete everything.
Saving a copy of the engine does nothing as the book.bin and position.bin are the files that contain the learning information.
You whole rebutal below is nothing more then turning this into a test not a tournament for all engines except for those that have the ability to learn. As Hyatt has said before there are 100 ways more to make Crafty weaker why not use them as well.
I consider them as part of the engine for this discussion.
You are right that you do not need to delete the .exe file but when I said "delete everything" I considered the possibility that the .exe file was changed by the program in some way inspite of the fact that I do not know of programs that do it and latest movei only changes other files for learning.
I do not know if it is possible to learn by changing the .exe file and it is enough to delete every file that was changed or generated by the engine after downloading it.

And yes deleteing these files between matches will removed the information. This point is, it makes it damn difficult.
I think that it is possible to generate a program to delete them automatically between matches.

Because the target is not to get weaker Crafty but to get data that can be produced again by other people.
If you use nunn match and no learning then the same data can be produced later by other people by using the same positions and it means also that it is easier to test for bugs because the first step in fixing a bug is reproducing it.
There are enough normal tournament when Crafty has all the learning.
I am not against them but if we want to try to get rating for engines then it is better to use something that can be produced again by other people even if it means that the rating does not represnt the full ability of the engines.
Uri Blass
 

Re: More testers welcome and some thoughts

Postby Uri Blass » 17 Jul 2004, 13:32

Geschrieben von:/Posted by: Uri Blass at 17 July 2004 14:32:22:
Als Antwort auf:/In reply to: Re: More testers welcome and some thoughts geschrieben von:/posted by: Bryan Hofmann at 17 July 2004 13:58:36:
The problem is that not limiting Crafty's strength prevent correct testing
of what many programmers like to get from testers.
Again this is not for testing it is a tourney to represent the strongest engine.
If you do not restrict the engines you cannot find the strongest engine because it is possible that A will be significantly stronger than B if every 2 engines play match of 10 games when B will be significantly stronger than A if every 2 engines play match of 100 games.
All the reason for disabling learning is to prevent this situation.
This makes no sense what so ever. You saying that if you do not remove a key element from an engine you can not find the strongest engine. Code and papers are out there on how to add learning to an engine. If the author does not do it then that is their choice. Let me ask you this, what if Hyatt removes the ability to turn off learning (which I hope he does)? Will it be banned from playing in these silly basement tourneys?
If you answer is no to the last two questions then why would it be done here? It is a tourney and it is about time that people learn when they disable anything it is not the way the author created the program to run and any results for the program are completly invalid!
It is impossible to do it.
You can save a copy of the engine before the game so after the game you use another copy of the engine and delete everything.

Also in the WCCC and CCT tourneys do think learning was off?
No but these tournament are not about getting rating for engines.
They are not meant to help the authors to see if they did improvement in their non learning stuff.
Do you think that Hyatt even thought that he must turn off learning in order to ensure that ever opponent played the same crafty?

No but in that enviroment the opponents also do not turn off learning and we talk about a tournament when all turn off learning.
I already implemented some book learning in movei(no positional learning).
Suppose that moveiA learns and later I give MoveiB that also learns
MoveiB may be better engine than MoveiA but first result may give wrong impression not because of statistical error but because MoveiA already learned which lines are good for it when MoveiB did not learn it so the lines that are going to be played will favour MoveiA.
The only solution to this problem is to play something like the nunn match
with no learning.
I want to disable the learning factor in tests that I do unless I change the learning algorithm and I believe that most of the work of programmers are not about changing the learning algorithm.
The program already run in enough tournament with learning on and I believe that most programmers will agree that tournament without learning from fixed position like the nunn match may be productive for them.
I am not against additional tournaments with learning on(not to be combined for rating) but I prefer the data of Heinz with learning off in the nunn match and if unifying efforts to one tournament means that programs will start to use their own book in Heinz's tests then I prefer not to see unifying efforts.
We have enough tournaments with original books and learning on like Leo's tournament or the infinite loop and I think that it is better to continue to have tournaments with learning off and if I need to decide about rating I will trust more tournament with learning off because there is more noise that is not about statistical error in tournament with learning on.
Uri
Hello Bryan and Uri,
both your opinions are convincing and make sense. It proves to be contradictory to test Crafty decently with all inherent features like position learning and ponder=on and at the same time not to give a horrific disadvantage to those who will have to play a "wise" Crafty later on when adding those gauntlets. So if we might opt for ponder=off and learing disabled, it would make sense to give a 20 point bonus to the rating of Crafty for not being able to use features Robert has worked a lot on. I know this is somehow constructed. After voting and before starting we will give of course all our common decisions here for another discussion.
As soon as it is decided who will be in the first bunch of top engines and group B, authors will have the opportunity to give us hints how to optimize their respective engines and of course anyone can release a version still. We will only use versions that are publically available.
I still do not know how to call this tournament. For the two groups maybe "Master Class" (because those engines are already as strong as human GM or at least IM with this time control) and "Higher Class" (because there will remain a lot of engines that are new and much weaker than Hermann for example).

Best Regards
Heinz
I was thinking about offering up my system in this tournament until I seen the disabling of engines. As you have seen, I am strongly opposed to disabling of any default settings of engines. One must decide what they are measuring in these tourneys and if anything is disabled it may give a better picture of other engines but those engines that have their features disabled are nothing but whipping boys for the other engines.
I wish you luck with this as I'm placing myself on the sidelines.
More data is always productive so if you want to test without disabling of the default setting then it is better than not testing at all and if other testers agree only to test in that way then you can also have unified effort.
I think that the best solution is to have 2 rating lists when one is without disabling learning and another rating list is with disabling learning and every tester will test what he wants.
I prefer to see more games with disabling learning and book instead of more games with the default option but I also prefer to see more games with the default option if the alternative is no more games.
Uri
Uri Blass
 

Re: More testers welcome and some thoughts

Postby Bryan Hofmann » 17 Jul 2004, 14:46

Geschrieben von:/Posted by: Bryan Hofmann at 17 July 2004 15:46:39:
Als Antwort auf:/In reply to: Re: More testers welcome and some thoughts geschrieben von:/posted by: Uri Blass at 17 July 2004 14:17:29:
The problem is that not limiting Crafty's strength prevent correct testing
of what many programmers like to get from testers.
Again this is not for testing it is a tourney to represent the strongest engine.
If you do not restrict the engines you cannot find the strongest engine because it is possible that A will be significantly stronger than B if every 2 engines play match of 10 games when B will be significantly stronger than A if every 2 engines play match of 100 games.
All the reason for disabling learning is to prevent this situation.
This makes no sense what so ever. You saying that if you do not remove a key element from an engine you can not find the strongest engine. Code and papers are out there on how to add learning to an engine. If the author does not do it then that is their choice. Let me ask you this, what if Hyatt removes the ability to turn off learning (which I hope he does)? Will it be banned from playing in these silly basement tourneys?
It is impossible to do it.
You can save a copy of the engine before the game so after the game you use another copy of the engine and delete everything.
Saving a copy of the engine does nothing as the book.bin and position.bin are the files that contain the learning information.
You whole rebutal below is nothing more then turning this into a test not a tournament for all engines except for those that have the ability to learn. As Hyatt has said before there are 100 ways more to make Crafty weaker why not use them as well.
I consider them as part of the engine for this discussion.
You are right that you do not need to delete the .exe file but when I said "delete everything" I considered the possibility that the .exe file was changed by the program in some way inspite of the fact that I do not know of programs that do it and latest movei only changes other files for learning.
I do not know if it is possible to learn by changing the .exe file and it is enough to delete every file that was changed or generated by the engine after downloading it.

And yes deleteing these files between matches will removed the information. This point is, it makes it damn difficult.
I think that it is possible to generate a program to delete them automatically between matches.

Because the target is not to get weaker Crafty but to get data that can be produced again by other people.
If you use nunn match and no learning then the same data can be produced later by other people by using the same positions and it means also that it is easier to test for bugs because the first step in fixing a bug is reproducing it.
There are enough normal tournament when Crafty has all the learning.
I am not against them but if we want to try to get rating for engines then it is better to use something that can be produced again by other people even if it means that the rating does not represnt the full ability of the engines.
You continue to argue over nonsense. When I stated about Hyatt removing the option to turn off learning it would prevent a vast majority of people from circumventing it. Yes there are a hundred ways to get around this including using a hex editor on the program file.


You just don't get it.... You keep stating TEST, it is not a TEST it is a tournament. And no you will not always be able to reproduce the same results. Take for example Ruffian when it gets into trouble it starts to think longer on moves and this varies even when playing the same opening. I have seen this time and time again with my beta testing with Tao and I use the Nunn I & II openings.
Bryan Hofmann
 

Re: More testers welcome and some thoughts

Postby Uri Blass » 17 Jul 2004, 15:39

Geschrieben von:/Posted by: Uri Blass at 17 July 2004 16:39:42:
Als Antwort auf:/In reply to: Re: More testers welcome and some thoughts geschrieben von:/posted by: Bryan Hofmann at 17 July 2004 15:46:39:
The problem is that not limiting Crafty's strength prevent correct testing
of what many programmers like to get from testers.
Again this is not for testing it is a tourney to represent the strongest engine.
If you do not restrict the engines you cannot find the strongest engine because it is possible that A will be significantly stronger than B if every 2 engines play match of 10 games when B will be significantly stronger than A if every 2 engines play match of 100 games.
All the reason for disabling learning is to prevent this situation.
This makes no sense what so ever. You saying that if you do not remove a key element from an engine you can not find the strongest engine. Code and papers are out there on how to add learning to an engine. If the author does not do it then that is their choice. Let me ask you this, what if Hyatt removes the ability to turn off learning (which I hope he does)? Will it be banned from playing in these silly basement tourneys?
It is impossible to do it.
You can save a copy of the engine before the game so after the game you use another copy of the engine and delete everything.
Saving a copy of the engine does nothing as the book.bin and position.bin are the files that contain the learning information.
You whole rebutal below is nothing more then turning this into a test not a tournament for all engines except for those that have the ability to learn. As Hyatt has said before there are 100 ways more to make Crafty weaker why not use them as well.
I consider them as part of the engine for this discussion.
You are right that you do not need to delete the .exe file but when I said "delete everything" I considered the possibility that the .exe file was changed by the program in some way inspite of the fact that I do not know of programs that do it and latest movei only changes other files for learning.
I do not know if it is possible to learn by changing the .exe file and it is enough to delete every file that was changed or generated by the engine after downloading it.

And yes deleteing these files between matches will removed the information. This point is, it makes it damn difficult.
I think that it is possible to generate a program to delete them automatically between matches.

Because the target is not to get weaker Crafty but to get data that can be produced again by other people.
If you use nunn match and no learning then the same data can be produced later by other people by using the same positions and it means also that it is easier to test for bugs because the first step in fixing a bug is reproducing it.
You continue to argue over nonsense. When I stated about Hyatt removing the option to turn off learning it would prevent a vast majority of people from circumventing it. Yes there are a hundred ways to get around this including using a hex editor on the program file.


You just don't get it.... You keep stating TEST, it is not a TEST it is a tournament.
People do it for a reason.
It is a free world and everyone can do what he wants.
You may not be interested in results without learning but it is not a reason to try to prevent it from people.


I think that most programmers are interested more in testing.
If you do not want to do testing but want to do tournament there is no problem with it but I prefer to see rating based on testing and not based on tournament.
It is always possible that you will not be able to get the same results again because engine also do not have to be deterministic but with testing from specific positions without learning there are better chances that people will be able to get the same results again.
Uri
Uri Blass
 

Re: More testers welcome and some thoughts

Postby Heinz van Kempen » 17 Jul 2004, 16:13

Geschrieben von:/Posted by: Heinz van Kempen at 17 July 2004 17:13:13:
Als Antwort auf:/In reply to: Re: More testers welcome and some thoughts geschrieben von:/posted by: Bryan Hofmann at 17 July 2004 13:58:36:
I was thinking about offering up my system in this tournament until I seen the disabling of engines. As you have seen, I am strongly opposed to disabling of any default settings of engines. One must decide what they are measuring in these tourneys and if anything is disabled it may give a better picture of other engines but those engines that have their features disabled are nothing but whipping boys for the other engines.
I wish you luck with this as I'm placing myself on the sidelines.
Hello Bryan,
justice for all is a complicated thing here and I think you are also seeing that. A lot of people will follow this tournament of course and they should know if there is participating a somewhat "castrated" Crafty or not, because Crafty surely is the most popular free engine I think and was already on high level when others did not start.
For me there are more questions. How much will give especially this position learning? No one knows for sure, but it could be tested running hundreds or thousands of games with one Crafty that has learing on and other where it is disabled. But here things are a bit different. Also with learning on you do not have one crafty on one machine and harddisc having played already more than hundred games and getting better and better. We have six, seven or more different crafties on the harddiscs of the respective testers. What if we would use version 19.14 that might have already played some hundred games for one tester on a certain machine and "knows" a lot. This one might be even stronger than a "virginal" new Crafty 19.15 without games before.
As I said here we will have probably six or seven different Crafties 19.15 where all will have learnt a bit after a while, if learning is on.
What if three of us give learning on and three others learning off, coming around 100 games from both groups. Being now the one stronger or not that had learning on, then we still do not have a statistically valid proof with relatively few games.
Best Regards
Heinz
Heinz van Kempen
 

Re: More testers welcome and some thoughts

Postby Bryan Hofmann » 17 Jul 2004, 18:23

Geschrieben von:/Posted by: Bryan Hofmann at 17 July 2004 19:23:31:
Als Antwort auf:/In reply to: Re: More testers welcome and some thoughts geschrieben von:/posted by: Uri Blass at 17 July 2004 16:39:42:
The glass is half full.
Bryan Hofmann
 

Re: More testers welcome and some thoughts

Postby Bryan Hofmann » 17 Jul 2004, 18:35

Geschrieben von:/Posted by: Bryan Hofmann at 17 July 2004 19:35:20:
Als Antwort auf:/In reply to: Re: More testers welcome and some thoughts geschrieben von:/posted by: Heinz van Kempen at 17 July 2004 17:13:13:
I was thinking about offering up my system in this tournament until I seen the disabling of engines. As you have seen, I am strongly opposed to disabling of any default settings of engines. One must decide what they are measuring in these tourneys and if anything is disabled it may give a better picture of other engines but those engines that have their features disabled are nothing but whipping boys for the other engines.
I wish you luck with this as I'm placing myself on the sidelines.
Hello Bryan,
justice for all is a complicated thing here and I think you are also seeing that. A lot of people will follow this tournament of course and they should know if there is participating a somewhat "castrated" Crafty or not, because Crafty surely is the most popular free engine I think and was already on high level when others did not start.
For me there are more questions. How much will give especially this position learning? No one knows for sure, but it could be tested running hundreds or thousands of games with one Crafty that has learing on and other where it is disabled. But here things are a bit different. Also with learning on you do not have one crafty on one machine and harddisc having played already more than hundred games and getting better and better. We have six, seven or more different crafties on the harddiscs of the respective testers. What if we would use version 19.14 that might have already played some hundred games for one tester on a certain machine and "knows" a lot. This one might be even stronger than a "virginal" new Crafty 19.15 without games before.
As I said here we will have probably six or seven different Crafties 19.15 where all will have learnt a bit after a while, if learning is on.
What if three of us give learning on and three others learning off, coming around 100 games from both groups. Being now the one stronger or not that had learning on, then we still do not have a statistically valid proof with relatively few games.
Best Regards
Heinz

It would appear that the direction is for learning off. What I would recommend at the very least would be for someone to download the Crafty books from Hyatt's site and run a crafty vs crafty both using the same book and 1 crafty has learning on the other off. Run say 10 min matches for a 24 hour period and use the resulting book in all of the crafty matches with learning off. This would atleast let crafty have some what of a book to use and should hopfully elimate the some of the bad openings. Also I recommend the COMPUTER be added to the .rc file since these are engine vs engine matches.
Bryan Hofmann
 

Re: More testers welcome and some thoughts

Postby Uri Blass » 17 Jul 2004, 18:39

Geschrieben von:/Posted by: Uri Blass at 17 July 2004 19:39:26:
Als Antwort auf:/In reply to: Re: More testers welcome and some thoughts geschrieben von:/posted by: Bryan Hofmann at 17 July 2004 19:35:20:
I was thinking about offering up my system in this tournament until I seen the disabling of engines. As you have seen, I am strongly opposed to disabling of any default settings of engines. One must decide what they are measuring in these tourneys and if anything is disabled it may give a better picture of other engines but those engines that have their features disabled are nothing but whipping boys for the other engines.
I wish you luck with this as I'm placing myself on the sidelines.
Hello Bryan,
justice for all is a complicated thing here and I think you are also seeing that. A lot of people will follow this tournament of course and they should know if there is participating a somewhat "castrated" Crafty or not, because Crafty surely is the most popular free engine I think and was already on high level when others did not start.
For me there are more questions. How much will give especially this position learning? No one knows for sure, but it could be tested running hundreds or thousands of games with one Crafty that has learing on and other where it is disabled. But here things are a bit different. Also with learning on you do not have one crafty on one machine and harddisc having played already more than hundred games and getting better and better. We have six, seven or more different crafties on the harddiscs of the respective testers. What if we would use version 19.14 that might have already played some hundred games for one tester on a certain machine and "knows" a lot. This one might be even stronger than a "virginal" new Crafty 19.15 without games before.
As I said here we will have probably six or seven different Crafties 19.15 where all will have learnt a bit after a while, if learning is on.
What if three of us give learning on and three others learning off, coming around 100 games from both groups. Being now the one stronger or not that had learning on, then we still do not have a statistically valid proof with relatively few games.
Best Regards
Heinz

It would appear that the direction is for learning off. What I would recommend at the very least would be for someone to download the Crafty books from Hyatt's site and run a crafty vs crafty both using the same book and 1 crafty has learning on the other off. Run say 10 min matches for a 24 hour period and use the resulting book in all of the crafty matches with learning off. This would atleast let crafty have some what of a book to use and should hopfully elimate the some of the bad openings. Also I recommend the COMPUTER be added to the .rc file since these are engine vs engine matches.
In the test by Heinz no opening book is used and the engines simply play from the nunn positions so Crafty has no problem of getting bad positions out of book.
In part of the other tests the engines use the same external book. that is not optimized for Crafty but also not optimized for other engines
Uri
Uri Blass
 

Re: More testers welcome and some thoughts

Postby Heinz van Kempen » 17 Jul 2004, 18:43

Geschrieben von:/Posted by: Heinz van Kempen at 17 July 2004 19:43:24:
Als Antwort auf:/In reply to: Re: More testers welcome and some thoughts geschrieben von:/posted by: Bryan Hofmann at 17 July 2004 19:35:20:

Hello Bryan,
Heinz van Kempen
 

Re: More testers welcome and some thoughts

Postby Bryan Hofmann » 17 Jul 2004, 19:16

Geschrieben von:/Posted by: Bryan Hofmann at 17 July 2004 20:16:18:
Als Antwort auf:/In reply to: Re: More testers welcome and some thoughts geschrieben von:/posted by: Heinz van Kempen at 17 July 2004 19:43:24:
Hello Bryan,

Bryan Hofmann
 

Re: More testers welcome and some thoughts

Postby Bryan Hofmann » 17 Jul 2004, 19:17

Geschrieben von:/Posted by: Bryan Hofmann at 17 July 2004 20:17:39:
Als Antwort auf:/In reply to: Re: More testers welcome and some thoughts geschrieben von:/posted by: Uri Blass at 17 July 2004 19:39:26:
The glass is half empty.
Bryan Hofmann
 

Previous

Return to Archive (Old Parsimony Forum)

Who is online

Users browsing this forum: No registered users and 18 guests