Jonny 2.62 gauntlet, part 1

Archive of the old Parsimony forum. Some messages couldn't be restored. Limitations: Search for authors does not work, Parsimony specific formats do not work, threaded view does not work properly. Posting is disabled.

Jonny 2.62 gauntlet, part 1

Postby Heinz van Kempen » 14 Apr 2004, 12:37

Geschrieben von:/Posted by: Heinz van Kempen at 14 April 2004 13:37:14:

Hi :-),
first half of the Jonny 2.62 gauntlet with Nunn2 positions 1 to 10 is played over 240 games.
Conditions:
Athlon 2600+
64 MB Hash
ponder=off
5 men EGTB
Nunn2 position 1
Time control: 4m + 2s increment


Jonny 2.62   - Fruit 1.0           12.0 - 8.0
Jonny 2.62   - El Chinito 3.25     5.0 - 15.0
Jonny 2.62   - Delfi 4.4           5.0 - 15.0
Jonny 2.62   - Amyan 1.593b        9.5 - 10.5
Jonny 2.62   - Gothmog 0.4.7s.6    9.0 - 11.0
Jonny 2.62   - Movei 00_8_178      12.0 - 8.0
Jonny 2.62   - Abrok 5.0           11.5 - 8.5
Jonny 2.62   - Dragon 4.5 CF       10.5 - 9.5
Jonny 2.62   - Ufim 4.04           14.5 - 5.5
Jonny 2.62   - Naga Skaki 3.01     17.5 - 2.5
Jonny 2.62   - IceSpell 0_022      12.0 - 8.0
Jonny 2.62   - Comet B68           11.5 - 8.5

First rating is 2568 and so we have another top engine :-).
Good result here for IceSpell that just promoted to Nunn Active League C and is starting good in Blitz Tournament E. Standings here after three positions done:
1 Arasan 7.4               52.5 / 72   
2 IceSpell 0_022           46.5 / 72   
3 Phalanx XXII             41.0 / 72   
4 Queen 2.43               40.5 / 72   
5 Resp 0.19                39.5 / 72  
6 Horizon 4.1 b9           39.5 / 72 
7 Cerebro 1.25b            38.5 / 72   
8 Beowulf 2.3              36.0 / 72   
9 Esc 1.16                 33.0 / 72   
10 Booot 3.2               30.0 / 72   
11 Tytan 3.6               26.5 / 72   
12 CyberPagno 2.0.1        23.5 / 72   
13 The Butcher 1.42c       21.0 / 72   
Nunn Blitz F has also begun, mainly for testing some new and updated engines like Cheetah, Delphil, Bruja, NagaSkaki and Tinker.
Nunn Elite was updated on Monday with position 14. Ktulu won.
Nunn Top will be updated tomorrow evening.
Nunn Active A and B are in progress.
Rating list is updated. After at least six positions in all Nunn Active tournaments there will be another rating list for 20 minutes + 10 seconds to have a comparison.


http://www.husvankempen.de/nunn/
Best Regards
Heinz
Heinz van Kempen
 

Re: Jonny 2.62 gauntlet, part 1

Postby Tord Romstad » 14 Apr 2004, 13:26

Geschrieben von:/Posted by: Tord Romstad at 14 April 2004 14:26:32:
Als Antwort auf:/In reply to: Jonny 2.62 gauntlet, part 1 geschrieben von:/posted by: Heinz van Kempen at 14 April 2004 13:37:14:
Hi :-),
first half of the Jonny 2.62 gauntlet with Nunn2 positions 1 to 10 is played over 240 games.
Conditions:
Athlon 2600+
64 MB Hash
ponder=off
5 men EGTB
Nunn2 position 1
Time control: 4m + 2s increment

>Jonny 2.62   - Fruit 1.0           12.0 - 8.0
>Jonny 2.62   - El Chinito 3.25     5.0 - 15.0
>Jonny 2.62   - Delfi 4.4           5.0 - 15.0
>Jonny 2.62   - Amyan 1.593b        9.5 - 10.5
>Jonny 2.62   - Gothmog 0.4.7s.6    9.0 - 11.0
>Jonny 2.62   - Movei 00_8_178      12.0 - 8.0
>Jonny 2.62   - Abrok 5.0           11.5 - 8.5
>Jonny 2.62   - Dragon 4.5 CF       10.5 - 9.5
>Jonny 2.62   - Ufim 4.04           14.5 - 5.5
>Jonny 2.62   - Naga Skaki 3.01     17.5 - 2.5
>Jonny 2.62   - IceSpell 0_022      12.0 - 8.0
>Jonny 2.62   - Comet B68           11.5 - 8.5
>First rating is 2568 and so we have another top engine :-).


Is this really correct? Based on what I have seen, I expected Jonny to
be at least on the level of Ruffian and List, probably even stronger. A
Noomen match between Gothmog 0.4.7 and Jonny 2.61 ended with a crushing
77-23 victory for Jonny.
I'm repeating the experiment now in order to make sure I didn't make some
mistake.
Tord
Tord Romstad
 

Re: Jonny 2.62 gauntlet, part 1

Postby Fabien Letouzey » 14 Apr 2004, 13:32

Geschrieben von:/Posted by: Fabien Letouzey at 14 April 2004 14:32:24:
Als Antwort auf:/In reply to: Re: Jonny 2.62 gauntlet, part 1 geschrieben von:/posted by: Tord Romstad at 14 April 2004 14:26:32:

Is this really correct? Based on what I have seen, I expected Jonny to
be at least on the level of Ruffian and List, probably even stronger. A
Noomen match between Gothmog 0.4.7 and Jonny 2.61 ended with a crushing
77-23 victory for Jonny.
I'm repeating the experiment now in order to make sure I didn't make some
mistake.
Jonny performed like ponder-off Crafty in my experiments.
... Which is already astonishing for an engine started last year!!!
But I don't think it's a league ahead of Crafty/Yace.
Maybe a style "incompatibility".
Fabien.
Fabien Letouzey
 

Re: Jonny 2.62 gauntlet, part 1

Postby Heinz van Kempen » 14 Apr 2004, 13:43

Geschrieben von:/Posted by: Heinz van Kempen at 14 April 2004 14:43:59:
Als Antwort auf:/In reply to: Re: Jonny 2.62 gauntlet, part 1 geschrieben von:/posted by: Tord Romstad at 14 April 2004 14:26:32:
Is this really correct? Based on what I have seen, I expected Jonny to
be at least on the level of Ruffian and List, probably even stronger. A
Noomen match between Gothmog 0.4.7 and Jonny 2.61 ended with a crushing
77-23 victory for Jonny.
I'm repeating the experiment now in order to make sure I didn't make some
mistake.
Tord
Hello Tord,
there are a lot of possible reasons for our different results. Just to mention a few:
-neither your 100 games nor my 240 games are enough so far
-Jonny 2.61 might be stronger, because of a bug that was added when changing to 2.62
-different opponents...
I will ask Johannes what results he has got or maybe he wants to post his opinion here.
I also ask other testers to report their results for Jonny versions here.
Anyway you can be sure that there will come many games more with different versions of Jonny and different time controls.

Best Regards
Heinz
Heinz van Kempen
 

Re: Jonny 2.62 gauntlet, part 1

Postby Gábor Szõts » 14 Apr 2004, 13:49

Geschrieben von:/Posted by: Gábor Szõts at 14 April 2004 14:49:43:
Als Antwort auf:/In reply to: Re: Jonny 2.62 gauntlet, part 1 geschrieben von:/posted by: Heinz van Kempen at 14 April 2004 14:43:59:
Is this really correct? Based on what I have seen, I expected Jonny to
be at least on the level of Ruffian and List, probably even stronger. A
Noomen match between Gothmog 0.4.7 and Jonny 2.61 ended with a crushing
77-23 victory for Jonny.
I'm repeating the experiment now in order to make sure I didn't make some
mistake.
Tord
Hello Tord,
there are a lot of possible reasons for our different results. Just to mention a few:
-neither your 100 games nor my 240 games are enough so far
-Jonny 2.61 might be stronger, because of a bug that was added when changing to 2.62
-different opponents...
I will ask Johannes what results he has got or maybe he wants to post his opinion here.
I also ask other testers to report their results for Jonny versions here.
Anyway you can be sure that there will come many games more with different versions of Jonny and different time controls.

Best Regards
Heinz
Yes, based on LCT2 I also back the opinion that 2.61 was a bit stronger. Still I don't think it comes close to the top level yet. I have the feeling that it gets stronger with longer time control, in other words it is a bad blitzer.
Gábor
Gábor Szõts
 

Re: Jonny 2.62 gauntlet, part 1

Postby Manfred Meiler » 14 Apr 2004, 14:36

Geschrieben von:/Posted by: Manfred Meiler at 14 April 2004 15:36:42:
Als Antwort auf:/In reply to: Re: Jonny 2.62 gauntlet, part 1 geschrieben von:/posted by: Heinz van Kempen at 14 April 2004 14:43:59:
Is this really correct? Based on what I have seen, I expected Jonny to
be at least on the level of Ruffian and List, probably even stronger. A
Noomen match between Gothmog 0.4.7 and Jonny 2.61 ended with a crushing
77-23 victory for Jonny.
I'm repeating the experiment now in order to make sure I didn't make some
mistake.
Tord
Hello Tord,
there are a lot of possible reasons for our different results. Just to mention a few:
-neither your 100 games nor my 240 games are enough so far
-Jonny 2.61 might be stronger, because of a bug that was added when changing to 2.62
-different opponents...
I will ask Johannes what results he has got or maybe he wants to post his opinion here.
I also ask other testers to report their results for Jonny versions here.
Anyway you can be sure that there will come many games more with different versions of Jonny and different time controls.

Best Regards
Heinz
Hello Heinz,
here my results of Jonny 2.61 in testsuite "Weltmeister-Test" (WM-Test), 100 test positions (38 in king attack, 36 in positional play, 26 in endgame):
   WM-Test                Jonny 2.61
AMD Athlon 1400         132/60 MB Hash    
.  
solved K-pos. (38)           21   
solve quote                  55%
rating king attack           2.654 
.
solved P-pos. (36)           20  
solve quote                  56%
rating positional play       2.652   
.
solved E-pos. (26)           9   
solve quote                  35%
rating endgame               2.592   
.
Ø solve time min. (total)    5,14
solved pos. (total)          50   
rating WM-Test (total)       2.637
For comparing these results with 230 other tested engines by me (same conditions: per test position 20 minutes in analyze mode, hardware AMD Athlon Thunderbird 1400 MHz) please have a look at my detailed results (Excel sheet für download) at http://www.computerschach.de/test/index.htm.
Best regards,
Manfred
Manfred Meiler
 

Re: Jonny 2.62 gauntlet, part 1

Postby Heinz van Kempen » 14 Apr 2004, 14:37

Geschrieben von:/Posted by: Heinz van Kempen at 14 April 2004 15:37:31:
Als Antwort auf:/In reply to: Re: Jonny 2.62 gauntlet, part 1 geschrieben von:/posted by: Tord Romstad at 14 April 2004 14:26:32:
Is this really correct? Based on what I have seen, I expected Jonny to
be at least on the level of Ruffian and List, probably even stronger. A
Noomen match between Gothmog 0.4.7 and Jonny 2.61 ended with a crushing
77-23 victory for Jonny.
I'm repeating the experiment now in order to make sure I didn't make some
mistake.
Tord
Hi Tord,
I only found one other result from a short gauntlet with partly inadequate opponents:
http://www.f22.parsimony.net/forum41668 ... /33299.htm
I can´t believe that there are not more results at this time. Come on guys :-).
Best Regards
Heinz
Heinz van Kempen
 

Re: Jonny 2.62 gauntlet, part 1

Postby Heinz van Kempen » 14 Apr 2004, 14:46

Geschrieben von:/Posted by: Heinz van Kempen at 14 April 2004 15:46:35:
Als Antwort auf:/In reply to: Re: Jonny 2.62 gauntlet, part 1 geschrieben von:/posted by: Manfred Meiler at 14 April 2004 15:36:42:
Is this really correct? Based on what I have seen, I expected Jonny to
be at least on the level of Ruffian and List, probably even stronger. A
Noomen match between Gothmog 0.4.7 and Jonny 2.61 ended with a crushing
77-23 victory for Jonny.
I'm repeating the experiment now in order to make sure I didn't make some
mistake.
Tord
Hello Tord,
there are a lot of possible reasons for our different results. Just to mention a few:
-neither your 100 games nor my 240 games are enough so far
-Jonny 2.61 might be stronger, because of a bug that was added when changing to 2.62
-different opponents...
I will ask Johannes what results he has got or maybe he wants to post his opinion here.
I also ask other testers to report their results for Jonny versions here.
Hello Heinz,
here my results of Jonny 2.61 in testsuite "Weltmeister-Test" (WM-Test), 100 test positions (38 in king attack, 36 in positional play, 26 in endgame):
   WM-Test                Jonny 2.61
For comparing these results with 230 other tested engines by me (same conditions: per test position 20 minutes in analyze mode, hardware AMD Athlon Thunderbird 1400 MHz) please have a look at my detailed results (Excel sheet für download) at http://www.computerschach.de/test/index.htm.
Best regards,
Manfred
>AMD Athlon 1400         132/60 MB Hash    
>.  
>solved K-pos. (38)           21   
>solve quote                  55%
>rating king attack           2.654 
>.
>solved P-pos. (36)           20  
>solve quote                  56%
>rating positional play       2.652   
>.
>solved E-pos. (26)           9   
>solve quote                  35%
>rating endgame               2.592   
>.
>Ø solve time min. (total)    5,14
>solved pos. (total)          50   
>rating WM-Test (total)       2.637
Hello Manfred,
thanks, I like this test a lot. I suppose it will need some weeks till we have a better estimation concerning Jonny performs at all levels.
Best Regards
Heinz
Heinz van Kempen
 

Jonny 2.61 and Gothmog 0.4.5 in WM-Test

Postby Manfred Meiler » 14 Apr 2004, 14:47

Geschrieben von:/Posted by: Manfred Meiler at 14 April 2004 15:47:51:
Als Antwort auf:/In reply to: Re: Jonny 2.62 gauntlet, part 1 geschrieben von:/posted by: Manfred Meiler at 14 April 2004 15:36:42:

... just to add also my test results of Gothmog 0.4.5 in test suite "WM-Test", compared with Jonny 2.61:
   WM-Test                Jonny 2.61     Gothmog 0.4.5
AMD Athlon 1400         132/60 MB Hash    256 MB Hash
.  
solved K-pos. (38)           21               27
solve quote                  55%              71% 
rating king attack           2.654            2.694  
.
solved P-pos. (36)           20               17
solve quote                  56%              47%
rating positional play       2.652            2.632    
.
solved E-pos. (26)           9                10    
solve quote                  35%              38%  
rating endgame               2.592            2.610    
.
Ø solve time min. (total)    5,14             4,58 
solved pos. (total)          50               54    
rating WM-Test (total)       2.637            2.650
http://www.computerschach.de/test/index.htm
Best,
Manfred
Manfred Meiler
 

Re: Jonny 2.62 gauntlet, part 1

Postby Tord Romstad » 14 Apr 2004, 15:28

Geschrieben von:/Posted by: Tord Romstad at 14 April 2004 16:28:25:
Als Antwort auf:/In reply to: Re: Jonny 2.62 gauntlet, part 1 geschrieben von:/posted by: Tord Romstad at 14 April 2004 14:26:32:
Is this really correct? Based on what I have seen, I expected Jonny to
be at least on the level of Ruffian and List, probably even stronger. A
Noomen match between Gothmog 0.4.7 and Jonny 2.61 ended with a crushing
77-23 victory for Jonny.
I'm repeating the experiment now in order to make sure I didn't make some
mistake.
After 24 games of the new match, I am already beginning to suspect that
something was seriously wrong in the first match. Gothmog is leading by
16-8. Perhaps Heinz' results give a realistic picture after all.
Tord
Tord Romstad
 

Re: Jonny 2.62 gauntlet, part 1

Postby Heinz van Kempen » 14 Apr 2004, 15:45

Geschrieben von:/Posted by: Heinz van Kempen at 14 April 2004 16:45:25:
Als Antwort auf:/In reply to: Re: Jonny 2.62 gauntlet, part 1 geschrieben von:/posted by: Tord Romstad at 14 April 2004 16:28:25:
Is this really correct? Based on what I have seen, I expected Jonny to
be at least on the level of Ruffian and List, probably even stronger. A
Noomen match between Gothmog 0.4.7 and Jonny 2.61 ended with a crushing
77-23 victory for Jonny.
I'm repeating the experiment now in order to make sure I didn't make some
mistake.
After 24 games of the new match, I am already beginning to suspect that
something was seriously wrong in the first match. Gothmog is leading by
16-8. Perhaps Heinz' results give a realistic picture after all.
Tord
Hello Tord,
no I do not know. It is not that easy. More games are needed and I will post results after 480 games next week.
Just a story when I first got confused about statistics. Twenty years ago (you know I am almost a grandfather :-)) I had two dedicated computers. The one was called Fidelity Elite A/S with a Spracklen program, the other Novag Superconstellation. I run 100 games on tournament time control and Fidelity won by a huge margin, 75:25 or something like that and I was convinced that it was much better. I played another match under same conditions and the result was really the opposite. At that time I thought that the computer that won the first match has got a defect or something. The third series then ended with 50 points each. We stíll have to learn to disbelieve in results of "shorter" matches.
Best Regards
Heinz
Heinz van Kempen
 

Re: Jonny 2.62 gauntlet, part 1

Postby Tord Romstad » 14 Apr 2004, 16:57

Geschrieben von:/Posted by: Tord Romstad at 14 April 2004 17:57:25:
Als Antwort auf:/In reply to: Re: Jonny 2.62 gauntlet, part 1 geschrieben von:/posted by: Heinz van Kempen at 14 April 2004 16:45:25:
Is this really correct? Based on what I have seen, I expected Jonny to
be at least on the level of Ruffian and List, probably even stronger. A
Noomen match between Gothmog 0.4.7 and Jonny 2.61 ended with a crushing
77-23 victory for Jonny.
I'm repeating the experiment now in order to make sure I didn't make some
mistake.
After 24 games of the new match, I am already beginning to suspect that
something was seriously wrong in the first match. Gothmog is leading by
16-8. Perhaps Heinz' results give a realistic picture after all.
Tord
Hello Tord,
no I do not know. It is not that easy. More games are needed and I will post results after 480 games next week.
Just a story when I first got confused about statistics. Twenty years ago (you know I am almost a grandfather :-)) I had two dedicated computers. The one was called Fidelity Elite A/S with a Spracklen program, the other Novag Superconstellation. I run 100 games on tournament time control and Fidelity won by a huge margin, 75:25 or something like that and I was convinced that it was much better. I played another match under same conditions and the result was really the opposite. At that time I thought that the computer that won the first match has got a defect or something. The third series then ended with 50 points each. We stíll have to learn to disbelieve in results of "shorter" matches.
Yes, but remember that in my case, both matches are Noomen matches, played
without book from the same starting positions. The first match finished 77-23
in favor of Jonny, while Gothmog is leading by 27-15 in the current match. I
am almost sure something must have been very wrong in the first match (but
unfortunately I didn't keep the games).
Tord
Tord Romstad
 


Return to Archive (Old Parsimony Forum)

Who is online

Users browsing this forum: No registered users and 20 guests