Geschrieben von:/Posted by: Robert Allgeuer at 05 August 2004 20:46:03:
I have run a test with Fruit 1.5 aiming at determining, which of its parameters have a positive, and which a negative impact on Fruit´s playing strength.
Method:
=======
The test consisted of a round robin tournament of several configurations of Fruit 1.5 and a set of reference engines. The reason why this approach was chosen is that I did not want to limit this test to a mere self-play test of the different Fruit configurations, because results of a self-play test may not be representative of the playing strength against other opponents.
The Nunn 1 starting positions were used; for each pairing each engine had to play both sides, resulting in 20 games for each pairing and 3800 games overall.
The tournament results have been analysed with Elostat and a corresponding rating table has been calculated.
Platform, Tools and Settings:
=============================
Athlon XP 2400+
1.1 GB RAM
Windows XP
Elostat 1.1b
Arena 1.08
Time Control: 5min + 2sec
Ponder off
EGTBs enabled when supported
64MB Hash
Participants:
=============
Seven different configurations of Fruit 1.5, including the default settings and six settings with always exactly one UCI-parameter modified:
Fruit v1.5def: Fruit 1.5 with the default parameter setting
Fruit v1.5nmalways: nullmove search is tried always (instead of in the fail-high case only)
Fruit v1.5noetc: ETC disabled
Fruit v1.5ppushext: pawn push extension (7th rank) enabled
Fruit v1.5nosinglerep: single reply extension disabled
Fruit v1.5noqchecks: quiescence search does not include checking moves
Fruit v1.5nmR2: nullmove reduction set to 2 instead of the default 3
plus 13 other engines.
Results:
========
Program  Elo +  -  Games  Score  Av.Op. Draws
 01 Ruffian v1.01  : 2695  26 42  380 70.7 %  2543  21.3 %
 02 List v5.12 : 2664  28 37  380 66.6 %  2544  23.7 %
 03 El Chinito v3.25 : 2643  29 35  380 63.7 %  2545  23.7 %
 04 Gothmog v0.4.8 : 2604  31 33  380 58.2 %  2547  19.5 %
 05 Fruit v1.5nmalways : 2596  32 31  380 57.0 %  2548  23.9 %
 06 Fruit v1.5noetc  : 2572  34 30  380 53.3 %  2549  20.8 %
 07 Fruit v1.5ppushext : 2571  34 29  380 53.2 %  2549  24.2 %
 08 Fruit v1.5def  : 2568  34 29  380 52.8 %  2549  22.9 %
 09 Fruit v1.5nosinglerep  : 2560  35 27  380 51.4 %  2550  28.2 %
 10 Fruit v1.5noqchecks  : 2554  35 30  380 50.5 %  2550  19.5 %
 11 Ktulu v5.0 : 2554  35 27  380 50.5 %  2550  29.5 %
 12 AnMon v5.21  : 2552  36 28  380 50.3 %  2550  24.7 %
 13 SoS4 : 2547  28 35  380 49.5 %  2550  24.2 %
 14 Amyan v1.592 : 2537  29 35  380 48.0 %  2551  22.4 %
 15 Fruit v1.5nmR2 : 2534  29 34  380 47.5 %  2551  24.5 %
 16 Yace Paderborn : 2509  33 32  380 43.8 %  2552  18.2 %
 17 Ufim v5.00 : 2460  35 29  380 36.7 %  2555  22.4 %
 18 Frenzee v1.59  : 2439  39 28  380 33.8 %  2556  18.7 %
 19 Patzer v3.61 : 2424  40 27  380 31.7 %  2557  19.7 %
 20 Sjeng v12.13 : 2417  42 27  380 30.9 %  2557  19.2 %
Not surprisingly the differences in playing strength due to the different parameter settings are statistically not significant, even after 3800 games. Nevertheless I would dare following interpretation:
Parameter settings that probably increase Fruit´s playing strength:
- Always trying nullmoves; it seems that the fail-high condition is a bit too aggressive and skips nullmove searches that in fact would have failed high
Parameter settings that probably are performance neutral:
- Disabling ETC (although I reckon that at longer time controls and deeper search depths ETC should give a better return and could yield an increase in playing strength)
- Enabling pawn push extensions
- Disabling single reply extensions
Parameter settings that probably decrease playing strength slightly:
- Disabling checks in quiescence search
Parameter settings that probably decrease playing strength:
- Reducing the nullmove reduction to 2
Conclusion:
===========
Generally the impact of the different parameter settings on Fruit´s playing strength is comparatively small.
I personnally am a bit surprised that enabling/disabling the extensions makes pretty much no difference, and would be interested in views as to why this would be the case.
I also would have expected that not searching checking moves in the quiescence search has a bigger (negative) impact than measured here.
I am currently extending this test by testing two further parameter settings:
- checks in quiescence search only after a nullmove
- the alternative material piece values as proposed by J. Rang
Eventually I plan to also test the combination of the best parameters in order to see whether improvements add up or not.
Robert