Winboard Forum

by **Dann Corbit** » 22 Sep 2005, 01:52

Kirill Kryukov wrote:Great! Thanks, R?mi!

Is there a chance future bayeselo may run in fully un-interactive mode? I like to run it from another program to automatically process PGN collection from my tournaments. So it would be very nice if you could run "bayeselo -readpgn ... -elo -mm -exactdist -ratings" (whatever we have to type now) and it will just print its output to stdout?

Oooh! Oooh! Meee Tooo!!!
{that would be a nice feature}

by **Rémi Coulom** » 22 Sep 2005, 08:30

Dann Corbit wrote:Can I get a source tarball?

I have just updated my web page with a new version of bayeselo that fixes a minor bug that could cause a segmentation fault. Also I have put a link to the new source.
http://remi.coulom.free.fr/Bayesian-Elo/

R?mi

by **Rémi Coulom** » 22 Sep 2005, 08:38

Dann Corbit wrote:
Kirill Kryukov wrote:Great! Thanks, R?mi!

Is there a chance future bayeselo may run in fully un-interactive mode? I like to run it from another program to automatically process PGN collection from my tournaments. So it would be very nice if you could run "bayeselo -readpgn ... -elo -mm -exactdist -ratings" (whatever we have to type now) and it will just print its output to stdout?

Oooh! Oooh! Meee Tooo!!!
{that would be a nice feature}

I will think about adding this feature. But I will probably not have time to do it rapidly.

If you wish to run bayeselo from another program, you can send commands through a pipe like this:

Code: Select all: echo "readpgn whatever.pgn elo mm exactdist ratings >ratings.dat " | bayeselo

You could also create a file called "script" that contains your commands and do

Code: Select all: bayeselo<script

R?mi

by **Kirill Kryukov** » 22 Sep 2005, 10:21

If you wish to run bayeselo from another program, you can send commands through a pipe like this:

*snip*

Yeah, this is approximately what I'm doing. I am calling bayeselo from perl script using open2(), to do without having any extra files. Just it would be more nice to have bayeselo as a proper command line app...

Hmm. One question.. I know WBEC is good, but are you planning to test Bayeselo's prediction on some other tourneys? I mean perhaps you can tune it for particular tournament, but will it be as good for another one? Particularly with different draw rate (may be because of shorter time control, or something)

I use Bayeselo for my engine tourney. It's still not ready for discussion, but the rating table is there... For now I included both old and new Bayeselo ratings. From what I see it's just a bit compressed comparing to the old one, but I trust you are doing your best science to make it right.

by **Dann Corbit** » 22 Sep 2005, 18:52

Kirill Kryukov wrote:
{snip}
I use Bayeselo for my engine tourney. It's still not ready for discussion, but the rating table is there... For now I included both old and new Bayeselo ratings. From what I see it's just a bit compressed comparing to the old one, but I trust you are doing your best science to make it right.

You have a very interesting web page.
Thanks for your interesting efforts

by **Dann Corbit** » 22 Sep 2005, 19:30

Here is a MS VC++ build:
http://cap.connx.com/chess-engines/new- ... yeselo.exe

Here is the source and projects that I used:
http://cap.connx.com/chess-engines/new- ... yeselo.zip

I added tag names to the enums that were missing tag names to make the Doxygen documentation more understandable to me. I also added the Cephes version of the erf() stuff so it will compile under Windows.

It is a manificent collection of chess parsing stuff, besides being an Elo calculation engine.

by **Rémi Coulom** » 22 Sep 2005, 20:07

Kirill Kryukov wrote:From what I see it's just a bit compressed comparing to the old one, but I trust you are doing your best science to make it right. :)

If you feel it is not compressed enough, you can make it more compressed by increasing the prior.

I still have to run more experiments to precisely determine whether bayeselo predictions are better than those of ELOStat. I am not yet 100% certain that they are better.

R?mi

by **Salvo Spitaleri** » 23 Sep 2005, 19:16

Hi R?mi,

what about the meanelo command?

Ciao
Salvo

by **Rémi Coulom** » 23 Sep 2005, 21:19

Salvo Spitaleri wrote:Hi R?mi,

what about the meanelo command?

Ciao
Salvo

I suppose you are referring to something old. I am afraid I do not remember. I do not understand your question.

R?mi

by **Kirill Kryukov** » 24 Sep 2005, 03:50

Thanks, Dann. Those efforst give me more questions than answers at the moment...

R?mi Coulom wrote:If you feel it is not compressed enough, you can make it more compressed by increasing the prior.

I have no idea how do I tell if it's compressed enough or not. But if you like to test something you can download my games. (Although there are still not so many of them at this point).

I got curious about one thing. As I understand, BayesELO is tuned for WBEC currently. As far as I know, WBEC is a very sparse tournament. It has big roundrobins for leagues, and leagues are connected by smaller promotion roundrobins. Do you think that tuning BayesELO to such sparse data will also give good ratings for more concentrated tournament, like a single, but big, roundrobin?

Another thing.. For a single roundrobin BayesELO and ELOstat are not too much different. But for a sparse data like WBEC they give very different ratings. I tried it for WBEC games, and the ratings are sometimes very different.. (One table is here, just hit esc after the rating table is opened, to not load the huge pairwise tables below that). So I wonder, how they can be so different? I can understand a few percent difference, but 158 vs -342 (Kiwi 0.5a), or 283 vs -132 (Delphil 1.5b) is quite large...

by **Salvo Spitaleri** » 24 Sep 2005, 09:49

Code: Select all: Rank Name BayesElo BayesElo BayesElo Chessbase Elostat v0052.12 v0052.05 v0052.12 v1.3 prior 2.6 meanelo 2572 prior 2.6 + - games score draws +2574 Elo 1 Hiarcs 10 570 3239 3144 3066 3097 167 160 10 60% 60% 2 Zappa 2.0 517 3175 3091 3001 3027 134 127 24 68% 29% 3 Hydra 448 3105 3022 3044 3045 270 195 12 87% 8% 4 Shredder 9-64 323 2900 2897 2869 2869 98 86 70 81% 11% 5 Fruit WCCC'05 280 2855 2854 2845 2846 20 19 1169 73% 23% 6 Shredder 9 251 2825 2825 2808 2810 9 9 6351 67% 23% 7 Toga II 1.0 235 2809 2809 2797 2798 20 20 983 62% 25% 8 Fruit 2.1 214 2790 2788 2782 2783 13 13 2481 63% 27% 9 Shredder 8 213 2788 2787 2778 2779 13 14 2209 62% 29% 10 Shredder 7.04 206 2780 2780 2770 2772 11 11 3129 59% 31% 11 Thinker 5.0b-64 204 2785 2778 2772 2773 186 181 13 53% 0% 12 Deep Sjeng X2-64 202 2801 2776 2719 2781 163 147 18 69% 16% 13 Shredder 8 Gambit 189 2765 2763 2757 2759 33 33 342 56% 30% 14 Junior 9 185 2760 2759 2751 2752 11 11 3623 58% 26% 15 Scorpio 1.6 178 2779 2752 0 2755 215 199 9 61% 11% 16 Fritz 8 176 2750 2750 2742 2743 8 7 7729 55% 28% 18 Shredder 7 162 2737 2736 2730 2732 19 19 1024 55% 31% 17 Toga II 0.93 162 2736 2736 2726 2727 29 29 497 56% 18% 19 Crafty 19.20 155 2734 2729 0 2722 219 221 8 50% 25% 20 Zappa 1.1-64 153 2763 2727 0 2732 228 208 8 62% 0% 21 Junior 8 147 2723 2721 2715 2717 12 12 2670 53% 29% 22 Chess Tiger 2004 140 2714 2714 2709 2711 24 24 628 53% 34% 23 Hiarcs 9 139 2714 2713 2710 2711 10 9 4404 52% 30% 24 Fritz 7 138 2713 2712 2708 2710 13 13 2356 56% 32% 25 Hiarcs 8 Bareev 133 2707 2707 2703 2706 27 27 526 46% 30% 26 Gandalf 6 131 2705 2705 2701 2702 10 10 4289 53% 26% 27 The King 3.33 131 2705 2705 2702 2703 9 9 5059 51% 25% 28 Patriot 2.0 128 2705 2702 2700 2701 33 32 403 51% 17% 29 Spike 1.0a Mainz 128 2702 2702 2703 2704 23 23 784 50% 23% 30 Ktulu 7.0 127 2702 2701 2700 2700 13 13 2466 55% 24% 31 Chepla 0.64 126 2823 2700 0 2816 190 156 15 83% 6% 32 Ruffian Leiden 125 2699 2699 2699 2699 19 19 1060 54% 28% 33 Pro Deo 1.1 120 2695 2694 2693 2694 12 12 2891 52% 22% 34 Chess Tiger 15 118 2693 2692 2691 2692 9 9 4992 50% 34% 35 The King 3.23 116 2691 2690 2686 2688 11 11 4391 52% 32% 36 Shredder Classic 1.1 113 2687 2687 2677 2678 86 86 45 46% 48% 38 Fritz 9 110 2684 0 2198 383 521 1 0% 0% 37 Gambit Tiger 2 110 2685 2684 2684 2686 17 17 1188 55% 38% 39 Chess Tiger 14 107 2682 2681 2679 2681 17 17 1349 56% 34% 40 Ruffian 2.1.0 106 2681 2680 2681 2682 10 10 4037 50% 27% 42 Chess Tiger 15 Gambit 103 2676 2677 2667 2670 41 41 223 42% 32% 41 SmarThink 1.00 103 2686 2677 2680 2680 80 80 56 50% 39% 43 Pharaon 3.2-64 100 2676 2674 2673 2673 79 78 67 55% 17% 44 Shredder 6.02 99 2674 2673 2672 2674 23 23 733 54% 30% 46 Deep Fritz 95 2669 2669 2667 2669 21 21 877 56% 29% 45 Pro Deo 1.0 95 2671 2669 2669 2670 16 16 1484 53% 25% 47 List 5.12 92 2668 2666 2668 2669 11 11 3382 50% 26% 48 SlowChess Blitz WV2 91 2665 2665 2671 2671 33 34 357 47% 25% 49 Gambit Tiger 2 aggr 89 2664 2663 2656 2658 49 49 150 51% 34% 50 Junior 7 82 2657 2656 2656 2658 16 16 1503 50% 28% 53 Fritz 6 79 2654 2653 2651 2652 24 24 636 56% 30% 52 Rebel 12 79 2652 2653 2653 2654 21 21 876 47% 32% 51 Ruffian 1.0.5 79 2654 2653 2657 2658 11 11 3127 52% 29% 54 Deep Sjeng 1.5 72 2647 2646 2648 2649 27 27 509 47% 29% 55 Shredder 6 71 2647 2645 2644 2646 25 25 611 49% 34% 56 Crafty 19.13-64 68 2645 2642 2655 2654 90 89 44 56% 40% 58 Kaissa 1.8a 65 2644 2639 2653 2650 88 84 59 65% 18% 57 List 5.04 65 2639 2639 2640 2642 20 20 907 49% 37% 60 Pseudo 0.7c 64 2638 2638 2644 2644 20 20 962 49% 30% 59 Spike 0.9a 64 2640 2638 2646 2646 14 14 2152 50% 25% 61 Maestro 1.09 63 2642 2637 2640 2640 74 74 74 50% 20% 62 Aristarch 4.50 60 2634 2634 2640 2640 10 11 3843 48% 23% 63 Fruit 2.0 59 2634 2633 2640 2640 13 13 2637 49% 20% 64 Deep Sjeng 1.6 56 2631 2630 2638 2639 16 16 1578 47% 25% 65 Junior 6 55 2628 2629 2628 2630 26 26 553 52% 29% 66 SOS 5 for Arena 55 2629 2629 2636 2636 13 14 2197 45% 22% 67 Hiarcs 8 54 2631 2628 2631 2632 17 17 1351 45% 32% 68 Zappa 1.1 51 2625 2625 2634 2634 41 41 229 44% 30% 69 SlowChess Blitz WV 49 2626 2623 2634 2634 15 15 1820 47% 27% 70 Crafty 19.20-64 43 2609 2617 0 2611 177 188 12 41% 0% 71 Glaurung 0.2.4 41 2613 2615 2628 2627 23 23 757 52% 27% 72 Maestro 1.08 40 2616 2614 2624 2623 64 63 106 57% 16% 74 Glaurung Mainz 39 2614 2613 2627 2626 29 29 473 48% 24% 73 Naum 1.82 39 2608 2613 2618 2619 77 83 71 28% 21% 75 Naum 1.8 36 2610 2610 2626 2625 24 24 671 50% 30% 77 Jonny 2.82 35 2609 2609 2625 2623 33 33 369 52% 22% 76 Pharaon 3.3 35 2610 2609 2622 2622 19 19 1140 49% 27% 78 SOS 4 for Arena 31 2605 2605 2614 2613 18 18 1226 52% 22% 79 Scorpio 1.3 30 2605 2604 2620 2618 25 25 619 51% 25% 80 SmarThink 0.18a 29 2604 2603 2613 2612 35 35 318 49% 27% 81 The King 3.12d 27 2600 2601 2600 2602 20 20 930 46% 36% 82 Anaconda 2.0.1 24 2599 2598 2609 2609 18 18 1240 42% 26% 83 TRACE 1.35 24 2680 2598 0 2724 199 173 9 77% 22% 84 Aristarch 4.21 21 2597 2595 2601 2602 18 18 1236 45% 30% 85 Zappa 1.0 15 2587 2589 2600 2600 15 15 1694 46% 25% 87 Delfi 4.5 CIPS 14 2589 2588 2600 2600 12 12 2715 46% 26% 86 Glaurung 0.2.3 14 2589 2588 2602 2602 20 20 1023 45% 23% 88 Pharaon 3.1-64 13 2586 2587 2602 2602 85 86 56 48% 21% 89 SmarThink 0.17a 12 2587 2586 2596 2596 23 23 737 48% 27% 91 Crafty 19.12 10 2583 2584 2594 2594 31 31 402 49% 33% 90 Thinker 4.7a 10 2585 2584 2600 2599 14 14 1868 46% 29% 92 SOS 3 for Arena 8 2584 2582 2587 2588 33 33 339 44% 33% 93 WARP 0.58 7 2579 2581 2588 2587 52 52 151 50% 23% 95 DanChess CCT7 5 2578 2579 2592 2592 17 17 1358 46% 22% 94 Gandalf 5.1 5 2580 2579 2584 2585 23 23 720 43% 33% 97 Glaurung 0.2.2 5 2585 2579 2598 2596 36 36 334 52% 15% 96 Shredder 5.32 5 2580 2579 2585 2587 20 20 941 43% 30% 98 Gandalf 4.32h 4 2578 2578 2583 2584 19 20 1022 45% 27% 99 WBNimzo 2000b 2 2576 2576 2586 2585 59 58 122 55% 18% 100 Scorpio 1.4 -2 2571 2572 2586 2585 39 39 248 47% 30% 102 Petir 2.97 -5 2568 2569 2583 2584 134 134 20 50% 30% 101 Thinker 4.6c -5 2566 2569 2576 2576 28 28 494 46% 31% 103 Little Goliath Evolution -6 2567 2568 2583 2582 25 25 640 45% 28% 104 Pharaon 3.2 -6 2566 2568 2581 2581 15 16 1706 39% 24% 105 WARP 0.37 -9 2580 2565 2586 2585 79 78 68 53% 16% 106 Ktulu 5.1 -10 2564 2564 2577 2578 19 19 1128 39% 23% 107 WARP 0.65 -10 2561 2564 2577 2578 50 50 174 43% 14% 108 Aristarch 4.4 -12 2559 2562 2566 2566 71 73 76 42% 23% 109 PostModernist 1010a -14 2599 2560 0 2640 176 172 13 53% 0% 110 WildCat 5.0 -15 2559 2559 2581 2579 32 32 391 45% 27% 111 Delfi 4.4 -16 2557 2558 2570 2570 55 55 135 49% 21% 112 Nimzo 8 -16 2558 2558 2564 2566 17 17 1347 39% 30% 114 Naum 1.7 -19 2554 2555 2567 2567 19 19 1079 41% 30% 113 Pharaon 3.1 -19 2555 2555 2574 2572 40 40 239 49% 25% 116 Hiarcs 7.32 -20 2552 2554 2557 2559 24 24 693 38% 29% 115 SlowChess Blitz 0.4 -20 2552 2554 2572 2571 32 32 400 46% 23% 117 Junior 5 -22 2549 2552 2557 2558 31 32 460 43% 25% 118 Naum 1.71 -24 2551 2550 2569 2566 87 85 56 57% 17% 119 Crafty Classic 2004 -25 2545 2549 2562 2562 95 98 44 39% 20% 121 Francesca M.A.D 0.10 -26 2608 2548 0 2606 217 217 7 50% 14% 120 Pharaon 3.0 -26 2545 2548 2560 2560 40 40 257 42% 23% 124 AnMon 5.53 -28 2544 2546 2564 2562 19 20 1058 46% 23% 123 Crafty Cito 1.2 -28 2545 2546 2564 2561 30 30 418 48% 28% 122 SmarThink 0.16b++ -28 2549 2546 2560 2560 56 56 125 47% 28% 125 The Baron 1.7.0 -31 2541 2543 2560 2564 47 48 185 40% 21% 126 The Baron 1.6.1 -32 2541 2542 2562 2560 25 25 669 50% 20% 127 Fritz 5.32 -33 2540 2541 2549 2550 26 26 595 39% 25% 128 Kaissa 1.7 -33 2542 2541 2564 2560 112 109 32 56% 18% 129 Green Light Chess 3.01.2-34 2538 2540 2559 2557 18 18 1311 46% 24% 130 AnMon 5.51 -36 2538 2538 2560 2559 17 17 1501 42% 22% 131 Crafty 19.15 -36 2536 2538 2559 2558 26 26 569 47% 25% 133 DanChess 1.08 -38 2543 2536 2566 2563 66 65 92 51% 25% 132 Tao 5.7 -38 2535 2536 2554 2553 18 18 1258 43% 18% 134 Little Goliath Nemesis -40 2533 2534 2539 2540 25 26 634 32% 28% 136 Amyan 1.595 -45 2529 2529 2547 2547 20 20 1069 38% 24% 135 Patriot 1.3.0 -45 2527 2529 2547 2547 23 24 783 34% 20% 137 Ufim 7.01 -47 2525 2527 2546 2544 29 29 472 42% 26% 138 Crafty 19.06 -49 2531 2525 2548 2548 101 103 38 44% 21% 139 SlowChess Blitz -50 2524 2524 2544 2541 86 85 50 53% 34% 140 Zarkov 4.75 -50 2521 2524 2543 2542 30 31 425 41% 26% 142 Crafty 19.19 -53 2519 2521 2534 2534 27 27 555 38% 25% 141 Tao 5.6 -53 2512 2521 2534 2532 77 78 72 45% 13% 144 Anaconda 1.6.2 -54 2518 2520 2526 2527 43 44 213 34% 30% 143 Green Light Chess 3.00.3-54 2519 2520 2540 2539 32 33 384 42% 23% 145 Arasan 8.4 -56 2558 2518 2642 2592 192 171 11 68% 9% 146 Delfi 4.2 CIPS -56 2512 2518 2524 2523 62 63 108 42% 20% 148 Scorpio 1.5 -57 2515 2517 2536 2535 59 61 110 39% 25% 147 The Baron 1.5.0 -57 2514 2517 2537 2536 27 27 581 44% 19% 149 Crafty Cito 1.4 -58 2508 2516 2534 2533 40 41 246 41% 23% 150 SOS 2 for Arena -60 2512 2514 2518 2519 52 54 145 31% 28% 152 Fruit 1.5 -62 2508 2512 2533 2531 34 34 356 43% 24% 151 Little Goliath Revival -62 2512 2512 2534 2532 22 22 840 44% 20% 153 Green Light Chess 3.00 -64 2511 2510 2534 2533 44 45 212 44% 20% 154 Crafty 18.15 -65 2509 2509 2516 2518 22 22 878 29% 28% 155 Gothmog 1.0 beta 10 -65 2508 2509 2534 2533 25 25 683 44% 19% 156 WildCat 4.0 -65 2508 2509 2533 2531 22 22 832 45% 17% 157 Movei 00.8.306 -66 2505 2508 2530 2529 62 64 105 37% 20% 158 Spike 0.8a -67 2504 2507 2527 2526 66 68 97 39% 16% 159 Yace 0.99.87 -67 2506 2507 2527 2526 16 15 1737 37% 22% 161 DanChess 1.07 -68 2508 2506 2534 2531 40 40 245 49% 22% 160 Movei 00.8.310 -68 2506 2506 2529 2527 22 22 819 38% 25% 164 El Chinito 3.25 -70 2497 2504 2525 2524 181 187 13 46% 0% 162 Petir 2.75 -70 2504 2504 2524 2522 33 33 352 40% 30% 163 Yace 0.99.79 Paderborn -70 2505 2504 2517 2518 20 21 974 34% 27% 165 Pharaon 2.62 -71 2502 2503 2517 2517 27 27 536 38% 27% 166 WildCat 3.0 -71 2503 2503 2524 2521 143 142 21 52% 9% 167 Zarkov 4.5 -77 2494 2497 2515 2514 51 51 158 43% 18% 168 Quark 2.35 Paderborn -80 2492 2494 2520 2519 23 23 790 40% 16% 169 Ufim 6.00 -81 2492 2493 2517 2515 19 19 1121 39% 23% 170 Naum 1.6 -83 2492 2491 2516 2514 32 32 401 41% 25% 171 Patriot 1.2.3 -84 2487 2490 2504 2503 64 66 95 35% 28% 172 Zarkov 4.67 -84 2490 2490 2514 2511 47 47 185 46% 20% 173 Naum 1.4 -87 2487 2487 2513 2511 38 39 276 41% 25% 174 Snitch 1.0.8 -88 2475 2486 2507 2548 178 165 12 62% 8% 175 SpiderChess 3.61 -88 2478 2486 2504 2501 86 86 53 48% 24% 176 Jonny 2.75 -90 2484 2484 2512 2509 39 40 276 43% 14% 179 Ceng 2.53.6b -91 2477 2483 0 2554 202 194 9 55% 0% 177 Pepito 1.59 -91 2482 2483 2508 2506 21 22 913 39% 21% 180 Quark 2.55 -91 2484 2483 2511 2509 63 65 103 43% 16% 178 SlowChess 2.94 -91 2486 2483 2515 2511 47 47 182 48% 21% 181 Amyan 1.597 -92 2482 2482 2509 2506 24 24 693 39% 25% 182 Amyan 1.594 -93 2477 2481 2511 2508 33 34 377 44% 16% 183 Petir 2.5c -98 2474 2476 2500 2497 30 30 435 39% 25% 184 AnMon 5.40 -101 2471 2473 2506 2503 49 49 164 46% 22% 185 Amateur 2.86 -102 2502 2472 2569 2552 152 155 13 46% 30% 186 Crafty Cito 1.4.1 -102 2451 2472 0 2475 235 252 7 42% 0% 187 Gromit 3.8.2 -104 2465 2470 2492 2491 34 35 379 40% 11% 188 Movei 00.8.295 -105 2464 2469 2495 2493 26 26 609 40% 19% 190 E.T.Chess 15.04.05 -111 2460 2463 2495 2492 70 71 84 44% 17% 189 SmarThink 0.15b -111 2457 2463 2495 2495 90 95 53 35% 11% 191 Gosu 0.11 -112 2461 2462 0 2506 214 205 7 57% 28% 192 Movei 00.8.317 -113 2451 2461 2490 2488 131 138 23 39% 17% 193 Capture R01 -115 2442 2459 2492 2540 156 152 15 56% 20% 194 Abrok 5.0 -118 2452 2456 2487 2484 32 32 416 42% 15% 195 Comet B.68 -127 2445 2447 2476 2474 35 36 342 39% 18% 196 Nejmet 3.07 -132 2438 2442 2475 2472 30 31 475 37% 17% 197 AnMon 5.32 -135 2437 2439 2467 2466 43 44 224 34% 21% 198 King Of Kings 2.57 -135 2436 2439 2476 2473 27 28 590 38% 16% 199 Leila 0.53h -138 2427 2436 2451 2471 127 129 24 47% 12% 200 Cerebro 2.01 -139 2418 2435 2448 2478 120 127 25 38% 20% 201 Dragon 4.7.5 -142 2426 2432 2511 2472 118 120 28 46% 14% 202 Frenzee 200 -144 2412 2430 2419 2470 170 179 12 41% 16% 203 Patzer 3.62 -154 2418 2420 2461 2458 34 34 386 38% 15% 204 Movei 00.8.263 -158 2423 2416 2460 2456 64 66 92 41% 26% 205 KnightDreamer 3.3 -159 2412 2415 2455 2453 29 29 543 34% 17% 206 Muse 0.899b -167 2384 2407 0 2483 145 154 16 37% 25% 207 Hermann 1.5 -169 2300 2405 0 2439 236 293 5 20% 0% 208 Chezzz 1.0.3 -199 2334 2375 0 2425 175 185 10 40% 20% 209 Phalanx XXII -208 2301 2366 0 2435 187 215 11 27% 0% 210 Jonny 2.82-64 -215 2378 2359 0 2398 323 289 3 66% 0% 211 SpiderChess 3.87 -237 2271 2337 0 2328 229 278 6 25% 16% 212 The Crazy Bishop 0052 -242 2254 2332 2406 2374 152 173 18 22% 11% 213 Booot 4.75 -243 2281 2331 2403 2371 172 185 11 36% 18% 214 Terra 3.4 -265 2271 2309 2358 2353 132 147 22 29% 13% 215 Bruja 1.9 -268 2203 2306 2262 2323 176 210 12 25% 0% 216 EXchess 4.03 -330 2182 2244 2241 2241 160 223 21 9% 9% 217 Patzer 3.71 -342 2087 2232 0 2181 218 280 6 16% 0% 218 Faile 1.4.4 -483 1945 2091 2062 2063 186 330 28 3% 0%

Hello R?mi,
IMO, yours tool is the best one for rating's calc, but I would like an output as that one of BayesElo 0052.05 or like that one of the third column, than I have obtained adding +2574 Elo to the output of version 0052.12.
Another nice feature would be then that one of being able to add the elo in the tag of the games.

Ciao
Salvo

by **Rémi Coulom** » 24 Sep 2005, 10:18

Kirill Kryukov wrote:I got curious about one thing. As I understand, BayesELO is tuned for WBEC currently. As far as I know, WBEC is a very sparse tournament. It has big roundrobins for leagues, and leagues are connected by smaller promotion roundrobins. Do you think that tuning BayesELO to such sparse data will also give good ratings for more concentrated tournament, like a single, but big, roundrobin?

The sparsity of the tournament is not really what determines the best value of the prior. The prior indicates how close in strength we expect players to be. In a tournament where very weak players may play against very strong players, it might be better to use a smaller prior. In tournaments where most of the games are between players that are close in strength, a larger prior might be better.

Also, changing the prior should not change the order of players much. The effect of increasing the prior is mainly to reduce the scale of rating differences, as you have already noticed.

Kirill Kryukov wrote:Another thing.. For a single roundrobin BayesELO and ELOstat are not too much different. But for a sparse data like WBEC they give very different ratings. I tried it for WBEC games, and the ratings are sometimes very different.. (One table is here, just hit esc after the rating table is opened, to not load the huge pairwise tables below that). So I wonder, how they can be so different? I can understand a few percent difference, but 158 vs -342 (Kiwi 0.5a), or 283 vs -132 (Delphil 1.5b) is quite large...

This is a very interesting example. The big rating differences that you noticed revolve around "Promo D" of WBEC 10. Let us take the striking example of Natwarlal 0.12 and NullMover 0.25. Natwarlal 0.12 finished in the top of division 4, and won the promotion tournament. NullMover 0.25 was in the bottom of division 3, and performed poorly in the promotion tournament. Here are the ratings that we get:

Natwarlal 0.12: 210 (bayeselo) and -267(elostat)
NullMover 0.25: 74(bayeselo) and -148(elostat)

I have a strong feeling that the ratings produced by bayeselo are much better than those produced by elostat in this situation. A fundamental problem of elostat is that it makes the assumption that when a program gets a winning percentage against a variety of opponents, it is equivalent to the same winning percentage against one single opponent, whose rating is equal to the average of opponents. This assumption is very wrong, and fails badly in this particular situation.

R?mi

by **Rémi Coulom** » 24 Sep 2005, 10:28

Salvo Spitaleri wrote:Hello R?mi,
IMO, yours tool is the best one for rating's calc, but I would like an output as that one of BayesElo 0052.05 or like that one of the third column, than I have obtained adding +2574 Elo to the output of version 0052.12.
Another nice feature would be then that one of being able to add the elo in the tag of the games.

Ciao
Salvo

I think I understand your question now: meanelo was replaced by offset. offset adds a constant to elo ratings.

Regarding adding tags to the PGN file, this would not be extremely difficult to do. I will put it in the TODO list, but implementing this kind of feature has a very low priority for me. I prefer to focus my efforts on trying to find better rating evaluations when I have time to work on bayeselo.

Thanks for your interest,

R?mi

by **Salvo Spitaleri** » 24 Sep 2005, 10:39

Regarding adding tags to the PGN file, this would not be extremely difficult to do. I will put it in the TODO list, but implementing this kind of feature has a very low priority for me.

Thank you very much!

Bests
Salvo

by **Rémi Coulom** » 24 Sep 2005, 22:29

R?mi Coulom wrote:
Kirill Kryukov wrote:Another thing.. For a single roundrobin BayesELO and ELOstat are not too much different. But for a sparse data like WBEC they give very different ratings. I tried it for WBEC games, and the ratings are sometimes very different.. (One table is here, just hit esc after the rating table is opened, to not load the huge pairwise tables below that). So I wonder, how they can be so different? I can understand a few percent difference, but 158 vs -342 (Kiwi 0.5a), or 283 vs -132 (Delphil 1.5b) is quite large...

This is a very interesting example. The big rating differences that you noticed revolve around "Promo D" of WBEC 10. Let us take the striking example of Natwarlal 0.12 and NullMover 0.25. Natwarlal 0.12 finished in the top of division 4, and won the promotion tournament. NullMover 0.25 was in the bottom of division 3, and performed poorly in the promotion tournament. Here are the ratings that we get:
Natwarlal 0.12: 210 (bayeselo) and -267(elostat)
NullMover 0.25: 74(bayeselo) and -148(elostat)
I have a strong feeling that the ratings produced by bayeselo are much better than those produced by elostat in this situation. A fundamental problem of elostat is that it makes the assumption that when a program gets a winning percentage against a variety of opponents, it is equivalent to the same winning percentage against one single opponent, whose rating is equal to the average of opponents. This assumption is very wrong, and fails badly in this particular situation.

R?mi

I have run more experiments with this data. When running ELOstat on this database it produces this output:

Code: Select all: Calculating Elo ratings... Iteration failed - degenerate database 1001 iterations

So, I thought that maybe ELOstat does not produce good ratings because it fails to converge. In order to test this, I implemented the ELOstat algorithm inside bayeselo. It took 1529 iterations to converge. The resulting ratings where -247 for NullMover, and -459 for Natwarlal, which really makes no sense.

The new version with the ELOstat algorithm built-in and other minor improvements is now available on my web page:
http://remi.coulom.free.fr/Bayesian-Elo/
Note that my implementation of ELOstat does not always produce results that are perfectly identical to ELOstat. There are sometimes differences of one or two points.

R?mi

by **Kirill Kryukov** » 27 Sep 2005, 04:16

R?mi, thank you for explanation and new version of Bayeselo!

R?mi Coulom wrote:A fundamental problem of elostat is that it makes the assumption that when a program gets a winning percentage against a variety of opponents, it is equivalent to the same winning percentage against one single opponent, whose rating is equal to the average of opponents. This assumption is very wrong, and fails badly in this particular situation.

I still don't understand completely. Suppose engine A vs B score is 5-20, and A vs C score is 10-20. (So that A sucks). Now how this is different from imaginary A vs D with score 15-40, and D having average rating of B and C? The A still sucks, no? Even if the rating of A is a little different in the second case, how it can be so much different as in your and my examples above?

by **Rémi Coulom** » 27 Sep 2005, 19:40

Kirill Kryukov wrote:R?mi, thank you for explanation and new version of Bayeselo!

R?mi Coulom wrote:A fundamental problem of elostat is that it makes the assumption that when a program gets a winning percentage against a variety of opponents, it is equivalent to the same winning percentage against one single opponent, whose rating is equal to the average of opponents. This assumption is very wrong, and fails badly in this particular situation.

I still don't understand completely. Suppose engine A vs B score is 5-20, and A vs C score is 10-20. (So that A sucks). Now how this is different from imaginary A vs D with score 15-40, and D having average rating of B and C? The A still sucks, no? Even if the rating of A is a little different in the second case, how it can be so much different as in your and my examples above?

Big differences arise in other kinds of situations. Suppose that A beats B, B draws C, and C beats D. In this kind of situation, one would suppose that B and C are of similar strength, A is strongest, and D weakest. ELOstat will say that B is significantly stronger than C. More precisely, here are the outputs of the programs in this situation

bayeselo:

Code: Select all: Rank Name Elo + - games score draws 1 A 104 355 274 1 100% 0% 2 C 8 211 194 2 75% 50% 3 B -8 194 211 2 25% 50% 4 D -104 274 355 1 0% 0%

C is rated 16 points ahead of B, because it drew B with the black pieces.

ELOstat:

Code: Select all: Program Elo + - Games Score Av.Op. Draws 1 A : 709 0 0 1 100.0 % 109 0.0 % 2 B : 109 259 409 2 25.0 % 300 50.0 % 3 C : -109 409 259 2 75.0 % -300 50.0 % 4 D : -709 0 0 1 0.0 % -109 0.0 %

As you can see, ELOstat is completely confused. I suppose this is similar to what happens in the WBEC case, with A being the stronger division, D being the lower division and B-C being the qualification tournament between them.

R?mi

by **Rémi Coulom** » 27 Sep 2005, 19:47

The results provided in my previous reply were obtained with a fixed bayeselo, that you can download here:
http://remi.coulom.free.fr/Bayesian-Elo/
There was a bug in the prior calculation, that is now fixed.

R?mi

by **Kirill Kryukov** » 29 Sep 2005, 04:21

Thank you for detailed explanation and for new version of BayesELO! It makes sense to me now.

I am now thinking to switch to BayesELO for computing ELOstat ratings (instead of using ELOstat itself), to simplify my processing script.

R?mi Coulom wrote:Note that my implementation of ELOstat does not always produce results that are perfectly identical to ELOstat. There are sometimes differences of one or two points.

Can be precision/rounding issue. 1-2 points is close enouigh!

by **Kirill Kryukov** » 29 Sep 2005, 06:17

R?mi, I compared the ELOstat ratings produced by Bayeselo and real ELOstat ratings. You can see my table on this page. Two things to notice - the difference is often more than 1-2 points, it goes up to 10 points sometimes. And second, the uncertainty estimation is quite different, especially at the both ends of the table. Do you have guess why?

[update]
I updated CEGT statistics page to include ELOstat ratings by both Bayeselo and ELOstat. They look very similar except some anomalies like -5037 for "Shredder 9 UCI" and "Fritz 8".

Winboard Forum

ELOStat algorithm ?

Re: ELOStat algorithm ?

Re: Bayeselo now with a proper prior

Re: ELOStat algorithm ?

Re: ELOStat algorithm ?

Re: ELOStat algorithm ?

Re: ELOStat algorithm ?

Re: ELOStat algorithm ?

Re: ELOStat algorithm ?

Re: ELOStat algorithm ?

Re: ELOStat algorithm ?

Re: ELOStat algorithm ?

Re: ELOStat algorithm ?

Re: ELOStat algorithm ?

Re: ELOStat algorithm ?

Re: ELOStat algorithm ?

Re: ELOStat algorithm ?

Re: ELOStat algorithm ?

bug fixed

Re: ELOStat algorithm ?

Re: ELOStat algorithm ?

Who is online