CCRL Update 18 August

Designed for posting all types of tournaments and Games (e.g. Man vs. Machine, Computer vs. Computer and basement matches.)

Moderators: Harvey Williamson, Watchman

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Post Reply
Ray

CCRL Update 18 August

Post by Ray »

The August 18th update of the CCRL Rating Lists and Statistics is now available for viewing at:
http://www.computerchess.org.uk/ccrl/4040/

The links to the various rating lists can be found just beneath the default Best Versions list.
For example there is a 32-bit Single CPU list.

Our standard testing is at 40 moves in 40 minutes repeating while our current blitz testing is at both 40 moves in 4 minutes repeating and 40 moves in 12 minutes repeating, all adjusted to the AMD64 X2 4600+ (2.4GHz).

Currently active testers in our team are:
Graham Banks, Ray Banks, Shaun Brewer, Kirill Kryukov, Dom Leste, Tom Logan, Andreas Schwartmann, Charles Smith, George Speight, Chris Taylor, Chuck Wilson, Gabor Szots and Martin Thoresen.

A big thanks to all testers as usual for their efforts this week.


40/40 Notes

There currently 69,999 games in our 40/40 database.

Many engines on our list have few games and in many cases their ratings are likely to fluctuate (markedly for some) until a lot more games are played. Therefore no conclusions should be drawn about their strength yet.
To illustrate this point, when an engine has 200 games played, the error margin is still approximately +/-40 ELO, after 500 games +-25 ELO, after 1000 games +-17 ELO and even after 2000 games there is a +-13 ELO error margin!
This of course highlights the importance of looking at other rating lists that are also available in order to draw comparisons and get a more accurate overall picture.


Multi CPU Engines

Rybka 2.3.2a 64-bit 4CPU is a small improvement over Rybka 2.2 64-bit 4CPU.
Interestingly, the improvement is greater on 2CPU.

Zap!Chess Zanzibar 64-bit 4CPU is clearly number 2 ahead of Hiarcs 11.1 4CPU and Naum 2.2 64-bit 4CPU.
Hiarcs 11.2 4CPU is still in the early stages of testing.

The current ratings for Loop M1-T 64-bit 4CPU and 2CPU suggest that there is little gain from the extra two CPU.

Deep Shredder 10 64-bit 4CPU, Deep Fritz 10 4CPU and Deep Junior 10 4CPU, are off the pace.


Single CPU Engines

Rybka 2.3.2a leads the ratings here as well, although by a slightly larger margin.
It also looks like the 64-bit version could make more difference to strength than with previous versions.

Newly released Toga II 1.3.1 is battling it out for second spot with Zap!Chess Zanzibar!
Loop M1-T could well be a threat to both as it gets more games under its belt.

Hiarcs 11.1, Strelka 1.8 and Fritz 10 are the next three in the ranking order, followed by Fruit 051103, Shredder 10 and Toga II 1.3.4.
We are still in the early stages of testing Hiarcs 11.2, and a lot of the more recently released engines also need more games before their ratings stabilise.

Spike 1.2 Turin, Naum 2.2, Junior 10 and Deep Sjeng 2.5 are the next group of engines and seem to be very even in strength.
Junior 10.1 is weaker than Junior 10 according to our testing.

SmarThink 1.00, Ktulu 8.0, Glaurung 2 epsilon/5 and Chess Tiger 2007.1 are further adrift.


Free Engines

Although Rybka 1.0 remains the top free engine, the gap is slowly closing.
Either of the newly released Toga II 1.3.1 and Toga II 1.3.4 could yet snatch away Rybka's crown!

Strelka 1.8, Fruit 051103, Fruit 2.3.1 and Naum 2.1 all appear to be stronger than Spike 1.2 Turin.

Glaurung 2 epsilon/5 and Alaric 707 are next, ahead of Scorpio 1.91, Delfi 5.1 and SlowChess Blitz WV2.1.

WildCat 7 and Pro Deo 1.2 are further back.

As we make our way down the list, it should be noted that the most recent versions of Booot, DanaSah, Delphil, Hermann, Alfil, Natwarlal and Feuerstein seem to have made good gains over previous versions.
Others to keep an eye on as they get more games are the latest versions of Hamsters, BugChess2, Popochin, NanoSzachy and GreKo.

We test a very extensive range of amateur engines through our Amateur Championship divisions (32-bit 1CPU) plus other tournaments, all of which can be followed in our public forum.

Our aim is of course to ensure that all engines lower on our lists get at least 200 games.


Blitz Notes

There are currently 161,045 games in our 40/4 database.

The 40/4 update is usually done separately to our 40/40 update. The most recent update can always be viewed here:
http://computerchess.org.uk/ccrl/404.live/


FRC Notes

Ray tests only those engines that can play FRC through the Shredder Classic GUI.
If engine authors have a new and stable version of their engine that will run under this GUI, they should contact Ray if they wish to see it tested.

Ray has recently tested Naum 2.2, Fruit 2.3, Fruit 051103, Hamsters 0.4, Hermann 2.0 and Movei 0.08.438. All are now included in the ratings.
He is hoping to test Rybka next, and it looks likely that Hiarcs 11.1 will finally be dethroned (Hiarcs 11.2 has not been tested yet).
The improvement in strength of Hamsters 0.4 and Movei 0.08.438 is particularly noteworthy.

For FRC Ray recommends the pure list.
http://www.computerchess.org.uk/ccrl/404FRC/


Stats/Presentation Notes

The LOS stats to the right hand side of each rating list are "likelihood of superiority" stats. They tell you the likelihood in percentage terms of each engine being superior to the engine directly below them.

A list of games played this week per engine can be found in the update thread in the CCRL public forum, accessible through the link given at the top of this post.
Please note that our forum has been moved and is now much quicker to load and more readily accessible. The link given will redirect you automatically.

All games are available for download through the link given at the top of this post. They can be downloaded by engine or by month.
ELO ratings are now saved in all game databases for those engines that have 200 games or more.

Clicking on an engine name will give details as to opponents played plus homepage links where applicable.

Custom lists of engines can be selected for comparison.

An openings report page (link at bottom of index page) lists the number of games played by ECO codes with draw percentage and White win percentage. Clicking on a column heading will sort the list by that column.
Games can now be downloaded by ECO code.
User avatar
Harvey Williamson
Site Admin
Posts: 6079
Joined: Sun Jul 29, 2007 6:57 am
Location: Media City, UK
Contact:

Post by Harvey Williamson »

Hi Ray,

Thanks for the update. Do you plan to test 11.2 at FRC? It seems to be showing a nice increase on some ratings lists!?

Best Wishes,

Harvey
Ray

Post by Ray »

I haven't looked at any ratings yet for 11.2, so I have no idea of it's strenght. Also depends how close 12 is....
Ray

Post by Ray »

Actually regardless of how far away Hiarcs 12 is, no reason not to test 11.2
I've just started the first match vs Shredder 10.
User avatar
Harvey Williamson
Site Admin
Posts: 6079
Joined: Sun Jul 29, 2007 6:57 am
Location: Media City, UK
Contact:

Post by Harvey Williamson »

Ray wrote:Actually regardless of how far away Hiarcs 12 is, no reason not to test 11.2
I've just started the first match vs Shredder 10.
Thanks - Look forward to your updates!
Ray

Post by Ray »

Hi Harvey,

After doing some investigation, I'm not that hopeful for 11.2

CCRL Blitz single CPU:
We see Hiarcs 11.2 just behind 11.1

Code: Select all

CCRL 40/4 Rating List - Custom engine selection
 Rank                 Engine                  ELO   +    -   Score  AvOp  Games
    1 Hiarcs 11.1                            2915  +13  -13  61.3%  -78.3  2152
      Hiarcs 11.2                            2909  +24  -24  61.5%  -80.4   611

CCRL Blitz 2 CPU:
We see Hiarcs 11.2 just behind 11.1

Code: Select all

CCRL 40/4 Rating List - Custom engine selection

 Rank                 Engine                  ELO   +    -   Score  AvOp  Games
    1 Hiarcs 11.1 2CPU                       2960  +14  -14  58.4%  -59.2  1841
      Hiarcs 11.2 2CPU                       2940  +27  -26  60.1%  -70.7   477

CEGT 40/20

Code: Select all

Hiarcs 11.1 2CPU  - 2876 
Hiarcs 11.2 2CPU -  2873
(about the same)

Code: Select all

Hiarcs 11.1 1CPU - 2834 
Hiarcs 11.2 1CPU  -2818 
(so far 11.2 not as good)

Maybe 11.2 comes alive at long time controls, but the FRC list is blitz so I'm slightly worried. Even more so because I left home this morning and saw results to date for 11.2, which did not look good at all. We'll know more in a couple of days as the existing matches finish and new ones are underway.
Post Reply