World Championship Winning Computer Chess Software Program & Downloads for Chess Databases, Analysis and Play on PC, Mac, iPad and iPhone — Visit: Hiarcs.com
spacious_mind wrote:
With equivalent I mean 60/30 or 40/20. I try to use average times as well. But some computers whose clock counts forwards when you take back moves, I then also sometimes use fixed time. Which ever gives most of the time 30 seconds of thinking best.
But preferably you take the 30s/move adjustment, whenever it's possible?
spacious_mind wrote:I also rated the human game so that the computer can be compared against the human players.
Does this work in the same way?
spacious_mind wrote:You must play all 5 test games in order to get a final rating that can be compared to other computers that were tested.
Sure - I just made it quick and dirty, I know!
spacious_mind wrote:
There is so much I can do with these tests that no other tests can do
How did you develop this test - I mean, how did you find the relationship between move "quality" and Elo?
blaubaer wrote:Hi Nick,
But preferably you take the 30s/move adjustment, whenever it's possible
Yes, you can see the level I used, the second tab shows tests that I had done and the setting used.
Yes you can test humans the same way.
How I do the tests is my little secret for now, but I will tell you that it is a lot of work just to do one test Let's just say that I am quite good with spreadsheets and numbers
Don't be amazed, it has been said that it is a nice play thing and of no scientific (Wissenschaftlich) value.
First of all many thanks for the test game spreadsheets, I have only just enough spreadsheet knowledge to realise the huge amount of time ,work and skill it has taken for you to compile them....
I am currently having fun with them ,so far i have tested Excellence 6080 version which scored 1820, as compared to the 1871 of your EP12 variant. Its Interesting how both have scored so low on test game 4 ! (Would you say that EP12 is supposed to be stronger than 6080 version)?
Anyway. ..talking of test game 4, I have a problem entering e5 on the first calculated move..ie game move 13. e5 is available from the drop down box but when selected doesn't show as e5 with a score of 30....what shows is 1.30E+06 in "move played" and #N/A in "score". All other possible moves behave normally. Could you replicate this on your sheet to verify this for me.(I have already re-downloaded the sheet from your link in this thread in case of a download corruption, but it behaves just the same)
see below
GAME 4: 18TH CENTURY MASTERS: BOWDLER - PHILIDOR
1783 MFR NAME
LONDON, ENGLAND PROGRAM NAME
THOMAS BOWDLER FRANCOIS-ANDRE DANICAN PHILIDOR PROGRAMMER NAME
1. e4 c5 2. Bc4 e6 3. Qe2 Nc6 4. c3 a6 5. a4 b6 6. f4 d6 7. Nf3 Nge7 8. Ba2 g6
9. d3 Bg7 10. Be3 d5 11. Nbd2 O-O 12. O-O f5 (START) LEVEL SETTING
r1bq1rk1/4n1bp/ppn1p1p1/2pp1p2/P3PP2/2PPBN2/BP1NQ1PP/R4RK1 w - - 0 13 HARDWARE DESCRIPTION
WHITE BLACK WHITE BLACK
MOVE PLAYED SCORE MOVE PLAYED SCORE MOVE PLAYED SCORE MOVE PLAYED SCORE
START TEST START TEST
13.e5 30.00 13. ... h6 28.20 1.30E+06 #N/A - -
14.d4 30.00 14. ... c4 30.00 - - - -
15.b4 30.00 15. ... b5 24.00 - - - -
16.Bb1 0.00 16. ... Bd7 0.00 - - - -
17.Bc2 0.00 17. ... Qc7 0.00 - - - -
I cant copy paste accurately the columns/colours etc but im sure you can see what is going on
.
Odd regards
First of all many thanks for the test game spreadsheets, I have only just enough spreadsheet knowledge to realise the huge amount of time ,work and skill it has taken for you to compile them....
I am currently having fun with them ,so far i have tested Excellence 6080 version which scored 1820, as compared to the 1871 of your EP12 variant. Its Interesting how both have scored so low on test game 4 ! (Would you say that EP12 is supposed to be stronger than 6080 version)?
Anyway. ..talking of test game 4, I have a problem entering e5 on the first calculated move..ie game move 13. e5 is available from the drop down box but when selected doesn't show as e5 with a score of 30....what shows is 1.30E+06 in "move played" and #N/A in "score". All other possible moves behave normally. Could you replicate this on your sheet to verify this for me.(I have already re-downloaded the sheet from your link in this thread in case of a download corruption, but it behaves just the same)
see below
GAME 4: 18TH CENTURY MASTERS: BOWDLER - PHILIDOR
1783 MFR NAME
LONDON, ENGLAND PROGRAM NAME
THOMAS BOWDLER FRANCOIS-ANDRE DANICAN PHILIDOR PROGRAMMER NAME
1. e4 c5 2. Bc4 e6 3. Qe2 Nc6 4. c3 a6 5. a4 b6 6. f4 d6 7. Nf3 Nge7 8. Ba2 g6
9. d3 Bg7 10. Be3 d5 11. Nbd2 O-O 12. O-O f5 (START) LEVEL SETTING
r1bq1rk1/4n1bp/ppn1p1p1/2pp1p2/P3PP2/2PPBN2/BP1NQ1PP/R4RK1 w - - 0 13 HARDWARE DESCRIPTION
WHITE BLACK WHITE BLACK
MOVE PLAYED SCORE MOVE PLAYED SCORE MOVE PLAYED SCORE MOVE PLAYED SCORE
START TEST START TEST
13.e5 30.00 13. ... h6 28.20 1.30E+06 #N/A - -
14.d4 30.00 14. ... c4 30.00 - - - -
15.b4 30.00 15. ... b5 24.00 - - - -
16.Bb1 0.00 16. ... Bd7 0.00 - - - -
17.Bc2 0.00 17. ... Qc7 0.00 - - - -
I cant copy paste accurately the columns/colours etc but im sure you can see what is going on
.
Odd regards
Bill
Hi Bill,
I believe EP12 is 4 MHz whereas 6080 is 3 MHz, therefore it is possible that EP12 scores a little better.
That is strange regarding game 4. 13. e5 is a standard move that gets played all the time. I just downloaded the spreadsheet to make sure and 13. e5 works well. Have you closed all the spreadsheets before you downloaded again?
@ Michael could you please verify on your spreadsheet game 4 that 13, e5 scores correctly?
spacious_mind wrote:
@ Michael could you please verify on your spreadsheet game 4 that 13, e5 scores correctly?
Game 4, 13.e5 scores 30 points!
HI Michael,
Thanks!
@Bill, I created these tests with MS Office Pro 2010, it is possible that earlier versions have problems with the formulas and functionality. What version of Excel are you using?
ps. another question, you are downloading from this link above and not using some earlier version from some older pages?
I just downloaded onto my laptop,and all works well using Excel.
FYI
previously I had been using my android tablet with the Kingsoft Office App (WPS Office).
Really odd because eveything had worked flawlessly apart from that single instance e5......all other entries on that move would have worked perfectly too.....just e5 doesnt..how typical.
Thanks for the info on Excellence.....its very difficult to find info on these thngs i find.
My EP12 has "go faster stripes" in the lower right corner and on the box so maybe that was a clue???
I just downloaded onto my laptop,and all works well using Excel.
FYI
previously I had been using my android tablet with the Kingsoft Office App (WPS Office).
Really odd because eveything had worked flawlessly apart from that single instance e5......all other entries on that move would have worked perfectly too.....just e5 doesnt..how typical.
Thanks for the info on Excellence.....its very difficult to find info on these thngs i find.
My EP12 has "go faster stripes" in the lower right corner and on the box so maybe that was a clue???
Best wishes
Bill
Well that is good news, I was at a loss of why it wouldn't work for you. You can see from the 2nd tab how many games I had played to test all the spreadsheets. Therefore I would have been really mystified if they had some errors
I just ran your test nr.4 with stockfish 6 64 popcnt (4 cores) and it came up with a perfect score of 3400 for the white side. The black side trailed a bit behind with 3048 making a combined score of 3224. I will try the same with hyperthreading activated and 8 cores although the makers of the strong engines don't recommend it. I am curiuos about the outcome...
I just ran your test nr.4 with stockfish 6 64 popcnt (4 cores) and it came up with a perfect score of 3400 for the white side. The black side trailed a bit behind with 3048 making a combined score of 3224. I will try the same with hyperthreading activated and 8 cores although the makers of the strong engines don't recommend it. I am curiuos about the outcome...
Greetings,
Gerhard
Hi Gerhard
Great result! What time setting did you use?
Hopefully it confirms to you from all the analysis you have done that my ratings should be very accurate And that the relative performances of different programs are good guidelines of their strength especially since they are all being tested in the exact same universe.
Not only for computers but these tests should also be interesting for humans who want to compare themselves. Also with this method if I were to do enough tests of Grandmaster games I could accurately establish their strength and compare them to computers Meaning I could accurately compare the strength of Capablanca against Fisher against Carlsen against Stockfish against Revelation Hiarcs and so on and on.
I might have to go even deeper in future tests to stop you achieving maximum score But it is all just a perspective in a moment of time as there is always something bigger and better tomorrow.
Hopefully you agree that these tests do work and have a lot of potential.
I ran the test with 30sec-move. The more I am performing your tests the more I get convinced of the high quality of them! You are right though that once a program hits the ceiling of 3400 points -even if it is only for the white side- you might want to consider to replace the test Nr.4 with a tougher one.
I ran the test with 30sec-move. The more I am performing your tests the more I get convinced of the high quality of them! You are right though that once a program hits the ceiling of 3400 points -even if it is only for the white side- you might want to consider to replace the test Nr.4 with a tougher one.
Best wishes,
Gerhard
Hi Gerhard,
Yes you are right of course, it is why I asked about your tested speed. I want these tests to be at least accurate with Tournament level as well, therefore you hitting the ceiling with only 30 seconds means that the next test suite will have to go deeper still and ceiling raised as well
These modern programmers and hardware are just getting too good too fast
Time to buy a nuclear powered PC.
Best regards
Last edited by spacious_mind on Mon Nov 23, 2015 10:48 pm, edited 1 time in total.