SPACIOUS-MIND RATING TEST DOWNLOAD AND INSTRUCTIONS

This forum is for general discussions and questions, including Collectors Corner and anything to do with Computer chess.

Moderators: Harvey Williamson, Steve B, Watchman

Forum rules
This textbox is used to restore diagrams posted with the fen tag before the upgrade.
User avatar
blaubaer
Full Member
Posts: 935
Joined: Thu Jul 28, 2011 12:53 pm
Location: Bavaria, the centre of Mysticum
Contact:

Post by blaubaer »

Hi Nick,
spacious_mind wrote: With equivalent I mean 60/30 or 40/20. I try to use average times as well. But some computers whose clock counts forwards when you take back moves, I then also sometimes use fixed time. Which ever gives most of the time 30 seconds of thinking best.
But preferably you take the 30s/move adjustment, whenever it's possible?
spacious_mind wrote:I also rated the human game so that the computer can be compared against the human players.
Does this work in the same way?
spacious_mind wrote:You must play all 5 test games in order to get a final rating that can be compared to other computers that were tested.
Sure - I just made it quick and dirty, I know! :P
spacious_mind wrote: There is so much I can do with these tests that no other tests can do :)
How did you develop this test - I mean, how did you find the relationship between move "quality" and Elo?

Amazed Regards,
Michael
User avatar
blaubaer
Full Member
Posts: 935
Joined: Thu Jul 28, 2011 12:53 pm
Location: Bavaria, the centre of Mysticum
Contact:

Post by blaubaer »

Hi Nick and Alain,
spacious_mind wrote:Alain is waiting on me for some final input on this piece then quite soon thereafter it will be ready to give everyone access.
I'm curious about it!

Regards, Michael
User avatar
spacious_mind
Senior Member
Posts: 4001
Joined: Wed Aug 01, 2007 10:20 pm
Location: Alabama
Contact:

Post by spacious_mind »

blaubaer wrote:Hi Nick,
But preferably you take the 30s/move adjustment, whenever it's possible
Yes, you can see the level I used, the second tab shows tests that I had done and the setting used.

Yes you can test humans the same way.

How I do the tests is my little secret for now, but I will tell you that it is a lot of work just to do one test :) Let's just say that I am quite good with spreadsheets and numbers :)

Don't be amazed, it has been said that it is a nice play thing and of no scientific (Wissenschaftlich) value. :P

Best regards
Nick
User avatar
blaubaer
Full Member
Posts: 935
Joined: Thu Jul 28, 2011 12:53 pm
Location: Bavaria, the centre of Mysticum
Contact:

Post by blaubaer »

Hi Nick,
spacious_mind wrote: How I do the tests is my little secret for now,
If you one day disclose this secret, I would like to be the first to get to know it!
spacious_mind wrote:but I will tell you that it is a lot of work just to do one test :)
I can imagine that....
spacious_mind wrote:Let's just say that I am quite good with spreadsheets and numbers :)
That's not new to me... :wink:
spacious_mind wrote:Don't be amazed, it has been said that it is a nice play thing and of no scientific (Wissenschaftlich) value. :P
sure... :wink:

Appreciative Regards,
Michael
User avatar
spacious_mind
Senior Member
Posts: 4001
Joined: Wed Aug 01, 2007 10:20 pm
Location: Alabama
Contact:

Post by spacious_mind »

blaubaer wrote:Hi Nick,
If you one day disclose this secret, I would like to be the first to get to know it!

Appreciative Regards,
Michael
Hi Michael,

I'll make sure you are the first to know.

Best regards
Nick
BillT
Member
Posts: 132
Joined: Sat Jun 09, 2012 9:31 pm
Location: Norfolk UK

Test game four

Post by BillT »

Hi Nick

First of all many thanks for the test game spreadsheets, I have only just enough spreadsheet knowledge to realise the huge amount of time ,work and skill it has taken for you to compile them....

I am currently having fun with them ,so far i have tested Excellence 6080 version which scored 1820, as compared to the 1871 of your EP12 variant. Its Interesting how both have scored so low on test game 4 ! (Would you say that EP12 is supposed to be stronger than 6080 version)?
Anyway. ..talking of test game 4, I have a problem entering e5 on the first calculated move..ie game move 13. e5 is available from the drop down box but when selected doesn't show as e5 with a score of 30....what shows is 1.30E+06 in "move played" and #N/A in "score". All other possible moves behave normally. Could you replicate this on your sheet to verify this for me.(I have already re-downloaded the sheet from your link in this thread in case of a download corruption, but it behaves just the same)

see below

GAME 4: 18TH CENTURY MASTERS: BOWDLER - PHILIDOR
1783 MFR NAME
LONDON, ENGLAND PROGRAM NAME
THOMAS BOWDLER FRANCOIS-ANDRE DANICAN PHILIDOR PROGRAMMER NAME
1. e4 c5 2. Bc4 e6 3. Qe2 Nc6 4. c3 a6 5. a4 b6 6. f4 d6 7. Nf3 Nge7 8. Ba2 g6
9. d3 Bg7 10. Be3 d5 11. Nbd2 O-O 12. O-O f5 (START) LEVEL SETTING
r1bq1rk1/4n1bp/ppn1p1p1/2pp1p2/P3PP2/2PPBN2/BP1NQ1PP/R4RK1 w - - 0 13 HARDWARE DESCRIPTION
WHITE BLACK WHITE BLACK
MOVE PLAYED SCORE MOVE PLAYED SCORE MOVE PLAYED SCORE MOVE PLAYED SCORE
START TEST START TEST
13.e5 30.00 13. ... h6 28.20 1.30E+06 #N/A - -
14.d4 30.00 14. ... c4 30.00 - - - -
15.b4 30.00 15. ... b5 24.00 - - - -
16.Bb1 0.00 16. ... Bd7 0.00 - - - -
17.Bc2 0.00 17. ... Qc7 0.00 - - - -

I cant copy paste accurately the columns/colours etc but im sure you can see what is going on
.
Odd regards

Bill
User avatar
spacious_mind
Senior Member
Posts: 4001
Joined: Wed Aug 01, 2007 10:20 pm
Location: Alabama
Contact:

Re: Test game four

Post by spacious_mind »

BillT wrote:Hi Nick

First of all many thanks for the test game spreadsheets, I have only just enough spreadsheet knowledge to realise the huge amount of time ,work and skill it has taken for you to compile them....

I am currently having fun with them ,so far i have tested Excellence 6080 version which scored 1820, as compared to the 1871 of your EP12 variant. Its Interesting how both have scored so low on test game 4 ! (Would you say that EP12 is supposed to be stronger than 6080 version)?
Anyway. ..talking of test game 4, I have a problem entering e5 on the first calculated move..ie game move 13. e5 is available from the drop down box but when selected doesn't show as e5 with a score of 30....what shows is 1.30E+06 in "move played" and #N/A in "score". All other possible moves behave normally. Could you replicate this on your sheet to verify this for me.(I have already re-downloaded the sheet from your link in this thread in case of a download corruption, but it behaves just the same)

see below

GAME 4: 18TH CENTURY MASTERS: BOWDLER - PHILIDOR
1783 MFR NAME
LONDON, ENGLAND PROGRAM NAME
THOMAS BOWDLER FRANCOIS-ANDRE DANICAN PHILIDOR PROGRAMMER NAME
1. e4 c5 2. Bc4 e6 3. Qe2 Nc6 4. c3 a6 5. a4 b6 6. f4 d6 7. Nf3 Nge7 8. Ba2 g6
9. d3 Bg7 10. Be3 d5 11. Nbd2 O-O 12. O-O f5 (START) LEVEL SETTING
r1bq1rk1/4n1bp/ppn1p1p1/2pp1p2/P3PP2/2PPBN2/BP1NQ1PP/R4RK1 w - - 0 13 HARDWARE DESCRIPTION
WHITE BLACK WHITE BLACK
MOVE PLAYED SCORE MOVE PLAYED SCORE MOVE PLAYED SCORE MOVE PLAYED SCORE
START TEST START TEST
13.e5 30.00 13. ... h6 28.20 1.30E+06 #N/A - -
14.d4 30.00 14. ... c4 30.00 - - - -
15.b4 30.00 15. ... b5 24.00 - - - -
16.Bb1 0.00 16. ... Bd7 0.00 - - - -
17.Bc2 0.00 17. ... Qc7 0.00 - - - -

I cant copy paste accurately the columns/colours etc but im sure you can see what is going on
.
Odd regards

Bill
Hi Bill,

I believe EP12 is 4 MHz whereas 6080 is 3 MHz, therefore it is possible that EP12 scores a little better.

That is strange regarding game 4. 13. e5 is a standard move that gets played all the time. I just downloaded the spreadsheet to make sure and 13. e5 works well. Have you closed all the spreadsheets before you downloaded again?

@ Michael could you please verify on your spreadsheet game 4 that 13, e5 scores correctly?

Thanks
Nick
Nick
User avatar
blaubaer
Full Member
Posts: 935
Joined: Thu Jul 28, 2011 12:53 pm
Location: Bavaria, the centre of Mysticum
Contact:

Re: Test game four

Post by blaubaer »

Hi Nick,
spacious_mind wrote: @ Michael could you please verify on your spreadsheet game 4 that 13, e5 scores correctly?
Game 4, 13.e5 scores 30 points!

Regards, Michael
Last edited by blaubaer on Mon Nov 16, 2015 11:06 am, edited 1 time in total.
User avatar
spacious_mind
Senior Member
Posts: 4001
Joined: Wed Aug 01, 2007 10:20 pm
Location: Alabama
Contact:

Re: Test game four

Post by spacious_mind »

blaubaer wrote:Hi Nick,
spacious_mind wrote: @ Michael could you please verify on your spreadsheet game 4 that 13, e5 scores correctly?
Game 4, 13.e5 scores 30 points!
HI Michael,

Thanks!

@Bill, I created these tests with MS Office Pro 2010, it is possible that earlier versions have problems with the formulas and functionality. What version of Excel are you using?

ps. another question, you are downloading from this link above and not using some earlier version from some older pages?

Best regards
Nick
BillT
Member
Posts: 132
Joined: Sat Jun 09, 2012 9:31 pm
Location: Norfolk UK

Post by BillT »

Hi Nick

I just downloaded onto my laptop,and all works well using Excel.

FYI
previously I had been using my android tablet with the Kingsoft Office App (WPS Office).
Really odd because eveything had worked flawlessly apart from that single instance e5......all other entries on that move would have worked perfectly too.....just e5 doesnt..how typical.

Thanks for the info on Excellence.....its very difficult to find info on these thngs i find.
My EP12 has "go faster stripes" in the lower right corner and on the box so maybe that was a clue???

Best wishes
Bill
User avatar
spacious_mind
Senior Member
Posts: 4001
Joined: Wed Aug 01, 2007 10:20 pm
Location: Alabama
Contact:

Post by spacious_mind »

BillT wrote:Hi Nick

I just downloaded onto my laptop,and all works well using Excel.

FYI
previously I had been using my android tablet with the Kingsoft Office App (WPS Office).
Really odd because eveything had worked flawlessly apart from that single instance e5......all other entries on that move would have worked perfectly too.....just e5 doesnt..how typical.

Thanks for the info on Excellence.....its very difficult to find info on these thngs i find.
My EP12 has "go faster stripes" in the lower right corner and on the box so maybe that was a clue???

Best wishes
Bill
Well that is good news, I was at a loss of why it wouldn't work for you. You can see from the 2nd tab how many games I had played to test all the spreadsheets. Therefore I would have been really mystified if they had some errors :)

Best regards
Nick
kgvetter
Member
Posts: 239
Joined: Sat May 12, 2012 5:22 pm

test Nr.4

Post by kgvetter »

Hi Nick,

I just ran your test nr.4 with stockfish 6 64 popcnt (4 cores) and it came up with a perfect score of 3400 for the white side. The black side trailed a bit behind with 3048 making a combined score of 3224. I will try the same with hyperthreading activated and 8 cores although the makers of the strong engines don't recommend it. I am curiuos about the outcome...

Greetings,

Gerhard
User avatar
spacious_mind
Senior Member
Posts: 4001
Joined: Wed Aug 01, 2007 10:20 pm
Location: Alabama
Contact:

Re: test Nr.4

Post by spacious_mind »

kgvetter wrote:Hi Nick,

I just ran your test nr.4 with stockfish 6 64 popcnt (4 cores) and it came up with a perfect score of 3400 for the white side. The black side trailed a bit behind with 3048 making a combined score of 3224. I will try the same with hyperthreading activated and 8 cores although the makers of the strong engines don't recommend it. I am curiuos about the outcome...

Greetings,

Gerhard
Hi Gerhard

Great result! What time setting did you use?

Hopefully it confirms to you from all the analysis you have done that my ratings should be very accurate :) And that the relative performances of different programs are good guidelines of their strength especially since they are all being tested in the exact same universe.

Not only for computers but these tests should also be interesting for humans who want to compare themselves. Also with this method if I were to do enough tests of Grandmaster games I could accurately establish their strength and compare them to computers :) Meaning I could accurately compare the strength of Capablanca against Fisher against Carlsen against Stockfish against Revelation Hiarcs and so on and on.

I might have to go even deeper in future tests to stop you achieving maximum score :) But it is all just a perspective in a moment of time as there is always something bigger and better tomorrow.

Hopefully you agree that these tests do work and have a lot of potential.

Best regards
Nick
kgvetter
Member
Posts: 239
Joined: Sat May 12, 2012 5:22 pm

Post by kgvetter »

Hello Nick,

I ran the test with 30sec-move. The more I am performing your tests the more I get convinced of the high quality of them! You are right though that once a program hits the ceiling of 3400 points -even if it is only for the white side- you might want to consider to replace the test Nr.4 with a tougher one.
:twisted:

Best wishes,

Gerhard
User avatar
spacious_mind
Senior Member
Posts: 4001
Joined: Wed Aug 01, 2007 10:20 pm
Location: Alabama
Contact:

Post by spacious_mind »

kgvetter wrote:Hello Nick,

I ran the test with 30sec-move. The more I am performing your tests the more I get convinced of the high quality of them! You are right though that once a program hits the ceiling of 3400 points -even if it is only for the white side- you might want to consider to replace the test Nr.4 with a tougher one.
:twisted:

Best wishes,

Gerhard
Hi Gerhard,

Yes you are right of course, it is why I asked about your tested speed. I want these tests to be at least accurate with Tournament level as well, therefore you hitting the ceiling with only 30 seconds means that the next test suite will have to go deeper still and ceiling raised as well :)

These modern programmers and hardware are just getting too good too fast :)

Time to buy a nuclear powered PC.

Best regards
Last edited by spacious_mind on Mon Nov 23, 2015 10:48 pm, edited 1 time in total.
Nick
Post Reply