Your Country Needs You!

This forum is for general discussions and questions, including Collectors Corner and anything to do with Computer chess.

Moderators: Harvey Williamson, Steve B, Watchman

Forum rules
This textbox is used to restore diagrams posted with the fen tag before the upgrade.
Post Reply
User avatar
spacious_mind
Senior Member
Posts: 4000
Joined: Wed Aug 01, 2007 10:20 pm
Location: Alabama
Contact:

Post by spacious_mind »

Brian B wrote:
spacious_mind wrote:
Brian B wrote:First off, I really enjoy reading about these tests and the relative performance of these computers against each other and the occasional human. It is an incredible amount of work and I do appreciate the effort.

I was wondering if there should be some way to weight certain moves at a higher level? Granted, if a computer makes a bonehead move, it will pay the price with a very low rating. It seem to me that a rating of zero wouldn't penalize a computer enough if this move came at a critical point in the game. How many games are decided not by great moves, but by one costly mistake? These results for these computers are relative to each other, yet other than having all these computers play games against other computers, I wish there was a way to identify a key move and grade accordingly.

Regards,
Brian B
Hi Brian,

Each move is weighted from very bad to very good based on a consistent evaluation formula not changed by a human. The problem with a human adding extra weight beyond a consistent formula for certain moves is that now you are being biased again because a human's opinion differs and does not treat everything equal. Consistent formulas however do.

My rating really does not care who is being evaluated and my opinion on what is good or bad is irrelevant, since my opinion is not a factor in the evaluation.

Best regards
Hi Nick, thanks for your thoughts.

I wasn't thinking of a human evaluation, I was thinking of a computer evaluation of the resulting position after a given move. So, not an evaluation of the move, but of the position after the move. Not sure how it could be done, but if there were a way to rate the actual position after a move as either winning or losing it would be interesting. This is being done now to some extent by rating the move itself, yet I don't see a true order of magnitude. How bad is a really bad move? I understand that this idea isn't very practical, just food for thought.

Regards,
Brian B
Hi Brian,

Rewarding bad by scale I think does not do a lot since bad = bad. What you do instead is reward accordingly by scale the better moves. The magnitude grows through the total number of good moves compared to the number of bad moves. So I think the overall achievement is reflected by the overall score of all the evaluated moves combined.

If you are grading too few moves then your universe is too small for overall accuracy. I.E. 40 hand picked test positions = too small a universe and to some degree biased because of the hand picking aspect of the positions chosen. Even if it is done by consensus it is still opinionated.

Anyway those are my thoughts on this.

best regards
Nick
User avatar
spacious_mind
Senior Member
Posts: 4000
Joined: Wed Aug 01, 2007 10:20 pm
Location: Alabama
Contact:

Post by spacious_mind »

I just finished the 5 tests with Gavon JFresh v. 0.1a:

Image

After 5 games JFresh finished with a score of 1854 which ended up being 18 points less than CXG Advanced Star Chess. Therefore I quickly played a game between the two.

Test Game: CXG Advanced Star Chess - Gavon JFresh v. 0.1a

[Event "Computer Test Match"]
[Site "Alabama"]
[Date "2015.02.25"]
[Round "?"]
[White "CXG ADVANCED STAR CHESS, LV A7 30S."]
[Black "GAVON JFRESH V.0.1A, LV AT30."]
[Result "1/2-1/2"]
[ECO "E32"]
[PlyCount "113"]
[EventDate "2015.02.25"]
[EventCountry "USA"]

1. d4 Nf6 2. c4 e6 3. Nc3 Bb4 4. Qc2 O-O {CXG ADVANCED STAR CHESS OUT OF BOOK}
5. e4 d5 {GAVON JFRESH V.0.1A OUT OF BOOK} 6. cxd5 exd5 7. e5 Ne4 8. Bd3 Nc6 9.
Nf3 Bg4 10. Bxe4 dxe4 11. Qxe4 Bxf3 12. gxf3 Qxd4 13. Qxd4 Nxd4 14. Kf1 Bxc3
15. bxc3 Nxf3 16. Rb1 b6 17. Bf4 Rae8 18. Rd1 Nxe5 19. Rg1 f6 20. Bh6 Re7 21.
Rg3 c5 22. f4 Nc4 23. f5 Rfe8 24. Rdd3 Kh8 25. Bf4 Re1+ 26. Kf2 R8e2+ 27. Kf3
Re8 28. Rg2 Ne5+ 29. Bxe5 Rf1+ 30. Kg4 fxe5 31. Rd7 a5 32. Rb2 Rf4+ 33. Kg3
Rxf5 34. Rxb6 Ref8 35. Re6 a4 36. Rde7 Rf3+ 37. Kg4 R8f4+ 38. Kg5 Rf5+ 39. Kg4
R3f4+ 40. Kg3 Rg5+ 41. Kh3 Rf8 42. Rxe5 Rxe5 43. Rxe5 Rc8 44. Kg4 Kg8 45. Kf5
Rf8+ 46. Ke4 Rf2 47. Rxc5 Rxh2 48. Rc8+ Kf7 49. a3 Re2+ 50. Kd3 Ra2 51. Kc4
Rxa3 52. Kb4 Ra1 53. Ra8 Rb1+ 54. Kc4 Ra1 55. Kb4 Rb1+ 56. Kc4 Ra1 57. Kb4 {
CXG ADVANCED STAR CHESS ANNOUNCES DRAW BY 3X REPETITION} 1/2-1/2

JFresh's came out of the opening with a good advantage, but once again weakened in the endgame. It has problems in endgames. Overall however I would say that these two programs are quite evenly matched.

Chess.com has an interesting comparison between FIDE and USCF ratings that they started. It looks as below and the link is here:

http://www.chess.com/article/view/chess ... omparisons

Code: Select all

USCF			FIDE
944			1207
992			1240
1039			1273
1087			1306
1134			1338
1182			1371
1229			1404
1277			1437
1324			1469
1372			1502
1419			1535
1467			1568
1514			1600
1562			1633
1609			1666
1657			1699
1704			1731
1752			1764
1799			1797
1847			1830
1894			1863
1942			1895
1989			1928
2037			1961
2084			1994
2132			2026
2179			2059
2227			2092
2274			2125
2322			2157
2369			2190
2417			2223
I always thought that on the lower end of the scale that USCF was rated higher than FIDE, but apparently not. USCF only becomes higher rated after about 1800 ELO according to this interesting comparison.

Makes you wonder when we look at list and rate a computer at 1100 if it is not really 1350?

Best regards
Nick
User avatar
spacious_mind
Senior Member
Posts: 4000
Joined: Wed Aug 01, 2007 10:20 pm
Location: Alabama
Contact:

Post by spacious_mind »

I played another game between Gavon JFresh v. 0.1a and CXG Advanced Star Chess.

[Event "Computer Test Match"]
[Site "Alabama"]
[Date "2015.02.26"]
[Round "?"]
[White "GAVON JFRESH V.0.1A, LV AT30."]
[Black "CXG ADVANCED STAR CHESS, LV A7 30S."]
[Result "1/2-1/2"]
[ECO "D27"]
[PlyCount "81"]
[EventDate "2015.02.26"]
[EventCountry "USA"]

1. d4 d5 2. c4 dxc4 3. Nf3 Nf6 4. e3 e6 5. Bxc4 a6 {GAVON JFRESH V.0.1A OUT OF
BOOK} 6. O-O c5 7. Nc3 {CXG ADVANCED STAR CHESS OUT OF BOOK} b5 8. Bd3 cxd4 9.
Nxd4 e5 10. Nf3 Nc6 11. Be4 Nxe4 12. Qxd8+ Nxd8 13. Nxe4 Bf5 14. Ng3 Bd3 15.
Rd1 Bc2 16. Rd2 Rc8 17. Nxe5 Bb4 18. Rd4 a5 19. Rd5 Bc5 20. Bd2 b4 21. Rc1 Be7
22. f3 a4 23. Nh5 O-O 24. Nf4 Re8 25. Ned3 b3 26. axb3 axb3 27. Rb5 Nc6 28. Ne1
Nb8 29. Nd5 Bg5 30. f4 Bh4 31. g3 Bf6 32. Nxf6+ gxf6 33. Nxc2 bxc2 34. Bc3 Rxe3
35. Rxc2 Nd7 36. Rd2 Rc7 37. Kf2 Re7 38. Kg2 Re3 39. Kf2 Re7 40. Kg2 Re3 41.
Kf2 {CXG ADVANCED STAR CHESS ANNOUNCES DRAW BY 3 X REPETITION} 1/2-1/2

Same as the first game JFresh had the advantage but looses all ideas on how to proceed and win. Another draw by repetition. Pity JFresh is fun to play up to a point when it gets to an endgame.

Best regards
Nick
User avatar
spacious_mind
Senior Member
Posts: 4000
Joined: Wed Aug 01, 2007 10:20 pm
Location: Alabama
Contact:

Post by spacious_mind »

Next I wanted to test Gavon's next worst rated chess program after JFresh v. 01a, which is Rocinante v. 1.01 written by Antonio Torrecillas.

After playing the first three test games the score was as follows:

Game 1 = 1990
Game 2 = 1851
Game 3 = 2158

This surmounted to an average rating of 1995 after these 3 tests. CXG Advanced Star Chess in the same 3 tests had an average rating of 1975. So I decided to play another test match with Star Chess playing Rocinante v. 1.01. Star Chess of course is written by Kaare Danielsen.

[Event "COMPUTER CHESS MATCH"]
[Site "Alabama"]
[Date "2015.02.27"]
[Round "?"]
[White "CXG ADVANCED STAR CHESS, LV A7 30S."]
[Black "GAVON ROCINANTE V. 1.01, LV AT30."]
[Result "1-0"]
[PlyCount "105"]
[EventDate "2015.02.27"]
[EventCountry "USA"]

1. e4 c5 2. Nf3 d6 3. d4 cxd4 4. Nxd4 Nf6 5. Nc3 a6 6. Bd3 {GAVON ROCINANTE V. 1.01 OUT OF BOOK} e5 7. Nde2 Nbd7 {CXG ADVANCED STAR CHESS OUT OF BOOK} 8. O-O Be7 9. Nd5 Nxd5 10. exd5 O-O 11. Be3 Qa5 12. f4 Nf6 13. Nc3 Bg4 14. Qd2 Bd7 15. Rae1 Ng4 16. b4 Qxb4 17. Rb1 Qa3 18. Rxb7 Nxe3 19. Qxe3 Rad8 20. Rb3 Qa5 21. fxe5 dxe5 22. Qxe5 Bc5+ 23. Kh1 Bb5 24. Nxb5 axb5 25. Qe4 f5 26. Qe6+ Kh8 27. Rxb5 Qa7 28. Bxf5 Bd4 29. Qe4 Rde8 30. Qg4 Qxa2 31. d6 Qc4 32. Rbb1 Qc3 33. d7 Rd8

[fen]3r1r1k/3P2pp/8/5B2/3b2Q1/2q5/2P3PP/1R3R1K w - - 0 34[/fen]

CXG Star Chess pretty much dominated this game, but struggled to find the strongest continuation in a few earlier positions, which kept Rocinante hanging on to a lifeline. The above position I found however interesting because CXG Star Chess decided to sacrifice its Bishop for the King's Rook pawn on h7. It may not have been the strongest choice available, but the fact that it played a sound sacrifice, followed by a pretty combination that decidedly forced the win I found really interesting and exciting. The rest of the game continued as below:

34. Bxh7 Kxh7 35. Qh4+ Kg8 36. Rxf8+ Rxf8 37. d8=Q Rxd8 38. Qxd8+ Kh7 39. Rd1 Bf6 40. Qd3+ Qxd3 41. cxd3 Kg6 42. d4 Bh4 43. d5 Kf5 44. d6 Ke6 45. d7 Bd8 46. Kg1 Kf5 47. Kf2 Ke4 48. Re1+ Kf5 49. Re8 Bc7 50. Kf3 g5 51. g4+ Kg6 52. d8=Q Bxd8 53. Rxd8 1-0

Overall this was a really nice and exciting game. It really is fun to finally have some new opponents for the dedicated computers without having to resort to trickery like slowing the program down or artificially reducing its strength.

Best regards,
Nick
User avatar
spacious_mind
Senior Member
Posts: 4000
Joined: Wed Aug 01, 2007 10:20 pm
Location: Alabama
Contact:

Post by spacious_mind »

The final rating result for Gavon Rocinante v. 1.01 is ELO 1865

Image

Rocinante finishes 11 points better than Gavon JFresh v. 0.1a and 7 points worse than CXG Advanced Star Chess.

Here is a second game between Gavon Rocinante v. 1.01 and CXG Advanced Star Chess.

[Event "COMPUTER CHESS MATCH"]
[Site "Alabama"]
[Date "2015.02.28"]
[Round "?"]
[White "GAVON ROCINANTE V. 1.01, LV AT30."]
[Black "CXG ADVANCED STAR CHESS, LV A7 30S."]
[Result "1/2-1/2"]
[ECO "E11"]
[PlyCount "68"]
[EventDate "2015.02.28"]
[EventCountry "USA"]

1. d4 Nf6 2. c4 e6 3. Nf3 Bb4+ 4. Bd2 Qe7 {GAVON ROCINANTE V. 1.01 OUT OF BOOK}
5. g3 Nc6 6. c5 {CXG ADVANCED STAR CHESS OUT OF BOOK} O-O 7. a3 Bxd2+ 8. Nbxd2
e5 9. Nb3 e4 10. Nfd2 d5 11. e3 Bg4 12. Qc2 Rfe8 13. Bb5 Qe6 14. Bxc6 bxc6 15.
O-O Be2 16. Rfe1 Bd3 17. Qc3 Nd7 18. Nc1 Bb5 19. a4 Bc4 20. b3 Ba6 21. Qa5 Bb7
22. Qxc7 Ba6 23. Qa5 Nb8 24. Na2 Qg4 25. Rad1 f6 26. Qc3 Nd7 27. a5 Be2 28. h3
Qh5 29. g4 Qxh3 30. Rxe2 Qxg4+ 31. Kf1 Qh3+ 32. Kg1 Qg4+ 33. Kf1 Qh3+ 34. Kg1
Qg4+ {CXG ADVANCED STAR CHESS ANNOUNCES DRAW BY REPETITION.} 1/2-1/2


[fen]r3r1k1/p1p2ppp/2p1qn2/2Pp4/3Pp3/PNQbP1P1/1P1N1P1P/R3R1K1 w - - 0 17[/fen]

CXG Advanced Star Chess missed a good opportunity in the above position. The played move 17. ... Nd7? was not good. 17. Qh3 would have caused all sorts of problems for White.

Best regards
Nick
User avatar
spacious_mind
Senior Member
Posts: 4000
Joined: Wed Aug 01, 2007 10:20 pm
Location: Alabama
Contact:

Post by spacious_mind »

I added another Gavon chess program to the tests. This time it is RedQueen v. 0.4 written by Ben-Hur Carlos Vieira Langoni Junior.

Image

RedQueen v. 0.4 final score was 1972. Next I will play it against CXG Legend to see how it performs.

For fun I created the chart below which shows the Old Masters ratings, Manufacturer ratings and ELO ratings taken from Schachcomputer.Info and CCRL 40/4.

Image

Some of the manufacturers I had to estimate as I did not have their official Manufacturer ratings available. If someone can fill in the gaps, then that would be great.

For Gavon I used Gavon's published rating.

Anyway sometimes it is fun for me just to see something graphically.

Best regards
Nick
User avatar
spacious_mind
Senior Member
Posts: 4000
Joined: Wed Aug 01, 2007 10:20 pm
Location: Alabama
Contact:

Post by spacious_mind »

That was a tough game!

[Event "COMPUTER TEST MATCH"]
[Site "Alabama"]
[Date "2015.03.01"]
[Round "?"]
[White "CXG SPHINX LEGEND, LV 53."]
[Black "GAVON REDQUEEN V. 0.4, LV AT30."]
[Result "1-0"]
[WhiteElo "1792"]
[BlackElo "1689"]
[PlyCount "173"]
[EventCountry "USA"]

1. e4 c5 2. Nc3 Nc6 3. g3 g6 4. Bg2 Bg7 5. d3 e6 {CXG SPHINX LEGEND OUT OF BOOK
} 6. Nf3 {GAVON REDQUEEN V. 0.4 OUT OF BOOK} d5 7. exd5 exd5 8. Bg5 Nge7 9. O-O
Qb6 10. Qe2 Be6 11. Bxe7 Nxe7 12. Rfb1 Bf6 13. Ne5 d4 14. Na4 Qb5 15. b3 Rb8
16. c4 Qa5 17. Re1 O-O 18. Qf3 Nf5 19. Re2 Rfe8 20. Qf4 g5 21. Qf3 Qc7 22. Rae1
h6 23. Ng4 Bg7 24. Nb2 Qa5 25. Ra1 Qc3 26. Rb1 Qa5 27. a4 Qb6 28. Nd1 Qd6 29.
h3 h5 30. Nh2 h4 31. Nf1 Be5 32. Qg4 Bf6 33. Qf3 Be5 34. g4 Ne7 35. Nd2 Bh2+
36. Kh1 Bf4 37. Ne4 Qe5 38. Nb2 Nc6 39. Ra1 f5 40. gxf5 Qxf5 41. Ree1 Re7 42.
Qe2 Rd8 43. Qd1 a6 44. Qe2 Bf7 45. Ra3 Bg6 46. Qd1 Rc7 47. Ra1 Ne5 48. Qe2 Rc6
49. Rg1 Re8 50. Rae1 Qe6 51. Nxg5 Qf5 52. Bd5+ Kh8 53. Be4 Qd7 54. Bxc6 Qxc6+
55. Ne4 Nxd3 56. Rxg6 Qxe4+ 57. Qxe4 Nxf2+ 58. Kg2 Rxe4 59. Rxe4 Nxe4 60. Rg4
Nd2 61. Rxh4+ Kg7 62. Rg4+ Kh8 63. Rxf4 Nxb3 64. Rf3 Na5 65. Rf5 Nb3 66. Nd3 b6
67. Rf6 b5 68. cxb5 axb5 69. axb5 c4 70. Ne5 c3 71. b6 Nc5 72. Rd6 c2 73. Rd8+
Kg7 74. b7 Nxb7 75. Rc8 Kf6 76. Nd3 Nd6 77. Rxc2 Kf5 78. Kf3 Ne4 79. Nc5 Ng5+
80. Kg2 Ne4 81. h4 Nxc5 82. Rxc5+ Ke6 83. h5 d3 84. Rc3 d2 85. Rd3 Kf5 86. Rxd2
Kg5 87. Rd5+ 1-0

Best regards
Nick
User avatar
spacious_mind
Senior Member
Posts: 4000
Joined: Wed Aug 01, 2007 10:20 pm
Location: Alabama
Contact:

Post by spacious_mind »

Here is the second game between Gavon RedQueen V. 0.4 against CXG Sphinx Legend.

[Event "COMPUTER TEST MATCH"]
[Site "Alabama"]
[Date "2015.03.01"]
[Round "?"]
[White "GAVON REDQUEEN V. 0.4, LV AT30."]
[Black "CXG SPHINX LEGEND, LV 53."]
[Result "1/2-1/2"]
[ECO "B24"]
[WhiteElo "1792"]
[BlackElo "1689"]
[PlyCount "114"]
[EventCountry "USA"]

1. d4 Nf6 2. c4 e6 3. Nc3 d5 4. Nf3 c6 5. e3 Bd6 {GAVON REDQUEEN V. 0.4 OUT OF BOOK} 6. c5 {CXG SPHINX LEGEND OUT OF BOOK} Be7 7. Bd3 b6 8. cxb6 axb6 9. e4 dxe4 10. Nxe4 Nxe4 11. Bxe4 O-O 12. Ne5 Bb4+ 13. Kf1 Bb7 14. Be3 f6 15. Qb3 Qe7 16. Nd3 Ba5 17. Bf5 Re8 18. Nf4 Bc8 19. Kg1 Bd7 20. Be4 Qd6 21. Rd1 Na6 22. Ne2 Nb4 23. Bf4 Qe7 24. Qg3 f5 25. Bd6 Qf7 26. a3 Nd5 27. Bf3 b5 28. Nf4 Kh8 29. Nd3 Qf6 30. Ne5 Red8 31. Nxd7 Rxd7 32. Be5 Qh6 33. h4 Bd8 34. h5 Be7 35. Rh3 b4
36. axb4 Bxb4 37. Be2 Rda7 38. Qb3 Rc8 39. Bc4 Be7 40. Qc2 Rd8 41. Bb3 Ra6 42. Bc4 Rb6 43. Rb3 Rxb3 44. Bxb3 Qxh5 45. Qxc6 Qg6 46. Ra1 Kg8 47. Ra8 Rxa8 48. Qxa8+ Bf8 49. Bxd5 exd5 50. Qxd5+ Kh8 51. Qc4 Qh5 52. Qb3 Qh6 53. Qf7 Qc1+ 54. Kh2 Qh6+ 55. Kg1 Qc1+ 56. Kh2 Qh6+ 57. Kg1 Qc1+ {CXG SPHINX LEGEND ANNOUNCES DRAW BY REPETITION} 1/2-1/2

This was a really well played game by both programs, Redqueen v. 0.4 slightly outplayed Legend right up to the end, where unfortunately Redqueen shows a lack of rules knowledge by not seeing the draw by repetition. Kg3 should have won this game for RedQueen.

Sofar unfortunately the last 3 low end Gavon programs (JFresh, Rocinante and RedQueen) have the same problems with recognizing draw by repetition. Which is really a pity because they play some interesting and exciting chess.

Best regards
Nick
User avatar
paulwise3
Senior Member
Posts: 1505
Joined: Tue Jan 06, 2015 10:56 am
Location: Eindhoven, Netherlands

Post by paulwise3 »

Hi Nick,
1. The H8-bug strikes again with my Saitek Cougar. Finally reading the manual more carefully, I found out that it has two search algorithm settings: Selective and Brute Force. So I decided to start testing the BF mode. And in the first testgame it produced the same odd move as in the selective mode (18. ... Rg8), but this time it came as an immediate response, so some lights started burning: this must be another example of the famous Frans Morsch H8-bug ;-).

2. Today I got a second Sphinx Comet (bought it for a friend) and decided to also test it with testgame2. There were again variations in the moves, it was weaker then the first Comet. So I tried it a second time after totally resetting it. And now the result was even worse...
So I start to wonder if the 30-second level is the right level to test this machine.
In one of the other testgames it found mate in two only after I raised the level to 60 seconds. And in a free game I played today a situation came along that it made a move to prevent material loss, but giving me the opportunity for mate in two. So that is probably 5 ply it has to search. Only after raising the level to 10 minutes (B4) it played a move to prevent the fast mate.

To round it off, I will test my first machine one more time...

Regards, Paul.
User avatar
spacious_mind
Senior Member
Posts: 4000
Joined: Wed Aug 01, 2007 10:20 pm
Location: Alabama
Contact:

Post by spacious_mind »

paulwise3 wrote:Hi Nick,
1. The H8-bug strikes again with my Saitek Cougar. Finally reading the manual more carefully, I found out that it has two search algorithm settings: Selective and Brute Force. So I decided to start testing the BF mode. And in the first testgame it produced the same odd move as in the selective mode (18. ... Rg8), but this time it came as an immediate response, so some lights started burning: this must be another example of the famous Frans Morsch H8-bug ;-).

2. Today I got a second Sphinx Comet (bought it for a friend) and decided to also test it with testgame2. There were again variations in the moves, it was weaker then the first Comet. So I tried it a second time after totally resetting it. And now the result was even worse...
So I start to wonder if the 30-second level is the right level to test this machine.
In one of the other testgames it found mate in two only after I raised the level to 60 seconds. And in a free game I played today a situation came along that it made a move to prevent material loss, but giving me the opportunity for mate in two. So that is probably 5 ply it has to search. Only after raising the level to 10 minutes (B4) it played a move to prevent the fast mate.

To round it off, I will test my first machine one more time...

Regards, Paul.
Hi Paul,

Yes the H-Bug although it doesn't happen too often, can sometimes ruin a good game for Cosmos and the Morsch computers that have the same bug. It is unfortunate.

Regarding the Comet, you could try other level settings to see how it performs. I usually only play 30 seconds per move or 3 minutes per move. Perhaps at 3 minutes per move it will get a chance to perform better. I have noticed though that this is not always the case, often at a higher level, the computer will throw away a previously good move and replace it with a worse move which it just happens to find good at the next ply.

Alain Zanchetta, did some tests with this playing with Fidelity Par Excellence, doing the same rating test. The levels he playing if I recall were something like 15 seconds per move, 30 seconds, 1 minute, 3 minutes and 15 minutes per move.

The Par Excellence scored best in that test game if I remember correctly at either 30 seconds or a minute per move. You would have thought 15 minutes per move would have scored best but it didn't. The sweet spot for the Par Excellence was 30-60 seconds per move. Besides who plays at longer settings than 3 minutes per move? I don't know of anyone that does :P

I personally don't think scoring a computers strength at for example an hour a move does a lot for me since I or no one else plays at that setting so the rating really becomes irrelevant when you play. But that's just my opinion.

Best regards
Nick
User avatar
spacious_mind
Senior Member
Posts: 4000
Joined: Wed Aug 01, 2007 10:20 pm
Location: Alabama
Contact:

Post by spacious_mind »

Below is another chart that I created in order to be able to take a different look at the performance of the top 4 dedicated computers tested so far.

Image

I find this chart interesting for many reasons:

1) You can see quite clearly I think the different programmers.
2) The two King programs almost follow the same pattern with CM being the stronger.
3) The most consistent performances for all the programs are recorded in Games 1, 3 and 5.
4) Except for Morsch every program scores the worst in game 4.
5) Lang absolutely shows the worst ELO difference from best score to worst score. Which means that you might be able to expect the biggest up and down gap when you play a Lang. (Or at least when you play Mephisto Berlin :P )
6) But Morsch is just as surprising, when everyone else plays a good game as in games 2 and 4 then Morsch might play extra bad.
7) The opposite applies in game 4. Morsch plays ok whereas everyone else shows a much bigger angled performance drop.

Don't you just love to look at charts! :P
Nick
User avatar
spacious_mind
Senior Member
Posts: 4000
Joined: Wed Aug 01, 2007 10:20 pm
Location: Alabama
Contact:

Post by spacious_mind »

I have added 7 new programs to the rating list making it a total of 20 that have completed all 5 tests:

The new additions are:

TASC CM 512K-32 GIDEON 3.1 NORMAL
DX2-66MHZ SARGON V
SAITEK ANALYST D - 6 MHZ
SAITEK SPARC
SATEK MAESTRO D - 12 MHZ
SAITEK MAESTRO B - 6 MHZ
SAITEK RENAISSANCE

Image

There are a couple of surprises. Saitek Analyst D 6 MHz for example performed better than Saitek Sparc and Saitek Maestro D 12 MHz.

I also added Tasc CM 512-32 MHz Gideon 3.1 Madrid with Normal setting. This is actually the fastest CM Card and Gideon 3.1 performed well finishing ahead of King.

I also played Sargon V in DOS on a 486-66 MHz DX2 computer, in order to compare it with Saitek Sparc. The chart below shows quite clear the strong relationship between Saitek Sparc and Sargon V:

Image

If you follow Saitek Sparc and Sargon V you will see that the pattern is almost identical. Difference being mainly I think the DOS computer runs a lot faster at 66 MHz. But you can also see how atrocious both played in game 4. This game was their nemesis just like it is for Lang's Mephisto Berlin. Both Lang and Spracklen are absolutely clueless in game positions that are similar to game 4.

In contrast take a look at Gideon 3.1 it scored an unbelievable 2476 in game 4!!

Had Sargon V played game 4 better it may well have ended up as the top rated program. In game 3 for example it scored top with 2768 ELO and then drops down to 1630 ELO in game 4. That is a drop of 1138 points! Amazing inconsistency. Sparc was not any better, it dropped from high in game 3 of 2611 down to 1656 in game 4!.

I think the chart and test games really help to get a better understanding of strengths and weaknesses.

Most well balanced programmer by the look of it is Ed Schroeder in these tests.

Best regards
Nick
Brian B
Member
Posts: 74
Joined: Mon Jun 09, 2014 10:37 pm

Post by Brian B »

spacious_mind wrote:
I also played Sargon V in DOS on a 486-66 MHz DX2 computer, in order to compare it with Saitek Sparc. The chart below shows quite clear the strong relationship between Saitek Sparc and Sargon V:



If you follow Saitek Sparc and Sargon V you will see that the pattern is almost identical. Difference being mainly I think the DOS computer runs a lot faster at 66 MHz. But you can also see how atrocious both played in game 4. This game was their nemesis just like it is for Lang's Mephisto Berlin. Both Lang and Spracklen are absolutely clueless in game positions that are similar to game 4.

In contrast take a look at Gideon 3.1 it scored an unbelievable 2476 in game 4!!

Had Sargon V played game 4 better it may well have ended up as the top rated program. In game 3 for example it scored top with 2768 ELO and then drops down to 1630 ELO in game 4. That is a drop of 1138 points! Amazing inconsistency. Sparc was not any better, it dropped from high in game 3 of 2611 down to 1656 in game 4!.

I think the chart and test games really help to get a better understanding of strengths and weaknesses.

Most well balanced programmer by the look of it is Ed Schroeder in these tests.

Best regards
This is very interesting Nick, thanks for doing these tests with the additional computers plus the charts. Does the Sparc and Sargon V pass the clone test or are they too different for that? I wonder if you slowed down Sargon V if it would mimic the Sparc even more?

Thanks,
Brian B
User avatar
spacious_mind
Senior Member
Posts: 4000
Joined: Wed Aug 01, 2007 10:20 pm
Location: Alabama
Contact:

Post by spacious_mind »

Brian B wrote:
This is very interesting Nick, thanks for doing these tests with the additional computers plus the charts. Does the Sparc and Sargon V pass the clone test or are they too different for that? I wonder if you slowed down Sargon V if it would mimic the Sparc even more?

Thanks,
Brian B
Below is test game 3 which shows the similarities between Sparc and Sargon V. It is hard to say if they are exact clones because of the speed difference.

Image

2 move variations as White = 89.5% the same.
4 move variations as Black = 78.9% the same.

6 move variations for the whole game = 84.2% the same.

Considering the big speed difference the closeness is quite impressive. I do have a 386 Laptop which I will try sometime to see if it makes them more similar.

Best regards
Nick
User avatar
spacious_mind
Senior Member
Posts: 4000
Joined: Wed Aug 01, 2007 10:20 pm
Location: Alabama
Contact:

Post by spacious_mind »

I played Game 1 tonight with Sargon V on:

LAPTOP - DELL 325NC 386SL-25 MHZ
and
DESKTOP - MAGNAVOX MAXSTATION 386SX-16 MHZ

The table below shows the result compared with Saitek Sparc and DX486-66 Mhz computer.

Image

Well it looks as if the Sparc processor is faster than a 386 computer. Perhaps a 486-25 Mhz or 486-33 MHz is needed for a closer comparison of Sargon V to Saitek Spark.

The 386-25 MHz laptop and 386-16 Mhz desktop played every move exactly the same in the above test.

Best regards
Nick
Post Reply