Another test game for Nick

This forum is for general discussions and questions, including Collectors Corner and anything to do with Computer chess.

Moderators: Harvey Williamson, Steve B, Watchman

Forum rules
This textbox is used to restore diagrams posted with the fen tag before the upgrade.
Post Reply
Reinfeld
Member
Posts: 486
Joined: Thu Feb 17, 2011 3:54 am
Location: Tacoma, WA

Another test game for Nick

Post by Reinfeld »

I'd like to propose a new game for the tests in this exercise Nick started, under similar conditions. Anyone can try it, and perhaps Nick can use his rating system/algorithm to devise measurements (that's the part I don't know how to do). Other wrinkles may add some spice. I'll start the ball rolling with results from some machines we've been discussing. Those are at the end.

The game is GM Lobron v Deep Thought 2, played in 1991. It appears in Daniel King's "How Good is Your Chess" (1993, Cadogan) a rate-your-play book with 20 games. The format is familiar: guess the move, get points.

It's an interesting game for the computer v human factor, and Lobron won, so that's another plus. But testing could answer a different question - how strong was Deep Thought, really? Historical ratings put it between 2550 and 2600. Can any of our machines crush it like a bug?

The errors in this game are subtler than the other test games (15th-c masters and the Turk in 1770). The mistakes are not glaring - no obvious crushers. Playing the game through gives the impression of Lobron outrunning Deep Thought.

Here's the game. The test in King's book starts at White's MOVE 11:

[Event "IBM Cup"]
[Site "Hannover 1991"]
[Date "2013.06.13"]
[Round ""]
[White "Lobron"]
[Black "Deep Thought 2"]
[ECO "A11"]
[Result "1-0"]

1.Nf3 d5 2.g3 c6 3.Bg2 Bg4 4.c4 e6 5.b3 dxc4 6.bxc4 Nd7 7.Bb2
Qb6 8.Qc2 Ngf6 9.O-O Bd6 10.d3 O-O 11.Nbd2 e5 12.Rab1 Qa6 13.h3
Be6 14.Ng5 (14.d4) 14. ... Bf5 15.Bc3 (15.Ba1) (15.e4) 15. ... Nc5
16.e4 Bg6 17.f4 exf4 18.gxf4 Na4 19.Ba1 (19.f5) (19.e5) 19. ... Nd7
20.e5 Bc5+ 21.Kh2 Be3 22.Nge4 Bxd2 23.Nxd2 Bf5 24.Be4 Bxe4 25.Nxe4
Nac5 26.Nd6 b6 27.Rg1 g6 28.f5 Nb7 29.Ne4 Qa3 30.Qd2 Nbc5 31.Qh6
Qxa2+ 32.Rg2 1-0

I ran three machines through this test tonight at avg 60s/m:

- Systema Challenge (Level 58 )
- Excalibur Igor (Level 58 )
- Fidelity Designer 2000 (no display)(Level 9)

I added Fidelity because I want to see the minor strength differences among the D2000 and D2100 models (display vs no display). I've got three of them - only missing D2100 (sans display, model 6103).

Also another chance to compare Challenge and Igor. I know, I know - Nick says it's all about tuning, can't judge the programmer. But I couldn't help noticing the two machines disagreed on 65 percent of the moves for White and Black.

Here are the moves for all three machines. Rather than adding complexity, I'm starting all of them at MOVE 11:

IGOR

White
11 Nd2
12 Bc3
13 Bc3
14 Ng5
15 Be4
16 Ne4
17 Nb3
18 gxf4
19 f5
20 f5
21 Kh2
22 Nde4
23 Qxd2
24 Be4
25 Nxe4
26 Nd6
27 Rg1
28 Bc3
29 Qb2
30 e6
31 Rg3
32 Rb2

BLACK
11 Bxf3
12 Qa6
13 Bf5
14 Rfe8
15 Nc5
16 Bd7
17 Nh5
18 h6
19 h6
20 Bc5+
21 Be3
22 Nac5
23 Rfe8
24 Bxe4
25 Nac5
26 b6
27 Qa3
28 Rad8
29 Nbc5
30 Nbc5
31 Nxe4

SYSTEMA CHALLENGE

WHITE
11 Nd2
12 e3
13 Bc3
14 a4
15 e4
16 e4
17 Ngf3
18 gxf4
19 Be5
20 f5
21 Kh2
22 Rf3
23 Qxd2
24 Ne4
25 Nxe4
26 Nxc5
27 d4
28 d4
29 fxg6
30 e6
31 Nf2
32 Rb2

BLACK

11 Rfd8
12 Qc7
13 Be6
14 Bf5
15 h6
16 Bg6
17 exf4
18 Be7
19 Be7
20 Bc5+
21 Be7
22 Nac5
23 Bf5
24 Bxe4
25 Nab6
26 Rab8
27 g6
28 Nb7
29 Nbc5
30 Nbc5
31 Nxe4

FIDELITY D2000

WHITE

11 Nc3
12 Rab1
13 Bc3
14 Ng5
15 Bc3
16 e4
17 d4
18 gxf4
19 Be5
20 f5
21 Kh2
22 e6
23 Qxd2
24 Rg1
25 Nxe4
26 Nd6
27 Qg2
28 Bc3
29 Nxb7
30 Bc3
31 Bc3
32 Rb2


BLACK

11 e5
12 Qa6
13 Bf5
14 Bf5
15 h6
16 Bd7
17 exf4
18 Ncd7
19 Bc5
20 Bc5+
21 Bf5
22 Nac5
23 Bf5
24 Bxe4
25 Nac5
26 b6
27 g6
28 Rab8
29 Qa3
30 Nbc5
31 Nxe4

- R.
"You have, let us say, a promising politician, a rising artist that you wish to destroy. Dagger or bomb are archaic and unreliable - but teach him, inoculate him with chess."
– H.G. Wells
User avatar
spacious_mind
Senior Member
Posts: 4015
Joined: Wed Aug 01, 2007 10:20 pm
Location: Alabama
Contact:

Post by spacious_mind »

Hi Reinfeld,

Daniel King's book is a good one. It takes a quite a lot of work to create one of my rating games. So bear with me it might take me a little while to get it done.

Out of curiosity, how did the programs score with Daniel King's rating?

Best regards
Nick
Reinfeld
Member
Posts: 486
Joined: Thu Feb 17, 2011 3:54 am
Location: Tacoma, WA

Post by Reinfeld »

Hi Nick,

Note that King's book only scores White's moves. Here's what I've got so far (note that I ran several Fidelitys through the test):

KING SCORES (70 points possible)

(King categorizes scores from 21-30 as befitting a "club player.")

Excalibur Igor - 29
Fidelity Designer 2100 Display - 28
Fidelity Chesster Challenger - 27
Fidelity Designer 2000 - 27
Fidelity Designer 2000 Display - 27
Systema Challenge - 24

- R.
"You have, let us say, a promising politician, a rising artist that you wish to destroy. Dagger or bomb are archaic and unreliable - but teach him, inoculate him with chess."
– H.G. Wells
Post Reply