Search or Evaluation?

xenophon · Post by **xenophon** » Mon Dec 03, 2007 5:13 pm

bob wrote:
xenophon wrote:
bob wrote:
ricard60 wrote:In chess computer tournaments i have seen the following:

1)A computer with an 8 bit processor beat a computer with a 16 bit processor (better software)

2)But i have never seen a computer with a better hardware and software lose to a computer with weaker hardware and software

Best
Ricardo
I've seen that. Fritz vs Deep blue prototype
Bad example. IIRC, Deep Blue was having a severe hardware problem in that game.

bob wrote: , Hong Kong, 1995 comes to mind. But there have been others. If you play a weak program/computer against a strong program/machine, the weak one will _still_ win some games...
That was the point. There are many different reasons why this will happen, from hardware/software failures. to just random luck.

Most people don't know there was a severely limiting circumstance for IBM in that game. Every time it is cited, the impression is given that Fritz beat a fully functioning Deep Blue Project, which is a false impression.

bob wrote:I would _never_ bet on a computer to beat another, no matter how big the rating difference, if the potential loss of bet is important.

bob · Post by **bob** » Mon Dec 03, 2007 6:48 pm

xenophon wrote:
bob wrote:
xenophon wrote:
bob wrote:
ricard60 wrote:In chess computer tournaments i have seen the following:

1)A computer with an 8 bit processor beat a computer with a 16 bit processor (better software)

2)But i have never seen a computer with a better hardware and software lose to a computer with weaker hardware and software

Best
Ricardo
I've seen that. Fritz vs Deep blue prototype
Bad example. IIRC, Deep Blue was having a severe hardware problem in that game.

bob wrote: , Hong Kong, 1995 comes to mind. But there have been others. If you play a weak program/computer against a strong program/machine, the weak one will _still_ win some games...
That was the point. There are many different reasons why this will happen, from hardware/software failures. to just random luck.
Most people don't know there was a severely limiting circumstance for IBM in that game. Every time it is cited, the impression is given that Fritz beat a fully functioning Deep Blue Project, which is a false impression.

bob wrote:I would _never_ bet on a computer to beat another, no matter how big the rating difference, if the potential loss of bet is important.

Two key points to make this correct...

1. it was not deep blue, it was "deep blue prototype" which was defined as "deep blue software running on deep thought hardware". It was not anywhere near the machine that lost to Kasparov in 1996, and that machine was not anywhere near the machine that beat Kasparov in 1997.

2. The machine was running fine. The problem was that there was a communication failure that required re-starting deep blue prototype, and it lost a significant amount of time in the process. Rather than having the usual amount of time where it had correctly predicted (and pondered) the opponent's move, it had to start from scratch having lost the correct pondering information and some clock time as well, and it quickly had a panic attack and moved too quickly, losing the game...

But, as I mentioned, that was my point. DB prototype was clearly far stronger than Fritz, but it lost the game. Never bet on a computer...

ricard60 · Post by **ricard60** » Mon Dec 03, 2007 8:01 pm

When i saw this match of a computer with an 8 bit processor beat a computer of a 16 bit processor was in a 6 game match, the 8 bit computer won 4 and the other 2 were drawn. I don't think this is a random luck.

Best
Ricardo

bob · Post by **bob** » Tue Dec 04, 2007 5:50 pm

ricard60 wrote:When i saw this match of a computer with an 8 bit processor beat a computer of a 16 bit processor was in a 6 game match, the 8 bit computer won 4 and the other 2 were drawn. I don't think this is a random luck.

Best
Ricardo

Possibly not. If program A is far better than program B, then A should win far more games. But not _every_ game. Luck is still there. But the central limit theorem still applies since the standard deviation is not zero.

I've had versions of my chess program that _anyone_ could easily beat. But the good programs can still lose to worse programs. 6 games doesn't say anything, as the probability of flipping 6 heads is 1/64, and 6 heads does not guarantee that the 7th flip will also be a head any more than it guarantees that the 7th flip will be a tail. Over 6 games anything _can_ happen. Over hundreds of games, anything _will_ happen. That was my point. DT was far stronger than fritz in 1995, anybody that believes otherwise is fooling themselves. But it lost a game do to non-chess-related circumstances, which is still a factor in playing games between computers...

m0nkee1 · Post by **m0nkee1** » Sat Dec 29, 2007 12:09 pm

Hi

Think there's a bit of luck in chess... won with this line once...
1 e4, e5, 2 bc4, nc6, 3 d3, na5, 4 bxf7.... think it forces win but there's 1000s of similar posistion where an early bxf7 is silly and looses -
There's an eliment of luck in stumbling onto a winning or loosing line.

Also when a computers searche 8 moves ahead it's only c 1.5% of the total number of moves. A sorting sytem rules out weak or illogical moves - this can be turned to your advantage.

If a piece is on prises - the computers tree of thought will
be based around moving or protecting in some way.
If you ignore the attacked piece & move another closer to your computers king... it'll carry on think about the exchange and not see the danger... think the quality of search matters as much as it's maximum depth

best regards

Simon

Uri Blass · Post by **Uri Blass** » Fri Mar 07, 2008 7:25 am

ed wrote:
Mark Uniacke wrote:I agree they are the two general purpose major advances although there are other ones that also offer extra Elo, like futility or even eliminating losing captures from the quiescence search. There are also many other search improvements which overlap with 1 & 2 and hence are much less effective, but without 1&2 existing then these other search improvements would be effective.

Also although not a general technique I have found search extensions to be extremely effective and I can see how implementation of search extensions could have another big impact on the strength (or otherwise) of a chess program.

I don't think it is possible to rely on the displayed search depths because some programs don't display their true depths. Additionally as you well know there are many factors, sometimes it is not the iteration depth but the importance of not pruning a critical new line of play that is key.

15 years ago a number of us were guilty of tuning against position test sets which were usually tactical. This was compounded because important publications like CSS used to run features on new programs performing against the BT test or the LCT2 test or ...

So it also became even commercially important to do well in these test sets. Of course there is a very loose relationship between test positions and chess strength in games and I think understanding the reasons for that are very important in understanding where chess strength in games comes from.

It is your last statement where we disagree. It is clear to me that the big + and - in Elo comes from search and the gradual accumulation of rating points comes from eval. I believe the two mentioned search breakthroughs above can be implemented in various different ways and the range of strength improvement is still quite large.

So I think there is plenty of room still to improve chess programs in both search and eval but I believe ultimately more strength comes from the search than it does from the eval.
Hi Mark and Uri,

Our discussion is as old as the birth of computer chess and we should keep it going and redo at times until we are in agreement.

My take, let's evaluate some history:

Richard Lang ruled the mid 80's till about 1992. Richard has admitted that his domination mainly came from his advantage of the better hardware he got from Mephisto. IOW: search dominance.

1991/92: Tasc came with the ChessMachine hardware 2 times faster than his hardware, Richard lost his world-title. Search dominance again.

1993/1994: Intel enters the scene, the Pentium, the end of the dedicated industry. You win the 1993 WCCC in Munich.

1996: Rebel8 tops the SSDF with a +60 elo gap. What was Rebel8 about? Search improvements mainly. Search rules.

1998: Fritz5, the introduction of a new concept, Nullmove. Frans rules the computer chess world for a couple years. Search, search, search...

Eventually other programs catch up, Fritz loses its superiority.

2000/2001: Shredder discovers LMR and tops all rating lists for years to come. Search did the trick again...

But, but, but... and this is crucial, eventually other programs catch up, ending the Shredder hegemony.

We are now living in the Rybka era, nobody yet knows its secret. Search only? The 2 of you seem to suggest it.

I disagree.

Why?

Although (I think) it's a proven fact that search has ruled the computer chess area (see above history) from its early existence till now the tendency can be quite misleading for the future, it certainly can't be automatically assumed as the 2 of you do.

My thesis: while it is certainly true that implementation matters it's wrong to assume this can explain the big difference between Rybka and the other tops. It's like saying Vas is a search genius, the rest is incompetent. I don't buy that, even if it is true eventually others will catch up.

Why claim the success of Rybka is search? There is no proof of that. Maybe your search (and others) is better than Rybka, who can tell?

Nowadays with ease the current tops are searching at 14-16 plies covering about all possible tactics, so what is left? IMHO, asking the question is answering the question.

Food for thought?

Ed

I think that today it is possible to test things because strelka2's code is free.

Change your evaluation to piece square table evaluation
Change Strelka to the same piece square table evaluation.

If your program lose against strelka then it is clear that search is the reason.

Uri