Hi Brian,Brian B wrote:Hi Nick, thanks for your thoughts.spacious_mind wrote:Hi Brian,Brian B wrote:First off, I really enjoy reading about these tests and the relative performance of these computers against each other and the occasional human. It is an incredible amount of work and I do appreciate the effort.
I was wondering if there should be some way to weight certain moves at a higher level? Granted, if a computer makes a bonehead move, it will pay the price with a very low rating. It seem to me that a rating of zero wouldn't penalize a computer enough if this move came at a critical point in the game. How many games are decided not by great moves, but by one costly mistake? These results for these computers are relative to each other, yet other than having all these computers play games against other computers, I wish there was a way to identify a key move and grade accordingly.
Regards,
Brian B
Each move is weighted from very bad to very good based on a consistent evaluation formula not changed by a human. The problem with a human adding extra weight beyond a consistent formula for certain moves is that now you are being biased again because a human's opinion differs and does not treat everything equal. Consistent formulas however do.
My rating really does not care who is being evaluated and my opinion on what is good or bad is irrelevant, since my opinion is not a factor in the evaluation.
Best regards
I wasn't thinking of a human evaluation, I was thinking of a computer evaluation of the resulting position after a given move. So, not an evaluation of the move, but of the position after the move. Not sure how it could be done, but if there were a way to rate the actual position after a move as either winning or losing it would be interesting. This is being done now to some extent by rating the move itself, yet I don't see a true order of magnitude. How bad is a really bad move? I understand that this idea isn't very practical, just food for thought.
Regards,
Brian B
Rewarding bad by scale I think does not do a lot since bad = bad. What you do instead is reward accordingly by scale the better moves. The magnitude grows through the total number of good moves compared to the number of bad moves. So I think the overall achievement is reflected by the overall score of all the evaluated moves combined.
If you are grading too few moves then your universe is too small for overall accuracy. I.E. 40 hand picked test positions = too small a universe and to some degree biased because of the hand picking aspect of the positions chosen. Even if it is done by consensus it is still opinionated.
Anyway those are my thoughts on this.
best regards