Reinfeld wrote:So R..your running a test with 1 Kaplan and 3 Morsch's?
seems a bit stacked to me
That's a perplexing comment, considering the title of the thread. RadioSmall insists that 2250XL is a Kaplan. If he's right, the test I ran compares *two* Kaplans against three Morschs. (If you concede Nick's use of the Corona, that's *three* Kaplans).
Apart from that, the stacking is the point. Using several Morsch machines gives you a baseline of behavior for comparison purposes. More clones should duplicate the behavior (do you hear the sound of inevitability, Mr. Anderson?) If the Morsch machines behave similarly that's worth noting - the pattern provides a basis for comparison. If RadioSmall is right, the 2250XL should behave less like a Morsch and more like a Kaplan - but it doesn't. Just the opposite.
why compare computers with wildly different ratings?
Kaplan produced engines rated up to 2000 Elo and Morsch released computers as low as mid 1800's
For one thing, I own only one Kaplan - the 2150L. Thus, my means are limited.
However, the 2150L was the machine specifically proposed by RadioSmall earlier in the thread after I volunteered. See below:
Reinfeld:
Obviously, the best direct test is 2250XL and another Morsch machine vs the best *verified* Kaplan machine (chosen by RadioSmall). Gentlemen, start your engines.
RadioSmall:
A good test machine in the RS 2150L , which is a verified Kaplan program .
Earlier in the thread, we discussed the strongest known Kaplan machines, sans modules. I suggested Turbo King II and Simultano - alas, I don't own them. For my money, they would provide a better comparison.
there is still something that bothers me about handpicking some test positions or even the BT test suites .. which were never meant for clone identification...but rather.. meant for determining the relative strength of a computer compared to other computers..and now using them as iron clad clone detecting barometers
Again, some parsing - these are not test positions or test suites invented to measure computers. These are master games, published in a 1957 rate-yourself book by Leonard Barden. We all grew up with these types of books. Ditto for the solitaire chess feature that appeared for decades in Chess Life.
I don't know about anyone else, but I've always gotten a kick out of comparing tabletops and master games. The Barden book does the same thing - it just adds a point value (admittedly the scoring scale is a bit arbitrary). Like Nick, I'm interested in the validity of testing a machine in this manner. It certainly seems to be a fair measure of strength. The question is whether it can also be used as a clone detector. My instinct is it can.
I am not suggesting that these tests are "iron clad clone barometers." I am suggesting that they provide one more way (among many) to measure machine behavior. Is anyone disputing that machines by the same programmer tend to play a higher percentage of similar moves? If that's true, it seems fair to measure a disputed machine by the same yardstick. If that machine behaves more like one programmer than another, it's reasonable to infer that the machine is related to said programmer. That's not iron clad, by any means - but it seems *reasonable.*
I understand the criticism that differently rated machines will play differently, and it would be better to have closer approximations of strength. That's the power of the forum, of course - anybody can add a machine to the test and build a bigger dataset.
At the same time, I see wisdom in this earlier point from Nick, because I've seen it so many times:
A different program will do exactly the same thing at different speeds. The results remain constant.
With that in mind, look at his results with the Corona, since that's (purportedly) a stronger machine, much closer to the known Morsch group. Recall that the original contention is that the 2250XL is a Kaplan.
Similarity scores:
2250XL-2150L – 40%
2250XL-Corona – 45%
2250XL-2200X - 75%
2250XL-GK 2100 – 85%
2250XL-Explorer Pro – 90%
2250XL-TC 2100 - 100%
- R.