Human game Appleshampogal vs Novag Obsidian

This forum is for general discussions and questions, including Collectors Corner and anything to do with Computer chess.

Moderators: Harvey Williamson, Steve B, Watchman

Forum rules
This textbox is used to restore diagrams posted with the fen tag before the upgrade.
User avatar
Monsieur Plastique
Senior Member
Posts: 1014
Joined: Thu Jul 03, 2008 9:53 am
Location: On top of a hill in eastern Australia

Post by Monsieur Plastique »

I am now also told the build date should actually appear on the bottom of the computer housing itself. However, not possessing an Obsidian, I cannot verify this. It is possible, however, that Steve's Obsidian was built in week 82 of the Martian year. That would work out mathematically as it probably would have taken from 1916 until now to fly to Earth from Mars given the ancient technology they had in those days.
Chess is like painting the Mona Lisa whilst walking through a minefield.
User avatar
Fernando
Admiral of the Fleet
Posts: 3059
Joined: Tue Jul 31, 2007 4:35 pm
Location: Santiago de Chile

Post by Fernando »

Monsieur Plastique wrote:I am now also told the build date should actually appear on the bottom of the computer housing itself. However, not possessing an Obsidian, I cannot verify this. It is possible, however, that Steve's Obsidian was built in week 82 of the Martian year. That would work out mathematically as it probably would have taken from 1916 until now to fly to Earth from Mars given the ancient technology they had in those days.
After reading all this long technical thread I feel myself prepared to present a job application to NASA.

Thanyou all

Fern
Festina Lente
User avatar
fourthirty
Full Member
Posts: 763
Joined: Fri Dec 06, 2013 8:46 pm
Location: San Francisco

Post by fourthirty »

appleshampogal wrote:Just for giggles, and other things... I conducted another few repetitions.

1. d4, d5
2. d4, d5
3. d4, d5
4. d4, c5!!

There it is. It isn't common, but as far as my Obsidian goes the c5 flank move is DEFINITELY present in Obsidian's repertoire.
Okay, now that the SF Giant's have defeated the Pirates in the Wildcard game I can relax & pull out Obsidian #2 to conduct another opening experiment.

I purchased Obsidian #2 in a quest to find a unit with a working NEXT BEST function. Unfortunately, even though the instruction manual does not include
addendum, I can confirm the NEXT BEST feature does not work. Sigh.

Obsidian #2 manual is coded 85-661-004 (does NOT include addendum regarding the Obsidian not having the NEXT BEST function)
manual copyright-2002
Date Code = 08W38 (on bottom of Obsidian and on the box)
Serial number 212636

Results of Openings Test (Level was set to Maximum Average Time Level AT16 to match Kat's game above):

RANDOM = 0
1.e4
1...e5 (19/20 95%)
1...c5 (1/20 5%)


What? Something other than e5? I decided to continue on to test fifty (50) King's Pawn e4 openings!

1.e4
1...e5 (48/50 96%)
1...c5 (2/50 4%)


Also, played a Queen's pawn opening:
1.d4
1...d5 (20/20 100%)

Never received the c5 response like Kat's unit.

RANDOM = 1
1.e4
1...e5 (20/20 100%)

With Queen's pawn opening:
1.d4
1...d5 (19/20 95%)
1...Nf6 (1/20 5%)



RANDOM = 2
1.e4
1...c5 (19/20 95%)
1...c6 (1/20 5%)


Again, 1...e5 is never played???

RANDOM = 3
1.e4
1...c5 (20/20 100%)


Not so random...

Well, Obsidian #2 appears to behave similar to unit #1.

Monty Python and the Quest for the Class A Obsidian regards,
Greg
User avatar
Monsieur Plastique
Senior Member
Posts: 1014
Joined: Thu Jul 03, 2008 9:53 am
Location: On top of a hill in eastern Australia

Post by Monsieur Plastique »

Thanks Greg. Well, at least your contributions confirm my hypothesis that serial number sequence is neither proof of relative build date nor does it have a relationship to good and bad Obsidians. Larry's machine at the very least has a much lower serial number than yours yet appears to be a "good" one. Kat's also has a much lower serial (similar range to Larry's) and is obviously a "good" one. As I noted in the other thread, I suspect anything from mid / late 2010 onwards should likely be a good one. But if there is no build date info on the matching box or bottom of the unit, no way to tell unless you have the unit in front of you to physically test.
Chess is like painting the Mona Lisa whilst walking through a minefield.
User avatar
fourthirty
Full Member
Posts: 763
Joined: Fri Dec 06, 2013 8:46 pm
Location: San Francisco

Post by fourthirty »

Monsieur Plastique wrote:Thanks Greg. Well, at least your contributions confirm my hypothesis that serial number sequence is neither proof of relative build date nor does it have a relationship to good and bad Obsidians. Larry's machine at the very least has a much lower serial number than yours yet appears to be a "good" one. Kat's also has a much lower serial (similar range to Larry's) and is obviously a "good" one. As I noted in the other thread, I suspect anything from mid / late 2010 onwards should likely be a good one. But if there is no build date info on the matching box or bottom of the unit, no way to tell unless you have the unit in front of you to physically test.
You're welcome Monsieur Plastique. It has been a very interesting (albeit frustrating) experiment.

Is the thought that the 200,000 series serial numbers were produced prior to the 100,000 series serial numbers?

If I'm reading the date codes correctly for my units:

S/N 202541 was built in 2007 [Date Code 07W43] (no NEXT BEST and limited openings)
S/N 212636 was built in 2008 [Date Code 08W38] (no NEXT BEST and limited openings)

Other members have reported the following S/Ns:
JMark's S/N 216670 (no NEXT BEST and limited openings)
Steve's S/N 133883 (NEXT BEST works)
Larry's S/N 143887 (NEXT BEST works)
Kat's S/N 120910 (NEXT BEST works)

Kat - Did you ever happen to find a date code on your box (or bottom of your Obsidian)?

Could we get some other members here to post Serial Numbers and date codes (excluding Steve's unit which was built on Mars)?

Greg
User avatar
Steve B
Site Admin
Posts: 10140
Joined: Sun Jul 29, 2007 10:02 am
Location: New York City USofA
Contact:

Post by Steve B »

fourthirty wrote:
If I'm reading the date codes correctly for my units:

S/N 202541 was built in 2007 [Date Code 07W43] (no NEXT BEST and limited openings)
S/N 212636 was built in 2008 [Date Code 08W38] (no NEXT BEST and limited openings)
Greg...

are you reading those date codes from the box or from the unit itself?
Larry's O had no date code on the unit ..nor does mine

i think we may have stumbled upon yet another Interesting discovery...

it seems that the Inferior Obsidians have date codes (possibly on the unit itself)that can be deciphered
07W43=2007 43rd Week
08W38=2008 38th week

while the Superior Obsidians have indecipherable date codes shown only on the box

Image
although so far ..my Obsidian is the only Superior O that has reported in with a date code ..so.. we are still working in the blind

Blind Date Regards
Steve
User avatar
Steve B
Site Admin
Posts: 10140
Joined: Sun Jul 29, 2007 10:02 am
Location: New York City USofA
Contact:

Post by Steve B »

The more i think about this the more absurd it all seems to me

why would Novag disengage a popular ..documented.. feature and at the same time limit the already small opening book ?
Perhaps as Jon theorizes..it was to eradicate some bug

still ..i wonder if this voluntary crippling of the Obsidian was actually an effort to ENHANCE some other aspect of the program?
perhaps...just perhaps...the Inferior Obsidians actually play a stronger overall game?
now that would be something

may i suggest that Larry and/or Pogal play a 10 game match Vs. Fourthirty and/or JMark to determine if one model is stronger then the other?

Obsurdian Regards
Steve
User avatar
Monsieur Plastique
Senior Member
Posts: 1014
Joined: Thu Jul 03, 2008 9:53 am
Location: On top of a hill in eastern Australia

Post by Monsieur Plastique »

Steve B wrote:perhaps...just perhaps...the Inferior Obsidians actually play a stronger overall game?
now that would be something
I strongly doubt that would be the case, given that by the time the Obsidian came along, playing strength was the last thing chess computer manufacturers cared about and the program was already playing really high quality chess as far back as the Zircon II incarnation. Well, you said it yourself, it really would be something.

However, further research today reveals a theory that our very own Fernando was actually a silent Director at Novag at this critical juncture in the company's history. Already despondent at having lost so many games to their product, he vetoed any notions to improve the program and instead implemented a very cunning plan to cripple the opening play in such a way that after only a few minutes study of a wrinkled copy of Batsford Chess Openings, he could have the machine begging for mercy by move 10.

However I feel we can dismiss the above theory since I do not believe Fernando possesses a copy of Batsford.

Case Still Open Regards
Chess is like painting the Mona Lisa whilst walking through a minefield.
User avatar
Monsieur Plastique
Senior Member
Posts: 1014
Joined: Thu Jul 03, 2008 9:53 am
Location: On top of a hill in eastern Australia

Post by Monsieur Plastique »

Steve B wrote:may i suggest that Larry and/or Pogal play a 10 game match Vs. Fourthirty and/or JMark to determine if one model is stronger then the other?
I think maybe a more streamlined idea might be to select just two or three test positions then set the machines to a fixed ply level (say 6 for a complicated middlegame or 9 for an endgame with just minor pieces and pawns) and see how long they take to come up with a reply and whether those replies are identical. That way the tests are not time consuming yet the results would be more definitive than a match where you'd need probably 40 games minimum just to eliminate the "random" factor due to major statistical variances.

And the fixed ply Novag levels are the same as the fixed time levels in that they use all the proper "tournament play" heuristics, etc, but of course we don't want to use fixed time levels because we want to see if the two versions take differing amounts of time to go through the same computations.

If the moves are different then that would be a strong indication the program has changed. Timings might vary by a very small percentage but this typical of batch to batch / machine to machine variances (say up to a second or three).

On the one hand it is plausible even just changing hardware created "the bug" as there might be an obscure instruction that causes the program to go into some sort of "soft" abnormal termination procedure for example. We saw hardware changes over the Opal series for example (started off as something like a 6301Y and ended life as one of those chip-on-board RISC type processors I think). The latter might explain why the later serial numbers seem to be affected, since it could well be a final hardware "improvement" (that might otherwise be transparent to the user) caused the issue. Novag might, for example, have needed to source different hardware of equivalent performance due to supplier issues.
Chess is like painting the Mona Lisa whilst walking through a minefield.
User avatar
fourthirty
Full Member
Posts: 763
Joined: Fri Dec 06, 2013 8:46 pm
Location: San Francisco

Post by fourthirty »

Steve B wrote: Greg...

are you reading those date codes from the box or from the unit itself?
Larry's O had no date code on the unit ..nor does mine

...Steve
Steve,

212636 has the small black date code sticker on the box and unit.
202541 has the small black date code sticker on the unit (unit was purchased without the box)

My inferior Obsidian may actually play a stronger overall game?

Cue the Rocky music regards...
Greg
User avatar
appleshampogal
Member
Posts: 126
Joined: Sun Jan 12, 2014 8:53 am

Post by appleshampogal »

Monsieur Plastique wrote:
Steve B wrote:may i suggest that Larry and/or Pogal play a 10 game match Vs. Fourthirty and/or JMark to determine if one model is stronger then the other?
I think maybe a more streamlined idea might be to select just two or three test positions then set the machines to a fixed ply level (say 6 for a complicated middlegame or 9 for an endgame with just minor pieces and pawns) and see how long they take to come up with a reply and whether those replies are identical. That way the tests are not time consuming yet the results would be more definitive than a match where you'd need probably 40 games minimum just to eliminate the "random" factor due to major statistical variances.

And the fixed ply Novag levels are the same as the fixed time levels in that they use all the proper "tournament play" heuristics, etc, but of course we don't want to use fixed time levels because we want to see if the two versions take differing amounts of time to go through the same computations.

If the moves are different then that would be a strong indication the program has changed. Timings might vary by a very small percentage but this typical of batch to batch / machine to machine variances (say up to a second or three).

On the one hand it is plausible even just changing hardware created "the bug" as there might be an obscure instruction that causes the program to go into some sort of "soft" abnormal termination procedure for example. We saw hardware changes over the Opal series for example (started off as something like a 6301Y and ended life as one of those chip-on-board RISC type processors I think). The latter might explain why the later serial numbers seem to be affected, since it could well be a final hardware "improvement" (that might otherwise be transparent to the user) caused the issue. Novag might, for example, have needed to source different hardware of equivalent performance due to supplier issues.


I like this idea. Of course, now there is the task of figuring out what the test positions should be!

Chesscomputeruk has a magazine with 20 chess problems for dedicated chess computers with instructions for evaluating the results which *may* be helpful. I definitely want input. Otherwise! I might suggest using a couple dozen positions from past games if that seems advisable.
User avatar
appleshampogal
Member
Posts: 126
Joined: Sun Jan 12, 2014 8:53 am

Post by appleshampogal »

fourthirty wrote:
Monsieur Plastique wrote:Thanks Greg. Well, at least your contributions confirm my hypothesis that serial number sequence is neither proof of relative build date nor does it have a relationship to good and bad Obsidians. Larry's machine at the very least has a much lower serial number than yours yet appears to be a "good" one. Kat's also has a much lower serial (similar range to Larry's) and is obviously a "good" one. As I noted in the other thread, I suspect anything from mid / late 2010 onwards should likely be a good one. But if there is no build date info on the matching box or bottom of the unit, no way to tell unless you have the unit in front of you to physically test.
You're welcome Monsieur Plastique. It has been a very interesting (albeit frustrating) experiment.

Is the thought that the 200,000 series serial numbers were produced prior to the 100,000 series serial numbers?

If I'm reading the date codes correctly for my units:

S/N 202541 was built in 2007 [Date Code 07W43] (no NEXT BEST and limited openings)
S/N 212636 was built in 2008 [Date Code 08W38] (no NEXT BEST and limited openings)

Other members have reported the following S/Ns:
JMark's S/N 216670 (no NEXT BEST and limited openings)
Steve's S/N 133883 (NEXT BEST works)
Larry's S/N 143887 (NEXT BEST works)
Kat's S/N 120910 (NEXT BEST works)

Kat - Did you ever happen to find a date code on your box (or bottom of your Obsidian)?

Could we get some other members here to post Serial Numbers and date codes (excluding Steve's unit which was built on Mars)?

Greg


Hi Greg,

Unfortunately, the gentleman I bought it from sold it to me with just the carrying case. He DID however indicate that he bought his "new" and that the only reason he was selling it was because he didn't use it enough. So I can't imagine this unit is very old at all.
User avatar
Steve B
Site Admin
Posts: 10140
Joined: Sun Jul 29, 2007 10:02 am
Location: New York City USofA
Contact:

Post by Steve B »

fourthirty wrote:
Steve B wrote: Greg...

are you reading those date codes from the box or from the unit itself?
Larry's O had no date code on the unit ..nor does mine

...Steve
Steve,

212636 has the small black date code sticker on the box and unit.
202541 has the small black date code sticker on the unit (unit was purchased without the box)

My inferior Obsidian may actually play a stronger overall game?

Cue the Rocky music regards...
Greg
Thanks for Info
interesting that your second unit has the sticker on it
seems the original owner knew enough to remove the sticker from the box and place it on the unit

it might play stronger but it seems instead of a match the consensus is to try out a few test positions which will only tell us if the program is different
not if its better or stronger

so sadly..we will never know the full Obsidian story

Sigh Regards
Steve
User avatar
Steve B
Site Admin
Posts: 10140
Joined: Sun Jul 29, 2007 10:02 am
Location: New York City USofA
Contact:

Post by Steve B »

Monsieur Plastique wrote:

That way the tests are not time consuming yet the results would be more definitive than a match where you'd need probably 40 games minimum just to eliminate the "random" factor due to major statistical variances.


If the moves are different then that would be a strong indication the program has changed. Timings might vary by a very small percentage but this typical of batch to batch / machine to machine variances (say up to a second or three).
Well not exactly
testing a few random positions will tell us if the program changed but not if that change made it play stronger

I realize that a 10 game match is not statistically significant for a rating but it would be a strong indication if one model overwhelms the other in a 10 game match
a score of 7-3 or 8-2 or better would clearly show this

you and I played several 2-3 game matches in the past
(most recent...Scisys Mark VI Vs. Gameboy)
where one side crushed the other and it was clear which computer was stronger
we did not need 40 games

perhaps the best way to proceed would be to try out the positions first
determine if there is a strong indication that there is difference in the program and then go for a match

Writing Chess Computer History Regards
Steve
User avatar
Monsieur Plastique
Senior Member
Posts: 1014
Joined: Thu Jul 03, 2008 9:53 am
Location: On top of a hill in eastern Australia

Post by Monsieur Plastique »

Steve B wrote:I realize that a 10 game match is not statistically significant for a rating but it would be a strong indication if one model overwhelms the other in a 10 game match
a score of 7-3 or 8-2 or better would clearly show this

you and I played several 2-3 game matches in the past
(most recent...Scisys Mark VI Vs. Gameboy)
where one side crushed the other and it was clear which computer was stronger
we did not need 40 games
Hi Steve,

The only conclusion I really drew from that match is that the Mark VI defeated the Gameboy. I did not draw a conclusion as to which computer was actually stronger even though others might have. I have plenty of matches in my database where a weaker machine soundly defeats a stronger one and the SSDF databases have many examples as well. I also have examples in various 10 game encounters where the same machine can go down 8 to 2 against itself and then the next time win 7 - 3. It is my experience that even the lowest level of statistical reliability is only achieved after around 40 games and at least 5 different opponents whose own ratings are within as close as possible range to the test machine's estimated rating. My concern is thus that given there is a very good chance the score will not be 5-all, a faulty conclusion will be reached.

Even after 100 games I consider a computer rating to be questionable though in general things tend to stabilise around the 40 game mark unless unsuitable opponents are used.

Perhaps the "worst" example I have ever encountered of isolated match "irrelevance" was where my Nintendo Fritz defeated the Mephisto MM IV by 10.5 to 1.5. This gave the Fritz a performance rating far in excess of it's actual rating of 1910, which is actually only around 110 points higher than the MM IV. And that was 12 game match, not a 10 game one.

However since I don't own an Obsidian I do not need to spend time playing the matches but again, I just cannot see it being possible to draw a conclusion on such a small sample. It might be of interest, but I can't see it providing anything definitive. In the end, I will actually be surprised if in the test positions there is any significant variation between the old and new machines unless there has actually been a significant hardware change that Novag did not make public.

I don't see any necessity to go beyond a few test positions because in my view the only alternative that will provide quality information is to simply rate both machines from scratch playing at least 100 games each against a number of different opponents. However others are of course free to test how they wish.
Chess is like painting the Mona Lisa whilst walking through a minefield.
Post Reply