e4-homie wrote: ↑Fri Dec 15, 2023 3:16 pm
Analyzing the same middlegame position, there is a difference between the Brew installation vs. building from source. Stockfish 16 with Brew results in 4.4 M nps whereas Stockfish 16 built from source, per HiarcsApple's instructions above, results in 9.1 M nps.
Merry Christmas
Your results are fine.
The results will be often slightly different due to many reasons like:
Hardware (M1 vs M1 PRO vs M1 MAX vs M2 vs... vs M3 vs... / 32 vs 64 vs 128 GB RAM, RAM speed / tested on different positions like different middlegame positions and different time used / Stockfish support ARMv8, ARMv8.5, ARMv9, ARMv9.2 later ARMv9.5....)
Note that the net size which Stockfish is using also changed and will change in the future very often (hardware gets stronger = net size gets bigger).
I hope that you've tested Stockfish 16 brew which is indeed Stockfish 16 and the correct Stockfish 16 build from source, not the newest Stockfish-master which is the so called Stockfish 17 development version.
Brew installations was the first shot from Stockfish developers to have an easy and fast way to run Stockfish on new Apple M1 devices.
You can also run a benchmark on your hardware using Stockfish exec (double click on the icon and type the bench... and hit enter:
10 is the number of the CPU cores, so you need to change it to the number of the cpu cores you have like 8 or 10 or 12 or 16 or to 1 if you want the results of 1 core.
Stockfish Developer: bench 16 10 13 default depth
Ipman chess Stockfish 14.1: bench 1024 10 26 default depth nnue
https://ipmanchess.yolasite.com/amd--in ... ckfish.php
Note that it doesn't make much sense to compare Apple vs Intel and AMD results with old Stockfish 14.1 due to many reasons like:
The Stockfish net size was doubled = you will see half of the speed in reality.
Apple is using ARM cores and Intel+AMD doesn't.
Stockfish speed improvements on Intel+AMD devices are not really available in the last years, because the developers have done probably all possible speed improvements since Stockfish 1 and finished some years ago. At the beginning of the development / search for speed improvements, you saw often a speedup rain every day. Of course the speed improvements become smaller and smaller over the years.
Now the Stockfish developers could try to find speed improvements when using different Apple ARM CPUs. (But unfortunately at the moment only one of them has an Apple MacBook).
It wouldn't wonder me when they improved Stockfishs speed on Intel and AMD CPUs by a total of 500% or even 1000% in the last 15 years.
So now you know what we can expect from ARM CPUs.
...
Example:
Implement AffineTransformSparseInput for armv8
Implements AffineTransformSparseInput layer for the NNUE evaluation
for the armv8 and armv8-dotprod architectures. We measured some nice
speed improvements via 10 runs of our benchmark:
armv8, Cortex-X1 : 18.5% speed-up
armv8, Cortex-A76 : 13.2% speed-up
armv8-dotprod, Cortex-X1 : 27.1% speed-up
armv8-dotprod, Cortex-A76 : 12.1% speed-up
armv8, Cortex-A72, Raspberry Pi 4 : 8.2% speed-up (thanks Torom!)
https://github.com/Joachim26/StockfishN ... 8b571223bf
https://github.com/official-stockfish/S ... /pull/4719
Feel free to compare:
Author: AndrovT
Date: Sun Aug 6 21:22:37 2023 +0200
Timestamp: 1691349757
Implement AffineTransformSparseInput for armv8
Implements AffineTransformSparseInput layer for the NNUE evaluation
for the armv8 and armv8-dotprod architectures. We measured some nice
speed improvements via 10 runs of our benchmark:
armv8, Cortex-X1 : 18.5% speed-up
armv8, Cortex-A76 : 13.2% speed-up
armv8-dotprod, Cortex-X1 : 27.1% speed-up
armv8-dotprod, Cortex-A76 : 12.1% speed-up
armv8, Cortex-A72, Raspberry Pi 4 : 8.2% speed-up (thanks Torom!)
closes
https://github.com/official-stockfish/S ... /pull/4719
No functional change
see source
https://abrok.eu/stockfish/?page=6
and
Author: ppigazzini
Date: Sun Aug 6 21:17:33 2023 +0200
Timestamp: 1691349453
Add new CPU archs in CI Tests workflow
Add CPU archs: armv8-dotprod, riscv64 and ppc64le.
The last two archs are built using QEMU multiarch docker container.
References:
https://docs.docker.com/build/building/multi-platform/
https://github.com/docker/setup-buildx-action
https://github.com/docker/setup-qemu-action
https://github.com/tonistiigi/binfmt
https://stackoverflow.com/questions/724 ... a-containe
closes
https://github.com/official-stockfish/S ... /pull/4718
No functional change
see source
Results:
NEW:
Total time (ms) : 1067
Nodes searched :
1350831
Nodes/second :
1266008
vs
OLD:
Total time (ms) : 1874
Nodes searched :
1350831
Nodes/second :
720827
https://github.com/official-stockfish/S ... /pull/4719
That's a speed up of 75.63%.
They also had a +5% speed improvement before.
It's much faster compared to non Apple ARM CPUs.
It happens some time after they doubled the net size.
Now you know how difficult it is to compare Apple vs Intel and AMD CPUs at ipmanchess with very old Stockfish vs newest Stockfish and newest hardware.