# Comparing the quality of backwards analysis strategies

Lichess aims to provide *deterministic*, *strong*, and *fast* chess game
analysis. There is some
tension between these goals.

Analysis is provided by volunteers, running the
*fishnet* client
(as version 2.7.0 running Stockfish 16)
on a wide variety of hardware. Determinism ensures users
can expect consistent quality, and bugs or manipulation can be reliably
identified.

Trying to optimize for quality while keeping computing power fixed, it is known that backwards analysis is efficient. When analysing the game backwards, move by move, the chess engine's hash table (also known as transposition table) will often contain highly relevant information about what happened later in the game.

However, only single-threaded analysis is deterministic. So on one extreme side, if we want to fully utilize the hash table, no other threads can contribute to quickly finishing the particular game. On the other extreme side, fishnet 2.7.0 analyses individual positions in parallel, each with a pristine hash table, finishing games very quickly.

More balanced strategies might be worth considering. To get some sharing and some parallelism, the game could be split into (possibly overlapping) chunks of consecutive positions.

What are we missing out on today, and how do all of these approaches stack up in terms of quality?

## Method

For forward play, the strength of players can be evaluated directly by looking at the outcome of their games, and we expect strong engines to provide high quality analysis. For a tweak that weakens the engine, like clearing the hash table after each move, an "equivalent" tweak is also expected to lower the quality of backwards analysis.

However, it is not immediately clear if the hash table is perhaps more
or less important in backwards analysis, or what the equivalent of overlap
in backwards analysis would even be.
So we look for a more direct way to
evaluate the quality of backwards analysis. Thanks *MoistvonLipwig*,
for the hints.

Here, 1000 games are randomly selected from rated games played in June 2023 database.lichess.org for which engine analysis had been requested.

The games are then re-analysed with Stockfish 16 at very high node limits, 100 meganodes for each position, using 1 GiB hash in single-threaded backwards analysis.

`analysis-1024-100000000-inf-0.pgn.zst`

The first 10 plies of each game are ignored for the following analysis. As a result, for each non-opening move, we now have an expected game theoretical outcome \(E_i\) between \(0\) and \(1\), based on Stockfish's WDL model, and the recommended primary variation.

The node limits used for the strategies we want to evaluate are about two orders of magnitude lower. Hopefully, from that perspective it's reasonable to accept these evaluations as near-objective truth, as close to the true evaluation of each position (\(0\), \(\frac{1}{2}\), or \(1\)) and optimal play as we can expect to get.

To evaluate a strategy, it is used to analyse the games to obtain evaluations \(\hat{E_i}\) and primary variations.

We score strategies by:

- The mean squared error of the evaluations: $$ \textrm{MSE} = \frac{1}{n} \sum^{n}_{i=1}(E_i - \hat{E_i})^2 $$
- The rate of mispredicted primary moves.

## Experiments

All of the following experiments are performed with Stockfish 16 in
single-threaded backwards analysis, and are thus exactly reproducible.
`UCI_AnalyseMode`

is always on.
We vary the following dimensions:

**Hash table size.**The size of the transposition table (`setoption name Hash`

).**Node limits**. Analysis uses a fixed node limit per position (`go nodes`

in the UCI protocol).**Chunk size**. For a chunk size of \( n \), the last chunk of \( n \) positions is analysed backwards, followed by clearing the hash table, followed by analysing the next chunk of \( n \) preceding positions, and so on.**Overlap**. For overlap size of \( n \), before analysing a chunk, the hash table is primed by analysing the following \( n \) positions first. Analysis of overlap positions is performed at the same node limit. This means more total resources spent. Additional experiments with adjusted node limits were performed.

### Node limits

### Hash table size

### Chunk size

## Discussion

Unsurprisingly, the ability to predict evaluations is basically the same as the ability to predict moves. The latter metric appears to be slightly less noisy.

The primary threat to validity is the choice of those metrics. If we go along with it, we see that:

- fishnet 2.7.0 leaves a lot of quality on table.
- Provided the hash table is used at all, the size is basically irrelevant at fishnet's relatively shallow node limit.
- At equal resources used, chunking, even with overlap, can close the gap between fishnet 2.7.0 and sequential backwards-analysis.

## Raw data

PGN | Hash | Nodes | Chunk size | Overlap | MSE | PV miss |
---|---|---|---|---|---|---|

`analysis-1024-100000000-inf-0.pgn.zst` | 1024 MiB | 100,000,000 | ∞ | 0 | 0.00000 | 0.00 % |

`analysis-1024-10000000-inf-0.pgn.zst` | 1024 MiB | 10,000,000 | ∞ | 0 | 0.00065 | 16.29 % |

`analysis-1024-2800000-inf-0.pgn.zst` | 1024 MiB | 2,800,000 | ∞ | 0 | 0.00092 | 19.13 % |

`analysis-1024-2500000-inf-0.pgn.zst` | 1024 MiB | 2,500,000 | ∞ | 0 | 0.00095 | 19.30 % |

`analysis-1024-2900000-inf-0.pgn.zst` | 1024 MiB | 2,900,000 | ∞ | 0 | 0.00093 | 19.31 % |

`analysis-1024-3000000-inf-0.pgn.zst` | 1024 MiB | 3,000,000 | ∞ | 0 | 0.00095 | 19.34 % |

`analysis-1024-2700000-inf-0.pgn.zst` | 1024 MiB | 2,700,000 | ∞ | 0 | 0.00098 | 19.34 % |

`analysis-1024-2400000-inf-0.pgn.zst` | 1024 MiB | 2,400,000 | ∞ | 0 | 0.00097 | 19.47 % |

`analysis-1024-2600000-inf-0.pgn.zst` | 1024 MiB | 2,600,000 | ∞ | 0 | 0.00096 | 19.48 % |

`analysis-1024-2300000-inf-0.pgn.zst` | 1024 MiB | 2,300,000 | ∞ | 0 | 0.00099 | 19.57 % |

`analysis-1024-2200000-inf-0.pgn.zst` | 1024 MiB | 2,200,000 | ∞ | 0 | 0.00102 | 19.57 % |

`analysis-1024-2100000-inf-0.pgn.zst` | 1024 MiB | 2,100,000 | ∞ | 0 | 0.00103 | 19.95 % |

`analysis-1024-2000000-inf-0.pgn.zst` | 1024 MiB | 2,000,000 | ∞ | 0 | 0.00103 | 19.96 % |

`analysis-1024-1900000-inf-0.pgn.zst` | 1024 MiB | 1,900,000 | ∞ | 0 | 0.00105 | 20.22 % |

`analysis-1024-1800000-inf-0.pgn.zst` | 1024 MiB | 1,800,000 | ∞ | 0 | 0.00108 | 20.23 % |

`analysis-1024-1700000-inf-0.pgn.zst` | 1024 MiB | 1,700,000 | ∞ | 0 | 0.00108 | 20.39 % |

`analysis-1024-1600000-inf-0.pgn.zst` | 1024 MiB | 1,600,000 | ∞ | 0 | 0.00108 | 20.50 % |

`analysis-256-1500000-inf-0.pgn.zst` | 256 MiB | 1,500,000 | ∞ | 0 | 0.00109 | 20.60 % |

`analysis-1024-1500000-inf-0.pgn.zst` | 1024 MiB | 1,500,000 | ∞ | 0 | 0.00108 | 20.69 % |

`analysis-64-1500000-inf-0.pgn.zst` | 64 MiB | 1,500,000 | ∞ | 0 | 0.00111 | 20.69 % |

`analysis-32-1500000-inf-0.pgn.zst` | 32 MiB | 1,500,000 | ∞ | 0 | 0.00111 | 20.71 % |

`analysis-128-1500000-inf-0.pgn.zst` | 128 MiB | 1,500,000 | ∞ | 0 | 0.00112 | 20.73 % |

`analysis-8-1500000-inf-0.pgn.zst` | 8 MiB | 1,500,000 | ∞ | 0 | 0.00110 | 20.73 % |

`analysis-512-1500000-inf-0.pgn.zst` | 512 MiB | 1,500,000 | ∞ | 0 | 0.00110 | 20.77 % |

`analysis-128-1500000-10-1.pgn.zst` | 128 MiB | 1,500,000 | 10 | 1 | 0.00113 | 20.83 % |

`analysis-1024-1400000-inf-0.pgn.zst` | 1024 MiB | 1,400,000 | ∞ | 0 | 0.00111 | 20.83 % |

`analysis-4-1500000-inf-0.pgn.zst` | 4 MiB | 1,500,000 | ∞ | 0 | 0.00114 | 20.90 % |

`analysis-16-1500000-inf-0.pgn.zst` | 16 MiB | 1,500,000 | ∞ | 0 | 0.00110 | 20.96 % |

`analysis-2-1500000-inf-0.pgn.zst` | 2 MiB | 1,500,000 | ∞ | 0 | 0.00112 | 21.02 % |

`analysis-128-1500000-8-1.pgn.zst` | 128 MiB | 1,500,000 | 8 | 1 | 0.00111 | 21.02 % |

`analysis-128-1500000-7-1.pgn.zst` | 128 MiB | 1,500,000 | 7 | 1 | 0.00112 | 21.03 % |

`analysis-128-1500000-9-1.pgn.zst` | 128 MiB | 1,500,000 | 9 | 1 | 0.00110 | 21.04 % |

`analysis-1-1500000-inf-0.pgn.zst` | 1 MiB | 1,500,000 | ∞ | 0 | 0.00112 | 21.05 % |

`analysis-1024-1300000-inf-0.pgn.zst` | 1024 MiB | 1,300,000 | ∞ | 0 | 0.00114 | 21.07 % |

`analysis-128-1500000-10-0.pgn.zst` | 128 MiB | 1,500,000 | 10 | 0 | 0.00110 | 21.12 % |

`analysis-128-1500000-6-1.pgn.zst` | 128 MiB | 1,500,000 | 6 | 1 | 0.00110 | 21.21 % |

`analysis-1024-1200000-inf-0.pgn.zst` | 1024 MiB | 1,200,000 | ∞ | 0 | 0.00116 | 21.21 % |

`analysis-128-1333333-8-1.pgn.zst` | 128 MiB | 1,333,333 | 8 | 1 | 0.00116 | 21.23 % |

`analysis-128-1500000-4-1.pgn.zst` | 128 MiB | 1,500,000 | 4 | 1 | 0.00112 | 21.30 % |

`analysis-128-1500000-5-1.pgn.zst` | 128 MiB | 1,500,000 | 5 | 1 | 0.00111 | 21.35 % |

`analysis-128-1500000-8-0.pgn.zst` | 128 MiB | 1,500,000 | 8 | 0 | 0.00112 | 21.38 % |

`analysis-128-1350000-9-1.pgn.zst` | 128 MiB | 1,350,000 | 9 | 1 | 0.00115 | 21.38 % |

`analysis-128-1312500-7-1.pgn.zst` | 128 MiB | 1,312,500 | 7 | 1 | 0.00113 | 21.42 % |

`analysis-128-1285714-6-1.pgn.zst` | 128 MiB | 1,285,714 | 6 | 1 | 0.00114 | 21.46 % |

`analysis-128-1363636-10-1.pgn.zst` | 128 MiB | 1,363,636 | 10 | 1 | 0.00112 | 21.51 % |

`analysis-1024-1100000-inf-0.pgn.zst` | 1024 MiB | 1,100,000 | ∞ | 0 | 0.00118 | 21.51 % |

`analysis-128-1500000-9-0.pgn.zst` | 128 MiB | 1,500,000 | 9 | 0 | 0.00114 | 21.55 % |

`analysis-128-1200000-4-1.pgn.zst` | 128 MiB | 1,200,000 | 4 | 1 | 0.00117 | 21.64 % |

`analysis-128-1500000-3-1.pgn.zst` | 128 MiB | 1,500,000 | 3 | 1 | 0.00110 | 21.65 % |

`analysis-128-1500000-2-1.pgn.zst` | 128 MiB | 1,500,000 | 2 | 1 | 0.00113 | 21.65 % |

`analysis-128-1250000-5-1.pgn.zst` | 128 MiB | 1,250,000 | 5 | 1 | 0.00117 | 21.71 % |

`analysis-128-1500000-6-0.pgn.zst` | 128 MiB | 1,500,000 | 6 | 0 | 0.00113 | 21.74 % |

`analysis-1024-1000000-inf-0.pgn.zst` | 1024 MiB | 1,000,000 | ∞ | 0 | 0.00120 | 21.79 % |

`analysis-128-1500000-7-0.pgn.zst` | 128 MiB | 1,500,000 | 7 | 0 | 0.00107 | 21.79 % |

`analysis-1024-900000-inf-0.pgn.zst` | 1024 MiB | 900,000 | ∞ | 0 | 0.00119 | 22.03 % |

`analysis-128-1500000-1-1.pgn.zst` | 128 MiB | 1,500,000 | 1 | 1 | 0.00113 | 22.13 % |

`analysis-128-1500000-5-0.pgn.zst` | 128 MiB | 1,500,000 | 5 | 0 | 0.00112 | 22.18 % |

`analysis-128-1125000-3-1.pgn.zst` | 128 MiB | 1,125,000 | 3 | 1 | 0.00117 | 22.26 % |

`analysis-1024-800000-inf-0.pgn.zst` | 1024 MiB | 800,000 | ∞ | 0 | 0.00127 | 22.31 % |

`analysis-128-1500000-4-0.pgn.zst` | 128 MiB | 1,500,000 | 4 | 0 | 0.00114 | 22.34 % |

`analysis-128-1000000-2-1.pgn.zst` | 128 MiB | 1,000,000 | 2 | 1 | 0.00124 | 22.57 % |

`analysis-1024-700000-inf-0.pgn.zst` | 1024 MiB | 700,000 | ∞ | 0 | 0.00133 | 22.59 % |

`analysis-1024-600000-inf-0.pgn.zst` | 1024 MiB | 600,000 | ∞ | 0 | 0.00144 | 22.91 % |

`analysis-128-1500000-3-0.pgn.zst` | 128 MiB | 1,500,000 | 3 | 0 | 0.00117 | 22.95 % |

`analysis-128-1500000-2-0.pgn.zst` | 128 MiB | 1,500,000 | 2 | 0 | 0.00120 | 23.33 % |

`analysis-1024-500000-inf-0.pgn.zst` | 1024 MiB | 500,000 | ∞ | 0 | 0.00146 | 23.51 % |

`analysis-128-750000-1-1.pgn.zst` | 128 MiB | 750,000 | 1 | 1 | 0.00135 | 23.53 % |

`analysis-1024-400000-inf-0.pgn.zst` | 1024 MiB | 400,000 | ∞ | 0 | 0.00159 | 23.90 % |

`analysis-1024-300000-inf-0.pgn.zst` | 1024 MiB | 300,000 | ∞ | 0 | 0.00172 | 24.69 % |

`analysis-64-1500000-1-0.pgn.zst` | 64 MiB | 1,500,000 | 1 | 0 | 0.00123 | 25.56 % |

`analysis-128-1500000-1-0.pgn.zst` | 128 MiB | 1,500,000 | 1 | 0 | 0.00125 | 25.57 % |

`analysis-32-1500000-1-0.pgn.zst` | 32 MiB | 1,500,000 | 1 | 0 | 0.00126 | 25.58 % |

`analysis-4-1500000-1-0.pgn.zst` | 4 MiB | 1,500,000 | 1 | 0 | 0.00125 | 25.59 % |

`analysis-256-1500000-1-0.pgn.zst` | 256 MiB | 1,500,000 | 1 | 0 | 0.00123 | 25.60 % |

`analysis-512-1500000-1-0.pgn.zst` | 512 MiB | 1,500,000 | 1 | 0 | 0.00123 | 25.62 % |

`analysis-16-1500000-1-0.pgn.zst` | 16 MiB | 1,500,000 | 1 | 0 | 0.00122 | 25.63 % |

`analysis-1024-1500000-1-0.pgn.zst` | 1024 MiB | 1,500,000 | 1 | 0 | 0.00124 | 25.68 % |

`analysis-8-1500000-1-0.pgn.zst` | 8 MiB | 1,500,000 | 1 | 0 | 0.00122 | 25.72 % |

`analysis-2-1500000-1-0.pgn.zst` | 2 MiB | 1,500,000 | 1 | 0 | 0.00123 | 25.75 % |

`analysis-1-1500000-1-0.pgn.zst` | 1 MiB | 1,500,000 | 1 | 0 | 0.00125 | 25.87 % |

`analysis-1024-200000-inf-0.pgn.zst` | 1024 MiB | 200,000 | ∞ | 0 | 0.00192 | 25.95 % |

`analysis-1024-100000-inf-0.pgn.zst` | 1024 MiB | 100,000 | ∞ | 0 | 0.00236 | 27.75 % |

`analysis-1024-10000-inf-0.pgn.zst` | 1024 MiB | 10,000 | ∞ | 0 | 0.00430 | 34.64 % |

`analysis-1024-1000-inf-0.pgn.zst` | 1024 MiB | 1,000 | ∞ | 0 | 0.00870 | 43.37 % |

`analysis-1024-100-inf-0.pgn.zst` | 1024 MiB | 100 | ∞ | 0 | 0.01714 | 50.66 % |

`analysis-1024-10-inf-0.pgn.zst` | 1024 MiB | 10 | ∞ | 0 | 0.02129 | 51.35 % |

`analysis-1024-1-inf-0.pgn.zst` | 1024 MiB | 1 | ∞ | 0 | 0.02160 | 51.59 % |

niklasf, 31th July 2023.