Chess in Rust: Vocabulary

Recently I've been using Rust to build a server for the new 7-piece Syzygy endgame tablebases. It is hosted as a public API by lichess.org, a popular free/libre open-source chess server. The API serves as a backend for the lichess.org analysis board as well as syzygy-tables.info.

Using Rust was quite enjoyable and I plan to use it for many future projects. This series is intended to order and share my thoughts, and as a primer to discuss some open questions I have.

The first step was to build a crate with common data types and basic move generation. These are the most basic types and almost any chess crate in the ecosystem has its own version of these:

It would be great to establish a common vocabulary crate, similar to http, but it looks difficult with all the possible tradeoffs. Please share your opinion on the following.

Should Copy be derived or not?

All the basic types could potentially be Copy, but should they be?

The standard library documentation says:

Generally speaking, if your type can implement Copy, it should. Keep in mind, though, that implementing Copy is part of the public API of your type. If the type might become non-Copy in the future, it could be prudent to omit the Copy implementation now, to avoid a breaking API change.

In my (quite limited) experience it is a good idea to take the opposite approach. Even trivially copyable types should not be Copy, unless there is a good reason to make them Copy. Forcing the programmer to explicitly acknowledge clones can prevent some kinds of logic bugs.

Obviously that's only the case when intentional cloning is rare. Therefore File, Rank, Square, Bitboard, Color and Role should be Copy.

Board should not be Copy. And anyway it might be too large, or internally use non-Copy data structures now or in the future.

Piece is more intresting, because it represents a physical object. The borrow-checker could ensure that pieces we take from one place are only put into one other place and not accidentally duplicated. However, working with non-Copy pieces turned out to be quite inconvenient in practice. I now believe that Piece should be Copy.

Move is similar, because it represents a physical action that a player can take (and only once). I very rarely needed to clone moves. If others find the same, I believe they should not be Copy.

Enum or newtype?

Squares are often referred to by name, but in chess programming they are also frequently used like integers (to compute or apply offsets, or index other data structures).

Is the following definiton idiomatic?

#[derive(Copy, Clone, Eq, PartialEq, Ord, PartialOrd, Hash)]
#[repr(i8)] // required for transmute, choice of type discussed below
pub enum Square {
    A1, B1, C1, D1, E1, F1, G1, H1,
    A2, B2, C2, D2, E2, F2, G2, H2,
    A3, B3, C3, D3, E3, F3, G3, H3,
    A4, B4, C4, D4, E4, F4, G4, H4,
    A5, B5, C5, D5, E5, F5, G5, H5,
    A6, B6, C6, D6, E6, F6, G6, H6,
    A7, B7, C7, D7, E7, F7, G7, H7,
    A8, B8, C8, D8, E8, F8, G8, H8,
}

impl Square {
    /// Gets a `Square` from an integer index.
    ///
    /// # Panics
    ///
    /// Panics if the index is not in the range `0..=63`.
    ///
    /// # Examples
    ///
    /// ```
    /// use shakmaty::Square;
    ///
    /// assert_eq!(Square::new(0), Square::A1);
    /// assert_eq!(Square::new(63), Square::H8);
    /// ```
    #[inline]
    pub fn new(index: i8) -> Square {
        assert!(0 <= index && index < 64);
        unsafe { Square::from_index_unchecked(index) }
    }

    /// Tries to get a `Square` from an integer index in the range `0..=63`.
    ///
    /// # Examples
    ///
    /// ```
    /// use shakmaty::Square;
    ///
    /// assert_eq!(Square::from_index(0), Some(Square::A1));
    /// assert_eq!(Square::from_index(63), Some(Square::H8));
    ///
    /// assert_eq!(Square::from_index(64), None);
    /// ```
    #[inline]
    pub fn from_index(index: i8) -> Option<Square> {
        if 0 <= index && index < 64 {
            Some(Square::new(index))
        } else {
            None
        }
    }

    /// Gets a `Square` from an integer index.
    ///
    /// # Unsafety
    ///
    /// It is the callers responsibility to ensure it is in the range `0..=63`.
    #[inline]
    pub unsafe fn from_index_unchecked(index: i8) -> Square {
        debug_assert!(0 <= index && index < 64);
        ::std::mem::transmute(index)
    }
}

Enum or custom packing?

Stockfish, a very strong chess engine written in C++, encodes moves as follows:

/// A move needs 16 bits to be stored
///
/// bit  0- 5: destination square (from 0 to 63)
/// bit  6-11: origin square (from 0 to 63)
/// bit 12-13: promotion piece type - 2 (from KNIGHT-2 to QUEEN-2)
/// bit 14-15: special move flag: promotion (1), en passant (2), castling (3)
/// NOTE: EN-PASSANT bit is set only when a pawn can be captured
///
/// Special cases are MOVE_NONE and MOVE_NULL. We can sneak these in because in
/// any normal move destination square is always different from origin square
/// while MOVE_NONE and MOVE_NULL have the same origin and destination square.

enum Move : int {
  MOVE_NONE,
  MOVE_NULL = 65
};

enum MoveType {
  NORMAL,
  PROMOTION = 1 << 14,
  ENPASSANT = 2 << 14,
  CASTLING  = 3 << 14,
};

Of course Rust would allow us to do something similar. On the other hand pattern matching on enums is very powerful and I have not forgotten to handle a special case since.

#[derive(Clone, Eq, PartialEq, Debug)]
#[cfg_attr(nightly, repr(align(4)))]
pub enum Move {
    Normal {
        role: Role,
        from: Square,
        capture: Option<Role>,
        to: Square,
        promotion: Option<Role>,
    },
    EnPassant { from: Square, to: Square },
    Castle { king: Square, rook: Square },
}

Unfortunately this convenience is not free:

Is there any way to have both?

Next

Next up is double checked locking.


niklasf, 14th April 2019, discussion on reddit

Update: Comparing sizes of Stockfish's and shakmaty's Move is not fair, because the Rust enum stores more information. Thanks Kaligule.

Update 2: shakmaty 0.15 now implements a compromise: Calculations are performed with the natural u32 and i32 integer types. Narrower types, but not bit packing, are used for storage.

As for double checked locking, meanwhile once_cell has emerged with a nicely consistent API.