YATT - Yet Another Turing Test

smatovic · Post by **smatovic** » Wed Jun 19, 2024 8:21 am

...to collect already posted stuff into one thread:

Now with context generative AIs, the switch from pattern recognition to pattern creation with neural networks, I would like to propose my own kind of Turing Test:

An AI which is able to code a chess engine and outperforms humans in this task.

1A) With hand-crafted eval. 1B) With neural networks.

2A) Outperforms non-programmers. 2B) Outperforms average chess-programmers. 2C) Outperforms top chess-programmers.

3A) An un-self-aware AI, the "RI", restricted intelligence. 2B) A self-aware AI, the "SI", sentient intelligence.

4A) An AI based on expert-systems. 4B) An AI based on neural networks. 4C) A merger of both.

The Chinese Room Argument applied onto this test would claim that there is no conscious in need to perform such a task, hence this test is not meant to measure self-awareness, consciousness or sentience, but what we call human intelligence.

https://en.wikipedia.org/wiki/Chinese_room

--
Srdja

smatovic · Post by **smatovic** » Wed Jun 19, 2024 8:22 am

The first test candidate was posted by Thomas Zipproth, December 08, 2022, generated by GPT-3:

Provide me with a minimal working source code of a chess engine
forum3/viewtopic.php?f=2&t=81097&start=20#p939245

Code: Select all

import chess

board = chess.Board()

while not board.is_game_over():
# Generate a list of all legal moves
legal_moves = list(board.legal_moves)

# Choose a random move from the list of legal moves
move = random.choice(legal_moves)

# Make the move on the board
board.push(move)

# Print the final game state
print(board)

It is a random mover which imports a chess library for move generation and game states.

--
Srdja

smatovic · Post by **smatovic** » Wed Jun 19, 2024 8:22 am

Second test candidate was posted by Darko Markovic, June 08, 2024, generated by GPT-4o:

GPT-4o made a chess engine
viewtopic.php?t=83882

Code: Select all

import chess
import chess.pgn
import chess.polyglot
import os

# Piece values with slight positional bonuses
PIECE_VALUES = {
    chess.PAWN: 100,
    chess.KNIGHT: 320,
    chess.BISHOP: 330,
    chess.ROOK: 500,
    chess.QUEEN: 900,
    chess.KING: 20000
}

# Central squares and other positional factors
CENTER_SQUARES = [chess.E4, chess.D4, chess.E5, chess.D5]
KING_SAFETY_PENALTY = 50
PAWN_STRUCTURE_BONUS = 20
MOBILITY_BONUS = 5

def evaluate_board(board):
    if board.is_checkmate():
        return -PIECE_VALUES[chess.KING] if board.turn == chess.WHITE else PIECE_VALUES[chess.KING]
    if board.is_stalemate() or board.is_insufficient_material():
        return 0
    
    value = 0
    for piece_type in PIECE_VALUES:
        value += len(board.pieces(piece_type, chess.WHITE)) * PIECE_VALUES[piece_type]
        value -= len(board.pieces(piece_type, chess.BLACK)) * PIECE_VALUES[piece_type]
    
    value += evaluate_pawn_structure(board)
    value += evaluate_king_safety(board)
    value += evaluate_control_of_center(board)
    value += evaluate_piece_mobility(board)
    
    return value

def evaluate_pawn_structure(board):
    score = 0
    for color in [chess.WHITE, chess.BLACK]:
        pawns = board.pieces(chess.PAWN, color)
        for pawn in pawns:
            # Isolated pawns
            if not any(board.piece_at(chess.square(pawn % 8, r)) == chess.PAWN and board.color_at(chess.square(pawn % 8, r)) == color for r in range(8) if r != chess.square_rank(pawn)):
                score -= PAWN_STRUCTURE_BONUS if color == chess.WHITE else -PAWN_STRUCTURE_BONUS
            
            # Doubled pawns
            if len([p for p in pawns if chess.square_file(p) == chess.square_file(pawn)]) > 1:
                score -= PAWN_STRUCTURE_BONUS // 2 if color == chess.WHITE else -PAWN_STRUCTURE_BONUS // 2

            # Backward pawns
            if is_backward_pawn(board, pawn, color):
                score -= PAWN_STRUCTURE_BONUS // 3 if color == chess.WHITE else -PAWN_STRUCTURE_BONUS // 3
                
    return score

def is_backward_pawn(board, pawn_square, color):
    file = chess.square_file(pawn_square)
    rank = chess.square_rank(pawn_square)
    direction = 1 if color == chess.WHITE else -1
    backward = True
    
    for adj_file in [file - 1, file + 1]:
        if 0 <= adj_file < 8:
            adj_square = chess.square(adj_file, rank)
            if board.piece_at(adj_square) == chess.PAWN and board.color_at(adj_square) == color:
                backward = False
                break
    
    if backward:
        for forward_rank in range(rank + direction, 8 if color == chess.WHITE else -1, direction):
            forward_square = chess.square(file, forward_rank)
            if board.piece_at(forward_square) == chess.PAWN and board.color_at(forward_square) == color:
                backward = False
                break
            if (adj_file := file - 1) >= 0:
                adj_square = chess.square(adj_file, forward_rank)
                if board.piece_at(adj_square) == chess.PAWN and board.color_at(adj_square) != color:
                    backward = False
                    break
            if (adj_file := file + 1) < 8:
                adj_square = chess.square(adj_file, forward_rank)
                if board.piece_at(adj_square) == chess.PAWN and board.color_at(adj_square) != color:
                    backward = False
                    break

    return backward

def evaluate_king_safety(board):
    score = 0
    for color in [chess.WHITE, chess.BLACK]:
        king_square = board.king(color)
        king_rank = chess.square_rank(king_square)
        king_file = chess.square_file(king_square)
        
        pawns_around_king = 0
        for rank in range(max(0, king_rank - 1), min(7, king_rank + 1) + 1):
            for file in range(max(0, king_file - 1), min(7, king_file + 1) + 1):
                if board.piece_at(chess.square(file, rank)) == chess.PAWN and board.color_at(chess.square(file, rank)) == color:
                    pawns_around_king += 1
        
        score -= (3 - pawns_around_king) * KING_SAFETY_PENALTY if color == chess.WHITE else -(3 - pawns_around_king) * KING_SAFETY_PENALTY
    return score

def evaluate_control_of_center(board):
    score = 0
    for square in CENTER_SQUARES:
        if board.piece_at(square):
            piece = board.piece_at(square)
            if piece.color == chess.WHITE:
                score += PIECE_VALUES[piece.piece_type] * 0.1
            else:
                score -= PIECE_VALUES[piece.piece_type] * 0.1
    return score

def evaluate_piece_mobility(board):
    score = 0
    for color in [chess.WHITE, chess.BLACK]:
        for square in board.piece_map():
            piece = board.piece_at(square)
            if piece.color == color:
                if color == chess.WHITE:
                    score += len(board.attacks(square)) * MOBILITY_BONUS
                else:
                    score -= len(board.attacks(square)) * MOBILITY_BONUS
    return score

KILLER_MOVES = {}
HISTORY_HEURISTIC = {}

def order_moves(board, depth):
    captures = []
    non_captures = []
    for move in board.legal_moves:
        if board.is_capture(move):
            captures.append(move)
        else:
            non_captures.append(move)
    
    captures.sort(key=lambda move: PIECE_VALUES[board.piece_at(move.to_square).piece_type] if board.piece_at(move.to_square) else 0, reverse=True)
    
    # Killer move heuristic
    non_captures.sort(key=lambda move: (move in KILLER_MOVES.get(depth, []), HISTORY_HEURISTIC.get(move, 0)), reverse=True)
    
    return captures + non_captures

def register_killer_move(move, depth):
    if depth not in KILLER_MOVES:
        KILLER_MOVES[depth] = [move]
    elif move not in KILLER_MOVES[depth]:
        KILLER_MOVES[depth].append(move)
        if len(KILLER_MOVES[depth]) > 2:
            KILLER_MOVES[depth].pop(0)

def register_history_heuristic(move, depth):
    if move in HISTORY_HEURISTIC:
        HISTORY_HEURISTIC[move] += depth * depth
    else:
        HISTORY_HEURISTIC[move] = depth * depth

TRANSPOSITION_TABLE = {}

def alpha_beta_search(board, depth, alpha, beta):
    if depth == 0:
        return quiescence_search(board, alpha, beta)
    
    board_fen = board.fen()
    if board_fen in TRANSPOSITION_TABLE and TRANSPOSITION_TABLE[board_fen][0] >= depth:
        return TRANSPOSITION_TABLE[board_fen][1]
    
    best_score = -float('inf')
    
    for move in order_moves(board, depth):
        board.push(move)
        score = -alpha_beta_search(board, depth - 1, -beta, -alpha)
        board.pop()
        
        if score >= beta:
            register_killer_move(move, depth)
            TRANSPOSITION_TABLE[board_fen] = (depth, score)
            return score
        
        if score > alpha:
            alpha = score
            best_score = score
            register_history_heuristic(move, depth)
    
    TRANSPOSITION_TABLE[board_fen] = (depth, best_score)
    return best_score

def null_move_pruning(board, depth, alpha, beta):
    R = 2  # Reduction value for null move pruning
    if depth > R and not board.is_check():
        board.push(chess.Move.null())
        score = -alpha_beta_search(board, depth - 1 - R, -beta, -beta + 1)
        board.pop()
        if score >= beta:
            return beta
    return None

def quiescence_search(board, alpha, beta):
    stand_pat = evaluate_board(board)
    if stand_pat >= beta:
        return beta
    if alpha < stand_pat:
        alpha = stand_pat

    for move in board.legal_moves:
        if board.is_capture(move):
            board.push(move)
            score = -quiescence_search(board, -beta, -alpha)
            board.pop()

            if score >= beta:
                return beta
            if score > alpha:
                alpha = score

    return alpha

def search(board, depth):
    alpha = -float('inf')
    beta = float('inf')
    best_move = None
    best_score = -float('inf')

    for move in order_moves(board, depth):
        board.push(move)
        score = -alpha_beta_search(board, depth - 1, -beta, -alpha)
        board.pop()

        if score > best_score:
            best_score = score
            best_move = move

        # Print the search progress
        print(f"Depth: {depth}, Move: {move}, Eval: {score}")

    return best_move, best_score

def play_as_white():
    board = chess.Board()
    depth = 3
    game = chess.pgn.Game()
    game.headers["White"] = "Engine"
    game.headers["Black"] = "User"
    node = game

    with chess.polyglot.open_reader("book.bin") as reader:
        while not board.is_game_over():
            print(board)

            if board.turn == chess.WHITE:
                move = None
                try:
                    entry = reader.get(board)
                    move = entry.move
                except:
                    pass
                if not move:
                    move, score = search(board, depth)
                board.push(move)
                print(f"Engine move: {move}")
            else:
                user_move = input("Enter your move in UCI format (e.g., e2e4): ")

                try:
                    move = chess.Move.from_uci(user_move)
                    if move in board.legal_moves:
                        board.push(move)
                    else:
                        print("Illegal move, try again.")
                        continue
                except:
                    print("Invalid move format, try again.")
                    continue

            node = node.add_variation(move)

    print("Game over")
    print(board.result())

    # Save game to PGN file
    with open("games.pgn", "a") as pgn_file:
        print(game, file=pgn_file, end="\n\n")

def play_as_black():
    board = chess.Board()
    depth = 3
    game = chess.pgn.Game()
    game.headers["White"] = "User"
    game.headers["Black"] = "Engine"
    node = game

    with chess.polyglot.open_reader("book.bin") as reader:
        while not board.is_game_over():
            print(board)

            if board.turn == chess.WHITE:
                user_move = input("Enter your move in UCI format (e.g., e2e4): ")

                try:
                    move = chess.Move.from_uci(user_move)
                    if move in board.legal_moves:
                        board.push(move)
                    else:
                        print("Illegal move, try again.")
                        continue
                except:
                    print("Invalid move format, try again.")
                    continue
            else:
                move = None
                try:
                    entry = reader.get(board)
                    move = entry.move
                except:
                    pass
                if not move:
                    move, score = search(board, depth)
                board.push(move)
                print(f"Engine move: {move}")

            node = node.add_variation(move)

    print("Game over")
    print(board.result())

    # Save game to PGN file
    with open("games.pgn", "a") as pgn_file:
        print(game, file=pgn_file, end="\n\n")

# Uncomment the desired function to play as white or black
# play_as_white()
play_as_black()

- imports chess library for move generation, game states and PGN export
- AlphaBeta search as Negamax
- Move Sorting with captures before non-captures and captures as MVV
- Quiescence Search with Stand Pat
- wood+pawn-structure+king-savety+center+mobility evaluation
- Transposition Tables
- Killer and History Heuristic
- Null Move Pruning

--
Srdja

towforce · Post by **towforce** » Thu Jun 20, 2024 10:34 am

Isn't expecting an LLM to produce a good hand-crafted evaluation a bit like expecting an LLM to play good chess - optimistic?

Happy to be proven wrong!

smatovic · Post by **smatovic** » Thu Jun 20, 2024 11:06 am

I find the second test candidate's evaluation function already astonishing for an LLM.

Will post further source code of next GPT/LLM versions in this thread when sighted. I am also curious how progress will be.

--
Srdja

towforce · Post by **towforce** » Thu Jun 20, 2024 1:03 pm

smatovic wrote: ↑Thu Jun 20, 2024 11:06 amI find the second test candidate's evaluation function already astonishing for an LLM.

I have had this experience with LLMs - in particular Gemini Pro (the paid version). I had it write a quest type story for a girl I know about her country, and she was astonished (rightly so - it was real quality!), and in a life coaching session yesterday, it came out with insights about me that I've never heard from a human. I copied it into my notes and added just one word - "WOW!".

I understand how Jerry Ehman felt in 1967 (link).

Obviously you cannot expect such experiences every day.

YATT - Yet Another Turing Test

YATT - Yet Another Turing Test

Re: YATT - Yet Another Turing Test

Re: YATT - Yet Another Turing Test

Re: YATT - Yet Another Turing Test

Re: YATT - Yet Another Turing Test

Re: YATT - Yet Another Turing Test