Lazy sorting algorithm - Sorting on steroids

Discussion of chess software programming and technical issues.

Moderator: Ras

User avatar
Ronald
Posts: 161
Joined: Tue Jan 23, 2018 10:18 am
Location: Rotterdam
Full name: Ronald Friederich

Re: Lazy sorting algorithm - Sorting on steroids

Post by Ronald »

If you do presorting of the moves, you calculate the scores of the moves before one of the moves has been searched and the order of the moves doesn't change anymore. If you sort on some kind of history score the history score of remaining moves might/will change during the search of a move and you can use the changed information for the order of the remaining moves.

If you want to use this new history info you will have to do what some people call "Lazy sorting" and recalc the movescores every time before you pick a new move.
JohnWoe
Posts: 529
Joined: Sat Mar 02, 2013 11:31 pm

Re: Lazy sorting algorithm - Sorting on steroids

Post by JohnWoe »

I updated the paper. But here are the same stuff.

I measured overhead stuff in python: Native .sort() is the fastest
10 x 10000 randomized array. Array cloned.

Code: Select all

# Speed ( LazySort )
N:        10 x 10000
Time(s):  26.378

# Speed ( SelectionSort )
N:        10 x 10000
Time(s):  28.241

# Speed ( PythonSort )
N:        10 x 10000
Time(s):  0.010

# Speed ( BubbleSort )
N:        10 x 10000
Time(s):  55.155
std::max_element was suggested. Unfortunately much slower:
std::max_element:

Code: Select all

Nodes:    242247069
Time(ms): 75007
NPS:      3229659
LazySort()

Code: Select all

Nodes:    266161266
Time(ms): 75006
NPS:      3548532
Updated benchmarks: std::sort() is the slowest. Because I don't score every move

Code: Select all

LazySort()
Nodes 544,437,237
Time(s) 150.008
Nodes / second 3,629,388

SelectionSort()
Nodes 535,690,617
Time(s) 150.010
Nodes / second 3,571,032

std::sort()
Nodes 490,725,631
Time(s) 150.005
Nodes / second 3,271,395
tcusr
Posts: 325
Joined: Tue Aug 31, 2021 10:32 pm
Full name: tcusr

Re: Lazy sorting algorithm - Sorting on steroids

Post by tcusr »

it's better to swap once when the loop is finished instead of each time you upgrade the largest score.
abulmo2
Posts: 467
Joined: Fri Dec 16, 2016 11:04 am
Location: France
Full name: Richard Delorme

Re: Lazy sorting algorithm - Sorting on steroids

Post by abulmo2 »

I tested a few sorting algorithms on Dumb 1.9.

Selection sort:

Code: Select all

void selectionSort(MoveItem [] item) {
	foreach (i; 0 .. item.length - 1) {
		size_t k = i;
		foreach (j; i + 1 .. item.length) if (item[j].value > item[k].value) k = j;
		if (k > i) swap(item[i], item[k]);
	}
}
Insertion sort:

Code: Select all

void insertionSort(MoveItem [] items) {
	foreach (i; 1 .. items.length) {
		size_t j;
		const tmp = items[i];
	    for (j = i ; j > 0 && tmp.value > items[j - 1].value; j--) items[j] = items[j - 1];
		items[j] = tmp;
	}
}
For both algorithms, all legal moves are generated, scored (hash move - mvv/lva for captures & promotions - killer - history) and the move is picked up through the simple function during the search:

Code: Select all

	ref MoveItem next() return { return item[index++]; }
A lazy implementation of both algorithms is done by moving the sorting function to the function where a move is picked. Both lazy and non lazy sorting visit the same nodes, but selection and insertion sorting lead to different trees.

Measuring their raw speed through the bench option of Dumb I got:

Code: Select all

insertion sort      : 4763890 nps.
lazy insertion sort : 4588487 nps.
selection sort      : 3910556 nps.
lazy selection sort : 4775616 nps.
Doing a gauntlet against 16 various opponent, I got the following results:

Code: Select all

ALGO                     RATING  ERROR   POINTS  PLAYED    (%)
insertion sort        :    0.0   ----   1017.5    1600   63.6%
lazy selection sort   :  -11.3   17.7    996.0    1600   62.2%
lazy insertion sort   :  -14.5   17.9    990.0    1600   61.9%
selection sort        :  -41.0   17.4    938.5    1600   58.7%
As the original poster, I found that lazy (selection) sorting is better than selection sorting. However, I also found that insertion sort is much better than selection sort and better or on par with their lazy versions.

Maybe the insertion sort is worth a trial.
Richard Delorme
dangi12012
Posts: 1062
Joined: Tue Apr 28, 2020 10:03 pm
Full name: Daniel Infuehr

Re: Lazy sorting algorithm - Sorting on steroids

Post by dangi12012 »

abulmo2 wrote: Sun Mar 13, 2022 11:28 pm Maybe the insertion sort is worth a trial.
Can add sorting networks to your tests?

Source: https://stackoverflow.com/questions/197 ... r-networks

Code: Select all

/**
 * A Functor class to create a sort for fixed sized arrays/containers with a
 * compile time generated Bose-Nelson sorting network.
 * \tparam NumElements  The number of elements in the array or container to sort.
 * \tparam T            The element type.
 * \tparam Compare      A comparator functor class that returns true if lhs < rhs.
 */
template <unsigned NumElements, class Compare = void> class StaticSort
{
    template <class A, class C> struct Swap
    {
        template <class T> inline void s(T &v0, T &v1)
        {
            T t = Compare()(v0, v1) ? v0 : v1; // Min
            v1 = Compare()(v0, v1) ? v1 : v0; // Max
            v0 = t;
        }

        inline Swap(A &a, const int &i0, const int &i1) { s(a[i0], a[i1]); }
    };

    template <class A> struct Swap <A, void>
    {
        template <class T> inline void s(T &v0, T &v1)
        {
            // Explicitly code out the Min and Max to nudge the compiler
            // to generate branchless code.
            T t = v0 < v1 ? v0 : v1; // Min
            v1 = v0 < v1 ? v1 : v0; // Max
            v0 = t;
        }

        inline Swap(A &a, const int &i0, const int &i1) { s(a[i0], a[i1]); }
    };

    template <class A, class C, int I, int J, int X, int Y> struct PB
    {
        inline PB(A &a)
        {
            enum { L = X >> 1, M = (X & 1 ? Y : Y + 1) >> 1, IAddL = I + L, XSubL = X - L };
            PB<A, C, I, J, L, M> p0(a);
            PB<A, C, IAddL, J + M, XSubL, Y - M> p1(a);
            PB<A, C, IAddL, J, XSubL, M> p2(a);
        }
    };

    template <class A, class C, int I, int J> struct PB <A, C, I, J, 1, 1>
    {
        inline PB(A &a) { Swap<A, C> s(a, I - 1, J - 1); }
    };

    template <class A, class C, int I, int J> struct PB <A, C, I, J, 1, 2>
    {
        inline PB(A &a) { Swap<A, C> s0(a, I - 1, J); Swap<A, C> s1(a, I - 1, J - 1); }
    };

    template <class A, class C, int I, int J> struct PB <A, C, I, J, 2, 1>
    {
        inline PB(A &a) { Swap<A, C> s0(a, I - 1, J - 1); Swap<A, C> s1(a, I, J - 1); }
    };

    template <class A, class C, int I, int M, bool Stop = false> struct PS
    {
        inline PS(A &a)
        {
            enum { L = M >> 1, IAddL = I + L, MSubL = M - L};
            PS<A, C, I, L, (L <= 1)> ps0(a);
            PS<A, C, IAddL, MSubL, (MSubL <= 1)> ps1(a);
            PB<A, C, I, IAddL, L, MSubL> pb(a);
        }
    };

    template <class A, class C, int I, int M> struct PS <A, C, I, M, true>
    {
        inline PS(A &a) {}
    };

public:
    /**
     * Sorts the array/container arr.
     * \param  arr  The array/container to be sorted.
     */
    template <class Container> inline void operator() (Container &arr) const
    {
        PS<Container, Compare, 1, NumElements, (NumElements <= 1)> ps(arr);
    };

    /**
     * Sorts the array arr.
     * \param  arr  The array to be sorted.
     */
    template <class T> inline void operator() (T *arr) const
    {
        PS<T*, Compare, 1, NumElements, (NumElements <= 1)> ps(arr);
    };
};

#include <iostream>
#include <vector>

int main(int argc, const char * argv[])
{
    enum { NumValues = 32 };

    // Arrays
    {
        int rands[NumValues];
        for (int i = 0; i < NumValues; ++i) rands[i] = rand() % 100;
        std::cout << "Before Sort: \t";
        for (int i = 0; i < NumValues; ++i) std::cout << rands[i] << " ";
        std::cout << "\n";
        StaticSort<NumValues> staticSort;
        staticSort(rands);
        std::cout << "After Sort: \t";
        for (int i = 0; i < NumValues; ++i) std::cout << rands[i] << " ";
        std::cout << "\n";
    }

    std::cout << "\n";

    // STL Vector
    {
        std::vector<int> rands(NumValues);
        for (int i = 0; i < NumValues; ++i) rands[i] = rand() % 100;
        std::cout << "Before Sort: \t";
        for (int i = 0; i < NumValues; ++i) std::cout << rands[i] << " ";
        std::cout << "\n";
        StaticSort<NumValues> staticSort;
        staticSort(rands);
        std::cout << "After Sort: \t";
        for (int i = 0; i < NumValues; ++i) std::cout << rands[i] << " ";
        std::cout << "\n";
    }

    return 0;
}
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer
User avatar
lithander
Posts: 915
Joined: Sun Dec 27, 2020 2:40 am
Location: Bremen, Germany
Full name: Thomas Jahn

Re: Lazy sorting algorithm - Sorting on steroids

Post by lithander »

With selection sort one iteration of the outer loop will put the best remaining move in the next slot. So I can just run one iteration play the move and leave the rest unsorted until I need to play another move. Lazy sorting.

But how does lazy sorting work with the insertion sort you showed? You can only be sure to have seen and inserted the best move when the whole outer loop is through. And then every move is already sorted correctly. How can you do that in a lazy fashion?
Minimal Chess (simple, open source, C#) - Youtube & Github
Leorik (competitive, in active development, C#) - Github & Lichess
abulmo2
Posts: 467
Joined: Fri Dec 16, 2016 11:04 am
Location: France
Full name: Richard Delorme

Re: Lazy sorting algorithm - Sorting on steroids

Post by abulmo2 »

lithander wrote: Mon Mar 14, 2022 11:58 am With selection sort one iteration of the outer loop will put the best remaining move in the next slot. So I can just run one iteration play the move and leave the rest unsorted until I need to play another move. Lazy sorting.

But how does lazy sorting work with the insertion sort you showed? You can only be sure to have seen and inserted the best move when the whole outer loop is through. And then every move is already sorted correctly. How can you do that in a lazy fashion?
the lazy selection sort:

Code: Select all

ref MoveItem next() return {
	size_t best = index;
	foreach(i; index + 1 .. n) if (item[i].value  > item[best].value) best = i;

	if (best > index) swap(item[best], item[index]);

	return item[index++];
}
the lazy insertion sort (less efficient):

Code: Select all

ref MoveItem next(const bool doSort = true) return {
	size_t best = index;
	foreach(i; index + 1 .. n) if (item[i].value  > item[best].value) best = i;

	const MoveItem tmp = item[best];
	foreach_reverse (i; index .. best) item[i + 1] = item[i];
	item[index] = tmp;

	return item[index++];
}
Richard Delorme
JohnWoe
Posts: 529
Joined: Sat Mar 02, 2013 11:31 pm

Re: Lazy sorting algorithm - Sorting on steroids

Post by JohnWoe »

Tested insertion sort:

My python implementation is propably correct and gives a nice boost. 3.4s vs 10.4s

Code: Select all

# Speed ( LazySort )
N:        100 x 1000
Time(s):  10.418

# Speed ( SelectionSort )
N:        100 x 1000
Time(s):  10.383

# Speed ( PythonSort )
N:        100 x 1000
Time(s):  0.111

# Speed ( BubbleSort )
N:        100 x 1000
Time(s):  20.797

# Speed ( InsertionSort )
N:        100 x 1000
Time(s):  3.454
Bench is pretty much the same. There's potential in InsertionSort(). However there's +4 SLOC. More complex algo.

LazySort():

Code: Select all

Nodes:    87356209
Time(ms): 75026
NPS:      1164345
InsertionSort:

Code: Select all

Nodes:    86965422
Time(ms): 75041
NPS:      1158905
My InsertionSort() implementation in C++. If you can optimize I'll run again.

Code: Select all

inline void LazySort(const int ply, const int nth, const int total_moves) {
  for (auto i = nth + 1; i < total_moves; ++i)
    if (g_boards[ply][i].score > g_boards[ply][nth].score) {
      const auto p = g_boards[ply][i];
      auto j       = i - 1;
      for (; j >= 0 && g_boards[ply][j].score < p.score; --j) g_boards[ply][j + 1] = g_boards[ply][j];
      g_boards[ply][j + 1] = p;
    }
}