Gemini

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

Werewolf
Posts: 2006
Joined: Thu Sep 18, 2008 10:24 pm

Re: Gemini

Post by Werewolf »

towforce wrote: Tue Aug 05, 2025 9:44 pm
towforce wrote: Sun Jul 27, 2025 11:32 am
Werewolf wrote: Sat Jul 26, 2025 7:54 pmWait 2 weeks...ChatGPT-5 should be out.

Better than 50% chance it will be out by 15-Aug-25 - link.

You have to pay to use Grok 4: I suspect the same will be true of GPT5.

Markets are confident it will be out in the next 5 days - link.

Will be interesting to see how good it is, and if it is the best, how long it will take competitors to catch it.
Yes I’m waiting in anticipation. I don’t think the competition will catch it soon, because current ChatGPT (4O + O3) is about equal to latest Grok / Gemini / Claude overall, and ChatGPT-5 is apparently a big step up.

First thing I want to do is write a chess engine with it.
Pedro
Posts: 30
Joined: Mon Oct 26, 2020 3:05 pm
Full name: Pedro

Re: Gemini

Post by Pedro »

Grok 4 was summarily defeated by o3 in a best-of-three match. Carlsen and Howell provided commentary on the duel. Grok 4 had previously defeated Gemini 2.5 Pro by a score of 3 to 2, while o3 beat o4 mini 4–0 in the semifinals.


User avatar
towforce
Posts: 12445
Joined: Thu Mar 09, 2006 12:57 am
Location: Birmingham UK
Full name: Graham Laight

Re: Gemini

Post by towforce »

Werewolf wrote: Tue Aug 05, 2025 11:22 pm
towforce wrote: Tue Aug 05, 2025 9:44 pm
towforce wrote: Sun Jul 27, 2025 11:32 am
Werewolf wrote: Sat Jul 26, 2025 7:54 pmWait 2 weeks...ChatGPT-5 should be out.

Better than 50% chance it will be out by 15-Aug-25 - link.

You have to pay to use Grok 4: I suspect the same will be true of GPT5.

Markets are confident it will be out in the next 5 days - link.

Will be interesting to see how good it is, and if it is the best, how long it will take competitors to catch it.
Yes I’m waiting in anticipation. I don’t think the competition will catch it soon, because current ChatGPT (4O + O3) is about equal to latest Grok / Gemini / Claude overall, and ChatGPT-5 is apparently a big step up.

First thing I want to do is write a chess engine with it.

It looks as though the release of GPT 5 is in process: it has gone straight to the top of the leader board - link.
Human chess is partly about tactics and strategy, but mostly about memory
Werewolf
Posts: 2006
Joined: Thu Sep 18, 2008 10:24 pm

Re: Gemini

Post by Werewolf »

I am very unconvinced by ChatGPT 5's reasoning ability.
We played 6 games of chess - it made illegal moves in 5 of them. All my attempts to get it to think (invoke inference) failed. It seems like OpenAI are trying to save money by taking inference out of the hands of the user.

Then I tested it on a 1990s IQ puzzle book that's not on the net. It scored a dismal 50%.
Grok 4 got 95%.

ChatGPT-O3 got 70%.
Werewolf
Posts: 2006
Joined: Thu Sep 18, 2008 10:24 pm

Re: Gemini

Post by Werewolf »

OpenAI have just added a "Thinking" button which forces longer thinks.

On my IQ test its score went from 50% to 70% but it's still not amazing.
User avatar
towforce
Posts: 12445
Joined: Thu Mar 09, 2006 12:57 am
Location: Birmingham UK
Full name: Graham Laight

Re: Gemini

Post by towforce »

Werewolf wrote: Fri Aug 08, 2025 9:47 am I am very unconvinced by ChatGPT 5's reasoning ability.
We played 6 games of chess - it made illegal moves in 5 of them. All my attempts to get it to think (invoke inference) failed. It seems like OpenAI are trying to save money by taking inference out of the hands of the user.

Then I tested it on a 1990s IQ puzzle book that's not on the net. It scored a dismal 50%.
Grok 4 got 95%.

ChatGPT-O3 got 70%.
We are still in a world in which different chatbots are suited to different tasks. Another example: GPT-5 has an input window of 272,000 tokens, whereas Gemini has a million.

Open AI have claimed that GPT-5 is better at medical diagnosis than the other chatbots: I have no reason to disbelieve this claim. If you suspect you have a mental health issue, go to Grok - it will actually give you an answer, while other chatbots will refuse to do so.

And so on and so on...
Human chess is partly about tactics and strategy, but mostly about memory
User avatar
towforce
Posts: 12445
Joined: Thu Mar 09, 2006 12:57 am
Location: Birmingham UK
Full name: Graham Laight

Re: Gemini

Post by towforce »

Werewolf wrote: Fri Aug 08, 2025 9:47 amI am very unconvinced by ChatGPT 5's reasoning ability.

Many of OpenAI's customers are not happy with GPT-5 - link.
Human chess is partly about tactics and strategy, but mostly about memory
Werewolf
Posts: 2006
Joined: Thu Sep 18, 2008 10:24 pm

Re: Gemini

Post by Werewolf »

My feelings about ChatGPT-5 remain mixed, however, after 24 hours of work I have a functioning chess engine - my first.

It’s a mainly brute force engine with a primitive evaluation function written in C. I decided to make Threads dynamic so that they equal the number of legal moves available (the idea was to do multi threading at the root).

It works within Fritz and plays at around 1800 FIDE Elo.

Off to sleep now!
User avatar
towforce
Posts: 12445
Joined: Thu Mar 09, 2006 12:57 am
Location: Birmingham UK
Full name: Graham Laight

Re: Gemini

Post by towforce »

Werewolf wrote: Sat Aug 09, 2025 6:31 pmMy feelings about ChatGPT-5 remain mixed, however, after 24 hours of work I have a functioning chess engine - my first.

Good work - well done! 8-)

You might have to do it again in a few weeks - the market is expecting GPT-5 to lose the throne very soon - link. :shock:
Human chess is partly about tactics and strategy, but mostly about memory
Werewolf
Posts: 2006
Joined: Thu Sep 18, 2008 10:24 pm

Re: Gemini

Post by Werewolf »

towforce wrote: Sun Aug 10, 2025 10:56 am
Werewolf wrote: Sat Aug 09, 2025 6:31 pmMy feelings about ChatGPT-5 remain mixed, however, after 24 hours of work I have a functioning chess engine - my first.

Good work - well done! 8-)

You might have to do it again in a few weeks - the market is expecting GPT-5 to lose the throne very soon - link. :shock:
Gemini 3?

Writing a chess engine is less fun than I imagined, I am literally spending 95% of my time de-bugging :(