Gemini

Werewolf · Post by **Werewolf** » Tue Aug 05, 2025 11:22 pm

towforce wrote: ↑Tue Aug 05, 2025 9:44 pm
towforce wrote: ↑Sun Jul 27, 2025 11:32 am
Werewolf wrote: ↑Sat Jul 26, 2025 7:54 pmWait 2 weeks...ChatGPT-5 should be out.

Better than 50% chance it will be out by 15-Aug-25 - link.

You have to pay to use Grok 4: I suspect the same will be true of GPT5.

Markets are confident it will be out in the next 5 days - link.

Will be interesting to see how good it is, and if it is the best, how long it will take competitors to catch it.

Yes I’m waiting in anticipation. I don’t think the competition will catch it soon, because current ChatGPT (4O + O3) is about equal to latest Grok / Gemini / Claude overall, and ChatGPT-5 is apparently a big step up.

First thing I want to do is write a chess engine with it.

Pedro · Post by **Pedro** » Thu Aug 07, 2025 9:43 pm

Grok 4 was summarily defeated by o3 in a best-of-three match. Carlsen and Howell provided commentary on the duel. Grok 4 had previously defeated Gemini 2.5 Pro by a score of 3 to 2, while o3 beat o4 mini 4–0 in the semifinals.

towforce · Post by **towforce** » Thu Aug 07, 2025 10:11 pm

Werewolf wrote: ↑Tue Aug 05, 2025 11:22 pm
towforce wrote: ↑Tue Aug 05, 2025 9:44 pm
towforce wrote: ↑Sun Jul 27, 2025 11:32 am
Werewolf wrote: ↑Sat Jul 26, 2025 7:54 pmWait 2 weeks...ChatGPT-5 should be out.

Better than 50% chance it will be out by 15-Aug-25 - link.

You have to pay to use Grok 4: I suspect the same will be true of GPT5.

Markets are confident it will be out in the next 5 days - link.

Will be interesting to see how good it is, and if it is the best, how long it will take competitors to catch it.
Yes I’m waiting in anticipation. I don’t think the competition will catch it soon, because current ChatGPT (4O + O3) is about equal to latest Grok / Gemini / Claude overall, and ChatGPT-5 is apparently a big step up.

First thing I want to do is write a chess engine with it.

It looks as though the release of GPT 5 is in process: it has gone straight to the top of the leader board - link.

Werewolf · Post by **Werewolf** » Fri Aug 08, 2025 9:47 am

I am very unconvinced by ChatGPT 5's reasoning ability.
We played 6 games of chess - it made illegal moves in 5 of them. All my attempts to get it to think (invoke inference) failed. It seems like OpenAI are trying to save money by taking inference out of the hands of the user.

Then I tested it on a 1990s IQ puzzle book that's not on the net. It scored a dismal 50%.
Grok 4 got 95%.

ChatGPT-O3 got 70%.

Werewolf · Post by **Werewolf** » Fri Aug 08, 2025 10:08 am

OpenAI have just added a "Thinking" button which forces longer thinks.

On my IQ test its score went from 50% to 70% but it's still not amazing.

towforce · Post by **towforce** » Fri Aug 08, 2025 10:31 am

Werewolf wrote: ↑Fri Aug 08, 2025 9:47 am I am very unconvinced by ChatGPT 5's reasoning ability.
We played 6 games of chess - it made illegal moves in 5 of them. All my attempts to get it to think (invoke inference) failed. It seems like OpenAI are trying to save money by taking inference out of the hands of the user.

Then I tested it on a 1990s IQ puzzle book that's not on the net. It scored a dismal 50%.
Grok 4 got 95%.

ChatGPT-O3 got 70%.

We are still in a world in which different chatbots are suited to different tasks. Another example: GPT-5 has an input window of 272,000 tokens, whereas Gemini has a million.

Open AI have claimed that GPT-5 is better at medical diagnosis than the other chatbots: I have no reason to disbelieve this claim. If you suspect you have a mental health issue, go to Grok - it will actually give you an answer, while other chatbots will refuse to do so.

And so on and so on...

towforce · Post by **towforce** » Fri Aug 08, 2025 8:53 pm

Werewolf wrote: ↑Fri Aug 08, 2025 9:47 amI am very unconvinced by ChatGPT 5's reasoning ability.

Many of OpenAI's customers are not happy with GPT-5 - link.

Werewolf · Post by **Werewolf** » Sat Aug 09, 2025 6:31 pm

My feelings about ChatGPT-5 remain mixed, however, after 24 hours of work I have a functioning chess engine - my first.

It’s a mainly brute force engine with a primitive evaluation function written in C. I decided to make Threads dynamic so that they equal the number of legal moves available (the idea was to do multi threading at the root).

It works within Fritz and plays at around 1800 FIDE Elo.

Off to sleep now!

towforce · Post by **towforce** » Sun Aug 10, 2025 10:56 am

Werewolf wrote: ↑Sat Aug 09, 2025 6:31 pmMy feelings about ChatGPT-5 remain mixed, however, after 24 hours of work I have a functioning chess engine - my first.

Good work - well done!

You might have to do it again in a few weeks - the market is expecting GPT-5 to lose the throne very soon - link.

Werewolf · Post by **Werewolf** » Sun Aug 10, 2025 8:59 pm

towforce wrote: ↑Sun Aug 10, 2025 10:56 am
Werewolf wrote: ↑Sat Aug 09, 2025 6:31 pmMy feelings about ChatGPT-5 remain mixed, however, after 24 hours of work I have a functioning chess engine - my first.

Good work - well done!

You might have to do it again in a few weeks - the market is expecting GPT-5 to lose the throne very soon - link.

Gemini 3?

Writing a chess engine is less fun than I imagined, I am literally spending 95% of my time de-bugging

Gemini

Re: Gemini

Re: Gemini

Re: Gemini

Re: Gemini

Re: Gemini

Re: Gemini

Re: Gemini

Re: Gemini

Re: Gemini

Re: Gemini