Gemini

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, chrisw, Rebel

User avatar
towforce
Posts: 11987
Joined: Thu Mar 09, 2006 12:57 am
Location: Birmingham UK
Full name: . .

Re: Gemini

Post by towforce »

Pedro wrote: Tue Nov 05, 2024 12:34 pmSkynet is coming...

:D

Image

Hopefully, our AI future will be this... :)

The simple reveals itself after the complex has been exhausted.
User avatar
towforce
Posts: 11987
Joined: Thu Mar 09, 2006 12:57 am
Location: Birmingham UK
Full name: . .

Re: Gemini

Post by towforce »

Gemini Pro (via NotebookLM) was instructed to write a podcast for AIs based on an encyclopaedia of philosophy.

I'm not sure why, but I find the outcome to be a bit creepy: it feels to me as though the AI is discussing humans in a similar way to how humans would discuss lower animals. Anyone else have any opinions?

The podcast (in which the AI has scripted both characters) begins at 25 seconds.


The simple reveals itself after the complex has been exhausted.
smatovic
Posts: 2984
Joined: Wed Mar 10, 2010 10:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic

Re: Gemini

Post by smatovic »

towforce wrote: Fri Nov 08, 2024 9:37 pm [...]
:D This is definitely "AIs entering the area of the Metzinger Test", when AIs join philosophers in a discussion and successfully defend their own theory of consciousness...interesting times ahead :)

--
Srdja
chesskobra
Posts: 289
Joined: Thu Jul 21, 2022 12:30 am
Full name: Chesskobra

Re: Gemini

Post by chesskobra »

smatovic wrote: Sat Nov 09, 2024 8:01 am
towforce wrote: Fri Nov 08, 2024 9:37 pm [...]
:D This is definitely "AIs entering the area of the Metzinger Test", when AIs join philosophers in a discussion and successfully defend their own theory of consciousness...interesting times ahead :)

--
Srdja
Even more fun would be AI making a large scale Sokal hoax on philosophers that even Sokal himself wouldn't suspect as hoax.
smatovic
Posts: 2984
Joined: Wed Mar 10, 2010 10:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic

Re: Gemini

Post by smatovic »

chesskobra wrote: Sat Nov 09, 2024 9:40 am [...]
Well, this whole gen AI thingy shows me pretty much that we arrived already in the postmodern era, or alike.

https://en.wikipedia.org/wiki/The_Postmodern_Condition

--
Srdja
Werewolf
Posts: 1909
Joined: Thu Sep 18, 2008 10:24 pm

Re: Gemini

Post by Werewolf »

During the lull while I wait for new AIs to develop a chess engine with, I decided to do an interesting test. I bought an IQ Test book from the 1990s that had not been uploaded to the net.

The I took some of the questions, carefully checked them, and tested all the best-known AIs to see what their IQ was. Of course this is very approximate.

To give an idea of the tests, they are varied, some are very easy, some hard. I had to filter out all the pictorial ones because not all LLMs are multi-model. Here are a small sample of questions:

Question 1 (From Test 1, number 1): Create two words using the following ten letters in capitals each once only:
Clue: grand tune (4,6)
MYSEVODLTA

Question 7 (From Test 1, number 9): A car travels at a speed of 40 mph over a certain distance and then returns over the same distance at a speed of 60 mph. What is the average speed for the total journey?

Question 9 (From Test 3, number 15): What, with reference to this question, is the next number in the sequence below?
3, 3, 5, 1, 3, 4, 1, 2, 3, 4, 1, 2, ?

Anyway, here are the results below.
A score of below 35% is poor
A score of above 35% is average (around 100 IQ)
A score of above 45% is good
A score of above 60% is very good
A score of above 75% is excellent
A score of above 88% is outstanding

ChatGPT-01 (Preview) = 70%
ChatGPT-01 Mini = 60%
ChatGPT 4 = 55%
Gemini Pro 1.5 = 55%
ChatGPT 4O = 50%
Claude Sonnet 3.5 (new) = 45%
User avatar
towforce
Posts: 11987
Joined: Thu Mar 09, 2006 12:57 am
Location: Birmingham UK
Full name: . .

Re: Gemini

Post by towforce »

Werewolf wrote: Wed Nov 13, 2024 3:44 pm During the lull while I wait for new AIs to develop a chess engine with, I decided to do an interesting test. I bought an IQ Test book from the 1990s that had not been uploaded to the net.

The I took some of the questions, carefully checked them, and tested all the best-known AIs to see what their IQ was. Of course this is very approximate.

To give an idea of the tests, they are varied, some are very easy, some hard. I had to filter out all the pictorial ones because not all LLMs are multi-model. Here are a small sample of questions:

Question 1 (From Test 1, number 1): Create two words using the following ten letters in capitals each once only:
Clue: grand tune (4,6)
MYSEVODLTA

Question 7 (From Test 1, number 9): A car travels at a speed of 40 mph over a certain distance and then returns over the same distance at a speed of 60 mph. What is the average speed for the total journey?

Question 9 (From Test 3, number 15): What, with reference to this question, is the next number in the sequence below?
3, 3, 5, 1, 3, 4, 1, 2, 3, 4, 1, 2, ?

Anyway, here are the results below.
A score of below 35% is poor
A score of above 35% is average (around 100 IQ)
A score of above 45% is good
A score of above 60% is very good
A score of above 75% is excellent
A score of above 88% is outstanding

ChatGPT-01 (Preview) = 70%
ChatGPT-01 Mini = 60%
ChatGPT 4 = 55%
Gemini Pro 1.5 = 55%
ChatGPT 4O = 50%
Claude Sonnet 3.5 (new) = 45%

Those are tricky questions!

I cheated on Q1 and looked it up (I'm not fond of anagrams).

I suspect most of the chatbots gave 50 MPH for Q7: once you realise that this is not going to be correct (because the outbound journey took 50% longer than the return journey), it's straightforward to get to the correct answer.

I am stumped by Q9, and there seems to be nowhere to look it up. Maybe the sequence means something in US culture? Does it require viewing Q8? The solution is not in the OEIS (link).

You might get higher scores by asking the chatbots to work step by step with these kind of questions.
The simple reveals itself after the complex has been exhausted.
User avatar
towforce
Posts: 11987
Joined: Thu Mar 09, 2006 12:57 am
Location: Birmingham UK
Full name: . .

Re: Gemini

Post by towforce »

Werewolf wrote: Wed Nov 13, 2024 3:44 pm3, 3, 5, 1, 3, 4, 1, 2, 3, 4, 1, 2, ?

I'm going to say "3" on the grounds that I can see more reasons for choosing the number 3 than any of the other numbers in the range 1..5
The simple reveals itself after the complex has been exhausted.
User avatar
towforce
Posts: 11987
Joined: Thu Mar 09, 2006 12:57 am
Location: Birmingham UK
Full name: . .

Re: Gemini

Post by towforce »

towforce wrote: Wed Nov 13, 2024 5:01 pm
Werewolf wrote: Wed Nov 13, 2024 3:44 pm3, 3, 5, 1, 3, 4, 1, 2, 3, 4, 1, 2, ?

I'm going to say "3" on the grounds that I can see more reasons for choosing the number 3 than any of the other numbers in the range 1..5

If you paste the sequence into https://alteredqualia.com/visualization/hn/sequence/ , it tells you that the differences don't converge.
The simple reveals itself after the complex has been exhausted.
Werewolf
Posts: 1909
Joined: Thu Sep 18, 2008 10:24 pm

Re: Gemini

Post by Werewolf »

I wasn't expecting someone to try this, but here are the answers:

Question 1 (From Test 1, number 1): Create two words using the following ten letters in capitals each once only:
Clue: grand tune (4,6)
MYSEVODLTA
Answer: Vast melody. Most AIs got this right.

Question 7 (From Test 1, number 9): A car travels at a speed of 40 mph over a certain distance and then returns over the same distance at a speed of 60 mph. What is the average speed for the total journey?
Answer: 48 mph. All got this right.


Question 9 (From Test 3, number 15): What, with reference to this question, is the next number in the sequence below?
3, 3, 5, 1, 3, 4, 1, 2, 3, 4, 1, 2, ?
Answer: 4. There is no mathematical pattern, that's a red-herring. The numbers represent the number of consonants in each word of the question, hence the precise wording of the question.