Werewolf wrote: ↑Fri Dec 13, 2024 5:54 pm...I did test Gemini 2.0 Flash. It scored 55%, the same as 1.5 Pro.
I am confused by the current versioning. The only thing I've seen tested (at Tom's Hardware) is that 2.0 Flash is superior to 1.5 .
"Flash" in chatbots means smaller, lighter and faster - and I understand that, using recent innovations, a "flash" version of a chatbot can be almost as good as the full versions.
The versions I have available are:
1.5 Pro (tackle complex tasks)
1.5 Flash (get everyday help)
1.5 Pro With Deep Research
2.0 Flash Experimental
Obviously, if you want to use the previously mentioned research functionality, you have to use the third option: clear and simple.
If, however, you want, say, personal coaching, then you just want to use the most intelligent model they have. Research means waiting several minutes, and the output is a report. This is not usually what you want for coaching, which is more likely to be a back-and-forth dialogue. But which is the most intelligent model - 1.5 Pro or 2.0 Flash?
Searching (
link) brings back the answer that 2.0 Flash has capabilities that that 1.5 Pro doesn't - but I'm not seeing anything being reported on intelligence. I suppose I'll just have to give it a try. My best guess is that it is more intelligent.
When I did my testing several months ago, I found that Gemini was, for me, the most suitable coaching partner. I decided to stick to it, rather than keep retesting many chatbots. Now it's necessary to try different versions even under the same brand.
A positive indicator for 2.0 as a coach, though: an experimental version of Gemini is top of the leader board (
link), so it's likely to be one that people like using.
It's surprising that Inflection's pi.ai isn't on the leader board: their big claim for it is that it's built to have a high emotional intelligence, so you'd expect people to like it.