The top 2 takeaways I took from this is that.
1. No one should still be using GPT 3.5 or Gemini 1.0, They may get the job done, but 98% of the time one of these LLMs (Gemini 1.5 Pro, Bing Copilot, ChatGPT4o, and Perplexity) will give you a better response.
2. The best way to always get the best response is by always comparing multiple AI Chatbots and choosing the best answer. This is because
If you are only using ChatGPT-3.5, llama3, claude sonnet, or mistral 100% of the time there is another LLM with a better answer.
If you are only using Gemini 1.0 96% of the time there is another LLM with a better answer.
If you are only using perplexity 86% of the time there is another LLM with a better answer.
If you are only using ChatGPT-4o 73% of the time there is another LLM with a better answer.
If you are only using Gemini 1.5 Pro 73% of the time there is another LLM with a better answer.
If you are only using Bing Copilot 73% of the time there is another LLM with a better answer.
I personally used chatplayground.ai to compare all of them.
I stopped using chatgpt 3.5 a long time ago because this gives you all the pro llms for the same price as gpt4
It is very important to note this research was done on a very small dataset of 22 questions.
The best LLM answer for each question was decided by me simply putting myself in the choose of the person asking that question and deciding which output is the most helpful.
Perplexity is good at giving you a completely answer, most llms will give you bullet point answers, perplexity perfers to write out a complete answer rather than just giving bullet points
Bing Copilot is great because it cites its resources, will even recommend videos to help you
ChatGPT-4o has really descriptive and usually longer answers, but prefers to answer in bullet points
Gemini 1.5 pro is great because it feels like it understands the context of your question more by having a conversational tone
Bing copilot can be great but 20-30% of the time the answers it gives are not even usable
Because Bing copilot is using sources to give you answers the answers feel much more human like and at times more useful then other LLMs that just list a bunch of basic bullet points
Their should be no reason anyone is still using ChatGPT 3.5 today
Gemini 1.0 gives good answers but not as detailed and helpful as Gemini 1.5 Pro
ChatGPT-4o was able to generate really nice data tables, that Gemini 1.5 Pro wasn't able to.
ChatGPT-3.5
ChatGPT-4o
Gemini 1.0
Gemini 1.5 Pro
Bing Copilot
Claude Sonnet
Llama 3
Mixtral 8x7b
Mistral Large
Perplexity