Hacker News new | past | comments | ask | show | jobs | submit login

> They simply compare the prompting strategies that work best with each model

Incorrect.

# Gemini marketing website, MMLU

- Gemini Ultra 90.0% with CoT@32*

- GPT-4 86.4% with 5-shot* (reported)

# gemini_1_report.pdf, MMLU

- Gemini Ultra 90.0% with CoT@32*

- Gemini Ultra 83.7% with 5-shot

- GPT-4 87.29% with CoT@32 (via API*)

- GPT-4 86.4% with 5-shot (reported)

Gemini marketing website compared best Gemini Ultra prompting strategy with a worse-performing (5-shot) GPT-4 prompting strategy.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: