A use of LLMs by search engines that I would like to see applied:
# At indexing time
- The search engine use LLM to convert all the page content into a "common language" (English?!) before indexing.
- Use LLM to rank the page (detect page spammy/informational level)
# At query time
- My query is translated to the "common language"
- Result list most appropriate results unbiased by page language, but only by spammy/informational value.
It depends. Let's suppose we want to search info about the bio of historic people, for example Julius Caesar. Google will return for sure the link to english wikipedia. But let's assume that Italian Wikipedia got more deep info about Caesar. Would you prefer to see a less complete English version, or get the most complete Italian version (and auto-translate it in English with any available tools to read it)?
# At indexing time - The search engine use LLM to convert all the page content into a "common language" (English?!) before indexing.
- Use LLM to rank the page (detect page spammy/informational level)
# At query time - My query is translated to the "common language" - Result list most appropriate results unbiased by page language, but only by spammy/informational value.