Does it have to crawl the whole internet or can it just scrape the top 5 or so most important sites?

Asking clarifying questions is a good interviewee practice. Well done.

Their response reminds me of the Monty Python sketch about the speed of an unlaiden swallow.

Please define what is "the internet" first ^^

Define if you mean "an internet" or "the Internet" before that :-)

