What distinguishes LLMs from classical computing is that they're very much not pedantic. Because the model is predicting what human text would follow a given piece of content, you can generally expect them to react approximately the way that a human would in writing.
In this example, if a human responded that way I would assume they were either being passive aggressive or were autistic or spoke English as a second language. A neurotypical native speaker acting in good faith would invariably interpret the question as a request, not a question.
I assume it's more a part of explicitly programmed set of responses than it is a standard inference. But you're right that I should be cautious.
ChatGPT, for example, says it can retrieve URL contents (for RAG). When it does an inference it then shows a message indicating the retrieval is happening. In my very limited testing it has responded appropriately. Eg it can talk about what's on HN front page right now.
Similarly Claude.ai says it can't do such retrieval - except through API use? - and doesn't appear to do so either.
In this example, if a human responded that way I would assume they were either being passive aggressive or were autistic or spoke English as a second language. A neurotypical native speaker acting in good faith would invariably interpret the question as a request, not a question.