I don't begrudge the authors for this at all. After all, these are people donating their time to work on libraries solving hard problems for free that I can then use as part of my job. There is nothing that has stopped me from forking these libraries and changing the behavior myself, other than a lack of time and that, while I dislike this behavior, its only is a problem somewhat infrequently. All that being said, I really wish the response to this issue has been something other than what feels like a language-lawerly reading of the spec (requests isn't an "agent" and so that sentence in the spec doesn't apply) and the theory that someone _might_ be able to do something with an incomplete response _some_ of the time and as such, the vastly more common case should be made much more complicated.
But, anyway, I'm glad to hear that this issue is being addressed and I thank the authors for their work!
It seems like what you are saying is that because a file might be corrupt we shouldn't worry about a completely unrelated case where the file is fine but the transfer is incomplete. By that logic, why do we worry about trying to report errors at all?
I'm saying that if you get a response from a webserver, you have to think about it being truncated. Period. No matter what the libraries you're using do for the many different cases of problem that can cause truncation.
You seem to have read a lot into my words that I didn't intend to put there.
BTW aiohttp has the problem of currently having a too-strict http protocol parser, so it throws errors for many rare cases of bad webservers, which browsers don't have a problem with. As a crawler, I need to be able to work with whatever a browser will display, ...
I've been interesting lately in exploring at aiohttp, and your comment about it being too strict is certainly very enlightening. And I want to second your comment about libraries following the lead of what browsers will do. My strong feeling is that the battle over what HTTP libraries should do has been fought and been decided by browsers, and that whatever browsers do is what HTTP libraries should do as well, regardless of how ugly it is (IIRC, when I last checked, browsers did pay attention to the Content-Length header and wouldn't display results that were shorter than it - but if I'm misremembering, I would happily change my position with respect to honoring this header). The purist in me hates to say that, but, the pragmatist wants to get things done and fighting against browser behavior feels counter-productive at this stage.
This is incredibly handy, both for the usual things one might want to tail -f (log files), and as a cheap-but-very-functional pub/sub system.
One thing I've learned is to watch out for silly proxies (client or reverse) that want to read the whole thing and then re-serve it as definite-length (i.e., not chunked, with Content-Length) or which impose a timeout on the transfer.
HTTP needs a way to say that a chunked encoding is of indefinite length, though arguably the Range: header in my case ought to be all the hint the proxies (and libraries!) need.
Chunked transfer encoding is used when the sender doesn't yet know the length of the resource's representation.
This is an awful justification. They should do this by raising an exception and attaching the response data to it, so the user who wants to "muddle through" can catch the exception and "muddle through" explicitly.
Regardless, this is a clear violation of https://en.wikipedia.org/wiki/Fail-fast and disappointing behavior from a package that prides itself on clear, obvious API behavior.
Altough I really like requests, this is a violation of one of the PEP 20 heuristics:
Errors should never pass silently.
Unless explicitly silenced.
An exception would be an elegant way to handle the problem and would be able to retain the incomplete data to be handeled. It's exactly what they are for. The exception could be the default to prevent surprises, possible deactivated with a flag.
This might be a UX problem and not a technical one. The author seems to think that `Response.ok` should be an all-purpose check.
Its been a while since i used Requests, but iirc response.ok is basically syntax sugar; but it seems to me that in most valid usecases where you'd like this sugar (over being explicit in your actions), is when you'd like to verify the communications was correct. And malformed http is not correct. I imagine if you implemented a wrapper ok2, of correctness check + response.ok, you'd see 90% of response.ok become ok2
It seems to me to be a sensible check (validate that the http message meets the standard), that should exist in any http library at request's level. And response.ok seems like a wasted api slot, if its not meeting the full needs of its sugaring
Except that the status line is literally "HTTP 200 OK", and the 'ok' check is simply using the domain terminology. This is not a bad thing. And having 'request.ok' be a general "I have examined every possible aspect of this response for problems and found none" is probably impossible, despite being what you apparently expect.
The solution is not to rename the method, it's to also warn or error when there's something fishy in the response. Which is apparently what requests 3.x will do.
There's also the issue that the Content-Length header, while generally well-adopted, is optional (sending either Content-Length or Transfer-Encoding is a SHOULD in RFC 7230, not a MUST), and sending an incorrect Content-Length, while annoying, is not actually against the RFC (the only MUST NOT prohibitions on an incorrect Content-Length value in the response are when the request was a HEAD or a conditional GET).
When a Content-Length is given in a message where a message-body is allowed, its field value MUST exactly
match the number of OCTETs in the message-body.
Occurrences of MUST NOT for the server in that section are:
A sender MUST NOT send a Content-Length header field in any message that contains a Transfer-Encoding header field.
(so can't send both Content-Length and Transfer-Encoding)
A server MAY send a Content-Length header field in a response to a HEAD request (Section 4.3.2 of [RFC7231]); a server MUST NOT send Content-Length in such a response unless its field-value equals the decimal number of octets that would have been sent in the payload body of a response if the same request had used the GET method.
(if you send Content-Length on HEAD, it must match the length of the response you would have sent for GET)
A server MAY send a Content-Length header field in a 304 (Not Modified) response to a conditional GET request (Section 4.1 of [RFC7232]); a server MUST NOT send Content-Length in such a response unless its field-value equals the decimal number of octets that would have been sent in the payload body of a 200 (OK) response to the same request.
(you can send Content-Length on conditional GET; if you do, must match length of the response you would have sent for GET)
A server MUST NOT send a Content-Length header field in any response with a status code of 1xx (Informational) or 204 (No Content). A server MUST NOT send a Content-Length header field in any 2xx (Successful) response to a CONNECT request (Section 4.3.6 of [RFC7231]).
(some other situations that can't use Content-Length)
It may be an oversight, but nothing in that section requires that the Content-Length, in general, must match the size of the response body.
I'll agree and admit that implication is not explicit specification, but I think it's a reasonable example to follow.
Well, not exactly. From the documentation of .ok: "This is not a check to see if the response code is 200 OK."
Technically, it checks if status code is not between 400 and 600.
FWIW Response.status_code is an int, so it might need to be a member function on Response.
No, it acts like an int; whether it is one (should be) irrelevant :)
return 200 <= self < 300
I'm glad to see the requests 3.0 does this by default for invalid Content-Length.