From: Ian Jackson Date: Tue, 22 Aug 2017 16:53:56 +0000 (+0100) Subject: ResponseConsumer: tolerate daft exception PotentialDataLoss X-Git-Tag: hippotat/1.0.0~55^2~7 X-Git-Url: http://www.chiark.greenend.org.uk/ucgi/~ian/git?p=hippotat.git;a=commitdiff_plain;h=916021af0447d61202fd909a722c28b08a4a13a3 ResponseConsumer: tolerate daft exception PotentialDataLoss The docs for this say: PotentialDataLoss may be raised by a transfer encoding decoder's noMoreData method to indicate that it cannot be determined if the entire response body has been delivered. This only occurs when making requests to HTTP servers which do not set Content-Length or a Transfer-Encoding in the response because in this case the end of the response is indicated by the connection being closed, an event which may also be due to a transient network problem or other error. [1] https://twistedmatrix.com/documents/16.6.0/api/twisted.web.http.PotentialDataLoss.html retrieved 22.8.2017 If the origin server (or proxy) chooses to use `Connection: close' and not to provide a Content-Length, then this exception will always be raised. hippotatd uses Twisted's http server so does not trigger this, but there might be an (intercepting) proxy. If so, then hippotat will simply not work. Additionally, the statement in the Twisted documentation that connection close might be due to a "transient problem or other error" is not correct. TCP is able to distinguish a graceful shutdown from an ungraceful close, and origin servers ought to arrange that they do not inappropriately perform a graceful close in error situations. Of course a badly-implemented proxy might get this wrong, but in that case a general web client would have no way to detect this situation if it wants to function at all. So certainly turning this situation into an exception by default seems wrong. For hippotat, ignoring the potential problem hippotat just means potentially processing a truncated response, which might mean lost packets (which should be retransmitted) or truncated packets (which will be detected by the IP header length field, and retransmitted). Signed-off-by: Ian Jackson --- diff --git a/hippotat b/hippotat index 200cf3a..7130f91 100755 --- a/hippotat +++ b/hippotat @@ -43,7 +43,12 @@ class GeneralResponseConsumer(twisted.internet.protocol.Protocol): self._log(DBG.HTTP_CTRL, 'connectionMade') def connectionLostOK(self, reason): - return reason.check(twisted.web.client.ResponseDone) + return (reason.check(twisted.web.client.ResponseDone) or + reason.check(twisted.web.client.PotentialDataLoss)) + # twisted.web.client.PotentialDataLoss is an entirely daft + # exception. It will occur every time if the origin server does + # not provide a Content-Length. (hippotatd does, of course, but + # the HTTP transaction might be proxied.) class ResponseConsumer(GeneralResponseConsumer): def __init__(self, cl, req, resp):