Reconnection for TCP/IP level network errors


#1

I’m working on a backoff strategy for a robot. The API documentation states:

“Back off linearly for TCP/IP level network errors. These problems are generally temporary and tend to clear quickly. Increase the delay in reconnects by 250ms each attempt, up to 16 seconds.”

I understand this errors to be when – for whatever reason – the client cannot communicate with the server (ie: no Internet service). However, I’m not sure if HTTP status codes equal or greater than 500 should be treated as TCP/IP level network errors too (ie: 503 service unavailable), because, in order to receive this error codes, a successful connection between client and server should have happened already.

Could someone please help me understand this?

Thanks.


#2

I think your intuition is correct in that a HTTP status code is not a TCP/IP error and you should use the exponential backoff. The slower backoff for these kinds of errors is so that your connection does not get rate limited. A 5XX error is a bit unusual, as it indicates an error which may have happened before or after the connection attempt was logged by the rate limiter. To be safe, I’d say use exponential backoff for this case (although most 503 issues should be cleared after the first reconnect attempt).


#3

Thank you, Arne.