So we’ve finally managed to track down the cause of these frequent 420 responses from the API.
It looks like a combination of our streamer code doing a poor job of stopping it’s threads, combined with the Twitter API being much more vigilant when it comes to detecting multiple open connections with the same credentials.
The underlying cause for us was that our streamer runs in several threads (for talking to the API, flushing tweets to disk etc.) Previously when we restarted the streamer, we simply told the main thread to die, which killed all it’s daemonised child threads, of which the connection to the API is one.
However from the Python threading documentation - ‘Daemon threads are abruptly stopped at shutdown. Their resources (such as open files, database transactions, etc.) may not be released properly’
So what was actually happening was that on some rare occasions we ended up with an orphaned TCP socket still open to the Twitter API, which now does a better job of detecting that and telling us to go away with a 420.
We’ve updated out code to properly disconnect the streamer thread and wait for it to die, before the main thread issues a full restart. We’ve not seen any 420 responses since doing that. So obvious once you find it!