I set up a filter stream and I computed the time differences between two subsequently received tweets’ created_at times. I am surprised to find that some of them are out of temporal order. Out of 13,220 tweets, 19 of them are out-of-order and some of them as large as 2,000 seconds. Is there any temporal ordering guarantee in streaming APIs?
There’s no guarantee of order. It’s a distributed system, some parts move at different speeds from others.
Thanks for the reply! My thought is that out of order by ~2,000 seconds might be indicative of some problems.
Generally we see that at least 99.9% of Tweets are delivered in under a second, but there are sometimes cases where an internal queue is slow to drain for some reason and the Tweets it contains don’t get delivered as fast. If absolute ordering is critical for you then you’ll have to build an index on time and sort based off of that - the volumes at which the streaming API operates makes guaranteed ordering a very difficult problem.
Good to know. I will be mindful to handle rare out-of-order exceptions on my end.