If I have a Search widget that will be used by many (many) people at once, will there be a problem?


#1

I’m working on a little slideshow that shows while people install the Ubuntu operating system, and I am hoping to add a Twitter search widget (or maybe something hand-made with the Search API) that shows recent tweets containing #ubuntu. This would appear for basically everyone running the installer. Though I don’t have the numbers on hand for fresh installs done near each release, Ubuntu is quite popular. I expect there would be a significant number of search queries (200,000?), all from different IP addresses, all for that same thing. I realize there is no specific rate limit with this widget — just a limit per IP address per hour. What I wonder is how this kind of traffic gets handled on Twitter’s end, and if there is a possibility of the feature breaking (or making Twitter angry).

I have no way of knowing (and this number is probably a bit much given the surprising ratio of normal users to crazy ones), but for the sake of argument let’s just pretend there are 800,000 installs on the same day. I’m assuming the search queries end up being cached, but would Twitter mind that many similar queries suddenly popping up within a day? Is it an ordinary occurrence, or will it be good if the results are cached on Ubuntu’s end in advance? If so, how can we go about doing that?


#2

As long as the Search requests were all coming form different IP addresses, the rate limit factor should be fine.

Appreciate you thinking of the unexpected sudden traffic from many distributed connections to the Search API. If you were looking to mitigate any chance of this agitating our systems, I would advise using the Streaming API from a centralized server to collect the relevant tweets, then have your application reach out (securely) to your store of tweets instead. This may require infrastructure you may not wish to maintain, so it may be wise to add fallback support to the Search API.

If you do end up using the Search API, ensure that you’re using a unique User-Agent HTTP header to describe the application so that if there are any problems with the level of traffic, it’s easy to identify the source and for us to perform any necessary mitigations. I don’t anticipate you running into any issues though.