I have an issue in how the paging of search results works with regard to API rate limits and suggested guidelines for using the since_id parameter.
If an API consumer makes an initial search request without a since_id parameter the result are currently returned in reverse chronological order. The result set could theoretically be huge (based on the query) and as such they may hit the rate limit whilst paging through the results before they’ve actually managed to fetch all available pages. Now if they store all of the tweets they did manage to fetch and used the since_id in any subsequent requests they would only fetch the tweets since the original request was made. This means the tweets that would have been available in the original request on the later page numbers (>180) will be pretty hard to fetch without significant coding to identify that the rate limit was hit and to add a request to a queue that use the max_id to fetch those results not retrieved initially.
This is also an issue if the query is common enough for there to be more than 180 pages of results produced since the first request was made. HItting the rate limit this time would cause a gap in search results that would be hard to identify without, again, some significant coding to work around hitting rate limits and identifying that there should be a further request made using the since_id of the first result from the first request and a max_id parameter of the last result received in the second request.
This would be easily solved if paged results were either returned in chronological order or, to prevent breaking existing implementations, a sort option was offered to reverse the current order.