Paginating through a users full timeline using Premium Search API


#1

Hello,

I’m trying to collect all user tweets for a small group of users (ca 150). I’ve just purchased the Premium Search API, and have got it generally working. However, I’m getting far fewer than 500 max results per query. (I use from:ID, fromdate:ACCOUNTCREATIONDATE, and todate:TODAY.) My results return wildly irregular numbers of tweets, as a paginate using the ‘next’ parameter.

What’s going on here? And how can I use my queries more efficiently to return a complete user timeline?

Thank you!
Derek


#2

I see from https://twittercommunity.com/t/premium-full-archive-not-returning-500-tweets/117829 and other discussions that my trouble is because: when you paginate through results, you paginate either by 500 results or by month (i.e., 31-day period). Which means you end up wasting a ton of requests for blank months.

This is not clear from the API documentation at all and should be specified. It’s also a terrible bottleneck for studying user timelines and seems highly punitive to people researching user activity.


#3

DerekKMiller I totally agree. Their literature does not clearly specify for timelines. However, I did find by messing around with my query, that you can put in multiple users using the OR statement. Your results will be organized by time (rather than user timeline), but you can sort them before you write them to your file. So, your query can be up to a certain number or characters (depending on the paid or free version) and you can put a bunch of user timelines in separated by “OR from:”. For example,

screen_names=[‘RepDanDonovan’,
‘LeaPeterson‏’,
‘Abby4Iowa’,
‘DannyTarkanian’,
‘RepErikPaulsen’,
‘JayWebberNJ’,
‘AngieCraigMN’]

query_name='from:'+screen_names[0]+' OR from:'+screen_names[1]+' OR from:'+screen_names[2]+' OR from:'+screen_names[3]+ \
           ' OR from:' + screen_names[4] + ' OR from:' + screen_names[5] + ' OR from:' + screen_names[6]

# "fromDate": "201802020000", "toDate": "201802240000"}'
search_params = {'query': query_name,
                 'toDate': toDate,
                 'fromDate': fromDate,
                 'maxResults': 500
 }

That should help because then you can max your results each time if you have enough users. Or at least get close to 500.
I hope this helps a little. Laura


#4

Amazing! Thank you! That’s a huge help.