I have an Academic Research account and am currently grabbing full archive tweets via R. I need a sample of 10,000, but it seems that it’s limiting me to 3,000 per call. Is this normal or should I be able to grab more?

Thanks!

What’s the R script you’re using? It’s wrapping the API and pagination somehow, because the API limit is 500 per call (in v2 full archive search - which is what i’m assuming it’s calling.) So the limit of 3000 may be coming from someplace else.

I’m using Rtweet (pic of script attached) Is there a reason why the limit is so low? I’m trying to capture a large sample of tweets during a specific time frame so I am able to random sample 1000 of them. Rtweet unfortunately grabs tweets mostly from the end of my search query (end of April)

If you have an “academic access” account, you should be using the v2 API - a separate API that’s not supported by rtweet, but one that does give you much better access limits.

rtweet fullarchive calls will use the Premium API which is very restrictive when using the free sandbox access level (gives you 50 calls that can retrieve 100 tweets each). If you’re paying for a Premium API subscription, i would advise against using rtweet - it hard codes a request size of 100 tweets per call, which is not optimal if you’re paying for Premium - in a paid plan, it’s 500 tweets per call. I would recommend using GitHub - DocNow/twarc: A command line tool (and Python library) for archiving Twitter JSON as a command line tool and load tweets into R for analysis later.

I unfortunately don’t have any coding experience and don’t know how to use Python so I don’t think that’s something I would be able to do.

R definitely counts as coding :joy:

I think for R this might work for you: GitHub - cjbarrie/twittterv2r: Repo containing code to loop through usernames/hashtag and collect tweets from Full Archive v2 API endpoint. Uses new pagination_token query params. this will use the Academic Research account v2 API, not the Premium API which does not have any special academic access.

I’m not sure I understand how to download the development package. Could you give a little more background on that? And could I do a call for specific terms rather than hashtags and users? And would this also solve the problem of the tweet cap limit?

In R,

install.packages("devtools")

should be enough to install devtools, if there are errors - you may have to ask elsewhere for someone who know more R stuff.

After devtools is installed, you will be able to run

devtools::install_github("cjbarrie/academictwitteR")

to install the above package.

If all that fails, you can always try to use the smaller snippet: Get tweets from the Academic Research product track. · GitHub

Yes, to me it looks like they just named that function awkwardly so,

get_hashtag_tweets("(\"green eggs\" OR \"ham\") from:drseuss", "2020-01-01T00:00:00Z", "2020-01-05T00:00:00Z", bearer_token, data_path = "data/")

should work for searching for the phrase green eggs or ham from @drseuss account. To formulate these kinds of queries see Building queries | Twitter Developer and the " inside the query has to be escaped as \"

so the above, the twitter query is

("green eggs" OR "ham") from:drseuss

And to make it valid for R, it’s

"(\"green eggs\" OR \"ham\") from:drseuss"

That R code above will call the academic access endpoint, so your cap will be 10 million tweets per month.

Hope that helps!