Today, we launched a product track on our new API tailored to serve the needs of the academic research community doing research with Twitter data. This post provides a technical overview of what’s available in this Academic Research product track, and how you can get started using it.
What’s available in the Academic Research product track
Free access to the complete archive of historical public Tweets
Many may already be familiar with the recent search endpoint in the Twitter API v2, which lets you access Tweets from the last 7 days. Now the Academic Research product track also includes access to the full-archive search endpoint, letting you get public Tweets from the entire history of public conversation on Twitter. When using the full-archive search endpoint, you can specify the dates for historical Tweets using the start_time and end_time parameters. (More details on how to use the full-archive search endpoint below)
Significantly higher monthly Tweet cap
The monthly Tweet cap is the number of Tweets you can retrieve using the Twitter API v2. On the Standard product track at the Basic level of access, this is set to 500,000. On the Academic Research product track, the Basic level of access includes an initial monthly Tweet cap of 10 million Tweets. Note: this Tweet cap currently applies to the following endpoints:
If you wish to speculatively capture data at scale for future research purposes, you can use the sampled stream endpoint, because it does not count towards your monthly Tweet cap.
We know that some research requires even more data, so we plan to add higher access levels across all our product tracks in the near future. We are also exploring how we may introduce more flexible access terms, to help account for periods of time where you may consume more or less data throughout the year. As always, if you have thoughts about this you can share them on our product feedback channel.
Enhanced filtering capabilities across Twitter API v2
Several endpoints of the Twitter API support a variety of different operators to specify exactly the data you want returned. These operators are combined to build rules, queries, or filters when using endpoints such as recent search, filtered stream or full-archive search. There are certain operators that are only available in the Academic Research product track, as they are intended to help researchers return more precise data. These operators include:
$ (aka cashtag), bio, bio_name, bio_location, place, place_country, point_radius, bounding_box, -is:nullcast, has:cashtags and has:geo.
Increased rule cap for the filtered stream endpoint
If you are using the Academic Research product track, you will be able to add 1,000 concurrent rules when using the filtered stream endpoint, and each rule can be 1,024 characters long. In the Standard product track, this limit is set to 25 concurrent rules and each rule can be 512 characters long. These increases are intended to support more precise and relevant data collection, and also help researchers maximize their monthly Tweet cap.
Longer query length for the recent search endpoint
If you are using the Academic Research product track, your query in the recent search endpoint can be 1024 characters long. In the Standard track, this limit is 512 characters.
How to apply for access
Whether you already have a Twitter developer account or you are just getting started today, everyone interested in gaining access to the Academic Research product track is required to submit an Academic Research application (note that you will need to log in with the Twitter account you wish to use for Academic Research access)Applicants must meet all of the following criteria:
- You are either a master’s student, doctoral candidate, post-doc, faculty, or research-focused employee at an academic institution or university.
- You have a clearly defined research objective, and you have specific plans for how you intend to use, analyze, and share Twitter data from your research.
- You will use this product track for non-commercial purposes. Learn about commercial use restrictions here.
Learn more about what you need for a successful Academic Research application here.
Once you have been approved for this product track, you will see the academic research project that you applied with in your developer portal (under Projects and Apps).
You can then create a new app or connect an existing app to this academic research project.
Finally, make sure to save your API keys and bearer token in a secure location, as you will need these to connect to the endpoints such as full-archive search.
Note: The full-archive search endpoint, higher Tweet caps and filtering enhancements (such as the additional operators mentioned above) are only available if you use the keys and token from an app connected to your academic research project.
How to get historical data with full-archive search
Once you have access to the Academic Research product track, and you have an app connected to the academic project, you can use the full-archive search endpoint to get historical data for any topic. The cURL command below shows how you can get historical Tweets from @TwitterDev handle:
curl --request GET 'https://api.twitter.com/2/tweets/search/all?query=from:twitterdev' --header 'Authorization: Bearer XXXXX'
Replace the XXXXX with your own Bearer Token. By default, only 10 most recent Tweets will be returned. If you want more than 10 Tweets per request, you can use the max_results parameter and set it to a maximum of 500 Tweets per request. Similarly, by default Tweets from the last 30 days will be returned. If you want older Tweets, you will have to specify the date range using the start_time and end_time parameters, as shown in the example below:
curl --request GET 'https://api.twitter.com/2/tweets/search/all?query=from:twitterdev&max_results=500&start_time=2016-03-31T15:00:00Z&end_time=2017-01-30T15:00:00Z' --header 'Authorization: Bearer XXXXX'
Check out this tutorial that explains in detail how you can get started with and use the full-archive search endpoint to get historical data.
Using the enhanced filtering capabilities in the Academic Research product track
An example of more precise filtering capabilities as part of the Academic Research product track is getting geo-tagged Tweets. Operators such as has:geo as well bio_location, place, place_country, point_radius and bounding_box are available in the Academic Research product track and can be used with filtered stream, recent search and full-archive search endpoints to get geo-tagged Tweets. Below are some examples of how these operators can be used with the full-archive search endpoint.
If you want to get only those Tweets that have geo data, you can use the has:geo operator. For example, the following cURL request will get only those Tweets from the @Twitterdev handle that have geo data:
curl --request GET 'https://api.twitter.com/2/tweets/search/all?query=from:twitterdev%20has:geo' --header 'Authorization: Bearer XXXXX'
Similarly, you can limit Tweets that have geo data, to a country using the place_country operator:
curl --request GET 'https://api.twitter.com/2/tweets/search/all?query=from:twitterdev%20place_country:US' --header 'Authorization: Bearer XXXXX'
For more details on the complete list of available operators and how to use them, refer to the guides for building a rule for the recent search, full-archive search and filtered stream endpoints.
We hope you are excited about this Academic Research product track, and how it can support your next research project. We plan to continue releasing new endpoints, functionality, and access options to all product tracks on the Twitter API v2, including specialized features for academic research. To follow our planned releases, check out the product roadmap. As always, if you have questions or want to connect with other researchers, you can do so in our academic research category in this forum.