Harvest and analyze geographic information from Tweets for Research



I plan to collect and analyze tweets via your API in the context of my PhD thesis.
I am particularly interested in geographic information (latitude & longitude) and, to a lesser extent, the content of tweets.

In fact, I wonder if I’m allowed to:

  • Collect such data and integrate them into a database for analysis;
  • Publish my findings from these analyzes in my thesis dissertation (or in a scientific article) ;
  • To represent these data (maps for example).

Thank you in advance.


Our terms pretty specifically has language around the extraction of geographic data from the API. It’s an area where you can’t really go.

“You will not attempt or encourage others to: … use or access the Twitter API to aggregate, cache (except as part of a Tweet), or store place and other geographic location information contained in Twitter Content.”

So anything you do with geo-based data has to still be directly tied to tweets and respect the users wishes with that content.

You also cannot redistribute geo-based data or any Twitter content in your research. You can distribute IDs pointing to tweets or IDs or screen names pointing to users, but the onus is on the next researcher to obtain that data from Twitter on their own accord.


Thank you for your quick reply.

So, if I understood correctly, I am not allowed to harvest, integrate (in a database) and analyse the geo-based data “anonymously”. In other words, i cannot separate the geographic information from their respective tweets.

Plus, I have to mention each Tweet IDs if I create a map based on geo-based data.
And, I am allowed to publish my findings from these analyzes ; but without redistribute the geo-based data I used to draw my conclusions.

Is that correct ?

Thank you in advance.


are you sure?
its a widely adopted research, a quick google scholar search gives http://scholar.google.es/scholar?start=10&q=related:22yvJmq4jTgJ:scholar.google.com/&hl=en&as_sdt=0,5 , Mr researcher I think its okay…


My above response stands.


Question regarding a similar issue:
I would like to display statistical aggregation of hashtags based on locations, and display a graph describing the popularity of a hash tag in certain locations (e.g. how popular lebron james is in Paris comparing to London). The location is based upon whatever the user entered as its location and not via geolocation.
Is it acceptable per the terms of use?


I am interested in a similar setup as described by Oren above - I’d like to compare twitter popularity of brands in a given geographical areas (e.g. do people in Stockholm tweet more about ICA than people in Dusseldorf about Lidl).
For that I would count the tweets and not store or process the location further.
Is that OK with the terms of use?