4e - use or access the Twitter API to aggregate, cache (except as part of a Tweet), or store place and other geographic location information contained in Twitter Content


I would like to do data mining on user Tweets based on a keyword search. Is this saying I cannot collect Tweets? Is this saying that I cannot look at the geo-tag of the tweets?



This is purely about geolocation data and its relationship to the tweet. You can’t cache or store geolocation data you find in tweets, except as they relate to the tweets themselves. In other words, don’t build a place database matching geolocations to place_ids or any other kind of geo data from the fields attached to tweets.


I am using Twitter4j to collect tweets from the Streaming API. I notice there are 3 forms of location, the location from the user profile, the geolocation from the tweet, and location information in the tweet itself (ie, someone tweets they are in paris). Which of these locations can I use? This is for a Masters research project.


It depends on your needs.

The location field from a user profile is self-declared and may have little to nothing to do with reality or actual places.

The geolocation information attached to the tweet, when present, was done so programmatically and likely is more accurate than a self-declared location. As long as you’re not storing this geodata separately from the tweet content and your use of the data will not surprise a user or violate their privacy, this information is ripe for usage as wel. You may also want to watch for the place_id attribute on tweets, which allows for a specific place instead of just lat/long coordinates.

Self-authored info about places (“I’m at the zoo”) within public tweets is pretty fair game for you to analyze as you see fit.


so is it okay to use the location field from user profile, for a phd research work, ( the intended work heavily relies on spatial info over social networks)


[Comment removed by user]


I hate to revive a dead thread, but does this provision prohibit storing an aggregate count of tweets at a particular location/place at a particular time?