I belong to an academic team which has been purchasing access to the twitter gardenhose feed (GNIP) for the last two years in France. We are beginning to work with geotagged tweets with GPS coordinates using the .geo.coordinates field (for latitude and longitude). We’ve noticed that by doing so, the actual percentage of geotagged tweets (= containing precise GPS coordinates rather than places) is lower (0.3%) than the one reported in most studies (around 1-2%). This issue has been flagged in other threads (Twitter search api always return geo="<null>") yet, we’d like to make sure that other developers are getting similar volumes.
Btw, if this is indeed the case, does anyone know when did this dropout start ocurring?
There are three root-level Geo objects in the payload – two of which are supported and may be used to identify geotagged Tweets (see the Geo objects documentation for further reference):
coordinates- relies on exact location [long, lat]
place- a specific, named location with corresponding geo coordinates
geo, is deprecated as documented near the bottom of this page. You should use the coordinates field instead.
Unfortunately, we can’t share exact figures around the percentage of Tweets that contain geo (place / coordinate).
Great thanks John !
Just fyi, it might be because the data acquired from GNIP has a different format than the one from the public API, but our data contains only the
geo object in the payload while places are described in the root-level
location object (no
place root-level Geo object)
Ah yes, the Activity Streams format available in the enterprise APIs (formerly Gnip) will have a slightly different structure. If present, there will be a root-level
geo object representing point data (x,y coordinates) which is the same as the
coordinates object for original format. And a root-level
location object which is the same as
place object for original format.