I am using streaming API to filter tweets from a specific location, i.e. specifying a bounding box. I get many tweets outside the bounding box. According to https://dev.twitter.com/streaming/overview/request-parameters
The streaming API uses the following heuristic to determine whether a given Tweet falls within a bounding box:
- If the coordinates field is populated, the values there will be tested against the bounding box. Note that this field uses geoJSON order (longitude, latitude).
- If coordinates is empty but place is populated, the region defined in place is checked for intersection against the locations bounding box. Any overlap will match.
- If none of the rules listed above match, the Tweet does not match the location query. Note that the geo field is deprecated, and ignored by the streaming API
however it seems to be the other way round, i.e. first (and possibly only) the place field is checked! I specified a bounding box (a city in England) and started collecting tweets. I tweeted from my mobile (from another city outside the bounding box), geo-tagging “England, UK” and enabling my coordinates. Even though my coordinates were outside the bounding box, my tweet got collected, just because the place field was populated with “England, UK” which intersects with the the bounding box specified (a city in England).
On the other hand when I tweeted again, enabling my coordinates and choosing “MyCity, England” as geo-tag, my tweet did not get picked. This makes me think that only the place field is checked during filtering.
This is obviously a bug. Any solutions/fixes?