Hi,
I’m trying to measure some statistics on relative quantities of tweets for a few cities.
I understand Streaming API is the only correct option for that (despite returning only 1% of tweets).
I use python tweepy for that, with a following code:
class StdOutListener(StreamListener):
def on_data(self, data):
tweet_object = json.loads(data)
print(tweet_object["text"].encode(sys.stdout.encoding, 'replace'))
print(tweet_object["coordinates"])
print(tweet_object["place"])
print(tweet_object["lang"])
return True
def on_error(self, status):
print(status)
if __name__ == '__main__':
l = StdOutListener()
auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
stream = Stream(auth, l)
stream.filter(locations=[52.0978767, 20.8512898, 52.3679992, 21.2710983])
However, it returns me for example tweets like this one:
??????: ??? ??? ??? ??? ?? ??? ??? ??? http://t.co/zXAzq8mSqA
?????? #??? #?????????
None
{u’country_code’: u’SA’, u’url’: u’https://api.twitter.com/1.1/geo/id/001ad0741538b980.json’, u’country’: u’\u0627\u0644
\u0645\u0645\u0644\u0643\u0629 \u0627\u0644\u0639\u0631\u0628\u064a\u0629 \u0627\u0644\u0633\u0639\u0648\u062f\u064a\u06
29’, u’place_type’: u’admin’, u’bounding_box’: {u’type’: u’Polygon’, u’coordinates’: [[[44.6569686, 17.0167619], [44.656
9686, 29.1031061], [55.6666671, 29.1031061], [55.6666671, 17.0167619]]]}, u’full_name’: u’Eastern, Kingdom of Saudi Arab
ia’, u’attributes’: {}, u’id’: u’001ad0741538b980’, u’name’: u’Eastern’}
ar
Which clearly should not fall in that filter (based on lack of coordinates in coordinates property and different coordinates for the object in place property).
What am I doing wrong?