Streaming archive tweets


#1

Please I am a student. I am doing my capstone project titled “use of twitter social media in promoting democracy in Nigeria Republic through political campaign and election monitoring”. While surfing the web, I came across the wonderful book “Mining the social web 2nd Edition” and I bought a copy from O’reilly. I tried to follow the appendix A instructions and the video to configure an ipython notebook. It has taken me about a week to get it work. Thank God now it is working fine and i am playing with it to get my hands on the different concepts.
I design my capstone project as follows:
First of all I am planning to get all Nigerian twitter users accounts. After that I would like to stream their tweets for about 4 to 5 months(i.e 3 months prior the general elections and a month after it). When I get all these I want to construct a classifier that after I trained it will be able to go through all the tweets and classify them based on whether they are related to political campaign or not. After that I would like to construct another classifier that will take these political campaign/elections tweets and classify them into three classes or categories: class 1 for candidate-1; class 2 for candidate-2 and class 3 for minor parties. When all this will be done I want to display using a map:
1- the distribution of twitter users across the country,
2- the distribution of twitter users who posted tweets about political campaign or election
3-distribution of twitter users who support the three candidates.
Please I need your advice for ways in which I can successfully achieve my project goals.
I would also like to know if there is an easy way using the twitter API to stream user accounts of a specific region or country.
On the other side, the geo attribute is not sometimes enabled therefore it is not all the tweets that contain the longitude and latitude attributes. Is there any way to easily allocate the geo coordinates to the tweets after they are streamed?
Lastly I would like to know if there is a way to stream historical or archived tweets because I need tweets for about four months.


#2

Sorry, I have to tell you that what you are trying to do is not easily doable using the public Twitter REST or Streaming APIs. You might be able to get access to a Data Grant Program or you might want to take a look at Gnip, a third-party (but Twitter affiliated) data provider.


#3

Not really. If you can identify accounts you are interested in, you can use the account IDs with the filter endpoint.

No, and we have some ongoing work around geo which unfortunately limits the number of geotagged results returned by the search API.

The only way you would be able to do this would be by using the Gnip APIs, and this requires paid access. The public Twitter search API only provides access to around 7 days of data.


#4

Thank you very much both ePirat and andypiper for your contribution. I appreciate. I will try Gnip that you suggested.
Best regards!