How to collect 5 million users' tweets


#1

Hello, for my Ph.D. thesis, I need to collect tweets of 5 million users for a month. I was planning to use Streamin API with filter option but it allows me to collect only 5000 users’ tweets. Do you have any better suggestions? Or, do I need to create 1000 apps and use their API keys, and collect all the data I need, which sounds impossible to do.


#2

Before starting such a program you should carefully check our developer rules of the road and related policies, especially regarding permission to collect / store tweet data.

The Streaming API is limited to listening to 5000 users on the filter endpoint. For more than this, you should use a certified data partner like Gnip.


#3

Thank you for your help. I fully respect the rules and privacy. I will only store data for mining purposes and will not share to anyone.


#4

Also please note that this

goes against §3 of the Developer Rules of the Road.

Have you already identified 5m users of interest, or are you just looking for a sample of tweets across any 5m users?


#5

Thank you Isaach. Twitter does not let me create more than 3-4 apps anyway. And, I already have identified 5 million users but I could not find a way to collect their tweets. Any help would be appreciated.


#6

The free public API just isn’t really designed or intended for that volume of access. You might want to check out the commercial data products available from Gnip: www.gnip.com/products.