Downloading over 1 mln tweets - REST or Streaming?


#1

Hi, so as title says - I need to download over 1 mln of tweets for analytics purposes and I wonder if I should go with REST API or Streaming?

The important fact is that I need to use only tweets with positive or negative content in Polish(right now I’m using REST API with query: “: ) OR : ( and lang=pl” and it works flawless) but I wonder if for over 1mln tweets would not be better to use Streaming API?

Also after reading a lot of articles exlpaining maxID and sinceID (including official Twitter’s guide) I’m still confused.

Before my code did that: (C# Linq2Twitter)

  1. Initial query - set sinceID = 1 and leave maxID blank.
  2. Find lowest ID, susbtract 1 and set to maxID.
  3. Find greatest ID and set to sinceID and if it is greater than current_sinceID - swap them.
  4. Query for 100 more tweets with set maxID and sinceID and go back to 2.

And I was getting error code 195 - invalid parameter, apparently sinceID should be initialized only once after initial query and saved to database/file/whatever and reused later. Is this conclusion correct? I thought that I should set sinceID each time that I received new bulk of tweets from query.


#2

Here’s a recent SO post with an example on how to do that with LINQ to Twitter: