I’m working on a kind of a twitter wall. Users can login with twitter and create their own wall, which will display the tweets for certain terms/hashtags.
I’m still looking for the best strategy to get the data out of the Twitter APIs.
Following some of my thoughts:
Strategy 1: Streaming API
- Open a single stream (POST statuses/filter) for all walls
- Each hashtag is added to the track parameter
- When new tweets arrive, they will be processed and sent to the corresponding wall
- (“one account, one application, one open connection” cf. https://dev.twitter.com/discussions/14935)
Problems with the Streaming API
-
Streaming api is limited to 400 keywords to track
-
What to do if there are more than 400 keywords to track?
-
Streaming api is limited to 1% of the tweets of the firehose
-
It’s very difficult to get above 1% of the firehose, but if you’re tracking a term like “apple” it’d be pretty easy to exceed the 1%. (cf. https://dev.twitter.com/discussions/6349)
-
How can I handle such popular terms? Blacklist them?
Strategy 2: REST Search API**
Problems with the REST Search API
- Polling
- Could get very expensive to poll the API for a lot of users.
Do you have any suggestions/recommendations which strategy would fit the best? Are there already solutions for these problems?
Best regards