Starting with the streaming API


#1

Hi All,
I have been working with the streaming API for some time now and now that I know how it works, I would like to take it to the next level where it can be used in production environments. I have a few questions/ remarks regarding the same which I would like to share and get feedback on before continuing:

  1. I made the first program using curl and Perl. Does any of you have experience with that? I would like to know if it is a decent choice to also put retry and fail, monitor mechanism around the code or should I look for a different language(python?) or use sockets with net api ? The tweets are then stored in a db to be consumed by the front end app

  2. I am planning to also have a rest API which can request missing tweets using the since and max_id parameters to grab the missing tweets in case of a restart.

  3. The app will track certain keywords which might change over time, now when a new keyword is added or an old one is removed, how can I handle that?
    one way is to close the stream and open a new one with the new list, the other will be to open another stream on the same account and let twitter close this stream automatically avoiding any tweet loss. The third option will be to open a new stream with different credentials and then close the old stream. The change in the parameters will be happening in the worst case 1-2 times a day. (normally once per week)

Thanks for your replies in advance!


#2

Hi

I’m asynchronously fetching tweets with AnyEvent::Twitter::Stream [1] and storing them with AnyEvent::DBI [2]. No curl. I’ve just started testing, but so far so good. I had to apply this one-line patch to Net::OAuth though [3].

Regarding 3), my plan is to overlap for a while. But maybe someone knows if just starting a second one is good enough?

1 - https://metacpan.org/module/AnyEvent::Twitter::Stream
2 - https://metacpan.org/module/MLEHMANN/AnyEvent-DBI-2.3/DBI.pm
3 - https://rt.cpan.org/Public/Bug/Display.html?id=73705


#3

Thanks for the tip. Anyevent::twitter::stream works nicer then the curl implementation.

I guess regarding 3 I will have to experiment with both the options and see how it pans out.
Also how did you check the error handling (backoff timing) with AnyEvent::Twitter?


#4

You mean the messages from twitter? I haven’t been able to get the “stall_warnings” to work, but other messages are picked up with on_… callbacks like this:

on_connected => sub { say “Connected!”; } ,
on_error => sub { say “Got error:”; ddx shift;} ,
on_eof => sub { say “Got EOF”; } ,
on_keepalive => sub { say “Got keepalive”; } ,
on_delete => sub { say “Got delete”; } ,

on_tweet => sub { …

Of course, in production we have to handle all these situations