Streaming Client won't stream tweets

rate-limits
streaming
ruby
token-limits
limits
bots
news

#1

Hey all,
I’m using Twitter’s streaming client (in Ruby) to stream tweets for a bot I have that was working until the other day. Here’s how it’s supposed to work.

My account (locked account, used only for this purpose by a journalism startup,) follows a large number of mostly American elected officials, and when any one of them posts a tweet containing one of a select number of strings, the account will automatically quote-tweet that post with “Tweeted by @USERNAME” or “Retweeted by @USERNAME”.

It stopped working at the end of last week, when I followed a bunch of new accounts (I updated the followed accounts to include the new Congress, new governors, new state AGs and a few other officials.) The bot account is now following 1,437 accounts, and was probably following a little over 1,000 accounts this time last week.

Any ideas why my streaming bot won’t work? My theory is that I’ve exceeded some follow limit for the streaming client, but if I’m way off, please let me know (and if I’m right, does anyone know exactly what that limit is?)

Thanks!
Ben

P.S. The bot is designed to run on an AWS EC2 instance with a daemon.


#2

Stopped working meaning it stopped receiving tweets or stopped tweeting tweets?

If it can’t tweet anymore - it’s probably due to the repetitive nature of the tweets - always the same pattern, too many duplicates (excluding the quote tweet i mean), adding some variation to the tweets (timestamp maybe? some other stats? might help). Or you may have hit a POST limit (300 in 3 hours for the entire application or something like that) or other tweet limit (2400 per day).

Not sure about your implementation, but what’s the client and endpoint you’re using? Your bot shouldn’t need to follow anyone because you can just list the accounts you want to monitor with follow parameter set https://developer.twitter.com/en/docs/tweets/filter-realtime/api-reference/post-statuses-filter.html (Max 5000 accounts for that - is that what you mean by following 1437?)


#3

It stopped receiving tweets (which, by extension, means it also isn’t sending any new tweets.) I can tell this because I’m running the script locally now, and no tweets from followed accounts are showing up in the terminal, as I designed it to do (it’s supposed to show every tweet coming through the stream, not just the ones that match the parameters.)

It’s been running for about 6 months with no problem, outputting tweets with roughly the same format “Tweeted by @USERNAME http://twitter.com/USERNAME/TWEETID

I’m using the Ruby twitter gem (https://github.com/sferik/twitter), and it’s running off an AWS EC2 instance, although I’m having the same exact issue when I try to run it locally.

And I am using the follow parameter to set which accounts to draw tweets from, but I’ve designed the script to refresh every 20 minutes, so that it updates that parameter for newly followed accounts. Again, all of this was working perfectly from July-ish until Friday, when I followed a bunch of new accounts (albeit far fewer than 5,000)


#4

Ah, i get you now - i would expect 20 minutes to be reasonable to reconnect - but maybe that’s too frequent? are there any other limit notice messages / errors in the stream on connection? Maybe add a condition to only reconnect the stream if there’s a change in the number of follows, so it won’t reconnect every 20 min - that’s if you don’t already have this.

If you print out the raw stream contents it might help to pin down - depending on the accounts you follow, it may actually be a case of low volume of tweets:

If you consume a low-volume stream, some libraries (e.g. Java’s GZIPInputStream or many Ruby stream consumers) handle decompression of incoming data poorly, and will need to be overridden in order to decompress the Tweets as they are received, without waiting for a set threshold of data to be received.

I’d try adding a debug / my own account to the follows and see if it picks up that properly - only thing i can think of to try really.


#5

Right now, the only refresh is the 20 minute window (but I could play around with triggering a refresh only on added/removed follows.) Attaching the file right now. Again, the only difference between now and last week is that the account follows more accounts than before.

#!/usr/bin/env ruby
require 'twitter'
require 'pry'

config = {
  consumer_key: "TKTK",
  consumer_secret: "TKTK",
  access_token: "TKTK",
  access_token_secret: "TKTK"
}

rClient = Twitter::REST::Client.new(config)
sClient = Twitter::Streaming::Client.new(config)

followings = rClient.friend_ids.attrs[:ids]

while true
  begin
    puts "hello #{DateTime.now}"

    if !followings
      followings = rClient.friend_ids.attrs[:ids]
    end

    sClient.filter(:follow => followings.join(','), :stall_warnings => true) do |tweet|
      if DateTime.now.minute % 20 == 4 && DateTime.now.second == 20
        puts "restarting #{DateTime.now}"
        followings = nil
        return thisIsAFakeVariableForErrors
      end

      twete = tweet.text.downcase.prepend(" ")

      tweet_bool = twete.include?('cannabi') || twete.include?('marijuana') || twete.include?('marihuana') || twete.include?('legalization') || twete.include?('legalize it') || !!twete.match(/[- ]hemp[-. ]/) || twete.include?('420') || !!twete.match(/[- ]weed[-. ]/) || !!twete.match(/[- ]thc[-. ]/) || twete.include?('tetrahydrocannabinol')

      ## Filter out mentions
      ## Attribute retweets

      if tweet_bool
        puts "#{tweet.user.screen_name}: #{tweet.text}"
        if followings.include?(tweet.user.id)
          if tweet.retweet?
            rClient.update("Retweeted by @#{tweet.user.screen_name}. #{tweet.uri.to_s}")
          else
            puts tweet.text
            rClient.update("Tweeted by @#{tweet.user.screen_name}. #{tweet.uri.to_s}")
          end
        end
      end
    end
  rescue
    puts "oops #{DateTime.now}"
    sleep 1
  end
end


#6

Well i can’t really see or tell if anything’s wrong there tbh - apart from maybe sleep 1 - if there’s an error posting a tweet, and it keeps happening, maybe you’ll end up here reconnecting over and over, leading to twitter refusing a connection - in that case you should see something in the stream like connection refused or something? (That’s if i’m reading that rescue block right)


#7

Yeah, the rescue block with print an “oops TIMESTAMP” every 20 mins or so. What happened when you tried to run it?


#8

Have you read the automation rules? This sounds like spammy behaviour that would be blocked on the platform.


#9

I re-applied for a new API key, describing exactly this, and it was approved. Also, the account in question is a locked account, viewable by only 4 or 5 people. This would certainly be spammy behavior if it was just being sent into the ether, but it’s not.


#10

The visibility / follower numbers of the account is not relevant here, sorry. When you sign up to and agree to the developer policy, you accept the automation rules. Thank you.


#11

I don’t know what to tell you, your own mods accepted it, and it worked like a charm until I made some change on Friday


#12

Also, your automation rules state:

Automated Retweets: Provided you comply with all other rules, you may Retweet or Quote Tweet in an automated manner for entertainment, informational, or novelty purposes. Automated Retweets often lead to negative user experiences, and bulk, aggressive, or spammy Retweeting is a violation of the Twitter Rules.

I am quote-tweeting purely for informational purposes, filtered through a specific parameter. What in the terms am I doing wrong? I genuinely don’t know. And if I am violating the terms, why didn’t I get a notification from Twitter telling me that?


#13

imo the “wrong” part of this is the bulk quote tweeting, even though it’s informational, it’s still in bulk and produces spammy looking tweets.

The “notification” of violating terms is most likely going to be just API restriction by anti-spam measures. So while your app was approved as an idea, maybe the implementation has to change.

For “notifying” people of such tweets, how about just dumping links to found tweets into a Slack channel instead of tweeting them (this slack approach worked well for me in the past)


#14

Except that my API key application was accepted literally this morning. The prior API key was active for months, and was generated before Twitter made you apply.


#15

Also @andypiper, it’s not spam if it’s going to 5 people who actively want these tweets and need them to do their jobs. If this account’s access to the streaming client has been automatically revoked, without even the courtesy of even notifying the account holder, it’s astounding to me that your company can crack down on this while Nazis run rampant on your platform.

I’m not trying to blame you for this personally, but surely, you must understanding why I find this incredibly frustrating.


#16

For what it’s worth, i’m still convinced this is a technical problem - my guess is still that the reason your stream was cut off was because of frequent connect / disconnects https://developer.twitter.com/en/docs/tutorials/consuming-streaming-data.html#reconnecting this in turn was most likely due to being write-restricted from making tweets that are too “spammy” looking or having too many in a short space of time, and while i haven’t run your code to reproduce this, i still think it’s worth fixing that reconnect delay (implementing exponential backoff) making sure you’re under the POST rate limits and if so, tweeting aggregate data or something.

The way it is now is that any error in tweeting will force a disconnect - so if a few people have a conversation that you’re tracking, every single time they tweet your stream will pick it up, your code will fail to post a tweet because it’s too much of a duplicate, and reconnect the stream (because it’ll go to the rescue block and execute the entire thing again within the infinite while loop), eventually leading to some rate limit for disconnects.

Also, the limit for rClient.friend_ids is also 15 in 15 minutes, so if you hit this rate limit, that part of the code will throw an error, again either forcing a disconnect or reconnect later on.

Try putting the rClient.update parts in their own rescue blocks and see if the stream messages have any errors for too many connections.


#17

ok i just tried to run your code properly and it’s exactly like i described, it reconnects infinitely and too quickly every time a tweet is failed. Also there are more issues:

twete = tweet.text.downcase.prepend(" ")

this line also fails every time a deleted tweet notice is issued, also forcing a reconnect / disconnect.

However, your ACTUAL bug, the reason why you no longer receive tweets in the stream after adding more users, is that the twitter library isn’t handling connecting to the stream properly - https://github.com/sferik/twitter/blob/master/lib/twitter/streaming/client.rb#L117 even though it’s making a POST request, the user IDs are encoded as URL parameters, having a 1000 or more user ids breaks this. I tried it with a few dozen ids and it worked, adding in over 2000 fails (i don’t know what the exact limit is):

requests with too many parameters may cause the request to be rejected for excessive URL length. Use a POST request to avoid long URLs.

https://developer.twitter.com/en/docs/tweets/filter-realtime/api-reference/post-statuses-filter

Your best bet is to use a better library for streaming tweets https://github.com/tweetstream/tweetstream and completely separate the code that reads and filters data from code that posts tweets / makes notifications


closed #18

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.