Hi,

my filtered stream on an academic account suddenly started to disconnect every five minutes without throwing an error. Reconnects work fine, but I fear I am losing data by the frequent disconnect/reconnect. I tried the same code on two different machines with different internet connections and different twitter accounts. I use nodejs with twitter-v2 which creates a request, splits the readable stream data and passes the data to an async iterator. Nowhere in the code is a timeout that would explain the five minute interval.

Anyone else experiencing this issue? If so under what conditions? Or any ideas what could be the reason?

Thanks for any ideas!
Thilo

1 Like

Are there any messages that could give you something to go on in the stream itself?

I run twarc GitHub - DocNow/twarc: A command line tool (and Python library) for archiving Twitter JSON for streaming and haven’t seen this kind of error, and the API status seems up: https://api.twitterstat.us/

If it’s predictably every 5 minutes, there may be a timeout of some sort like here: node.js - Node https request ECONRESET after 5 minutes - Stack Overflow

Update – the problem persists in nodejs and is not specific to the package I am using:

  • The code snippet below uses needle (as in twitter’s javascript examples), but reliably fails after ~5 minutes.
  • I updated my nodejs version to 14. Since then, the stream does not simply end, but ‘hangs’ after five minutes (no more tweets, no more alive signals). From a data collection perspective this is even worse.

Can anyone reproduce this?

// Uses TWITTER_BEARER_TOKEN from environment

const needle = require('needle');

// Create 30 second timer to catch stream timeouts
// (Twitter sends 'keep alive' signal ~every 20 minutes)
let timer
refreshTimeout = function () {
    if (timer) clearTimeout(timer)
    timer = setTimeout(() => {
        console.log(`${new Date()}: Stream unresponsive.`)
        process.exit(0)
    }, 30 * 1000)
}


// Send request to API filtered stream endpoint
const stream = needle.get('https://api.twitter.com/2/tweets/search/stream', {
    headers: { "Authorization": `Bearer ${process.env.TWITTER_BEARER_TOKEN}` }
});

// Create listeners for stream events and log activity
stream
    .on('data', function (data) {
        refreshTimeout()
        process.stdout.clearLine();
        process.stdout.cursorTo(0);
        try {
            const json = JSON.parse(data)
            if (json.data) {
                process.stdout.write(`${new Date()} <- Time of last tweet.`)
            } else {
                process.stdout.write(`${new Date()} <- Message from Twitter: ${json}`)
            }
        } catch (error) {
            process.stdout.write(`${new Date()} <- Keep alive signal received.`)
        }
    })
    .on('err', function (error) {
        console.log(`\n${new Date()}: Error (${error.message}).`)
    })
    .on('end', function () {
        console.log(`\n${new Date()}: Stream ended.`)
    });


1 Like

I’m experiencing the same issue. It’s not always exactly 5 minutes, but that does seem a common timespan for this to occur. Guess I’ll look into maybe a python alternative for this.

Let me know how this works out - in my case Python has the exact same issue.

Same here, unfortunately. I did try my nodejs PoC on my 5G connection for a short time and it seemed devoid of any issues. I’m starting to suspect my router or ISP. Next up: node on amazon ec2.

node on an ec2 instance has similar issues unfortunately

[23-5-2021 17:25:43.634] Stream disconnected with error. Retrying in 5s. TwitterError: Stream unresponsive
    at Timeout.<anonymous> (/home/ec2-user/tglink/node_modules/twitter-v2/build/TwitterStream.js:43:38)
    at listOnTimeout (node:internal/timers:557:17)
    at processTimers (node:internal/timers:500:7)
[23-5-2021 17:30:59.082] Stream disconnected with error. Retrying in 5s. TwitterError: Stream unresponsive
    at Timeout.<anonymous> (/home/ec2-user/tglink/node_modules/twitter-v2/build/TwitterStream.js:43:38)
    at listOnTimeout (node:internal/timers:557:17)
    at processTimers (node:internal/timers:500:7)
1 Like

I found that different versions of node react differently to the error in the stream. While version 14 and 16 seem to timeout, version 12 just ends the stream. The latter is an advantage because it allows for an immediate reconnection. That’s how I use it for the time being.

1 Like

I get the same behaviour using python code and the requests package to access the filtered stream API. The stream disconnects precisely every 5 min and 2 seconds with requests throwing a requests.exceptions.ChunkedEncodingError “invalid chunk size”. I also haven’t been able to figure out what exactly causes this. It does not seem to be related to the number of Tweets collected up to that point.

So far, my workaround is to catch the error, dump the data that I had collected up to that point to disk and restart the stream. This causes me to run into some connection limit every once in a while. Not sure which one though as I wasn’t able to find much in terms of documentation.

2 Likes

I am experiencing exactly same issue and waiting for a solution.

Hi all,

We have identified the root cause of this issue and are working to find a solution here. Unfortunately, I don’t have an eta for a fix yet, but I’ll let you know when we have more infomation.

Best,

Jessica

2 Likes