Hi, I’m using the v2 streaming API like this:

request = requests.get(’?expansions=author_id&tweet.fields=context_annotations,created_at,entities,id,text&user.fields=id,name,username,public_metrics’, headers=headers, stream=True)

and then

for data in request.iter_lines():

Now, it sometimes runs alright for a while, but like today, it’s been breaking after every couple dozen of tweets with a ECONNRESET, which I think triggers the ChunkedEncodingError.

The main problem is that it seems to take Twitter a long time to admin the failed connection, and I keep getting ‘TooManyConnections’ responses when I try to reconnect I guess untill the rate limit window resets.

The output of my program then looks something like below.

Am I doing something wrong?

Start reading
.....................................................................Exception in thread Thread-352:
Traceback (most recent call last):
  File "<MYDIR>/lib/python3.8/site-packages/urllib3/contrib/pyopenssl.py", line 313, in recv_into
    return self.connection.recv_into(*args, **kwargs)
  File "<MYDIR>/lib/python3.8/site-packages/OpenSSL/SSL.py", line 1734, in recv_into
    self._raise_ssl_error(self._ssl, result)
  File "<MYDIR>/lib/python3.8/site-packages/OpenSSL/SSL.py", line 1558, in _raise_ssl_error
    raise SysCallError(errno, errorcode.get(errno))
OpenSSL.SSL.SysCallError: (104, 'ECONNRESET')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<MYDIR>/lib/python3.8/site-packages/urllib3/response.py", line 425, in _error_catcher
    yield
  File "<MYDIR>/lib/python3.8/site-packages/urllib3/response.py", line 752, in read_chunked
    self._update_chunk_length()
  File "<MYDIR>/lib/python3.8/site-packages/urllib3/response.py", line 682, in _update_chunk_length
    line = self._fp.fp.readline()
  File "/usr/lib/python3.8/socket.py", line 669, in readinto
    return self._sock.recv_into(b)
  File "<MYDIR>/lib/python3.8/site-packages/urllib3/contrib/pyopenssl.py", line 318, in recv_into
    raise SocketError(str(e))
OSError: (104, 'ECONNRESET')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<MYDIR>/lib/python3.8/site-packages/requests/models.py", line 750, in generate
    for chunk in self.raw.stream(chunk_size, decode_content=True):
  File "<MYDIR>/lib/python3.8/site-packages/urllib3/response.py", line 560, in stream
    for line in self.read_chunked(amt, decode_content=decode_content):
  File "<MYDIR>/lib/python3.8/site-packages/urllib3/response.py", line 781, in read_chunked
    self._original_response.close()
  File "/usr/lib/python3.8/contextlib.py", line 131, in __exit__
    self.gen.throw(type, value, traceback)
  File "<MYDIR>/lib/python3.8/site-packages/urllib3/response.py", line 443, in _error_catcher
    raise ProtocolError("Connection broken: %r" % e, e)
urllib3.exceptions.ProtocolError: ('Connection broken: OSError("(104, \'ECONNRESET\')")', OSError("(104, 'ECONNRESET')"))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "<MYDIR>/project/api/management/commands/run_stream.py", line 129, in handle_twitter_stream
    for data in request.iter_lines():
  File "<MYDIR>/lib/python3.8/site-packages/requests/models.py", line 794, in iter_lines
    for chunk in self.iter_content(chunk_size=chunk_size, decode_unicode=decode_unicode):
  File "<MYDIR>/lib/python3.8/site-packages/requests/models.py", line 753, in generate
    raise ChunkedEncodingError(e)
requests.exceptions.ChunkedEncodingError: ('Connection broken: OSError("(104, \'ECONNRESET\')")', OSError("(104, 'ECONNRESET')"))
Late heartbeat. Resetting stream.
Sleeping for 30 seconds before trying again
New stream
Start reading
..................................................................................................................................................................Exception in thread Thread-423:
Traceback (most recent call last):
  File "<MYDIR>/lib/python3.8/site-packages/urllib3/contrib/pyopenssl.py", line 313, in recv_into
    return self.connection.recv_into(*args, **kwargs)
  File "<MYDIR>/lib/python3.8/site-packages/OpenSSL/SSL.py", line 1734, in recv_into
    self._raise_ssl_error(self._ssl, result)
  File "<MYDIR>/lib/python3.8/site-packages/OpenSSL/SSL.py", line 1558, in _raise_ssl_error
    raise SysCallError(errno, errorcode.get(errno))
OpenSSL.SSL.SysCallError: (104, 'ECONNRESET')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<MYDIR>/lib/python3.8/site-packages/urllib3/response.py", line 425, in _error_catcher
    yield
  File "<MYDIR>/lib/python3.8/site-packages/urllib3/response.py", line 752, in read_chunked
    self._update_chunk_length()
  File "<MYDIR>/lib/python3.8/site-packages/urllib3/response.py", line 682, in _update_chunk_length
    line = self._fp.fp.readline()
  File "/usr/lib/python3.8/socket.py", line 669, in readinto
    return self._sock.recv_into(b)
  File "<MYDIR>/lib/python3.8/site-packages/urllib3/contrib/pyopenssl.py", line 318, in recv_into
    raise SocketError(str(e))
OSError: (104, 'ECONNRESET')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<MYDIR>/lib/python3.8/site-packages/requests/models.py", line 750, in generate
    for chunk in self.raw.stream(chunk_size, decode_content=True):
  File "<MYDIR>/lib/python3.8/site-packages/urllib3/response.py", line 560, in stream
    for line in self.read_chunked(amt, decode_content=decode_content):
  File "<MYDIR>/lib/python3.8/site-packages/urllib3/response.py", line 781, in read_chunked
    self._original_response.close()
  File "/usr/lib/python3.8/contextlib.py", line 131, in __exit__
    self.gen.throw(type, value, traceback)
  File "<MYDIR>/lib/python3.8/site-packages/urllib3/response.py", line 443, in _error_catcher
    raise ProtocolError("Connection broken: %r" % e, e)
urllib3.exceptions.ProtocolError: ('Connection broken: OSError("(104, \'ECONNRESET\')")', OSError("(104, 'ECONNRESET')"))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "<MYDIR>/project/api/management/commands/run_stream.py", line 129, in handle_twitter_stream
    for data in request.iter_lines():
  File "<MYDIR>/lib/python3.8/site-packages/requests/models.py", line 794, in iter_lines
    for chunk in self.iter_content(chunk_size=chunk_size, decode_unicode=decode_unicode):
  File "<MYDIR>/lib/python3.8/site-packages/requests/models.py", line 753, in generate
    raise ChunkedEncodingError(e)
requests.exceptions.ChunkedEncodingError: ('Connection broken: OSError("(104, \'ECONNRESET\')")', OSError("(104, 'ECONNRESET')"))
Late heartbeat. Resetting stream.
Sleeping for 30 seconds before trying again
New stream
Start reading
{'title': 'ConnectionException', 'detail': 'This stream is currently at the maximum allowed connection limit.', 'connection_issue': 'TooManyConnections', 'type': 'https://api.twitter.com/2/problems/streaming-connection'}
Sleeping for 30 seconds before trying again
New stream
Start reading
{'title': 'ConnectionException', 'detail': 'This stream is currently at the maximum allowed connection limit.', 'connection_issue': 'TooManyConnections', 'type': 'https://api.twitter.com/2/problems/streaming-connection'}
Sleeping for 30 seconds before trying again
New stream
Start reading
{'title': 'ConnectionException', 'detail': 'This stream is currently at the maximum allowed connection limit.', 'connection_issue': 'TooManyConnections', 'type': 'https://api.twitter.com/2/problems/streaming-connection'}
Sleeping for 30 seconds before trying again
New stream
Start reading
1 Like

I’m having the same issue as you describe. (Note, the same issue is also reported here: requests.exceptions.ChunkedEncodingError:)

Many answers (mostly regarding the old statuses/filter API) seem to suggest this may be because your application is not keeping up with the volume from Twitter, or your connection is unstable.

I’ve ruled the above causes out by testing the following way:

  1. Creating the most simple script I can think of to get in the content (See below) . This script doesn’t even write to file system, it just prints the delay, so that I’m pretty sure my application is not running behind.
  2. Running it on several machines on Amazon Web Services (First tried on a t3.nano, later on a m3.medium) to ensure my local internet connection was not the problem.

The script:

import requests
import argparse
import time
import json
import datetime


def main():
    aparser = argparse.ArgumentParser(
        description='Twitter Filtered stream connection test'
    )
    aparser.add_argument(
        'bearer_token',
        help=(
            'The bearer_token for the connection to twitter'
        ),
        type=str
    )

    args = aparser.parse_args()
    bearer_token = args.bearer_token

    session = requests.Session()
    session.headers.update({
        "Authorization": "Bearer {}".format(bearer_token),
        "User-Agent": "connectiontest_x26x55d"
    })

    # Get the rules, to print
    response = session.get(
        'https://api.twitter.com/2/tweets/search/stream/rules'
    )
    print('Using the following rules:')
    print(response.text)

    # Now start consuming. Just print id's to make sure it doesn't lag
    params = {
        'expansions': 'attachments.media_keys,author_id,geo.place_id',
        'user.fields': 'id,location,name,username',
        'media.fields': 'type,url,preview_image_url',
        'place.fields': 'full_name,geo,id,place_type',
        'tweet.fields': 'attachments,author_id,created_at,id,text,lang'
    }
    with session.get(
                'https://api.twitter.com/2/tweets/search/stream',
                stream=True, timeout=(1, 21),
                params=params
            ) as response:
        if response.status_code != 200:
            raise requests.exceptions.HTTPError(
                'Initial stream connection failed with code {}: {}'.format(
                    response.status_code, response.text
                ),
                response=response
            )

        for response_line in response.iter_lines():
            if response_line:  # Also emtpy keep-alive signals are sent
                payload = json.loads(response_line)
                if 'data' in payload:
                    created_at = datetime.datetime.fromisoformat(
                        payload['data']['created_at'].rstrip('Z')
                    )
                    delay = (datetime.datetime.utcnow() - created_at).total_seconds()
                    print(
                        f"{int(time.time())}: "
                        f"Received id {payload['data']['id']}"
                        f" with delay {delay}"
                    )
                else:
                    # e.g. errors
                    print(f"{int(time.time())}: {payload}")


if __name__ == '__main__':
    main()

Using the script listed under (1), I got the following output (I tested with a combination of Japanese flood-related keywords):

Using the following rules:
{"data":[{"id":"1377191185438007302","value":"((浸水) OR (洪水) OR (濫) OR (水害) OR (豪雨) OR (川から水) OR (水 (路 OR 宅)) OR (水につか) OR (水没し) OR ((川 OR 水) (災 OR 死者 OR 命救 OR 
捜索 OR 救助 OR 避難 OR 浸 OR あふれ)) OR (災害派遣) OR (が水につ) OR ((ダム OR 橋 OR 堤防) (決壊 OR 避難 OR 壊)) OR (津波)) -is:retweet"},{"id":"1377191185438007303","value":"(台風) -is:re$
weet"}],"meta":{"sent":"2021-04-07T13:20:38.500Z"}}
1617801641: Received id 1379786165729914880 with delay 10.931484
1617801645: Received id 1379786179122393088 with delay 11.203124
Traceback (most recent call last):
  File "/home/ec2-user/temp/ingestion-twitter/.env/lib/python3.9/site-packages/urllib3/response.py", line 697, in _update_chunk_length
    self.chunk_left = int(line, 16)
ValueError: invalid literal for int() with base 16: b''

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ec2-user/temp/ingestion-twitter/.env/lib/python3.9/site-packages/urllib3/response.py", line 438, in _error_catcher
    yield
  File "/home/ec2-user/temp/ingestion-twitter/.env/lib/python3.9/site-packages/urllib3/response.py", line 764, in read_chunked
    self._update_chunk_length()
  File "/home/ec2-user/temp/ingestion-twitter/.env/lib/python3.9/site-packages/urllib3/response.py", line 701, in _update_chunk_length
    raise InvalidChunkLength(self, line)
urllib3.exceptions.InvalidChunkLength: InvalidChunkLength(got length b'', 0 bytes read)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ec2-user/temp/ingestion-twitter/.env/lib/python3.9/site-packages/requests/models.py", line 753, in generate
    for chunk in self.raw.stream(chunk_size, decode_content=True):
  File "/home/ec2-user/temp/ingestion-twitter/.env/lib/python3.9/site-packages/urllib3/response.py", line 572, in stream
    for line in self.read_chunked(amt, decode_content=decode_content):
  File "/home/ec2-user/temp/ingestion-twitter/.env/lib/python3.9/site-packages/urllib3/response.py", line 793, in read_chunked
    self._original_response.close()
  File "/usr/local/lib/python3.9/contextlib.py", line 135, in __exit__
    self.gen.throw(type, value, traceback)
  File "/home/ec2-user/temp/ingestion-twitter/.env/lib/python3.9/site-packages/urllib3/response.py", line 455, in _error_catcher
    raise ProtocolError("Connection broken: %r" % e, e)
urllib3.exceptions.ProtocolError: ("Connection broken: InvalidChunkLength(got length b'', 0 bytes read)", InvalidChunkLength(got length b'', 0 bytes read))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ec2-user/temp/ingestion-twitter/scripts/basic_test.py", line 76, in <module>
    main()
  File "/home/ec2-user/temp/ingestion-twitter/scripts/basic_test.py", line 57, in main
    for response_line in response.iter_lines():
  File "/home/ec2-user/temp/ingestion-twitter/.env/lib/python3.9/site-packages/requests/models.py", line 797, in iter_lines
    for chunk in self.iter_content(chunk_size=chunk_size, decode_unicode=decode_unicode):
  File "/home/ec2-user/temp/ingestion-twitter/.env/lib/python3.9/site-packages/requests/models.py", line 756, in generate
    raise ChunkedEncodingError(e)
requests.exceptions.ChunkedEncodingError: ("Connection broken: InvalidChunkLength(got length b'', 0 bytes read)", InvalidChunkLength(got length b'', 0 bytes read))

With the above I should have ruled out both the ‘application lagging’, since the last delay is still close to the expected 10 seconds (the volume with these keywords is on average less than 1 Tweet per second, so there can’t be a huge buffer on Twitter’s side), and I don’t believe that the connection in an AWS data center is so horribly unstable that I would witness these disconnects on a regular basis.

I can also confirm the observation of @twitapp7 that after implementing reconnection behaviour, the first few attempts (sometimes even a few minutes), return a TooManyConnections error. This again leads me to believe that the connection was not intentionally closed by Twitter…

Perhaps someone from Twitter can share their insights into this issue?

Update:
Sometimes I also get a ‘operational disconnect’ with the same code. Again, the delays, that are printed, do not indicate that I’m falling behind:

[Only the last lines of output are given]:
1617807651: Received id 1379811370548350981 with delay 10.278312
1617807660: Received id 1379811407621845003 with delay 11.280881
1617807690: Received id 1379811534147186689 with delay 10.285681
1617807704: Received id 1379811592800399366 with delay 10.195402
1617807781: Received id 1379811914788691968 with delay 11.166596
1617807803: Received id 1379812007285719044 with delay 11.157595
1617807818: Received id 1379812073606025217 with delay 10.904147
1617807829: Received id 1379812116736057344 with delay 11.118613
1617807847: Received id 1379812193437315072 with delay 10.445898
1617807851: Received id 1379812210734637056 with delay 10.499471
1617807853: Received id 1379812219874111494 with delay 10.678375
1617807879: Received id 1379812326128291851 with delay 11.080194
1617807912: Received id 1379812464670371847 with delay 11.148005
1617807913: Received id 1379812468290023425 with delay 11.159672
1617807915: Received id 1379812478159253504 with delay 10.311304
1617807918: Received id 1379812492252049408 with delay 10.662429
1617807918: Received id 1379812493082562565 with delay 10.828756
1617807925: Received id 1379812519120756743 with delay 11.085912
1617807925: Received id 1379812521255727106 with delay 10.587438
1617807935: Received id 1379812560958939137 with delay 11.051479
1617807935: Received id 1379812561529364482 with delay 10.17276
1617807960: Received id 1379812667964026884 with delay 10.741813
1617807961: Received id 1379812672733020163 with delay 10.72104
1617807979: Received id 1379812747030884366 with delay 10.474055
1617808011: Received id 1379812883538710528 with delay 10.934909
1617808012: Received id 1379812886856441858 with delay 10.788992
1617808013: Received id 1379812890266329088 with delay 10.601896
1617808032: Received id 1379812970658627593 with delay 10.736751
1617808045: Received id 1379813024974790656 with delay 10.674392
1617808061: Received id 1379813093287452677 with delay 10.994856
1617808078: Received id 1379813162661203968 with delay 10.501191
1617808087: Received id 1379813200200302596 with delay 10.441362
1617808088: Received id 1379813202830127104 with delay 11.11923
1617808108: Received id 1379813286934368266 with delay 11.110829
1617808134: {'errors': [{'title': 'operational-disconnect', 'disconnect_type': 'OperationalDisconnect', 'detail': 'This stream has been disconnected for operational reasons.', 'type': 'https://api.twitter.com/2/problems/operational-disconnect'}]}
2 Likes

I had the same error, found this on SO, works for me :slight_smile:

Would you mind adding the link to “this”? :slight_smile: This thread is possibly another version of the same issue as this: Filtered stream request breaks in 5 min intervals – hence +1 from here.

Fixed :grinning_face_with_smiling_eyes:

On 2nd attempt, nope, not fixed. Error still appearing after I hosted it on my heroku dyno.
Would appreciate fix by twitter thanks

TooManyConnections on Heroku is commonly because of the parallelism - you need to make sure that there is only 1 worker process on 1 dyno in heroku that’s connecting to Twitter, not multiple ones. I think by default it spawns 2 worker processes. I don’t know heroku that well so that’s a question for their docs.

That might be a point. Unsure if it’s solely on Heroku though.
Even if I take down my Heroku dyno - by turning on maintenance mode for a long period of time - I still experience this “too many connections” problem when running it on my own laptop

1 Like