I am trying to get the profiles from a list with ~1million user IDs. I am using tweepy and relevant code fragment :
auth = tweepy.AppAuthHandler(consumerkey,consumersecret)
api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)
followersL=[]
for i in range(0, len(followersIDL), 100):
while True:
try :
followersL.extend(api.lookup_users(user_ids=followersIDL[i:i+100]))
time.sleep(3)
except tweepy.TweepError as error :
print("...Exception : api_code {} len(followersL) = {} : {}".format(
error.__dict__['api_code'],len(followersL),
time.strftime("%a, %d %b %Y %H:%M:%S ", time.localtime())))
time.sleep(300)
continue
break
After collecting about 390,000 profiles, I get stuck in the exception catching part of the loop. I’ve extended the time.sleep(300) -> time.sleep(3600*2) and this still has not helped matters. The relevant exception is :
tweepy.error.TweepError: Failed to send request: HTTPSConnectionPool(host='api.twitter.com', port=443): Max retries exceeded with url: /1.1/users/lookup.json (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x1c5976240>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',))
I am perplexed by this issue b/c I think that I’m respecting the user/lookup limits by sleeping 3 seconds between requests.
QUESTION : How do I get past this apparent absolute limit of ~390k user profile lookups?