Sat May 06 2017 - Unable to download tweets for this date


#1

I payed for a subscription for the full-archive.

Using this query: {“companies”:[‘3MNews’,‘AmexBusiness’,‘Chevron’,‘CocaColaCo’,‘DowDuPontCo’,‘HomeDepot’,‘JNJNews’,‘Nike’,‘pfizer_news’,‘UnitedHealthGrp’,‘VerizonNews’,‘WaltDisneyCo’],
“fromDate”:[‘200706100000’],
“toDate”:[‘201811290000’]}

It was working well but suddenly It stopped at May 6. There, I wasted 146 request in blanck and I suspect it’s correlated.

INFO - Successfully downloaded --> From: Sat May 06 09:01:27 +0000 2017 To: Sat May 06 09:01:30 +0000 2017 - 2018-11-29 19:07:39
INFO - Successfully downloaded --> From: Sat May 06 09:01:24 +0000 2017 To: Sat May 06 09:01:27 +0000 2017 - 2018-11-29 19:07:42
INFO - Successfully downloaded --> From: Sat May 06 09:01:20 +0000 2017 To: Sat May 06 09:01:24 +0000 2017 - 2018-11-29 19:07:46
INFO - Successfully downloaded --> From: Sat May 06 09:01:17 +0000 2017 To: Sat May 06 09:01:20 +0000 2017 - 2018-11-29 19:07:49
INFO - Successfully downloaded --> From: Sat May 06 09:01:14 +0000 2017 To: Sat May 06 09:01:17 +0000 2017 - 2018-11-29 19:07:53
INFO - Successfully downloaded --> From: Sat May 06 09:01:10 +0000 2017 To: Sat May 06 09:01:14 +0000 2017 - 2018-11-29 19:07:57
INFO - Successfully downloaded --> From: Sat May 06 09:01:07 +0000 2017 To: Sat May 06 09:01:10 +0000 2017 - 2018-11-29 19:08:00
INFO - Successfully downloaded --> From: Sat May 06 09:01:04 +0000 2017 To: Sat May 06 09:01:07 +0000 2017 - 2018-11-29 19:08:04
INFO - Successfully downloaded --> From: Sat May 06 09:01:01 +0000 2017 To: Sat May 06 09:01:04 +0000 2017 - 2018-11-29 19:08:07
INFO - Successfully downloaded --> From: Sat May 06 09:00:57 +0000 2017 To: Sat May 06 09:01:01 +0000 2017 - 2018-11-29 19:08:11
INFO - Successfully downloaded --> From: Sat May 06 09:00:54 +0000 2017 To: Sat May 06 09:00:57 +0000 2017 - 2018-11-29 19:08:14

I had downloaded 150000 tweets without any problem before this query, so my app is working well.
Also, I tried to download using May 05 to the from date and worked well.
Ideas? Is there any special problem with May 06?

Thank you.


#2

Thank you for submitting a new topic.

I will dig in to see if I can find out what is going on here.


#3

After doing some investigations, I found that I was able to get through all of May 6th, so it isn’t that specific date that caused this situation.

This is still something that I’d like to put some deeper thought into. I am going to sync with our product team to see if they can investigate further.


#4

A couple of questions to help our team investigate:

What did the payload of these May 6th requests look like?

With the query, are you looking for Tweets that mention those companies, or Tweets that are posted by those companies? Or maybe something differently?

Did you manually stop the May 6th loop, or did it eventually end up moving on to a later time period?


#5

I was looking for tweets posted by those companies using from: operator. I stopped manually the process. The payload returned were 0 tweets + next token + dates.

It’s quite strange because I could download perfectly from 09 Feb 2008 to 05 May 2017 and from 07 May 2017 to 29 Nov 2018. But I could not get any data from 06 May 2017.
Thank you for your help.


#6

Could you please provide an example payload from May 6th?


#7

I can’t send you a payload because my program only save them temporarily to group all the data in one file.


#8

Thank you for providing this information. I’m going to work with our product team to identify what happened here. I might have some more questions here shortly.

More to come soon.


#9

Have you tried running the query for just the May 6th date again? We are interested to see if you are able to pull the data now.


#10

I’m not able to pull the data down.

INFO - Successfully downloaded --> From: Sat May 06 09:07:10 +0000 2017 To: Sat May 06 22:34:37 +0000 2017 - 2018-12-11 10:12:24
INFO - Successfully downloaded --> From: Sat May 06 09:07:07 +0000 2017 To: Sat May 06 09:07:10 +0000 2017 - 2018-12-11 10:12:25
INFO - Successfully downloaded --> From: Sat May 06 09:07:04 +0000 2017 To: Sat May 06 09:07:07 +0000 2017 - 2018-12-11 10:12:28
INFO - Successfully downloaded --> From: Sat May 06 09:07:01 +0000 2017 To: Sat May 06 09:07:04 +0000 2017 - 2018-12-11 10:12:32

There are only 3 sec between requests.


#11

I modified the code to get the results.
(I used -is:retweetto filter out retweets)
It downloaded 500 tweets, but they are the same one:

'@twice_once1001 \nEliud Kipchoge - 2:00:25\nThe barrier just got that much closer.

The only thing that change is the first @Name.

{‘created_at’: ‘Sat May 06 09:07:11 +0000 2017’, ‘id’: 860782988967235584, ‘id_str’: ‘860782988967235584’, ‘text’: ‘@twice_once1001 \nEliud Kipchoge - 2:00:25\nThe barrier just got that much closer. #Breaking2 #justdoit\nhttps://t.co/7oXHGl9XkT’, ‘source’: ‘<a href=“https://www.qwvr.co” rel=“nofollow”>Arrow.</a>’, ‘truncated’: False, ‘in_reply_to_status_id’: None, ‘in_reply_to_status_id_str’: None, ‘in_reply_to_user_id’: 807908005253496832, ‘in_reply_to_user_id_str’: ‘807908005253496832’, ‘in_reply_to_screen_name’: ‘twice_once1001’, ‘user’: {‘id’: 415859364, ‘id_str’: ‘415859364’, ‘name’: ‘Nike’, ‘screen_name’: ‘Nike’, ‘location’: ‘Beaverton, Oregon’, ‘url’: ‘http://nike.com’, ‘description’: ‘Just Do It.’, ‘translator_type’: ‘none’, ‘derived’: {‘locations’: [{‘country’: ‘United States’, ‘country_code’: ‘US’, ‘locality’: ‘Beaverton’, ‘region’: ‘Oregon’, ‘sub_region’: ‘Washington County’, ‘full_name’: ‘Beaverton, Oregon, United States’, ‘geo’: {‘coordinates’: [-122.80371, 45.48706], ‘type’: ‘point’}}]}, ‘protected’: False, ‘verified’: True, ‘followers_count’: 7669604, ‘friends_count’: 137, ‘listed_count’: 10480, ‘favourites_count’: 6300, ‘statuses_count’: 35369, ‘created_at’: ‘Fri Nov 18 22:31:18 +0000 2011’, ‘utc_offset’: None, ‘time_zone’: None, ‘geo_enabled’: True, ‘lang’: ‘en’, ‘contributors_enabled’: False, ‘is_translator’: False, ‘profile_background_color’: ‘FFFFFF’, ‘profile_background_image_url’: ‘http://abs.twimg.com/images/themes/theme14/bg.gif’, ‘profile_background_image_url_https’: ‘https://abs.twimg.com/images/themes/theme14/bg.gif’, ‘profile_background_tile’: True, ‘profile_link_color’: ‘FF8400’, ‘profile_sidebar_border_color’: ‘FFFFFF’, ‘profile_sidebar_fill_color’: ‘EFEFEF’, ‘profile_text_color’: ‘333333’, ‘profile_use_background_image’: True, ‘profile_image_url’: ‘http://pbs.twimg.com/profile_images/953320896101412864/UdE5mfkP_normal.jpg’, ‘profile_image_url_https’: ‘https://pbs.twimg.com/profile_images/953320896101412864/UdE5mfkP_normal.jpg’, ‘profile_banner_url’: ‘https://pbs.twimg.com/profile_banners/415859364/1516124378’, ‘default_profile’: False, ‘default_profile_image’: False, ‘following’: None, ‘follow_request_sent’: None, ‘notifications’: None}, ‘geo’: None, ‘coordinates’: None, ‘place’: None, ‘contributors’: None, ‘is_quote_status’: False, ‘quote_count’: 0, ‘reply_count’: 0, ‘retweet_count’: 0, ‘favorite_count’: 0, ‘entities’: {‘hashtags’: [{‘text’: ‘Breaking2’, ‘indices’: [81, 91]}, {‘text’: ‘justdoit’, ‘indices’: [92, 101]}], ‘urls’: [{‘url’: ‘https://t.co/7oXHGl9XkT’, ‘expanded_url’: ‘https://twitter.com/Nike/status/860779881965076480/video/1’, ‘display_url’: ‘twitter.com/Nike/status/86…’, ‘unwound’: {‘url’: ‘https://twitter.com/Nike/status/860779881965076480/video/1’, ‘status’: 200, ‘title’: ‘Nike on Twitter’, ‘description’: ‘“https://t.co/7oXHGl9XkT”’}, ‘indices’: [102, 125]}], ‘user_mentions’: [{‘screen_name’: ‘twice_once1001’, ‘name’: ‘まにょん’, ‘id’: 807908005253496832, ‘id_str’: ‘807908005253496832’, ‘indices’: [0, 15]}], ‘symbols’: }, ‘favorited’: False, ‘retweeted’: False, ‘possibly_sensitive’: False, ‘scopes’: {‘followers’: False}, ‘filter_level’: ‘low’, ‘lang’: ‘en’, ‘matching_rules’: [{‘tag’: None}]}

{‘created_at’: ‘Sat May 06 09:07:11 +0000 2017’, ‘id’: 860782988942090240, ‘id_str’: ‘860782988942090240’, ‘text’: ‘@AlissaBowling \nEliud Kipchoge - 2:00:25\nThe barrier just got that much closer. #Breaking2 #justdoit\nhttps://t.co/7oXHGl9XkT’, ‘source’: ‘<a href=“https://www.qwvr.co” rel=“nofollow”>Arrow.</a>’, ‘truncated’: False, ‘in_reply_to_status_id’: None, ‘in_reply_to_status_id_str’: None, ‘in_reply_to_user_id’: 719530093, ‘in_reply_to_user_id_str’: ‘719530093’, ‘in_reply_to_screen_name’: ‘AlissaBowling’, ‘user’: {‘id’: 415859364, ‘id_str’: ‘415859364’, ‘name’: ‘Nike’, ‘screen_name’: ‘Nike’, ‘location’: ‘Beaverton, Oregon’, ‘url’: ‘http://nike.com’, ‘description’: ‘Just Do It.’, ‘translator_type’: ‘none’, ‘derived’: {‘locations’: [{‘country’: ‘United States’, ‘country_code’: ‘US’, ‘locality’: ‘Beaverton’, ‘region’: ‘Oregon’, ‘sub_region’: ‘Washington County’, ‘full_name’: ‘Beaverton, Oregon, United States’, ‘geo’: {‘coordinates’: [-122.80371, 45.48706], ‘type’: ‘point’}}]}, ‘protected’: False, ‘verified’: True, ‘followers_count’: 7669604, ‘friends_count’: 137, ‘listed_count’: 10480, ‘favourites_count’: 6300, ‘statuses_count’: 35369, ‘created_at’: ‘Fri Nov 18 22:31:18 +0000 2011’, ‘utc_offset’: None, ‘time_zone’: None, ‘geo_enabled’: True, ‘lang’: ‘en’, ‘contributors_enabled’: False, ‘is_translator’: False, ‘profile_background_color’: ‘FFFFFF’, ‘profile_background_image_url’: ‘http://abs.twimg.com/images/themes/theme14/bg.gif’, ‘profile_background_image_url_https’: ‘https://abs.twimg.com/images/themes/theme14/bg.gif’, ‘profile_background_tile’: True, ‘profile_link_color’: ‘FF8400’, ‘profile_sidebar_border_color’: ‘FFFFFF’, ‘profile_sidebar_fill_color’: ‘EFEFEF’, ‘profile_text_color’: ‘333333’, ‘profile_use_background_image’: True, ‘profile_image_url’: ‘http://pbs.twimg.com/profile_images/953320896101412864/UdE5mfkP_normal.jpg’, ‘profile_image_url_https’: ‘https://pbs.twimg.com/profile_images/953320896101412864/UdE5mfkP_normal.jpg’, ‘profile_banner_url’: ‘https://pbs.twimg.com/profile_banners/415859364/1516124378’, ‘default_profile’: False, ‘default_profile_image’: False, ‘following’: None, ‘follow_request_sent’: None, ‘notifications’: None}, ‘geo’: None, ‘coordinates’: None, ‘place’: None, ‘contributors’: None, ‘is_quote_status’: False, ‘quote_count’: 0, ‘reply_count’: 0, ‘retweet_count’: 0, ‘favorite_count’: 0, ‘entities’: {‘hashtags’: [{‘text’: ‘Breaking2’, ‘indices’: [80, 90]}, {‘text’: ‘justdoit’, ‘indices’: [91, 100]}], ‘urls’: [{‘url’: ‘https://t.co/7oXHGl9XkT’, ‘expanded_url’: ‘https://twitter.com/Nike/status/860779881965076480/video/1’, ‘display_url’: ‘twitter.com/Nike/status/86…’, ‘unwound’: {‘url’: ‘https://twitter.com/Nike/status/860779881965076480/video/1’, ‘status’: 200, ‘title’: ‘Nike on Twitter’, ‘description’: ‘“https://t.co/7oXHGl9XkT”’}, ‘indices’: [101, 124]}], ‘user_mentions’: [{‘screen_name’: ‘AlissaBowling’, ‘name’: ‘AlissaMarie✟’, ‘id’: 719530093, ‘id_str’: ‘719530093’, ‘indices’: [0, 14]}], ‘symbols’: }, ‘favorited’: False, ‘retweeted’: False, ‘possibly_sensitive’: False, ‘scopes’: {‘followers’: False}, ‘filter_level’: ‘low’, ‘lang’: ‘en’, ‘matching_rules’: [{‘tag’: None}]}


#12

This information is helpful.

Can you please provide the specific request that you used when you received the following:


#13

query = ‘{“query”:"‘from:AmexBusiness from:Chevron from:CocaColaCo from:HomeDepot from:JNJNews from:Nike from:VerizonNews -is:retweet", “fromDate”:“201705050000”, “toDate”:"201705070000’]",“maxResults”:500}’


#14

The exact query that you provided is looking for Tweets that are all posted by those different companies, AND’ed together, meaning that the Tweet had to be posted by AmexBusiness AND by Chevron AND by CocaColaCo… etc.

I’m guessing that you meant to use a query like this:

{
	"query": "(from:AmexBusiness OR from:Chevron OR from:CocaColaCo OR from:HomeDepot OR from:JNJNews OR from:Nike OR from:VerizonNews) -is:retweet",
	"maxResults": "500",
	"fromDate": "201705050000",
	"toDate": "201705070000"
}

Does that make sense?


#15

Yes. Sorry for my mistake :sweat_smile:


#16

Does that address this issue, or is there still something that we should be investigating?


#17

It was not the problem the logical OR. It was my mistake do not including it when you requested me the query I used to obtein those results. The problem is not solve. I’m still obtaining hundred of equal tweets (Sat May 06 2017 - Unable to download tweets for this date) on 6 May.


#18

Can you please explain what you mean when you say…


#19

I obtained this payload with the same ‘text’ field: @AlissaBowling \nEliud Kipchoge - 2:00:25\nThe barrier just got that much closer. #Breaking2 #justdoit\nhttps://t.co/7oXHGl9XkT’ .

Example: ‘text’: ‘@twice_once1001 \nEliud Kipchoge - 2:00:25\nThe barrier just got that much closer. #Breaking2 #justdoit\nhttps://t.co/7oXHGl9XkT’

The only thing that it change between payloads is what is written after the @, as you can see on the payloads I posted before. Moreover, I’m unable to found this tweets on tweeter.

{‘created_at’: ‘Sat May 06 09:07:11 +0000 2017’, ‘id’: 860782988942090240, ‘id_str’: ‘860782988942090240’, ‘text’: ‘@AlissaBowling \nEliud Kipchoge - 2:00:25\nThe barrier just got that much closer. #Breaking2 #justdoit\nhttps://t.co/7oXHGl9XkT’, ‘source’: ‘<a href=“https://www.qwvr.co” rel=“nofollow”>Arrow.</a>’, ‘truncated’: False, ‘in_reply_to_status_id’: None, ‘in_reply_to_status_id_str’: None, ‘in_reply_to_user_id’: 719530093, ‘in_reply_to_user_id_str’: ‘719530093’, ‘in_reply_to_screen_name’: ‘AlissaBowling’, ‘user’: {‘id’: 415859364, ‘id_str’: ‘415859364’, ‘name’: ‘Nike’, ‘screen_name’: ‘Nike’, ‘location’: ‘Beaverton, Oregon’, ‘url’: ‘http://nike.com’, ‘description’: ‘Just Do It.’, ‘translator_type’: ‘none’, ‘derived’: {‘locations’: [{‘country’: ‘United States’, ‘country_code’: ‘US’, ‘locality’: ‘Beaverton’, ‘region’: ‘Oregon’, ‘sub_region’: ‘Washington County’, ‘full_name’: ‘Beaverton, Oregon, United States’, ‘geo’: {‘coordinates’: [-122.80371, 45.48706], ‘type’: ‘point’}}]}, ‘protected’: False, ‘verified’: True, ‘followers_count’: 7669604, ‘friends_count’: 137, ‘listed_count’: 10480, ‘favourites_count’: 6300, ‘statuses_count’: 35369, ‘created_at’: ‘Fri Nov 18 22:31:18 +0000 2011’, ‘utc_offset’: None, ‘time_zone’: None, ‘geo_enabled’: True, ‘lang’: ‘en’, ‘contributors_enabled’: False, ‘is_translator’: False, ‘profile_background_color’: ‘FFFFFF’, ‘profile_background_image_url’: ‘http://abs.twimg.com/images/themes/theme14/bg.gif’, ‘profile_background_image_url_https’: ‘https://abs.twimg.com/images/themes/theme14/bg.gif’, ‘profile_background_tile’: True, ‘profile_link_color’: ‘FF8400’, ‘profile_sidebar_border_color’: ‘FFFFFF’, ‘profile_sidebar_fill_color’: ‘EFEFEF’, ‘profile_text_color’: ‘333333’, ‘profile_use_background_image’: True, ‘profile_image_url’: ‘http://pbs.twimg.com/profile_images/953320896101412864/UdE5mfkP_normal.jpg’, ‘profile_image_url_https’: ‘https://pbs.twimg.com/profile_images/953320896101412864/UdE5mfkP_normal.jpg’, ‘profile_banner_url’: ‘https://pbs.twimg.com/profile_banners/415859364/1516124378’, ‘default_profile’: False, ‘default_profile_image’: False, ‘following’: None, ‘follow_request_sent’: None, ‘notifications’: None}, ‘geo’: None, ‘coordinates’: None, ‘place’: None, ‘contributors’: None, ‘is_quote_status’: False, ‘quote_count’: 0, ‘reply_count’: 0, ‘retweet_count’: 0, ‘favorite_count’: 0, ‘entities’: {‘hashtags’: [{‘text’: ‘Breaking2’, ‘indices’: [80, 90]}, {‘text’: ‘justdoit’, ‘indices’: [91, 100]}], ‘urls’: [{‘url’: ‘https://t.co/7oXHGl9XkT’, ‘expanded_url’: ‘https://twitter.com/Nike/status/860779881965076480/video/1’, ‘display_url’: ‘twitter.com/Nike/status/86…’, ‘unwound’: {‘url’: ‘https://twitter.com/Nike/status/860779881965076480/video/1’, ‘status’: 200, ‘title’: ‘Nike on Twitter’, ‘description’: ‘“https://t.co/7oXHGl9XkT”’}, ‘indices’: [101, 124]}], ‘user_mentions’: [{‘screen_name’: ‘AlissaBowling’, ‘name’: ‘AlissaMarie✟’, ‘id’: 719530093, ‘id_str’: ‘719530093’, ‘indices’: [0, 14]}], ‘symbols’: }, ‘favorited’: False, ‘retweeted’: False, ‘possibly_sensitive’: False, ‘scopes’: {‘followers’: False}, ‘filter_level’: ‘low’, ‘lang’: ‘en’, ‘matching_rules’: [{‘tag’: None}]}


#20

Hi @DBProject3

You can find both these Tweets on Twitter here:

As you can see from the different id numbers, these are two different Tweets, even if the text is similar in each one of them.

Currently, your rule will match all Tweets within the specified time range that were posted by either AmexBusiness, or Chevron, or CocaColaCo etc… excluding retweets. There are in total 296,111 Tweets that match your query as it stands (including the two ones above). You will need to change your rule if you don’t want to receive that many Tweets.

{
	"query": "(from:AmexBusiness OR from:Chevron OR from:CocaColaCo OR from:HomeDepot OR from:JNJNews OR from:Nike OR from:VerizonNews) -is:retweet",
	"maxResults": "500",
	"fromDate": "201705050000",
	"toDate": "201705070000"
}