Unprocessable entity from 30 day search using searchtweets


#1

Trying to retrieve tweets from the 30 day premium archive using searchtweets is resulting in:

HTTP Error code: 422: Unprocessable Entity: This is returned due to invalid parameters in a query or when a query is too complex for us to process. –e.g. invalid PowerTrack rules or too many phrase operators, rendering a query too complex.

Steps to reproduce:

  1. Git clone https://github.com/twitterdev/search-tweets-python.
  2. Create a python 3 virtualenv and pip install -e ..
  3. Get a bearer token.
  4. Create a ~/.twitter_keys.yaml:
search_tweets_30_dev:
  account_type: premium
  endpoint: https://api.twitter.com/1.1/tweets/search/30day/sandbox.json
  bearer_token: <bearer token>
  1. Create a config file:
search_rules:
    from-date: 2017-11-01
    to-date: 2017-01-31
    pt-rule: "league"

search_params:
    max-results: 100
    max-pages: 1

output_params:
    save_file: True
    filename_prefix: trade_rumors
  1. Execute: search_tweets.py --credential-file-key search_tweets_30_dev --config-file trade_rumors.yaml --debug

Result:

search_tweets.py --credential-file-key search_tweets_30_dev --config-file trade_rumors.yaml --debug
DEBUG:root:{
    "credential_file": null,
    "credential_yaml_key": "search_tweets_30_dev",
    "env_overwrite": true,
    "config_filename": "trade_rumors.yaml",
    "account_type": null,
    "count_bucket": null,
    "from_date": null,
    "to_date": null,
    "pt_rule": null,
    "results_per_call": 100,
    "max_results": 500,
    "max_pages": null,
    "results_per_file": 0,
    "filename_prefix": null,
    "print_stream": true,
    "debug": true
}
DEBUG:root:{
    "from_date": "2017-11-01",
    "to_date": "2017-01-31",
    "pt_rule": "league",
    "max_results": 500,
    "max_pages": 1,
    "save_file": true,
    "filename_prefix": "trade_rumors",
    "credential_yaml_key": "search_tweets_30_dev",
    "env_overwrite": true,
    "config_filename": "trade_rumors.yaml",
    "results_per_call": 100,
    "results_per_file": 0,
    "print_stream": true,
    "debug": true,
    "bearer_token": "XXXXX",
    "endpoint": "https://api.twitter.com/1.1/tweets/search/30day/sandbox.json"
}
DEBUG:root:{
    "from_date": "2017-11-01",
    "to_date": "2017-01-31",
    "pt_rule": "league",
    "max_results": 500,
    "max_pages": 1,
    "save_file": true,
    "filename_prefix": "trade_rumors",
    "credential_yaml_key": "search_tweets_30_dev",
    "env_overwrite": true,
    "config_filename": "trade_rumors.yaml",
    "results_per_call": 100,
    "results_per_file": 0,
    "print_stream": true,
    "debug": true,
    "bearer_token": "XXXX",
    "endpoint": "https://api.twitter.com/1.1/tweets/search/30day/sandbox.json"
}
DEBUG:root:ResultStream: 
	{
    "username": null,
    "endpoint": "https://api.twitter.com/1.1/tweets/search/30day/sandbox.json",
    "rule_payload": {
        "query": "league",
        "maxResults": 100,
        "toDate": "201701310000",
        "fromDate": "201711010000"
    },
    "tweetify": false,
    "max_results": 500
}
INFO:searchtweets.utils:writing to file trade_rumors.json
INFO:searchtweets.result_stream:using bearer token for authentication
DEBUG:searchtweets.result_stream:sending request
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.twitter.com
DEBUG:urllib3.connectionpool:https://api.twitter.com:443 "POST /1.1/tweets/search/30day/sandbox.json HTTP/1.1" 422 196
WARNING:searchtweets.result_stream:retrying request; current status code: 422
DEBUG:searchtweets.result_stream:sending request
DEBUG:urllib3.connectionpool:https://api.twitter.com:443 "POST /1.1/tweets/search/30day/sandbox.json HTTP/1.1" 422 196
WARNING:searchtweets.result_stream:retrying request; current status code: 422
DEBUG:searchtweets.result_stream:sending request
DEBUG:urllib3.connectionpool:https://api.twitter.com:443 "POST /1.1/tweets/search/30day/sandbox.json HTTP/1.1" 422 196
WARNING:searchtweets.result_stream:retrying request; current status code: 422
DEBUG:searchtweets.result_stream:sending request
DEBUG:urllib3.connectionpool:https://api.twitter.com:443 "POST /1.1/tweets/search/30day/sandbox.json HTTP/1.1" 422 196
ERROR:searchtweets.result_stream:HTTP Error code: 422: Unprocessable Entity: This is returned due to invalid parameters in a query or when a query is too complex for us to process. –e.g. invalid PowerTrack rules or too many phrase operators, rendering a query too complex.
ERROR:searchtweets.result_stream:rule payload: {'query': 'league', 'maxResults': 100, 'toDate': '201701310000', 'fromDate': '201711010000'}
Traceback (most recent call last):
  File "/Users/justinlittman/Data/sfm3/search-tweets-python/ENV/bin/search_tweets.py", line 6, in <module>
    exec(compile(open(__file__).read(), __file__, 'exec'))
  File "/Users/justinlittman/Data/sfm3/search-tweets-python/tools/search_tweets.py", line 191, in <module>
    main()
  File "/Users/justinlittman/Data/sfm3/search-tweets-python/tools/search_tweets.py", line 185, in main
    for tweet in stream:
  File "/Users/justinlittman/Data/sfm3/search-tweets-python/searchtweets/utils.py", line 138, in write_result_stream
    yield from write_ndjson(_filename, stream)
  File "/Users/justinlittman/Data/sfm3/search-tweets-python/searchtweets/utils.py", line 93, in write_ndjson
    for item in data_iterable:
  File "/Users/justinlittman/Data/sfm3/search-tweets-python/searchtweets/result_stream.py", line 202, in stream
    self.execute_request()
  File "/Users/justinlittman/Data/sfm3/search-tweets-python/searchtweets/result_stream.py", line 253, in execute_request
    rule_payload=self.rule_payload)
  File "/Users/justinlittman/Data/sfm3/search-tweets-python/searchtweets/result_stream.py", line 101, in retried_func
    raise requests.exceptions.HTTPError
requests.exceptions.HTTPError

Thanks in advance for your assistance.


#2

Hi Justin,

It appears you are making a request with the following dates:
from-date: 2017-11-01
to-date: 2017-01-31

There are two issues with this request. First, the 30-day endpoint makes available Tweets from the previous 30-day, so at the moment, Tweets posted since ~ January 14 are available.

Also, the fromDate timestamp must be earlier/before the toDate.

If you do not include any time parameters in your request, the API will default to the last 30-days.

Hope this helps!

Thanks.


#3

Let’s just say this isn’t the first time that I’ve made date errors. Thanks for the speedy assist!


#4