Rest API 1.1 does not return tweets 3 days old


#1

Hello everyone;

I am trying to retrieve tweets from Feb 17 2016 about the ankara bombing in Turkey. However, even though I use ‘since’ and until options in the query, it returns tweets from Feb 21 which is the current date. I have tried many different queries but still no luck.my query is as follows:
q=ankara%20since%3A2016%2D02%2D17
Any help would be greatly appreciated.
Thank you.


#2

I just tried the same, with following query:

/1.1/search/tweets.json?q=ankara%20since%3A2016-02-16%20until%3A2016-02-20

and it returns no Tweets at all, I suspect that is because those Tweets are too old, the REST API for search, contrary to the Website Search only indexes a small amount of Tweets. Maybe @andypiper can clarify this a bit.


#3

I don’t believe this is an age of Tweets thing since this is within the past few days and the Search API should index around 7 days. Omitting the since and until parameters seems to return results. I’ll have to dig into this over the next couple of days when I have some time.


#4

The query I used works fine on the website, so if the search API should actually return Tweets for the last 7 days, it seems something is broken?


#5

Yes, most likely something in the date parameters is causing an issue. I’ll see what I can discover.


#6

Strange - I’m trying the same:

tweets.json?q=ankara%20since%3A2016-02-16%20until%3A2016-02-20

I can get results. Maybe it was an intermittent thing?


#7

I am still getting no tweets:

twurl "/1.1/search/tweets.json?q=ankara%20since%3A2016-02-16%20until%3A2016-02-20" | python -m json.tool
{
    "search_metadata": {
        "completed_in": 0.082,
        "count": 15,
        "max_id": 701533011666309120,
        "max_id_str": "701533011666309120",
        "query": "ankara%2520since%253A2016-02-16%2520until%253A2016-02-20",
        "refresh_url": "?since_id=701533011666309120&q=ankara%2520since%253A2016-02-16%2520until%253A2016-02-20&include_entities=1",
        "since_id": 0,
        "since_id_str": "0"
    },
    "statuses": []
}

#8

my intention is to get the tweets from the query ‘ankara’ on the dates 2016-02-17 and 2016-02-18. As I said, I have tried many things but the API responses seem to be unstable.
The date I am trying to discover is less than one week old, so the date shouldnt be an issue.


#9

Well, fwiw, here are the tweet ids returned for the search “ankara since:2016-02-16 until:2016-02-20” i was able to retrieve just now: https://gist.githubusercontent.com/igorbrigadir/8c1376c0f6df9c01b2af/raw/c4741ff2b681a5c9be1735e60c7abfa056caffda/ankara_2016-02-16_2016_02_20

I was using Twitter4J, result_type=mixed if that matters? Here are the individual calls made paging through results, if it’s of any use: https://gist.github.com/igorbrigadir/6230b3ce7ede079bbf99 (“Ok: 521193” at the end of a line means http 200, 521193 bytes received)


#10

Thanks for this detailed update. I’m still seeing a gap in results testing locally using twurl, so my current best guess is that it could be related to caching between data centers / hitting different backends for some reason. I’ll continue to dig.


#11

Have just tried your exact query and params:

twurl "/1.1/search/tweets.json?q=ankara%20since%3A2016-02-16%20until%3A2016-02-20&count=100&result_type=mixed&with_twitter_user_id=true" | python -m json.tool
{
    "search_metadata": {
        "completed_in": 0.118,
        "count": 100,
        "max_id": 701750193553850368,
        "max_id_str": "701750193553850368",
        "query": "ankara%2520since%253A2016-02-16%2520until%253A2016-02-20",
        "refresh_url": "?since_id=701750193553850368&q=ankara%2520since%253A2016-02-16%2520until%253A2016-02-20&result_type=mixed&include_entities=1",
        "since_id": 0,
        "since_id_str": "0"
    },
    "statuses": []
}

#12

have you received the tweets with this query “ankara%2520since%253A2016-02-16%2520until%253A2016-02-20”


#13

ePirat,
this is what I get from the same query you ran before. The max_id_str seems to be different, which makes me think the system still does not provide a stable answer for the same query.

{
“search_metadata”: {
“completed_in”: 0.165,
“count”: 100,
“max_id”: 702227733489827840,
“max_id_str”: “702227733489827840”,
“query”: “ankara%2520since%253A2016-02-16%2520until%253A2016-02-20”,
“refresh_url”: “?since_id=702227733489827840&q=ankara%2520since%253A2016-02-16%2520until%253A2016-02-20&result_type=mixed&include_entities=1”,
“since_id”: 0,
“since_id_str”: “0”
},
“statuses”: []
}


#14

I have tried my application again with the same query but still getting the most recent tweets from today.


#15

No solution from Twitter yet? Andypiper?


#16

No, I’m afraid I’m not sure why this query is not working since it seems to work for some people, leading me to suspect it could be a caching issue somewhere, and I’ve not had time to investigate in detail. @IgorBrigadir did kindly post a list of Tweet IDs which you are free to retrieve as an alternative solution.


#17

Thank you for all effort.