I’m querying full archive API with with ‘from_date’ and ‘to_date’ parameters set up to certain dates. The problem is that the results start from the the latest tweet and gives back all the tweets in reversed chronological order. I’d need it to give back a sample of tweets so that the whole time span of the query is covered. Is there any way to do that?
the problem here is that the number of matching tweets in the specified time span exceeds by far the number of tweets which my account ( a premium account) is allowed to query in a month
It is not possible to sample this easily. You will have to make individual calls to small time windows and retrieve all tweets from those, and sample things that way. There’s a work in progress for this in twarc Random sample option by igorbrigadir · Pull Request #459 · DocNow/twarc · GitHub
It will also help to get counts of tweets beforehand - you can run twarc2 counts instead of twarc2 search to get just the counts of tweets for example.