Truncated, garbled tweets from API

php
api

#1

I posted this on StackOverflow, but thought I would try here too:

When I try to grab tweets from a public timeline, many get truncated with garbage characters and I can’t seem to grab the whole text.

I tried both the old and the new versions of twitteroauth that I could find.

This was running fine for a long time, so I’m sure it is something new to the API that I can’t seem to identify.

I’ve tried adding extended mode with no luck (still garbaged out):

$res = $connection->get("https://api.twitter.com/1.1/statuses/user_timeline.json?user_id=".$id."&count=".$numTweets."&tweet_mode=extended");

Tried adding a curl-encoding option to twitteroauth, still truncated.

Tried using the new twitteroauth:

$res = $connection->get(‘statuses/user_timeline’, array(‘screen_name’ => $screenName));
Truncated, garbage.

PHP 5.6.8 (cli) (built: Apr 15 2015 15:07:09) Copyright © 1997-2015 The PHP Group Zend Engine v2.6.0, Copyright © 1998-2015 Zend Technologies

Here is an example. Could this be some memory issue? Any help would be greatly appreciated:

$res->fulltext DIRECTLY FROM RESPONSE GIVEN BY twitteroauth: “RT @GeoscienceAus: #PicOfTheDay of the #AliceSprings #satellite antenn a taken by the Chief of our #CommunitySafety & Earth Monitoring Div DΓǪ”

ACTUAL TWEET FROM twitter.com: “#PicOfTheDay of the #AliceSprings #satellite antenna taken by the Chief of our #CommunitySafety & Earth Monitoring Div Dr Andy Barnicoat!”


#2

Hey have found a solution for this “truncated” reply problem? I am also getting truncated reply from twitter.


#3

Are you using tweet_mode=extended when calling the API?


#4

Yes, as shown in the first code snippet.


#5

No, nothing yet. Sorry.


#6

Are you able to share specific Tweet IDs that demonstrate this garbled truncation issue? Are you able to retrieve them fully using twurl?

It has been a while since I’ve done a lot with PHP, but the nature of that rendering reminds me of some kind of buffer overflow / shorting issue.


#7

Here is one that fails:

Account: @NASA_Landsat
id_str:809633973538353152
full_text: “RT @GeoscienceAus: #PicOfTheDay of the #AliceSprings #satellite antenn a taken by the Chief of our #CommunitySafety & Earth Monitoring Div DΓǪ”

I’ll try to load up twurl as soon as possible.


#8

So in using the API Console, I see that the first full_text field is truncated (even though it is full_text), with the horizontal ellipsis char code:

“created_at”: “Fri Dec 16 05:39:16 +0000 2016”,
“id”: 809633973538353200,
“id_str”: “809633973538353152”,
“full_text”: “RT @GeoscienceAus: #PicOfTheDay of the #AliceSprings #satellite antenna taken by the Chief of our #CommunitySafety & Earth Monitoring Div D…”,
“truncated”: false,

But if I scroll down the list to the retweeted section, then it is fine:

“retweeted_status”: {
“created_at”: “Fri Dec 16 05:31:11 +0000 2016”,
“id”: 809631942425444400,
“id_str”: “809631942425444353”,
“full_text”: “#PicOfTheDay of the #AliceSprings #satellite antenna taken by the Chief of our #CommunitySafety & Earth Monitoring Div Dr Andy Barnicoat! https://t.co/smKxGdMYOb”,
“truncated”: false,


#9

Can confirm. http://i.imgur.com/tgHBpvc.png

    $twitter = Rewst_Connect_Twitter::getService($this->_account['oauth_token']);

    $tweetId = $this->input('id', 'int');

    $request = $twitter->getRequests('statuses/show', [
        'id' => "809633973538353152",
        'tweet_mode' => 'extended'
    ], 'array');

Using my own wrapper but that isn’t the issue as you can see with the “full_text” field in the output.

What’s interesting also is that it’s working perfectly fine if you use the id of the first retweet (809631942425444353).


#10

Thanks Daniel.

Andy, can you advise as to the timeline for a fix?


#11

I’ll need to raise internally to check if this is the expected behaviour; if a fix is required, I would not expect it before January at this stage.


#12

Can you recommend a workaround until January? Obviously the retweet full_text is a non-starter, unless we retweet all the tweets we ever want to use I guess (yikes).


#13

Do you have examples of other ones that fail? I’m curious (and thinking that it is) if this is only an issue with retweeted statuses.

If it is just an issue with tweets that are retweeting another tweet then you could rely on the retweeted_status parameter as it will always be there.

So with php you could run a check like this:

$text = $response['full_text'];
if (!empty($response['retweeted_status']['full_text']))
{
 $text = $response['retweeted_status']['full_text'];
}

#14

Hang on a sec, this is correct behaviour and always has been with retweets. The retweeted_status text or full_text contains the original text of the tweet.

If the retweeted status is 140 chars then it can’t fit in with the RT: prefix so it gets trucated, but has never added the truncated attribute before.

TL:DR when parsing retweets use the retweeted_status.full_text and not full_text


#15

This is what I was thinking, but was going to double-check the expectation with the new longer Tweets. You’re correct that you should definitely be looking at the “child” retweet object in parsing these types of Tweets.


#17

Let’s keep this thread on (original) topic, @brandyellen - I’ll come respond on your other thread. Thanks.


#18

Sorry, this appears to be the solution. I was thrown at first when the truncated field was set to false, hadn’t seen the ellipses before, and too lazy to actually count the characters:

“full_text”: “RT @geomatlab: #Oceanography JGR: Using Landsat 8 data to estimate suspended particulate matter in the Yellow River estuary https://t.co/uI…”,
“truncated”: false,

I regret causing anyone any unnecessary work on this topic.