fromJSON deserializing skips 'url' node of every other Twitter user tree. Any ideas?


#1

Hi All,

I’m at my wit’s end in troubleshooting this and hope that the community can help, regardless of their preferred programming language.

I’m basically doing Twitter user lookups for a set of user ids via the appropriate API get request (users/lookup). However, it seems something happens in parsing the JSON, via RJSONIO’s fromJSON function, that results in it effectively nullifying the ‘url’ of every other user’s JSON tree.

Here’s my code (in R, but should be intuitive to everyone):

url_req_users <- paste0("https://api.twitter.com/1.1/users/lookup.json?user_id=", user_ids)
req_users <- GET(url_req_users, add_headers(Authorization = token))
raw_req_users <- content(req_users, as = "text", type = "application/json")
rt_users <- fromJSON(raw_req_users)

I then preview the data and get this for one record:

> rt_users[[1]]$url
[1] "https://t.co/JDT0ZEWO8X"

But then I get this for the next record (for every other record, actually):

> rt_users[[2]]$url
NULL

Any ideas? For what it’s worth, I’d really prefer to stick with RJSONIO, as I use it in other parts of my application.

Thanks, in advance, for your help!

-Jude C.


#2

One thing to check is whether or not a user has a URL set in their profile - not all users will. Also the URL field in the user object is different to urls that may be in the user description - those can be found in "entities":{"description":{"urls":[...]}} but see Why do User Entities have only ‘urls’ field and not others? for a bit more on entities in descriptions.

Do post an example if you find a user that has a url in the profile and it doesn’t appear (or is failing to parse) in the JSON response.


#3

Hey there Igor,

Yes, >90% of the users in question have a URL set in their profile. Here’s the URL for the 2nd user mentioned above: https://t.co/QpZ7AUsmDf

Also, I don’t get the URL from their description. I’m getting it from the ‘url’ field just before the first instance of the profile image url (see https://dev.twitter.com/rest/reference/get/users/lookup), or at least I believe so, since I’m just reducing the larger list down to a few fields (id_str, name, screen_name, url, description). However, I’m only getting the URL for every other user (i.e. I only get URLs for user #1, 3, 5, 7, etc).

Any ideas?

Hope you’re having a good day.

-JC


#4

Hey Igor. I’m just following up. Any ideas?


#5

odd indeed! Is there a set of user ids where this keeps happening?

I ran this and it seems to have worked as expected (using twitterR - maybe the issue is in how RJSONIO parses json?)

library(twitteR)
setup_twitter_oauth(consumer_key, consumer_secret, access_token, token_secret)

user_ids <- list(
    26985539,   # No URL
    184926151,  # No URL
    227070529,  # Has URL
    250134112,  # Has URL
    3187999843, # Private Account, No URL
    3433981359, # Has URL
    22271344   # Has URL
    )

users <- lookupUsers(user_ids);

for (user in users) {
    print(paste("User ID: ", user$id, " URL: ", user$url))
}

Outputs:

[1] "Using direct authentication"
[1] "User ID:  26985539  URL:  "
[1] "User ID:  184926151  URL:  "
[1] "User ID:  227070529  URL:  http://t.co/D4Qw3bNeCU"
[1] "User ID:  250134112  URL:  http://t.co/l7OPDtH7fC"
[1] "User ID:  3187999843  URL:  "
[1] "User ID:  3433981359  URL:  http://t.co/iiph8QYIz4"
[1] "User ID:  22271344  URL:  http://t.co/FDgTCw0A3S"

#6

Yeah, I fear it’s an RJSONIO thing. :-\ – And I was hoping not to use so many higher level packages, as this is for a marketing data science portfolio.

Anyways, here’s a short subset of those IDs that don’t result in a URL:
15242531,497425416,1476765594,151581782,144585909,342853151,22211844,738702469,383331867,11431782,50842890,14556184,1313835319,4080927977,4165723461,595530055,57435738,19862894,3295596556,267527802

Let me know if you happen to find anything, Igor. Thank you again for your help! And if we don’t figure it out, I guess I’ll just do the lookup via twitteR.

Hope all’s well.

-JC


#7

Oh, snap. I figured it out, Igor! – yes, it seemed every other ID (in my larger set) has no URL upon parsing, but then I listed all the URLs side by side and saw no real pattern, so I sought to confirm URLs in these user’s real profiles.

It turns out that about 40% of Twitter users don’t actually provide a URL!

Sorry, I just assumed that most everyone on Twitter includes a URL in their profiles. I thought it was a rare exception when, in fact, it’s almost a 50/50 rule.

Thanks for your help.

-JC


#8

Great, glad to see you figured this out, Jude - and thanks too to Igor for his very detailed help.


#9