Is it possible to request a random username via the API?

user
api

#1

I’ve written a little app that shows the most recent tweets of 8 Twitter users, drawn sort-of randomly. I would like to make the selection of usernames actually random.

I don’t want the user of a random tweet, or a random followee or follower of any given user. Those techniques bias towards active, popular or highly networked users.

At the other extreme, I do not want to generate random 1-15 character strings, since the user namespace is highly clustered. There are many more usernames beginning with ‘mar’ (substrings of Marcela, Mark, Maryam, Martin, etc…) than there are user names that begin with ‘_w7’.

I want there to be equal likelihood of drawing @Tiana3, an egg who set up a spam account in 2008, tweeted once, and quit, as to get @katyperry.

Given the namespace combination of sparse and clustered, a really random (or even nearly random) username can only come from Twitter (and, I hope, from the REST API.)


#2

Hey Clay! Great to have you here in the dev forums.

Unfortunately we do not have a specific API method that would enable you to do this directly i.e. drawing a random username from the hat.

I’ve not tested this, but since every screen_name is backed by a unique id, how about generating a random ID number and querying for those? I suspect you’ll get a fair number of deleted or suspended account hits, so you’d need to handle errors. Also, since user ID values recently started to be snowflake IDs rather than shorter values, you’d need to handle that case too - probably generating numbers in two pools and then selecting between them. It’s very hacky, but it might be a way forward?


#3

Andy, thanks.

I’ve played with some random generation of both names and IDs, but since both Twitter’s namespace and snowflake-space are huge, names or numbers generated at random almost never match an actual users’ ID. (If Twitter has had ~1B user names generated over the last 10 years, and the namespace @[a-z0-9_]{1,15} is ~3.3*10^23, I’d have to check 300 trillion random strings to get, on average, one match.)

Worse, working backwards like that will under-match most usernames, which are highly clustered around short memorable strings. There are a million valid potential names of the form @mar[a-z0-9_]{1,6} and @w7qh_8t5[a-z0-9]{1,6}, but the former string will have at least 10s of thousands of matches, and probably 100s of thousands. I’d be surprised if the latter had even one.

So I’ll just throw a mix of dictionaries and randomness at the problem, and assume that that will be weird enough, even though it won’t be as weird as Twitter itself.