While going through some tweets I had collected through the REST API, I found that the lang parameter had tagged several tweets with the tags ‘iw’ and ‘in’. The blog post that introduces the ‘lang’ metadata cites BCP 47 (http://tools.ietf.org/html/bcp47) as the basis for the two letter codes being used, which notes:
"By contrast, the subtags ‘he’ and ‘iw’ share a ‘Description’ value of “Hebrew”; this is permitted because ‘iw’ is deprecated and its ‘Preferred-Value’ is ‘he’. "
However, I could not find ‘in’ in the list of ISO 639-1 two letter codes. From my dataset’s context, I’m guessing that it refers to ‘Indonesian’, but the correct code should be ‘id’.
Just wanted to let the devs know that these tags are incorrect. Thanks!