Mark:
The sampling is based on a hash that is completely agnostic to any substantive metadata, so it should be a fair and proportional representation across all cross-sections, including whether there are links, whether there are hashtags, @-mentions, @-replies, what app/client generated the Tweet, etc.