Twitter Cards Not Displaying


#1

We have certain pages that are not displaying the Twitter card, despite the page looking fine via the Card Validator tool. Here is the link I’m having issues with: https://www.commonsensemedia.org/guide/essential-school-tools

This is what we see when we validate this link:
*.commonsensemedia.org is whitelisted for summary_large_image card
INFO: Page fetched successfully
INFO: 22 metatags were found
INFO: twitter:card = summary_large_image tag found
INFO: Card loaded successfully

Any help is appreciated!


#2

The robots.txt file (https://d2e111jq13me73.cloudfront.net/robots.txt) for the server where the image is hosted prevents Twitterbot from accessing your image. Please check our troubleshooting FAQ post and links for information.


#3

Hi Andy,

Thanks for replying back. I checked our robots.txt and we have allow enabled for our image paths at the moment. Also, some of our content is showing an image in the validator (https://www.commonsensemedia.org/the-common-sense-census-media-use-by-tweens-and-teens-infographic for example).

Thanks!


#4

I’m a bit surprised as to how the image appears in one case and not in the other, since the robots.txt file blocks all robots from crawling the image URLs (checked using this tool - http://tools.seobook.com/robots-txt/analyzer/ and confirmed with the behaviour of our fetcher).

My current hypothesis is that there’s a caching thing at work here and maybe the image for one was cached previously when the image was accessible. There’s a crawler cache which gets updated every 7 days so it could be that one or other image will resolve in the future.


#5

Hi Andy,

So you do not think it’s out robots.txt file, but maybe a caching issue? I checked the two URLs I listed in this ticket against the robots.txt analyzer and they both got the same “Robots allowed: All robots” message. Also, if we were blocking the Twitterbot, wouldn’t that be displayed in the Twitter Card Validator, or no?


#6

Did you check the image URLs, or the post URLs? that could be the difference.

Yes, again, I’m surprised there is a difference in the way the two URLs are appearing. I think it might be a cache thing, but I’m finding it difficult to tell what is happening in this particular case.


#7

I tested against both image URLs and post URLs. The links I tested are below and they both allow robots.txt and they’ve both been on our site for months, so I do not think it is cache. We only just started noticing that not all of our image cards are loading when a link is shared in Twitter. I saw some other threads where a Twitter staff member said you recently updated your infrastructure, could this be a result of that?



#8

I think it was me that mentioned the recent change - it is possible that it is behind this, but I’ve seen most of the issues resolved by now.

Again, I’ve checked the robots file in the SEO analyser tool this morning, and it says that all robots are denied access to that image, which is why our fetcher is not grabbing it. There’s a chance that we were reading the file differently before, but are interpreting it more strictly now, which could explain the change in behaviour against your site. I realise that this is frustrating and confusing, so I can only apologise for that!


#9

Thanks Andy. I’m still a bit perplexed by this issue, so will bring this up with our engineers. I will let you know if we have any additional questions.


#10

Hi Andy,

We have checked into our robots.txt and can confirm that everything is working as it is supposed to. Can you confirm that Twitter supports the Allow directive with your crawlers that hit robots.txt?

Thanks!
Cassie


#11

Yes, it does - see our troubleshooting page for our suggested robots.txt directive.


#12

Hi Andy,

We have been looking over our robots.txt and below are two directives that are part of our image file paths.
User-agent: *
Allow: /sites/default/files/styles/*
Disallow: /sites/

So, given the above this URL: https://www.commonsensemedia.org/sites/default/files/styles/blog_article/public/blog/csm-blog/2016-08-10-10tvkids-blog-1138x658.jpg why would this URL not pass, since it fits into the “allow pattern”? Do you see that passing or disallowing?


#13

Hi Andy,

Following up with this question, do you have any insight on my above question?

Thanks!
Cassie


#14

Hi Cassie, I have not had a chance to dig in with the developers on whatever change has happened, so I don’t have anything to share - so sorry about that. It’s very odd from what I can debug. Apologies, I’ll continue to see if I can get time for anyone to take a look.


#15

Thanks Andy, we’re anxiously awaiting your reply. We have turned off all of our Twitter share icons for the time being, because we do not like the user experience of sharing a “blank” Twitter card.


#16

HI Andy, I just wanted to check if there was any update on this issue. We have all our Twitter buttons removed, but we want to turn them back on as soon as possible. We’re just waiting to resolve this image issues.


#17

What links are not working for you?


#18

Thanks Andy!

I work with Cassie (other user on this thread) so they’re the same links she has noted above.


#19

I am one of the developers working on this with Amy and Cassie and the main issue with the images not showing is that we have NOT changed anything with our robots.txt prior to the issue starting to happen. Twitter previously happily read the share images and displayed with the same robots.txt as we have now. And we do not have similar issues with any other share service. Robots.txt validators online such as Googles show that our rules are fine for the images that we are trying to share.

So the issue is why does Twitter not like the robots.txt rules we have? We have not gotten an answer to that here so wondering how we can escalate this issue.

To summarize:

As we have been told that Twitter supports the Allow directive for robots.txt, our current set of directives that would be in context here should work:

User-agent: *
Allow: /sites/default/files/styles/*
Disallow: /sites/

For an example URL such as https://www.commonsensemedia.org/sites/default/files/styles/share_link_image_large/public/blog/csm-blog/2016-08-10-10tvkids-blog-1138x658.jpg

the above rules should work.

In addition last friday I also made a change to try and test the same above rules but this time specifically for Twitterbot:

User-agent: Twitterbot
Allow: /sites/default/files/styles/*
Disallow: /sites/
User-agent: *
Disallow: /

Using the Twitter dev test tool for cards, the image still does not show (using dev.commonsensemedia.org domain).

So we are still were we started with this ticket which is waiting on Twitter to tell us why the rules we have are blocking Twitter from fetching images even though by scan of the rules themselves, the images should NOT be blocked from fetching by Twitterbot?


#20

Thanks. Yes I understand your confusion and frustration, I’m right there too as I’d like to see this resolved for you. We made some changes to our fetcher a couple of weeks ago and I’m still working with the teams involved to understand some of the edge cases that seem to have caused changed behaviour. When we have news we will be sure to share it.