Image in twitter:image with twitter:card = summary_large_image is not displayed in card

embeddedtweet
image
cards

#1

Issue:

  • In the blog of our client startnext.com the given image doesn’t show up in tweets.
  • twitter:image ist set to a HTTPS URL of a JPG file, directly hosted on the server. The file meets the minimum dimensions and doesn’t hits the maximum file / dimension size.
  • twitter:card is set to “summary_large_image”.
  • The robots.txt allows the folder /media/thumbnails/ for alle User-Agents where the given JPG is located.
  • The JPG file is OK.
  • We don’t know why the given image file is not showing up in the card?:
  • Please give us a hint for fixing this issue. :slight_smile:

URL affected:


Tweet: https://twitter.com/startnext/status/877789665641811968

Troubleshooting steps tried already:


#2

The robots.txt file does not allow access to the image, as the order of precedence of the rules prevents any access to /media/.


#3

Thanks for the quick reply.
So the order of precedence is from bottom to top not from top to bottom? I mean, we are disallowing all files in /media/ from being indexed but want to allow indexing and access for bots like Twitter to /media/thumbnails.

There seems to be a contrary opinion in the web.
So for google an specific allowment means, the path/file is allowed to be indexed than a disallowment for the generic path. An specific allowment is higher rated than a generic disallowment.
See: https://developers.google.com/search/reference/robots_txt?hl=en#order-of-precedence-for-group-member-records

Your advice would be to change the order of allow / disallow for /media/ and /media/thumbnails?


#4

In your official documentaion there is the same case: https://dev.twitter.com/cards/getting-started#crawling
Following that, the robots.txt is right.
First disallowing all in /media/ except (allowing) /media/thumbnails. That’s how we do like in the documentation.

But your answer seems to be different? What ist right?


#5

I agree that there are conflicting sources of information on this - and I apologise if our documentation is inaccurate, I’ll look into that as soon as I can. We intend to follow the Google specification on this.

I can tell you that when I attempt to fetch the image directly through our crawler, it reports that the robots.txt file is blocking access, so I would suggest either removing the disallow on the parent folder, or setting a directive before the all robots directive that grants Twitterbot access.


#6

Hey Andy,
thanks for your advise!
So we know the reason for the issue now.

Can we test the crawler result for ourselves after we have changed the robots.txt?

Regards,
Markus


#7

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.