Originally, Tweets were text only. Later, the ability to add a single image was added (entities.media).
When Tweets evolved further to support multiple images, GIFs and videos, the extended_entities object was added. This enables applications that have not been updated to support multiple images, to fall back to displaying the first one.
If a Tweet contains more than one image, a GIF, or a video - the extended_entities object will be present.