I would love to know if there is a good site/source for files of different media types.
In my own side projects, I'm often dealing with old, badly-documented formats with limited examples (like data files for games). I usually start with the "file" command to try to identify the filetype, then look on Wikipedia and filext.com to find links to format specifications. Usually, I can also find the name of any programs that create/edit/view that file type, and that's a jumping-off point to find examples (given the domain, it'll be anything from another game that uses the format to a 90s-era fan page with modified or fan-created data files).
I've used this site before too:
It provides links to a lot of format specifications, codec information, sometimes the mplayer samples that other comments here have linked to, etc.
It organizes files by mimetype. It's not complete, but it might be a good starting point.
You can also look at the testcase folder for afl-fuzz, which includes archives, images, and even an H264 compressed video:
On the same site, http://fileformats.archiveteam.org/wiki/Encyclopedia_of_Grap... points to an archive.org copy of a CD ROM with sample images.
Via its BMP page, I found http://entropymine.com/jason/bmpsuite/, which looks like the definite resource on that format.
Not nearly complete, of course. I can't find a set of test images from the imagemagick project.
This would be a awesome github project; it wouldn't even need an associated web page. You'd be amazed at all the different varieties of "legal" JPEG images, for example.
For large sets of some common media types take a look at the govdocs1 corpus:
For an odd example, sometimes a google search will turn something up.:
Here was my stab at that problem. Somewhere near 70 files, many are variants on text files/code files iirc but a lot of data centric files as well. There is a small neglected WordPress site linked to it.
I've used this site for videos of different sizes/filetypes before: http://www.sample-videos.com/
It is basically the "Big Buck Bunny" video in many sizes, durations and formats.
I've used it before when I built a system that processed video files.