Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Is there a site providing example files for “all” media types?
73 points by salzig on June 16, 2016 | hide | past | web | favorite | 18 comments
Hej HN.

I would love to know if there is a good site/source for files of different media types.




What exactly do you want to do? A lot of the other answers assume that you want examples of different h.264 encodings, JPEG image varieties, etc.

In my own side projects, I'm often dealing with old, badly-documented formats with limited examples (like data files for games). I usually start with the "file" command to try to identify the filetype, then look on Wikipedia and filext.com to find links to format specifications. Usually, I can also find the name of any programs that create/edit/view that file type, and that's a jumping-off point to find examples (given the domain, it'll be anything from another game that uses the format to a 90s-era fan page with modified or fan-created data files).

I've used this site before too: https://wiki.multimedia.cx/index.php?title=Main_Page

It provides links to a lot of format specifications, codec information, sometimes the mplayer samples that other comments here have linked to, etc.


I started a project in this vein a while ago:

https://github.com/nbeaver/mimetype-menagerie

It organizes files by mimetype. It's not complete, but it might be a good starting point.

You can also look at the testcase folder for afl-fuzz, which includes archives, images, and even an H264 compressed video:

https://github.com/arisada/afl-fuzz/tree/master/testcases


You may want to look at http://fileformats.archiveteam.org/wiki/Category:Graphics. Collecting sample files from there is a bit of a hassle, but there's quite some obscure formats.

On the same site, http://fileformats.archiveteam.org/wiki/Encyclopedia_of_Grap... points to an archive.org copy of a CD ROM with sample images.

Via its BMP page, I found http://entropymine.com/jason/bmpsuite/, which looks like the definite resource on that format.


I've used the ffmpeg samples before:

https://samples.ffmpeg.org/

Not nearly complete, of course. I can't find a set of test images from the imagemagick project.

This would be a awesome github project; it wouldn't even need an associated web page. You'd be amazed at all the different varieties of "legal" JPEG images, for example.


Incidentally, https://samples.ffmpeg.org/ and https://samples.mplayerhq.hu/ are actually mirrors of the same collection, it seems.


Define "all" media types. Do you mean something like an example of each and every one of these ? https://svn.apache.org/repos/asf/httpd/httpd/trunk/docs/conf...


yes, kinda.


Apache Tika is a text extraction toolkit. They store a wide selection of file types for their parser tests: https://github.com/apache/tika/tree/master/tika-parsers/src/...

For large sets of some common media types take a look at the govdocs1 corpus: http://digitalcorpora.org/corp/files/govdocs1/by_type/

For an odd example, sometimes a google search will turn something up.: https://www.google.com/?q=filetype:xlsx


The test files for the parser are available at: https://github.com/apache/tika/tree/master/tika-parsers/src/...


https://github.com/alexschiller/file-format-commons

Here was my stab at that problem. Somewhere near 70 files, many are variants on text files/code files iirc but a lot of data centric files as well. There is a small neglected WordPress site linked to it.


Would like to find something like this as well.

I've used this site for videos of different sizes/filetypes before: http://www.sample-videos.com/


I've used this site in the past to test various video formats and sizes: http://www.sample-videos.com/

It is basically the "Big Buck Bunny" video in many sizes, durations and formats.


I don't know if there's any source with a wide variety of formats, but there are various sites with some samples. Here's one for H.264 videos, mostly movie trailers, encoded using various parameters:

http://www.h264info.com/clips.html

I've used it before when I built a system that processed video files.


http://www.iana.org/assignments/media-types/media-types.xhtm... is the canonical list of types. They do not have examples though.


Is this about Cascading Style Sheet media types? Or MIME types for video, audio, etc.?


video, documents, whatever.


For video and audio, the best place is by far https://samples.mplayerhq.hu/, from the devs of MPlayer


dunno. if not, go please make it. sounds like a useful public resource. it's one of those classic cross-cutting concerns. it would fit in nicely with a web where any piece of info you'd want, or service, is sitting at a URL, just a tab or curl away.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: