Hacker News new | past | comments | ask | show | jobs | submit login
Storing lots of files
3 points by lotsofiles on Oct 23, 2010 | hide | past | favorite | 3 comments
I am looking for ideas on storing millions of small files. I want to build out a system, for uploading images. I am planning on resizing the images and storing thumbnails as well as the original image.

Originally I was planning to store the images on a shared file system, where the webserver and app servers could access the images. I was planning on creating a dir structure based on the user id uploading the image and breaking the long into /xxx/yyy/zzz/image.png. Then storing the photo info into the db.

The problem with the file system approach, is that I can see moving this file system with millions of files can be unwieldy.

Then I started to think about storing them in a database either mysql or mongodb. Using a Master slave setup where the writes are stored to the master as blobs and then then read by an app to serve out the image. I was thinking of using caching in front of the db, either a memcache setup for the thumbnails or creating a cache by writing to the local disk for serving up the image.

I am soliciting any other ideas on how to best handle this situation, Any ideas. Thanks




Not enough information: what are access patterns like? How often do we add new files? Can we tolerate a disk read most of the time? Are we going to see zipf distributed popularity or even popularity?

BCC has happily chugged through a few million GIFs and PDFs using nested directories, on a single VPS. (Low concurrency requirements and all files being accessed less than ten times in a lifetime makes it easy.)

Anyhow, so the simplest thing that works, rearchitecture when you become Facebook. (You probably won't need to. But spend your resources to achieve success, nit to guard against the problems of successful people.)


I'd go with the db + filesystem approach. It's listed here under "best of both worlds": http://www.hashmysql.org/index.php?title=Storing_files_in_th...


I've been keeping millions of files in single directories on XFS without problems. The big upside is that you won't need to learn any funky/new management tools.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: