
A Keras multithreaded DataFrame generator for millions of image files - timehaven
https://techblog.appnexus.com/a-keras-multithreaded-dataframe-generator-for-millions-of-image-files-84d3027f6f43
======
timehaven
This is a tutorial-style post that offers an alternative approach to dealing
with large sets of training data, without resorting to copying and moving
files to hard-coded directories named `train`, `validation` and `test`.
Instead, you keep the files where they "naturally" reside on your system and
track their locations with a Pandas DataFrame, feeding their names to the
Keras generator. It scales well when dealing with millions of image files and
hundreds of gigabytes of data.

