I would consider providing numbers about the datasets in the default view, such as number of observations and variables. That's probably the biggest weakness, IMO, of current data portals (including data.gov)...you have to click through every link to then find out there's not much data in the set. In your situation, this applies to several of the things you've included...that Gun Ownership and Crime Rates set, for example...unless I'm missing something, but that has fewer than 40 observations, and relies on highly questionable numbers from the FBI  on a nationwide level...that can't possibly be of any use in a machine learning context, can it? I'm surprised it's even the basis for an academic paper (though kudos to the authors for posting their work). If you still think it's worth keeping that dataset, it'd be nice to know # of observations before having to click through.