Hacker News new | comments | show | ask | jobs | submit login
Model Zoo – Pretrained deep learning models (modelzoo.co)
148 points by jonbaer 37 days ago | hide | past | web | favorite | 24 comments

As a researcher who has work "featured" here, I am confused.

I was going across the projects and found my own - "Quasi-Recurrent Neural Network"[1]. I was interested to see what type of model would be trained as the QRNN is a component, not a model. Upon clicking it, I'm shown the README I wrote on GitHub and a "Get model" button that takes you to the Github repo where I know there isn't a model.

This is true for other projects I looked at.

A model zoo without the models seems misleading especially when the claim is "Model Zoo - Pretrained deep learning models" ...

Mixed with already having an Advertise button gets me a little concerned.

[1]: https://modelzoo.co/model/quasi-recurrent-neural-network-qrn...

To use pre-trained model as a component, you might want to look for https://www.tensorflow.org/hub/

The current model offering is limited, but seems to be on the right direction.

Hey, I'm the one who made the site. Thanks for your feedback. Most of the models listed on the site are taken from lists of deep learning resources (and framework specific model zoos), so I did not validate every repo to ensure they actually offer pretrained models. I guess a more accurate text for the link would be "view code" or "more info" instead of "get model". Indeed a lot of papers don't release their models but just the code.

The goal of the site was to create a common platform to search and aggregate models (or code) available for reimplementation. I'm planning to add tools to allow users to flag or report errors on pages, since most of the content is automatically scraped.

I hate to be "that guy" but I have pretty serious problems with the site and your response here. Respectfully, you're admitting that you're not doing the thing the website principally claims. It doesn't matter what you intended - the site is claiming that it aggregates pretrained models.

Not only does it not do that, but it apparently only scrapes third party resources with little to no manual oversight in whether or not the e.g. code repository even contains a pretrained model. But wait: your plan going forward is to offload the moderation onto users?! So instead of being responsible for the content, the users (who ostensibly came to your site because they couldn't find what they were looking for) are now obligated to do due diligence. What's the difference between this and just searching GitHub for paper titles or keywords?

The final issue I have is more meta. I don't really see the value of the site as implemented. Why are you automatically scraping all of these resources? Why don't you curate them yourself and demonstrate that competency to the community? As it stands this is blatantly misleading and seems like a transparent attempt to cash in views for buzzwords, regardless of whether or not the user is ultimately helped by the content.

Sorry for being harsh, but this is kind of brazenly inept. I can understand that automatically scraping these resources gives you a lot of leeway to scale up inclusion to make it more viable. However you really can't just turn on a scraper, direct it at a few keywords and tell your users to sort it out. Users will want this site to make their lives easier instead of wading into the complexity themselves. You're not reducing that complexity, you're just adding another layer of abstraction to it.

> I guess a more accurate text for the link would be "view code" or "more info" instead of "get model"

It's not about accurate link caption. It's that you make a site with it's whole purpose being getting models, you call it "model zoo", and then you don't link to models and don't check whether models exist in all your links! A site for aggregating models (for the ones that provide it) and code already exists: it's github.

Your site simply does not work.

> I did not validate every repo to ensure they actually offer pretrained models.

please do.

> so I did not validate every repo to ensure they actually offer pretrained models

You don't have to manually verify each one--that's what machines are for. If you find a file that is recognized as a model list and link to the_model. Obviously a README is not a model.

Just an FYI - neat though this is, there does exist a related underrated thing called OpenML that hosts data, models, and individual runs: https://www.openml.org/ . Really want this to be more well known...

OpenML looks great! I hope it gets a bit of attention from this thread. Thanks for sharing!

Thanks; this is sort of what I was expecting when I clicked on the original link.

So if I understand correctly, as it is right now, it is just a Github scraper with an advertisement button, right?

Caffe has a thing called exactly "model zoo"[1], which is also a collection of pre-trained models. The submitted site looks like a borrowed idea (+ borrowed name).

[1]: https://github.com/BVLC/caffe/wiki/Model-Zoo

I always figured 'model zoo' was a general term for for a collection of trained models.

Full Disclaimer: One of the creators. But I'm working on a platform that has pretrained models, with notebooks that are I've personally verified works correctly & is well documented/explained. We also host the models ourselves for you to easily download via CURL/WGET.

If anyone is looking for curated pre-trained DL models that have IPython Notebooks that run out of the box, check out https://modeldepot.io

If you have some pretrained models that you'd love to share, feel free to hit share via the submit button :)

We might not have the volume of modelzoo.co, but we have a focus on quality and understandability, especially for those that are newer to the ML/DL field.

Great product, I'll definitely share it!

This looks quite nice in the sense that it's an index of Deep Learning projects, but it only seems to copy the README and link to the the github projects of those papers.

I thought by clicking "Get Model" I would get the model right away but it just redirects me to the github page of the project.

There is certainly value in getting information about all these models in one place but I feel more friction can be elimaned by providing direct ability to download the model files.

This seems to be a directory of neural network and RL related projects not a model zoo at all.

It's a pity - I'm still looking for a WikisumWeb pretrained model!

This thread could serve as an informative discussion on how to properly site open, collaborative ML tools and resources. If anyone reading is active in the OpenML and OpenAI communities, could you provide some best practices for attribution and curation of these resources? There are already some pretty obvious recommendations: the project name or description of the site should be accurate. Describe what the site actually does, and credit those whose models are used. Hopefully someone can provide some formal specifications for how to do this.

Could this kind of model registry enable efficient model search based on input/output queries, a la this paper about semantic code search? https://kstolee.github.io/papers/JSS2015.pdf

Perhaps the registry would need to be a lot larger before semantic search would be really useful?

This looks like a Pretty neat idea. Do you cross check the models to see if they are working? Or you just consider based on stars?

Most of the models are retrieved from "awesome" lists or other framework specific model zoos - I didn't have time to check all the models. I'm planning to add tools for reporting non-functional models for this purpose.

You didn't have time? Was this site developed under a deadline?

I'll just add to the other comments here that I think you need to try a little harder to make this site do what it claims to do.

"Model Zoo" is the thing that is going to fuck us up when the machines finally rise up

blank page with JS disabled

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact