Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: End-to-end deep learning experimentation platform for Tensorflow (github.com/polyaxon)
151 points by xboost on July 21, 2017 | hide | past | favorite | 21 comments


I'm taking a course on Deep Learning right now, because my boss asked me to. It's Udacity course number UD730.

The first homework assignment was image recognition on notMNIST. That dataset has been used thousands of times before, so I decided to make it more interesting! I got data for 1500 Chinese fonts, so I could make a better OCR for Chinese characters.

I put that into the example, made the "pickle" files, trained the model, got a 90% prediction-test result value... but now what?

How do I take that model and make a real image OCR that can take a photo from my phone and analyse it?

I can write scripts to get data, and follow the steps in a course to make a model, but unless I have a useful output, I don't know how I'm going to apply this to my other programs. If you're offering "end to end", then documentation and examples of some kind of API would be great. (e.g. send a PNG to this REST service, and get the prediction back as text).


> How do I take that model and make a real image OCR that can take a photo from my phone and analyse it?

I assume you're talking about some kind of endpoint to submit the image to, and offload the computation entirely.

However I just wanted to clarify for people that are new to this, that running the neural net on a mobile device usually requires entirely different models, because of the memory/compute limitations. It's a rapidly developing field, but it is not quite as accessible as developing for desktops yet.

I do believe Tensorflow allows you to run models on mobile now though [0]

[0]: https://www.tensorflow.org/mobile/


Thank you for the suggestions. All of those look like servers for the Tensorflow model (i.e. take what I already have on my laptop, and put it in the cloud).

Real life doesn't give me folders of training data to sort. It gives me photos and videos.

Applying a systems engineering approach, I think I need to:

1. Break the picture down into thousands of small little squares.

2. Look for a character inside each picture.

3. If a character is found, add the character to the output text.

That brings new questions. How big should each square tile be? What if the characters are not perfectly flat? How do I avoid getting duplicates from tiles that are next to each other?

Whether the tile->text conversion is done with a simple classifier or "deep learning model", the bulk of the work is done using non-Tensorflow programming. I don't know about open-source projects I could build on and retrain using my data. So the most likely result is that I'll show off the fancy Tensorflow data as a "portfolio project" and archive it, never to be used in production.


It might be fairly simple to write up a REST API with flask and expose your models?


It is pretty easy to deploy these models on Algorithmia.


Use an API like TensorFlow serving (that uses gRPC actually iirc) or deep detect.


Could you explain how it differs from Keras ? I've looked quickly at the blog intro and examples and it looks very similar to me.


As it is mentioned on the github page, the project was inspired from keras and other great projects, but many decisions did not completely fit with the way keras does things. There will be support for keras models in the future, but currently we are trying finish the work on the web api, the web ui and the cli.


> but many decisions did not completely fit with the way keras does things

That's interesting. Could you expand on this some more?


Sure, one of the reasons why we couldn't base the code on Keras at the time when we start working on the project was the integration of keras into the Estimator api, meaning that we couldn't have used the distributed training that was offered by the tensorflow team.


Ok, sounds like a good reason. But how is that situation now? Did the Keras team remove that limitation in the meantime?


I understand your concerns, we had the same reactions from some of our friends, and I can assure you that the integration is planned, and any redundancy, duplication, or patches that we have right now for some tf code will be removed.

thank you for taking the time to have a look at the project, and I am very happy that we are receiving such a constructive criticism.


It was always possible to train Keras models in a distributed setting (I was doing it in late 2015). And there's built-in, one-line integration with the Estimator API coming in the next version of TensorFlow.


If I have to do MNIST with my custom images (either for training or inference) (png), what steps do I have to follow in here to get it working ?


Hi, thanks for the comment, here's an example for a configuration file for a convolutional denoising autoencoder, where some preprocessing is applied during the input pipelines https://github.com/polyaxon/polyaxon/blob/master/examples/co...


Thanks xboost. Could you elaborate more on that. Is there a way to convert images files to TF records directly (as a pipeline parameter). Great project by the way! The easier it becomes to setup end to end training, the more easier it will be to use custom data.


Sure, we are preparing some ways to automotate datasets creation and versionning. For now, the way to feed data is through the built-in Tensorflow numpy/pandas input functions, there are some examples where this pipeline configuration is used. Otherwise there are different pipeline modules that could be used to feed the data especially for training where TF records could be faster.

The way to create a TF record is still manuall, here's an image data converter that was used to create the mnist dataset:

https://github.com/polyaxon/polyaxon/blob/master/polyaxon/da...

More data converters will be available soon.

Once you have a record, numpy array, or a pandas Dataframe, you can basically use any operation/layer on your data by providing the feature name and the list of operation to apply. In general only operation that are necessary for data augmentation should be done on the input data pipelines, otherwise everything should be done during the creation of the TF records, to minimize the computations.

One last thing for reinforcement learnign, the way we feed data is through the feed_dict, because an interaction with an environement is necessary.


Lately, whenever I see a new deep learning library, I think of https://xkcd.com/927/


I really can't stand that cartoon anymore. Especially since 99% of the time it's used to criticize efforts that are not attempting to be either universal nor standardized. Instead, it's linked whenever anyone does anything that resembles anything that has been done before. I think people try to use it as "ha ha, I get that reference!" But, in a feedback thread, it comes across as discouragement for it's own sake.


I had no intention of criticizing any efforts. One should do whatever makes them happy, and as a frequent user of these libraries, it is to my own advantage for them to improve.

For the same reasons, I have nothing against people who "do anything that resembles anything that has been done before," and I certainly wouldn't attempt to discourage the author. I'm sorry if that cartoon is over-referenced (in attempts to be negative), I can't really control that, but I think it's something to think about when starting a new project (i.e. what is your goal, does it differ from other projects, is that important to you?)

Sorry if my comment seemed like I was shooting this down, I certainly had no intent of doing that.


I agree, it's incredibly overused




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: