Hacker News new | past | comments | ask | show | jobs | submit login




The open source alternatives you list seem to only provide experimentation logging. ML Flow seems to support more (such as model deployment).

Not to claim that the deployment processes are _good_, just that MLFlow seems more general than these open source alternatives listed here.


How about SageMaker, Can we include it in this list. I played with SageMaker sometime ago and it helps you build a whole pipeline to host your models, in addition to host your notebook and bridge the gap between data scientists and data engineers.


Anecdotally, we considered using the hosted versions of Jupyter and Apache Zeppelin that are part of AWS SageMaker and EMR. We couldn't figure out a simple/familiar workflow for keeping the notebooks under version control. So, we agreed to run the notebooks locally, use a familiar Git-based workflow, and interact with the AWS infrastructure through the local notebook instances.


Does Zeppelin work naturally with git? I've been struggling to get the right setup with just Jupyter


Well, good question. The file format for Jupyter is not ideal for 'code craftsmanship', as pointed out by another comment. There are utilities to strip out some of the metadata from the Jupyter files, such as rendered output and run counters, but that is a trade-off to be decided by your team:

https://github.com/kynan/nbstripout


For deep learning, deepdetect can be useful in dev and prod phase.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: