Hacker News new | comments | show | ask | jobs | submit login
PyTorch, a year in (pytorch.org)
306 points by ghosthamlet 10 months ago | hide | past | web | favorite | 42 comments

Great update, it's been an exciting year for the project.

I love PyTorch for tinkering and experimenting.

In my experience, there's very little 'impedance mismatch' with PyTorch, meaning the framework rarely gets in my way. I never find myself 'wrestling' with the API. I expect this is only going to get better now that one of the project's explicit goals is to match numpy's API and semantics as much as possible over time.

Congratulations to the PyTorch community. You guys have done a great job!

In terms of impedance mismatch, I wish the PyTorch API was more similar to numpy. eg: using .shape instead of .size, using the same method names where possible. Seems like a small detail, but it could make PyTorch a little bit more intuitive.

The latest versions of PyTorch (compiled from repo) already do this:

  > x = torch.rand(4, 3)
  > x.size()
  torch.Size([4, 3])
  > x.shape
  torch.Size([4, 3])
Their explicitly stated goal is to match numpy semantics as much as possible :-)

I'd love to see the foundational layer of PyTorch integrated into Numpy, so that e.g. Numpy matrix-multiplications can be performed on the GPU without rewriting (much) code.

Perhaps numba might help, accelerated by gputoolkit


They mention increased NumPy API compatibility in the article. Seems like that would cover what you’re asking for. FWIW, I agree, there are some things that NumPy does that TensorFlow doesn’t that I’d like to see. .T for transpose being one of them, but maybe I’m not understanding something there. I’m probably just bike shedding a little though ;)

It is our framework of choice especially when prototyping and implementing new differentiable programs. Although we use Caffe2 in production right now mainly because of Windows support for some of our customers. But both of them support ONNX exchange format, we can prototype and train in PyTorch and then deploy the model using Caffe2 CPU version.

I just want to call out something with onnx. It will be slated to improve this year but be very careful what you try to do with it. A lot of basic things don't work yet.

See: http://pytorch.org/docs/0.4.0/onnx.html

The other frameworks are using (in my opinion) slightly misleading language as to how "ready" it is. onnx to Caffe2 is facebook's use case so I could see that being supported fine. It's hard to go much beyond that though.

Support for onnx will be bottlenecked by what pytorch can export right now. The file format just hit 1.0, but it will take some time for the ecosystem around it (including the export) to mature.

Disclaimer: I am a framework vendor who has spent the last few months messing with it for end users writing model import

for tensorflow's file format as well.

Yes ONNX support is not mature even for PyTorch --> Caffe2. There are problems with basic things like affine BatchNormalization. But still the situation is a lot better now ... before we had to write our own converters from one framework to another.

> Caffe2 to onnx is facebook's use case so I could see that being supported fine. It's hard to go much beyond that though.

ONNX to Caffe2*. Though it's possible to go from Caffe2 to ONNX to some custom hardware accelerator (TensorRT?)

Ahh yes sorry for the typo. Fixed!

And it will be possible but again support isn't actually implemented by anyone yet.

They just "signed on".

Tensorflow is the only supported format right now for tensorrt.

Wow. I'm a passerby who had heard of PyTorch on HN, and been on the sidelines about Machine Learning and Deep Learning. Just read this summary and feel inspired to kick the tyres and start learning some.

May be I will find some use for it in my sysadmin world.

Thanks for such a summary.

machine learning is about the math, not about the frameworks.

Downvote because while it is true that the math is important, a framework like PyTorch allows for more idiomatic python code. Tensorflow is in its own world being mostly written in C++ and trying to cater to multiple client languages. That means the same math (same networks) will be coded differently in the two libraries/frameworks.

Are you always this eager to douse someone's enthusiasm for trying a new technology?

Machine Learning is about the math but is also about the code (including frameworks), and most of the time also about the domain specific contours of the problem. (ime)

That said, there's been a religious flamewar over which-is-better-TensorFlow/Keras-or-PyTorch that's dragged down a lot of productive discussion about ML/DL frameworks.

Specially since the answer is obviously PyTorch. :)

Seriously though, we're in the early days of ML and still understanding the correct patterns and abstractions.

Web dev been through this to the point we had "JS fatigue" until things started settling on the reactive model and React / Vue / Angular 2

"Seriously though, we're in the early days of ML"

In the early days of "differentiable programming" perhaps. Certainly not ML.

This has got to be one of the most important things to keep an ecosystem active. If you don't have the spectacle from the fireworks and heat from the flame war to keep peoples' interest, you are in danger. Controversy and religion are excellent and important health indicators, because they reflect passion within your community, while giving passerbys a fascinating train wreck to watch and discuss.

Attention is an exceedingly precious commodity, potentially the most precious and valuable commodity in the marketplace. Skilled developers must learn how to direct this so that it stops being commandeered by posers all the time.

chainer is the best. plain numpy means that somebody else thinks about broadcasting rules and other stuff that has nothing specific to do with neural stuff and cupy just implements them.

Imho, as a user, it is rather silly to find that a framework requires me to define a dependency graph instead of just coding the operations directly. In my career so far, no compiler has asked me to do the same. So why would neural networks be an exception?

Optimization and embedded devices. In theory, knowing the full graph upfront allows the framework to optimize and fuse some operations.

For embedded devices you may not have access to Python. You can precompile the graph to the target device in such cases.

Note that in practice, PyTorch is as fast or faster than Tensorflow and, and newer version allow you to "post compute" the graph and export to ONNX to allow embedded inference using Caffe2

Indeed. Frameworks seem like embryonic languages, which are extremely awkward to work with (compared to Python) but easy to compile to the GPU.

A relevant essay: "On Machine Learning and Programming Languages" https://julialang.org/blog/2017/12/ml&pl

Because if you are using a language designed around one execution model, but you need another execution model, defining a graph is one of the better ways to do it, even if it is annoying. You're basically writing a compiler that takes your problem and converts it into another representation that will actually be run.

In the maximally abstract sense, Python isn't necessarily the best choice for this, but as the closest manifestation to the correct way to do this that I know of is probably Haskell, that seems unlikely to beat out Python any time soon. There's certainly a lot of worse languages.

Ultimately, you'll want a language that has the necessary primitives built in natively, but it doesn't sound like we're to that level of maturity in the field yet.

Chainer does have its tradeoff though. It's not exactly smooth to generate CUDA kernels from pure Python code.

Also Chainer is bogged down by the fact it is made and maintained by Japanese company/people or at least majority of them are Japanese.

Most of the community itself are Japanese users, and many model implementations are made by Japanese users, hence in blogged, documented in Japanese.

Chainer had similar fate like Ruby. Language barrier is real.

It's not exactly smooth to generate CUDA kernels from pure Python code.

Can you give an example? What's not smooth?

Yo! I've read articles on your blog quite a few times while working on my thesis. Great work!

I've been using Pytorch for quite a while, and had to use tensorflow recently. May be its my pytorch _priors_, but using tensorflow felt weird, and un-intuitive. Any good resources for pytorch folks to get into tensorflow?

Tensorflow's eager execution might challenge Pytorch. I for one love the imperative style of Pytorch but I also realize the advantages of pre-computed graph style of TF, especially when it comes to distribution. Interesting times!

As a member of the Lua community I cry every time...

But congrats on the hard work! I've been planning to try it out for a while now and the amount of resources and docs is great.

Convince me to Lua. While Lua gets job done, as an outsider who dabbled into Lua (Torch, Love2D, ESP8266) I think Lua ecosystem have two major problems:

- 5.1 vs 5.2 vs LuaJIT. This is worse than Python 2 vs 3

- Lack of universal standard on how to OOP, etc. Python has PEP8.

Sometimes coding in Lua feels like Javascript, but more or less worse.

I can cope with index-1 counting (think like using Matlab), but other problems are holding me back.

It's not my job to convince you of anything. But while I agree with the lack of universal standard, PEP8 is great, I disagree with all the rest.

The changes between Lua versions are nothing like the abysmal difference between python 2 and python 3. Not only 5.1 code is very likely to run on 5.3 and JIT without issues, if there are ever any issues they are always very minor and solvable with requiring a simple compat library.

As someone who has done a lot of Lua and Javascript, I had always had the feeling that coding in Lua feels better by orders of magnitude. It's a way more consistent language.

Thanks for reply. Since I am learning deep learning, I still need to read and run Lua Torch code every now and then. But if I have choice I choose Pytorch.

P.S. I did not downvote you. Somebody else did

That image to image transform of the horse and zebra is something I’ve never seen before. Soon (or perhaps already) we can’t trust anything we see with our own eyes anymore, the potential for manipulation is scary. The repcursions are unthinkable with respect to mob mentality and gullibility.

We live in a post-truth society now [1]. OK, maybe we're not all the way yet, but it's interesting (and scary) to speculate about what society might look like if it becomes much harder to verify if a given statement is true or false.

[1] https://en.wikipedia.org/wiki/Post-truth_politics

Why is this a campaign link? Not a direct link?

I wish HN would strip tracking cgi args in software.

I wonder how anyone can even install torch:

luarocks install torch

Error: No results matching query were found.

/opt/torch$ ./install.sh

/opt/torch/extra/cutorch/lib/THC/generic/THCTensorMath.cu(393): error: more than one operator "==" matches these operands:

PyTorch was born from Lua Torch but they're not interchangeable and noting installation issues regarding Lua Torch when the article is discussing PyTorch is likely confusing.

PyTorch helpfully provides clear installation instructions for each platform and package manager at http://pytorch.org/ and the team have consistently been careful to ensure they work simply.

pretty easy to get started on their homepage: http://pytorch.org/

for example, python 2.7/ pip/ osx/

pip install http://download.pytorch.org/whl/torch-0.3.0.post4-cp27-none-... pip install torchvision

I have used torch without issues but that was a while ago. It looks like Lua Torch was sort of deprecated in favor of PyTorch

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact