Hacker News new | past | comments | ask | show | jobs | submit login

Disclaimer, I am a data scientist

I feel the complete opposite. I really enjoy working with python over any other language. R does linear models and time series better and matlab has its charm, but overall I prefer python. Python is so easy to read and quick to program in. I am so glad I am not in the Java/C++ world anymore, but I know people in different roles have to deal with different issues.




> I really enjoy working with python over any other language.

I assume you mean, "over any other language I have tried" ?

As someone with a mathematical background myself, I am always surprised at how many data scientists and quants are ignoring more mathematically principled languages like F#, OCaml and Haskell.


F#, OCaml and Haskell

Can I quickly prototype a new deep learning model and scale it to a 32 GPU cluster with very little effort in those languages?


If you put in as much time as you have learning Python, then the answer is probably yes.


Probably yes

What does it mean? Have you done it in any of those language? Have you seen it done in any of those languages?


> What does it mean? Have you done it in any of those language?

I did. I'm doing a image processing recently and use OCaml for prototyping. I've tried python (I've used it a lot for that long time ago), I've failed, it felt to awkward. I've described my experience here [1]

If you have no experience whatsoever with ML family [2], and doing all the stuff in python, you'll most likely be much more productive with python of course.

But I find ML-like languages way more pleasant, and I'm far more productive with libraries like owl [3], which are more fundamental and don't have fancy stuff, and ML, rather than with python and fancy lib like numpy/scipy.

Also Julia could be a good choice hitting a sweet spot between fancy libraries and fancy language.

[1] https://news.ycombinator.com/item?id=20457505

[2] https://en.wikipedia.org/wiki/ML_(programming_language)

[3] https://ocaml.xyz/


Right now I’m experimenting with a pretty complicated model (60+ layers of multiple types), and I plan to train it on several hundred GB of data, using 8-16 node cluster (4 GPUs per node). Does Owl have a well tested and well documented autograd library with distributed GPU support (e.g. Horovod)? With a good choice of optimizers, regularizers, normalizers, etc, so I can focus on my model and not on debugging the plumbing or implementing basic ops from scratch. And last, but not least, it must be as fast as TF/Pytorch.

If the answer is “no”, then it does not matter whether I’m an OCaml expert, because I’m still going be more productive with Python.

p.s. Julia is nice though, hopefully it will keep growing.


I feel what you're saying is that regardless of how subpar a language is compared to alternatives as long as it has community built specific libraries that solve your problems you're more productive using them than anything else.

Which is of course a fair point. A language by itself is probably not even in the top 3 considerations when choosing new tech. Stuff like runtime, ecosystem and the amount of available developers would probably be more important in most cases.


> A language by itself is probably not even in the top 3 considerations when choosing new tech. Stuff like runtime, ecosystem and the amount of available developers would probably be more important in most cases.

Totally depends on a domain. In serious mission critical software you wont use libraries, but will use the language.


Yeah I don't disagree. But even there you would have similar other considerations besides the language. Like most still end up with C/C++ there even though there are others like Crystal, Nim, but you just don't find developers who know them easily, nor do you have any ecosystem support.


> Like most still end up with C/C++ there even though there are others like Crystal, Nim,

Because C++ and C are significantly better than Nim and Crystal.

There are also Ada and Spark and aerospace and very critical stuff.

> just don't find developers who know them easily

We don't look for OCaml/Ada developers, we hire programmers, and they program OCaml and Ada. It's not a big deal for a good programmer to learn a language, especially while programming side by side with seasoned programmers.


how subpar a language is compared to alternatives

In my 6 years with Python, the only dissatisfaction with the language I felt was from parallel programming. I switched to Python from C, and at the time, I missed C transparency and control over the machine, but that was compensated by the Python conciseness and convenience. Then I had to dig into C++ and I didn't like it at all. Then I played with CUDA and OpenMP, and Cilk+, but I wished all that would be natively available in a single, universal language. Then I started using Theano, then Tensorflow, and now I'm using Pytorch, and am more or less happy with it. It suits my needs well. If something else emerges with a clear advantage, I'll switch to it, but I'm not seeing it yet.


Although I'm not sure what you mean by "deep learning", you can take a look at Spark: https://spark.apache.org/mllib/

As a bonus, it IS Python (numpy) in the background mixed with Scala. So you can use each language where they make the most sense - Python for the maths number crunching and Scala for the business logic and the architecture.

I think Spark also has .net bindings (so you can also tick F# on that list...).


As much as I love the languages you mentioned: I think it's a major weakness of them that they don't have the linear algebra libraries integrated such that you can do this the same way Python does.

There are a lot of reasons why this is.


Not until the libraries get built. They didn't exist for Python either until fairly recently in the history of all these languages.


For those unaware: Haskell has a REPL (ghci), and you can make files more script-like with the (currently most popular) build tool stack[0] if you include:

  #!/usr/bin/env stack
  -- stack --resolver lts-6.25 script --package turtle
(as you see it includes easy dependency management :) )

0: https://docs.haskellstack.org/en/stable/README/


This doesn't answer their question though. a REPL isn't magical


I dont know about F# and ocaml, but haskell's numerical libraries really pale compared to numpy.


It's language vs libraries. If you have a library that has a function

    get_the_shit_done_quick ()
than you don't care much about the language.

When you don't have such function, you need an expressive language to write it (and a bulk of python libs are not written in python, tho mostly for the performance reasons).

So it's all about finding a sweet spot between fancy libraries which do the shit for you, and fancy language, which let you to express things, absent in libraries.

This sweet spot differs from domain to domain, from user to user. Even in numerical stuff someone could have a requirement for a better language, although this domain is indeed to well defined to have enough fancy libraries.


Language vs libraries isn't just about an expressive language to build in when you don't have a library. The likelihood of a library's availability also depends on the barrier to entry. An amazing language that isn't usable by biologists won't have many libraries that solve biologist's problems.

To your original point of being "surprised at how many data scientists and quants are ignoring more mathematically principled languages like F#, OCaml and Haskell," I'd much rather use one of those languages, but I'd have to build the foundations myself. Today, they aren't the right tool for the job. They don't have the libraries I need, which means I don't build further libraries for them, making other people less likely to build on them, so they aren't the right tool for the job tomorrow either. I'd say it's a network effects thing primarily.


Well, yeah compared to R and matlab, I am willing to believe Python excels, but the person you are replying to is probably not doing data science, so he has options besides the 3 just mentioned.




Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: