Hacker News new | past | comments | ask | show | jobs | submit login

Are R and Python really your only choices? Basic competency in machine learning is on my todo list, and I don't relish the thought of using either language.



There is a pro for Python in that it makes machine learning really easy, or at least incredibly accessible. You can write and test classifiers in about 5 lines of Python using scikit-learn. The second point is that virtually all the latest deep learning packages come with Python frontends by default nowadays. For stats you could also use SPSS.

The other advantage of Python is that as a scripting language it's very powerful for data wrangling and pre-processing, without needing all the boilerplate that e.g. C++ would require.


Not to join a flame war, but R makes it pretty easy to test multiple models on a single dataset as well. I have also noticed it does better stats and missing data handling out of the box.


I have played around with scikit-learn and love how simple and easy it is to work with, but the story for scaling it doesn't seem super straightforward - is this something anyone here has experience with?

I built a recommendation system in Spark earlier this year that used terabytes of input and would run it on a 40 node EMR cluster so it took less than half an hour. It wasn't trivial to make it run in a clustered environment, but it wasn't very hard either.


Out of curiosity, were you using spark-scala or pyspark?


I was using scala


If you consider SPSS as an alternative, you'll probably really have no use for R. I agree that Python is more approachable for people with a CS background (unless your fan of array processing languages) but R actually is a nice language for data centric tasks.


Julia is another option. You can even call R and python code from a Julia REPL.

You generally want an interactive language, though, because there is an iterative cycle in prototyping models.


They are not the only ones, but python is definitely the one which is likely to get you the most productivity fastest.


R and Python are not your only options. Check out the article for some other languages people are using.

R and Python are probably the two with the most support/community materials around them - lots of tutorials, libraries, guides etc.


Which is the one main reason why PHP is still around. If you are starting a new language or a new project, it is better to have examples and guides available.

I am preferring python to "R" because it gives me better search results.


Given that the article mentions many library implementations in Java, that puts JVM languages on the table. While Scala may be a good one, which the article mentions, that also puts Clojure on the table as well.


Just added clojure to the query to see what it gives. You can do it too, simply follow the link I give in the article. Closure is a bit less popular than Julia, hence not significant at this point. See https://www.indeed.com/jobtrends/q-python-and-%28%22machine-...


> don't relish the thought of using either language

Why not?

> only choices

Obviously not, according to the linked article. However, many people like Python and/or R. Perhaps you should find out why before dismissing their choices.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: