Hacker News new | past | comments | ask | show | jobs | submit login

>You can't, you need to know what you are parsing, a number, a complex number, a symbol etc.

pandas (Python) has the upper hand here. A lot, if not most, real world data will have values of the wrong type interspersed. pandas will still let you read in the table and then deal with these problematic values. For example, reading in the data and then dropping all values that don't conform to the type that is expected could likely be done in 2-3 lines.

But the advantage GP may be speaking of is that you can still do a lot of useful stuff with the data even if you leave the bad values in there.

For all its warts, pandas really is amazing.




You should try using R. You're missing out.


In the context of this conversation (dynamic vs static), R is in the same boat as Python.

And I don't know about today, but when I used pandas years ago, it was much faster than R.


R's data.table package is much faster and much nicer to use than pandas.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: