Hacker News new | past | comments | ask | show | jobs | submit login

and i still have no idea what i could use it for...



It's great for speeding up 'hot' functions in your python code and makes it easy to call C libraries from python.


It's easy to drop in Cython in an existing project where you need some performance, and start gradually "cythonizing" modules from the inside out. The rest of the code does not need to care.

With a bit of care (and benchmarking) you can get very respectable speed. The main drawback is that the further you go, the more C knowledge you need in order to not blast your own feet off.

If you're just after a bit more performance in general, a drop in solution like pypy might be enough.


We speed up our ML code 40 times using it.


Would that be in the data loading that you are getting the most benefit?

I'm curious, since most of the big libraries are already just cuda calls anyway but I'm always interested in anything to speed up the full process.


I can't speak for the parent commenter, but there is often code processing the input/output of machine learning models that benefits from high-performance implementations. To give two examples:

1. We recently implemented an edit tree lemmatizer for spaCy. The machine learning model predicts labels that map to edit trees. However, in order to lemmatize tokens, the trees need to be applied. I implemented all the tree wrangling in Cython to speed up processing and save memory (trees are encoded as compact C unions):

https://github.com/explosion/spaCy/blob/master/spacy/pipelin...

2. I am working on a biaffine parser for spaCy. Most implementations of biaffine parsing use a Python implementation of MST decoding, which is unfortunately quite slow. Some people have reported that decoding dominates parsing time (rather than applying an expensive transformer + biaffine layer). I have implemented MST decoding in Cython and it barely shows up in profiles:

https://github.com/explosion/spacy-experimental/blob/master/...


In this case was multicore computation without GIL if i remember correctly.


We had to parse dozeon of 20GB files daily with super complex structure and not in linear structure. With Cython (finally we migrated to Pypy) we gained around 20-60x speedup.


it's C with Python syntax and syntactic sugar for Python objects on C level, including refcounting, which is the hard part.

if you successfully use numba, probably nothing that you couldn't already do.

if you want something that lives much closer to C, it's perfect.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: