Hacker News new | past | comments | ask | show | jobs | submit login

A BitTorrent client https://github.com/anacrolix/torrent, and several projects using it. The original idea started in Python which just didn't cope. Things are probably better now in Python with green-threading being a standard concept, but you just can't easily get the throughput you need without minimal overhead, and that overhead is just too high in Python.



Which is strange given the first version of bittorrent was written in Python.

Also, the default torrent client in Ubuntu is Deluge, which is in Python too. I download 20 torrents at 20M/s without any issue with it.

But I get that using threads, downloading from 1000 clients is annoying to code, and indeed you are right, today with asyncio, the story is different.

And I definitely understand the appeal of Go for network concurrency for a lot of projects.

I just don't get how Python didn't fit the bill for this particular one. It's just a client, after all.

Maybe I'm missing something.


First guess: Packaging and distributing python binaries is more difficult than it is with Go.

(I like Python a lot; I've written tens of thousands of lines of it. But its code packaging systems leave much to be desired).


Oh yeah, no argument there. Even now with the fantastic nuitka that compiles Python seamlessly to a stand alone executable, you still don't have the cross compilation story go has. And you have to be careful with libc.

But the thread is about highly scalable things, isn't it ?


Packaging is inherently harder in Python, because of the emphasis on modularity. But conda does a good job. I've been very happy with Miniconda.


We are talking about end user packaging.

Lib packaging is a solved problem. For installing, people are just using pipenv. For creating, it's just an 2 lines setup.py file and a ini file to fill in now (http://setuptools.readthedocs.io/en/latest/setuptools.html#c...)


Yeah in it's early days, the torrent network was small, and peer counts were limited. The original, and many later Python implementations used event loops, which side step concurrency implementation overheads (like threads), and very often do a lot of heavy lifting in C. I'm not sure that Deluge is Python for the torrent part. The standard is to use libtorrent these days.

Python can handle a torrent client with appropriate tools, but you just have to be extra careful about algorithms etc.


Deluge is really bad with more torrents. After 1000 it's basically unusable. Most people who do long-term seeding use multiple instances of Transmission or rTorrent. There are some manager tools to help with that, these usually also do the balancing.

I used to split them up with around 3000 seeding / client which worked fine.


Ah ok. That's a very specific use case though. Regular users won't do that. But i can understand that non asyncio python is not good for that specific usage.


Isn't deluge's actual torrent handling all implemented by libtorrent, which is c++?


Python is c. The ui is gtk which is c. Python is always mostly sugar on top of c.


What is scalable about a client?


You're associating the word scalable with the typical web-scale definition. Don't do this. The OP didn't specify what kind of scaling, and a torrent client has a lot of scalability (vertical?) in the way it manages connections and DHT.


It can scale to your all 4 cores and burn your battery.. kidding aside most bittorent clients handle hundreds/thosuands of connections and sparse files as the splits and number of running clients increase, so scalability is a concern here


I think it is not the definition of scalability. It just means it is resource efficient. Scalability usually means that a service can handle the load beyond a single node capacity, at least for me.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: