Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Privacy-focused, ad-free, non-tracking torrent search engine (skytorrents.in)
1183 points by kasumis_ on Jan 18, 2017 | hide | past | web | favorite | 346 comments



Hi,

Love the site especially the speed. But disappointed that the search indexes the name of the container of the torrent only and not the name of the files within the torrent.

To date the only site I know that does this is "filelisting dot com" but their website is very slow.

Do you have any plans to extend your product to add this feature. Maybe as a premium option?

Super useful to find single documents contained within archived bundles of files.


If 3 more users vote this feature, It will be provided for free to everyone and it will provided at the same speeds as the current website.


Who are you? The tooth fairy? :-)

Yes please, implement this feature.


You should set up an issue/feature request tracker instead of turning HN into one. This sort of stuff makes the discussion unreadable.


For feature voting I've seen other apps use Product Pains[0]

[0] - https://productpains.com/


Thanks for the shoutout! We definitely see huge value in letting your users post & vote on feature requests:

- You know what your users want most

- People feel like their voice is heard

- It builds a community around your product

Try it out!


we've truly reached peak internet now


I love this! Vote for a feature, is fantastic!


Add my vote too. Thanks for your great work!


Consider this an upvote. Need a kidney? I'd be willing to offer one for this feature.


My cents too for the feature


+1 for the feature


If only HN had some sort of upvote feature.


Yes please!


Why .in? Is this based in india?


right on +1


Yes, please.


+1


Yes please!


+1


Indexing individual files is surely a great feature. Voting for it, too.


You get my +1.


Have my vote please as well.


Kindly add my vote too, for this feature.


Archived bundles of files having significant names are important! Torrents of opaque disk images and archives (e.g. pirated software) or video files (e.g. pirated movies and TV series episodes) might be popular, but the strong suit of BitTorrent are carefully crafted and often very large collections of books, music, emulator ROM files etc.


It's frustrating when you are downloading one big rar file just because you need one of the small files inside.


With transmission you can actually select or deselect individual files.


But not when they're trapped in a .rar or a .iso, at least not in my version of Transmission.


I wrote a BitTorrent client that specifically only downloads the parts you need to perform some operation: https://github.com/anacrolix/torrent


But can it look inside archives? If not, then mainstream clients can already do what you mention.


Reading the page, it sounds like they're talking about the "torrentfs" mounted filesystem component, where read requests will download just that part of the file.

So theoretically, you could open the archive in a GUI, extract the file you want, and the only part it would download would be the archive header/file listing and the part of the archive corresponding to the file you wanted.

Of course, this wouldn't work for tarballs or solid RARs, but for regular archives it could conceivably work.


Most bittorrent programs I've used have this feature.


we may consider giving priority to bigger torrents.


Search result priority is largely irrelevant for rare or specific files, since there are going to be few results (at least, few clearly different torrents) and each result is going to be inspected by looking at the file list.

Indexing file names is the search quality improvement you need most. For example, I tried searching for "Pretty Polly", by the Poison Girls; moderately popular niche music. There are 0 results for "pretty polly" and 2 results for "poison girls", one of which is a complete discography collection containing that song. Imagine the results for more popular and prolific authors.


Ability to search inside torrents is being considered and will be added soon.


> disappointed that the search indexes the name of the container of the torrent only and not the name of the files within the torrent. To date the only site I know that does this is "filelisting dot com" but their website is very slow.

Check out http://torrentproject.se/


Suggestion: get rid of the torrent files and just serve magnet urls in plain text without a clickable button. This way you get rid of the knowledge of who-downloaded-what, and also save some bandwidth.


> just serve magnet urls in plain text without a clickable button

That would make it impossible to copy the magnet link from the search results page, which I believe is a really nice usability perk. For users that are less technically inclined, clickable links could be crucial (even if it gives you the theoretical possibility to track the click).

I guess it depends on if you are prepared to be a search engine only for the 1337 crowd, or if you want casual users to have access to a decent search engine too. The Hacker News crowd could still just right click and copy the magnet link, so essentially you're just removing features for those who don't understand how technology works but are still dependant on it.

Perhaps the best idea would be to keep the link and have a text area for selecting and copying like requested above.

Consider this.


Clickable magnet links are fine, since the query doesn't actually go to the server.


I think the point is that the clicks could be tracked via Javascript, even if the links themselves point elsewhere.


If we're allowing javascript, then you might as well use links. It would be just as easy to check for selected text to find out what the user copied from the page.


Ah, that's true, I kind of assumed no client-side trickery, as it would be easily detectable.


Yes, if they stand by their word of not using JavaScript. That is easy enough to verify by someone knowledgeable, but I suppose ditching any magnet links would be more of a marketing move ("we can't track your clicks, because there's nothing to click!"). I guess selecting text (at least by selecting it in a textarea) could easily be tracked with js too though...


> This way you get rid of the knowledge of who-downloaded-what, and also save some bandwidth.

I don't get this. It's a magnet link with magnet protocol. It will be handled by your torrent client, no additional request to the server whatsoever if you click on them. It's the same as plain text copying, but more convenient.


That's still trackable via JS for instance. Selecting & copying plain text are (AFAIK) not trackable, hence the suggestion from parent.


Nearly everything that you do on a page is trackable if you have JS enabled including how you moved your mouse on the page (https://api.jquery.com/mousemove/).

To keep the technical details short: the events are logged with JS and sent over either ajax or a websocket in the background as you navigate the site.


Text select and copy raises the respective events in supported browsers. See https://developer.mozilla.org/en-US/docs/Web/Events/copy for the copy event.


Selecting and copying are both definitely trackable.


Is "right click and copy url" trackable as well? Genuine question.


Yes, it is. I've tried it on Facebook several times (since I try to avoid clicking on external links on the platform) and it shows related links the moment I finish the right click and copy link location action. There is some variation depending on where one clicks (on the image that has a link or the actual hyperlink), but Facebook definitely knows in most cases what URLs someone is copying to the clipboard. The first time I saw it I was very annoyed.


Because the website notorious with spying on every single user action was spying on a user action?


I get annoyed by it too, because sometimes I want to read an article but don't necessarily want fb to continue serving those articles to me. I don't really know what happens inside the Facebook machine, but I imagine clicking a link indicates you are interested in a topic. Sometimes I just want to read trash.


So google the title in an incognito window, or alternative and preferably, stop using that site all together.


Yes, I tend to do something along those lines when I don't want to be tracked. To the original point, it's an annoying extra step to have to do but not unreasonable enough to stop using the site, I'm simply answering your question on why someone might find it annoying.


Yes and yes - Right click raises an event, and copy specifically raises an event.

If in doubt: If it happens over or in a browser window, it is probably trackable from JavaScript.

See: https://developer.mozilla.org/en-US/docs/Web/Events/copy


Over a browser Window? So If my 1Password window overlaps with my browser window, the page I'm on could sniff the entire 1Password window? Seems like a bunch of FUD..


It's not really polite to call it FUD right off the bat, especially when you're probably just misunderstanding. A browser obviously can't get events from a different app.


Moving your mouse over the window, despite not interacting with it, will allow the page to track your cursor, and could provide a surprising amount of information. That's why I wrote "over".


You built a strawman, but it can actually be dangerous to interact with an application on the same screen as a browser!

https://jameshfisher.github.io/cursory-hack/


Subject to the same origin policy.

https://en.wikipedia.org/wiki/Same-origin_policy


That protects you (to some extend) from page A accessing resources from page B.

Same origin policy doesn't do anything to stop you from being tracked, though.


Yeah, of course, I was replying to a post about the concern about a website getting your entire 1password database.


Maybe not in itself, but "user mouse pointer is above the url" and "user right clicked" are.


It is trackable in itself through the `copy` clipboard event (https://developer.mozilla.org/en-US/docs/Web/Events/copy). Both keychord and contextual menu will trigger the event.


The parent was asking whether the "Copy Link Address" (Chrome) or similar is other browsers is trackable though, not the more general copy to clipboard.

I don't think that one is.


Presumably if you select and copy from a page's source view (eg via Inspect in FF) then you'd overcome tracking of specific interactions?

Or if you just block scripts.


The site serves no JS. Of course you shouldn't trust the site, so you should disable JS explicitly (via NoScript or other means) too.


In that case a clickable maybe link isn't trackable. Afaik that requires js


not if the link itself is unique for each user trivial to implement, this is how email campaigns track what you clicked in a email ( email after all do not server js)


doesn't that only work for sites the link creator has control of? i don't see how the technique can be used with links a search engine finds and then gives out to anon users.


This is how it would work.

1. The search engine adds itself to the tracker list in the magnet URI (&tr= query parameters), but with a unique subdomain and using HTTP (i.e. it adds http://asdfkja.skytorrent.in/announce to the tracker list)

2. The client announces itself to the tracker, sending the unique "Host" in the GET request (asdfkja.skytorrent.in in the example).


In that case the plain text link would be just as trackable.


magnet links are user-side. They're like email addresses or phone numbers.


This doesn't prevent their users from being tracked.

Email addresses can be used for tracking by having each request for an address (e.g. each GET of a "contact us" page) return a different email address (e.g. contact43327@example.org). Checking which address gets used allows the GET request (and all associated information) to be associated with the email message (and all associated information).

Likewise, magnet links can be tracked by looking for torrent clients trying to access them. A sibling has pointed out that unique tracker URLs can be used for this. Another way would be to make up a unique content hash for each request, then lurk on the DHT looking for queries for those hashes.

If the operator has no qualms with transferring data (e.g. being directly exposed to copyright infringement) they could even service such requests, with the user being unaware of the fact they're being tracked: the operator alters the hash by making some innocuous "watermark" change to the content, e.g. altering a filename inside an archive; each time a chunk is requested, the operator fetches it from the "original" and passes it on.


If 2 more users recommend this, it will be done.


I never download torrent files, just look for the magnet link. Same feature, why bother downloading?


Magnet links are perfectly fine for popular content, but because of the dependency on BEP 009 (Metadata exchange) for very long tail content it's often better to get the torrent file directly.

For example, to get information about torrents that are on life support (e.g. have a couple seeds that show up for a few minutes a day), having the torrent file downloadable is invaluable, as it includes info like the size and the file names, etc. In many cases it's even possible to bring a torrent /back to life/ if you have the torrent file and got the content that was in it out of band.

If you just have the magnet link you cannot do any of this.


It's not the same.you'll be missing some of the trackers.


Magnet URLs can and usually do include all of the trackers.

http://www.bittorrent.org/beps/bep_0009.html#magnet-uri-form...

That being said DHT and PEX (BEP 5 and BEP 11) is good enough that you don't need them.


That's not entirely true. DHT is generally the most effective, but PEX and trackers help a lot when there aren't many seeds, or there's a lot of noise on the DHT.


Vote to serve only magnet links.


Please don't remove downloads of .torrent files. If you don't have this it's impossible to know what files are in the torrent (or even how big they are!) without connecting to the swarm and relying on BEP 009 (Metadata exchange).

I use this all the time for downloading very long tail content, and the move to have indexing sites be magnet-only is very frustrating.


What's the intent/desire for you to be on the delivering end of serving .torrent files?


There is no intent, it's just there as many people still prefer torrent file. Though we have magnet links, so security focused users can download without us knowing nothing.


Tangible/easy mirroring of the torrent database? Then again, DHT wouldn't go down.


Yes magnet only plz


i recommend this


do this


It will be done soon, thanks for voting.


ralph plz


Alternatively: Just don't log downloads of the torrent file? Those who care can use the magnet link, those who have a use case for the torrent file can get the unlogged torrent file.

I prefer using torrent files, because I can just wget/curl them down into a directory being watched by my server which downloads torrents.


You can use aria2c on the command line to generate the torrent file from the magnet link, so it's just as easy, only a little slower.


That looks good. Thanks for the tip.


The DHT search from the magnet lookup will announce to the world what you are downloading, no?


Yes, but the site owner does not know if the Person who views the page actually clicks the magnet link or not, so noone will be able to tell "due to your site, X people illegally downloaded file Y"


"Due to your site, people viewed illegal content. Go to jail for X years".

Is the current world we live in.


YES, but still we donot download any content(because it reaches EXA Bytes).


Yes!


> magnet urls in plain text

A magnet as a link would be more useful. I'm sure the people worried about letting others know what they download can copy the link without clicking anyway.


agree


I don't do web development or design, but is tracking the selection of text that will be copied more difficult that tracking clicks in actual practice?


Send an event into Google Analytics using GTM when a visitor copies text from your page

http://dcarlbom.com/google-tag-manager/event-tracking-gtm-wh...


Yes, clicking a button or link will send a request to the server with some information in the header, such as which browser clicked from which IP.

Selecting the text will trigger no such request, the developer will still know who opened the site and searched for what, but he cannot know the exact torrent that was copied.


clicking a magnet link wont fire a request to the server. yes you can track who clicked a magnet link with javascript but you can track who selected the text with javascript also.


The magnet links can be in the page, and downloads and copy/paste UX can be achieved via JS just the same as server-backed as far as the end-user is concerned. This way the entire execution of the page can be audited.


Don't you realize that you can't build a usable site without mountains of JS and user-tracking scripts? /sarcasm

It's really refreshing to see a fast, small efficient site that gets rid of all the bullshit. Congrats!


Yes!

At first I literally couldn't believe what I was seeing - ZERO third-party bullshit in uMatrix. Not just none to unblock to make the site work - none at all!

(Apparently you can make a website without downloading Bootstrap from someone's external CDN...)

And then... oh my God, how fast the site is. I can actually feel I have a somewhat modern, powerful computer. Usual day-to-day browsing makes me forget that.

So yeah, for this reason alone I'll be using and recommending this site. Great job!


I tried doing our site like that several times but everyone just complains it's ugly... sigh


CSS and design skills have nothing to do with having JavaScript disabled.


You'd think so, but I've heard plenty of web-devs complain that sites look "ugly" just because it was built without JS. I'm not saying it makes sense, but I've definitely heard it.


with all free services it starts this way ( look at imgur and so on. ) and eventually ends in ads ridden monster.

but we usually end up in a couple of years great service, before they figure out that they could be earning money with serving ads or selling data.


Let's be grateful for what we have now.

The cycle is usually completed by another free service starting and being usable for the next few years, and in this case there's no pain in switching (no content you uploaded and spread URLs to), so I wouldn't worry.

On the other hand, in the age of HTML "hello world" including an Angular installation, downloading Bootstrap from external CDN, and at least three different kind of tracking scripts, I think sites like this one need to be praised loudly.


I never see any banners on Imgur. I block banners.

What Extratorrents and The Pirate Bay do tho seems to be something akin to clickjacking. I click on a link to a torrent search result (or a download link), a tab opens, and it immediately closes. Rest assured this generates them money.


That is your adblocker closing the popup as soon as it opens. For users without an adblocker the tab stays open.


Awesome


A few other folks have mentioned this, but a (monthly?) dump of the database, offered as a torrent link on the site would be fantastic.

Not only would it enable offline searching, but when the inevitable take down shows up, folks can still continue to search the database.


7 more votes to get it done.


It's really not an auction site. Set up a way to handle feature requests.


Don't worry, unless they set up a vote to "banish all US Copyright protected works" the inevitable RIAA/MPAA pressure will sort out the project by itself. I'm not delusional to claim Torrents are without legitimate purpose - they are - and I'm also not delusional enough to believe hosts/developers are powerless to address piracy of sorts.


Yes please.


+1


+1


Vote this up!


+1


That would be really fantastic, and something that is missing from every torrent search engine I've used.


this is actually a great idea.. +1


That site served blazingly fast. It's refreshing to see an ad free torrent site.

Since you aren't monetizing the site, how are you planning on keeping the site running? Donations?


It's very fast.

Currently they're NOT accepting donations.

https://www.skytorrents.in/howto


Given the relatively slim page size (23Kb for me), and given that SQLite can pretty easily host a relatively simple database like this one and serve 100K-1M queries per day[0], I would expect that whoever is hosting this could pretty easily host this on a single dedicated server which only costs ~$50. So for the foreseeable future, I suspect this site will be fine.

[0] - https://sqlite.org/whentouse.html


The cost of hosting will probably be insignificant compared to the cost of dealing with DMCA takedowns and possibly lawsuits.


That's why I built Magnetissimo: https://github.com/sergiotapia/magnetissimo

It's easy to build a crawler, the hard part is spending time on bullshit like DMCA takedowns and such. Even if you say: "I just crawl, I don't provide download links, I don't even know what is indexed", you have to deal with legal issues.

So, host your own locally or for your community. No big deal.

Even Google has to deal with DMCA!


How large is the database compiled by Magnetissimo? Is it Wikipedia-sized (a few dozens of Gigabytes), or several orders of magnitude above or below?


Haven't really measured to be honest. But it's only 1 model with a few fields. Shouldn't be humongous.


I'm sure you'll right. The MPAA will get this taken down as soon as they are aware of it. Which is a shame because it's a very nice looking and functional website.

Aside: the MPAA can do this, of course, because they give lots of money to politicians. If Trump really wants to "drain the swamp" he could be more pro-freedom on file sharing and copyright. Many Trump supporters feel Holywood supports the Democrats, so it's not like he will lose much support over this. A Republican staffer suggested something similar a while ago ( https://politics.slashdot.org/story/12/11/16/2354259/gop-bri... ) but was shot down.


Unless it's a honeypot run by the MPAA ... doffs tinfoil hat


If the MPAA put the site up aren't they then implicitly authorising it's use. Surely that would preclude any suggestions of tortuous infringement.

If I put a sign on my house saying "please take anything you like" and people take stuff then it's no longer burglary. If someone else puts the sign up, who isn't authorised, then it is still theft.

Of course following that analogy through, if MPAA aren't authorised then any content owners should be able to successfully sue them for contributory infringement. People in the UK have been extradited for such things. Extradition of MPAA bosses for running a torrent site would be hilarious (but still wrong!).


they have in past been busted doing similar things so i dont put is past them


SQLITE cannot handle indexing queries of our kind and is not used. We have kept the page slim, so as the website is as fast as possible.


If you don't mind me asking, what technologies are you using?


It appears to be using Caddy at least https://caddyserver.com/


Caddy is just used as a reverse proxy, a frontend for the backends.


You could use Sqlite for persistence, but a search index like Lucene for full-text searching - and Lucene runs great in constrained environments too.


Sphinx Search is orders of magnitude faster and more lightweight than Lucene.

Heck, even PostgreSQL's fulltext indexes would be faster than those Java monsters.


The top 1000 has a lot of software. The thought of running pirated software always filled me with unease, having been burned by malware in my younger days. Only tangentially related to this really well done torrent search (so snappy!), but does anyone have a feel for how safe this software is in general? Do security firms do any analysis on it?


We have an AI based differentiater which detects fake or malware infested torrents. However, since this is alpha state and is used sometimes. Once we are satisfied, it will be deployed. We still have deployed different mechanism to detect and fake torrents. There is also Voting and commenting for the community


Growing up I heard horror stories of people pirating versions of Windows, Office, etc that had backdoors installed. If you download pirated software there isn't a real way to know for 100% certainty that it isn't laced with something.

So I avoid it like the plague. Even if the majority don't have anything it only takes the one and then, depending on its level of sophistication, you're screwed.


Pirating Windows is pretty stupid these days. You can download the official MS ISO from their website, and buy a real key from some subreddits for like $20.

The keys they sell are those bulk keys large companies get, their IT dep guy wants to make some extra coin on the side. But they work.


Also:

- if you bought a laptop with preinstalled Windows, you can activate a fresh install with the key that's stored in the laptop (in ACPI tables IIRC). Just make sure it's the same edition and language.

- if you're a student, check if you have DreamSpark (er, Imagine) Premium. Apparently Windows Server is available even in the non-Premium program (for all students).


> you can activate a fresh install with the key that's stored in the laptop (in ACPI tables IIRC).

Free tool we developed to quickly grab that key: http://neosmart.net/OemKeyTool/


> The keys they sell are those bulk keys large companies get, their IT dep guy wants to make some extra coin on the side. But they work.

What is the legality of that? The IT-guy is probably violating his corporation's license with Microsoft, but does that affect you as buyer? I suspect that you may run afoul of anti-fencing laws in a lot of jurisdictions (where buying goods you can reasonably suspect are fenced is illegal as well).


Aren't keys tied to computers these days?

> The keys they sell are those bulk keys large companies get, their IT dep guy wants to make some extra coin on the side

Which make these keys as illegal as pirating Windows. Obviously it is a more secure solution though.


Which subs, how do you look for that?


Don't bother with the subs just buy one off ebay for $5 instead it's the same thing and the subs are just as shady but charge more.



i think we can be quite certain that there are backdoors into windows also without it being pirated, so your phrasing makes it easy to argue back.

i'd personally argue with the backdoors hidden by the pirates are significantly more likely to be used for nefarious things than the 'official' ones, as whitehats are responsible for the former, while blackhats created the latter.


My favorite was all the pirated copies of Windows 10 that were floating around that supposedly had the telemetry features removed. It always confused me who exactly those were targeting.


My guess: one's mother / uncle / neighbour who wants Windows 10 on his/her laptop, but heard about it spying on you and does not want that.

I know that the usual rule of thumb is that "normals don't understand or care about complicated technical issues" such as privacy. But Windows 10 telemetry was so widely discussed in media, that I noticed even the non-tech people around me were refusing to upgrade from Win8 to Win10 on the basis of "I heard it spies on you".


Lots of non technical people I know find that creepy without really being sure of what to do about it.


And the crying shame there is that Win8 did the same sort of spying.


Actually, no. Microsoft instrumented essentially every OS component (and sharing backend infrastructure, universal telemetry client, etc.) starting in Windows 10. Prior to that, it was just a few isolated pieces that had their own kind of telemetry.


Not quite. Windows 8 had most of the same telemetry, but it was off by default and you were given the option to turn it on during or after installation. Windows 10 flipped that on its head by giving you a page during install that said "get started quickly" which automatically turned everything on, and there was a tiny text link at the bottom which said "customize". Clicking that gave you three pages of telemetry-related toggles that you had to manually click to turn off. Even then, basic telemetry (anonymized usage data and crash reports) is still sent to them.

After Windows 10 was released, Microsoft started backporting most of its telemetry capabilities to Windows 8 and 7 and turning them on by default, rendering those versions just as "backdoored" as 10. That was what pushed me to accept that Windows was never going to get better in that regard, and so I've upgraded all my Windows 7 and 8 machines (except my Surface RT obviously) to take advantage of the new features. I'm especially enjoying native NVMe support on my new workstation build; I can't go back to a spinning HDD without feeling like I've stepped 20 years into the past.


Also wrong. The telemetry in Windows 8 was just in critical components that had been instrumented long ago, such as Windows Update, crash management, etc. None of those were using the current universal telemetry system. In fact, I don't know if all of them moved to the new telemetry infrastructure shared with the rest of the OS or if they kept what they had and just added the new telemetry on top.

All that said, I agree with your last point about upgrading being a fair compromise compared to staying in Windows 7 or gasp Windows 8.


> The telemetry in Windows 8 was just in critical components that had been instrumented long ago, such as Windows Update, crash management, etc. None of those were using the current universal telemetry system.

Are you sure? I did a lot of Windows 8 installations when I was evaluating it, and I distinctly recall the option to turn on the same tracking features that 10 had on by default. In the Windows 8 installer they were presented up front and off by default, in 10 they are hidden and on by default.

As of a month ago this is still the case; I reset a Windows 8 hybrid laptop, going through the standard installation screens, and then upgraded it to 10.


The EULA for it is the same, but the underlying system is totally different. I am 100% sure about this :-)

But, as you noted, a lot of the telemetry instrumentation has been backported so that argument is lost for someone wanting to avoid upgrading to Windows 10 from an earlier version.


The same kids who believe "Turn off your virus checker when running the crack because it shows up as a false positive"


> isn't a real way to know for 100% certainty that it isn't laced with something.

There are iso hashes on MSDN. You can read them with a free account.


With some things you can get a hash of an iso from the true source and compares that to the downloaded iso


Security in this space is based on reputation and trust. Users and groups who has a long history of uploading files with no malware is more trusted than freshly created accounts.

To make a analogy, imagine a world where there existed no government food inspectors. I would expect that visiting restaurants would become more dangerous, but my behavior would also change. I would only visit established places where people before me have eaten and proven that the food aren't poisonous.


https://github.com/you-dont-need/You-Dont-Need-Javascript Here is a good collection of web-components which can be implemented without using JavaScript. Hope it helps you guys in developing the site. All the best.


Thanks for sharing that! Some of them are just a big pile of CSS which I don't know if it's a better option than JavaScript... But others were pure gold and a very clever use of HTML (like the carousel)


Not sure if it's a bug, but a common search for '1080p' returns no results (and takes around 10 seconds): https://www.skytorrents.in/search/all/ed/1/?q=1080p


Same with searching '720p'


Suggestions:

1. Enable whois privacy via Gandi

2. Hide your server IPs behind Cloudflare

3. Ensure the site is written such that it is trivial for yourself and others to use alternate domain names


no CDNs as it leads to tracking by CDNs.


Close your port 22 at least. Define some simple iptables rules.

Only able to login if coming from specific ip address, like a private vpn server.


I thought whois guard wasn't allowed by .in? I'm using Namecheap and it isn't possible for my .in but I'd like to add it if some other registrars allow this. Anyone have to registrar to recommend besides gandhi?


Gandi is the best. Although you pay for their no bullshit, and sometimes they won't let you do (questionable) things that other registrars allow.


I'm assuming that means they somehow allow .in regstration to use their WHOIS guard? This conflicts with their published list of WHOIS guard compatible domains [0].

[0]: http://wiki.gandi.net/en/domains/private-registration/gandi-...


They don't, afaik.


Note: there is a website that shows the real ip addresses of (some?) cloudflare customers: http://www.crimeflare.com/cfs.html


Yeah, anyone who moves to Cloudflare (or any service that happens to hide origin server IPs) should then change their origin IPs.

Not everyone has got the memo on that.


Is old whois data cached? If you launch without whois privacy and turn it on later, can someone go back in history and still find the registrant?

As a side note, maybe someone like archive.org should keep time series whois data...


> can someone go back in history and still find the registrant?

Yes, there are some services which offer this, e.g.:

http://www.domainhistory.net/skytorrents.in


Will Cloudflare work for this kind of stuff? I assume you guys comply with DMCA takedowns and things?


Yeah the whois thing should be done ASAP before you start to take on more users.


#1 on HackerNews -> [Dynamic Search-Result] Page served in 39┬Ás

What kind of dark magic is this? What's your stack?


My guess is that he's using a compiled language such as Rust or Go, with a lightweight non-blocking framework, plus intelligent caching of backend queries, plus a decent backend such as Sphinx or PostgreSQL.

Compare with interpreted languages for the frontend and Java monsters as backends (PHP and Lucene anyone?)

People forget what modern computers are really capable of, when used properly.


The only thing I see in the headers is that it's using https://caddyserver.com


Yeah it's stupidly fast.


I'd suggest using another TLD or having multiple TLDs to avoid losing it even temporarily (if at all someone decides to take this domain down). The .in TLD policies [1] require a real and accurate address of the registrant, which is not provided for this domain [2]:

> Contact information: Registrants must provide true, accurate contact information. The following contact types are required: Registrant, Administrative, Technical, Billing. As per standing policy, the contact data will displayed in the .IN WHOIS, except for the Billing contact data, which is not displayed.

[1]: https://registry.in/Policies

[2]: https://registry.in/whois/skytorrents.in / http://whois.domaintools.com/skytorrents.in


Ridiculously fast! Would love to hear about your stack. C, what else? What database, template library?


This is absolutely incredible, and the load times are incomprehensible for a modern site.

What software powers the site?

I did some looking and discovered it uses Bulma [1] instead of Bootstrap, and it's absolutely amazing. Another commenter pointed out that the site also uses Caddy for HTTP/2 and HTTPS.

Excellent work, keep it up!

[1] - http://bulma.io/


Thanks , for such search engine,that really includes no javascript and no cookies in any form.

Secondly Its amazingly fast, I believe its more than just C language. I tried different combinations of searches on your search engine to check whether results are being served from cache or so. And it looks all searches(don't know your cache/uncache db design) are several times faster comparable to many. + +1 for speed. Also i feel i found a minor bug.

Even though it looks passion project but don't know how long will you survive fighting DMCA ,others, costs without monetizing plans or accepting donations. Won't mind donation if you accept.

Till you survive, i am glad for your engine.


Please file the bug on the email address mentioned on the website.


Thank you for working on this project. What is your indexing strategy?

Similar search engines have a warning that you should use a VPN when downloading that I think would benefit users.


We have written up a DHT crawler to locate trackers.We are not dependant on trackers. We are also testing an AI based detector which filters out FAKE torrents (it is not currently deployed, but we test it). If 2 more users here demand a VPN warning, it will be deployed.


How is this "If 2 or more users demand <x feature>, it will be done" policy working for you? Have you used it in other projects prior to this? Curious to hear any stories :).


It's working fine. We are gathering feedback what users want and then do the development instead of we will develop this whether user want it or not. A company once invested more than an year developing a feature, which was rolled back in 1 day after users complained.


Requiring only 3 users to vote for a feature seems quite low to me.


> A company once invested more than an year developing a feature, which was rolled back in 1 day after users complained.

Hah! I know that all too well. Great work!


Notice how they make it difficult and painful to file DMCA complaints:

https://www.skytorrents.in/dmca

Genius.


Doesn't seem too painful to me. They're basically just asking for an email with all the stuff that's legally required to be in a DMCA notice.

Here's what the DMCA requires[1]:

(i) A physical or electronic signature of a person authorized to act on behalf of the owner of an exclusive right that is allegedly infringed.

(ii) Identification of the copyrighted work claimed to have been infringed, or, if multiple copyrighted works at a single online site are covered by a single notification, a representative list of such works at that site.

(iii) Identification of the material that is claimed to be infringing or to be the subject of infringing activity and that is to be removed or access to which is to be disabled, and information reasonably sufficient to permit the service provider to locate the material.

(iv) Information reasonably sufficient to permit the service provider to contact the complaining party, such as an address, telephone number, and, if available, an electronic mail address at which the complaining party may be contacted.

(v) A statement that the complaining party has a good faith belief that use of the material in the manner complained of is not authorized by the copyright owner, its agent, or the law.

(vi) A statement that the information in the notification is accurate, and under penalty of perjury, that the complaining party is authorized to act on behalf of the owner of an exclusive right that is allegedly infringed.

[1] https://www.law.cornell.edu/uscode/text/17/512


the fact that DMCA takedowns are applied to services that simply index other parts of the web is completely insane


Uh, could purely be a coincidence but the favicon looks identical to the logo of UK TV corporation Sky (albeit with less colour): https://en.wikipedia.org/wiki/File:Sky_plc_logo.png


Are you really complaining that a site that is dedicated to copyright infringement might be misusing a trademark? I have a feeling that is the least of their legal concerns.


Trademark law is pretty fierce, and sites like these should generally want to minimise the legal attack surface.


Never underestimate the allure of digital heroism. I get a similar vibe as Aurous with this one. For profit or not it'll bear the brunt of an angry industry.


And now it's changed to something that looks rapidly scratched down in paint :D


Sky TV will be on it in no time


add-free, non-tracking, not accepting donations - call me paranoid but unless I know why are you doing this I have to assume the worst. There simply is no free lunch.


Wow! its super fast and NO Javascript!!!

Where are you hosting it and do you plan to monetize?


No, we DONOT have any plans to monetize. NO javscript was a design feature for security focused users. We also DONOT place any cookies (and donot track). We wanted users to experience a fastest experience possible without worry about adblocks, noscripts etc.


we DONOT have any plans to monetize

So my next question is - any plans on open sourcing? :)


We donot have plans on open sourcing soon, but we love open-source. Making it open-source will be considered in future for sure.


You should consider open sourcing the static pages/text at least, so people can improve paragraphing, punctuation, and grammar/spell correct:

"donot" should be "do not" (separate words)

"atleast" should be "at least" (separate words)

I realize English is probably not your first language, so no big deal, but it would be nice to fix.


Here's a better version. I'd e-mail it, but I don't use PGP...

[Feel free to change do not to don't. Torrents and Javascript may be capitalised or not.]

This is a clean, ad-free, privacy focused torrent search engine. Like Google/Yahoo but just for torrents (at least for now). This project is still under heavy development. Feedback is welcome, please report any problems or bugs.

We do not track users in any form and therefore. We do not use cookies or javascript in any form. We do not sell any data to anyone.

The entire project is maintained up to date by smart software. Manual intervention is limited but still there. Every hour hundreds of new torrents are discovered and made available for search purposes.

This is currently in beta testing. You can send feedbacks or report any problems to admin (at) skytorrents.in.

Note: Any mails which do not use PGP are discarded by automated software.


FIXED Thanks.


Will be used ASAP


Since authors seem to like to make feature requests vote-based here, I second this motion!


I second open-sourcing. Another idea would be to offer dumps of your index for people to download.

(Pragmatic reason: I worry that you may go offline someday due to misguided legal action, lack of time, etc., and I'd like to have your database offline so I can continue using it even if you're gone.)


On behalf of all NoScript users, I thank you :)


> we DONOT have any plans to monetize.

i'd like to tweak the question, and ask if you intend to make money with this site, or is it a passion project/labour of love?


It's kind of passion project.


How will you pay the bills?


Not GP, but my guess is - like many self-respecting Internet citizens, with a $dayjob :).


:)


hey, just a thought, but at some point if this gets too popular you will probably get a message from the UK TV company SKY who wont be super happy about your name.


Or the favicon that looks like it was lifted straight from on old version of their website.


hah yes i hadn't spotted that but now you point it out i realise it was itching the back of my brain.


Perhaps show the NFO, or parse link to IMDB from NFO and link to that? You can also scrape NFOs to make genres and type of media (games, applications, audio, video, and subgenres of these).


I'm very wary of anything closed which bills itself as privacy focused and "non-tracking". Can you prove these claims openly? Otherwise this is just a public tracker, potentially full of malware and honeypots.


Agreed here, the site is using a backend and who knows what this backend is tracking. It should be a static site and open source, so we can verify the claims.


We are planning to put up a tor onion frontend and thus tor users can browse even more anonymously.


A useful addition would be to categorize the top 1000 based on file name extension.


I see that you are based out of India. How do you plan on dealing with law enforcement issues? (Great site, btw).


Hosting a torrent site is completely LEGAL. Since we are acting just like google "a search engine".


Do you have legal precedent on this? That defense hasn't held in many jurisdictions.


True or not, quite a lot of sites have said this and now no longer exist.


Well that hasn't worked out well for other sites so that is why I was curious as to how you will handle this differently.


Love the speed. What's the stack for this?


What's the chance of this getting a simple API which can be fed into flexget or similar?


are you talking about RSS, it's already there as soon as make a query to feed into flexget.


I totally hadn't made that connection, thanks!


Search works incorrectly.

+WORD1 +WORD2 returns WORD1 OR WORD2 instead of empty set if there is no match.


Asking purely out of curiosity: Are you actually based in India? (The domain is on the Indian TLD and the address in whois records lists an Indian city).

Here in India, cyber law enforcement is a joke, and if some legal issue comes up with this site, I'd be curious to see how it works out.

And, just a hypothetical follow up question: how would it work out if the site was not even DMCA compliant?


I'm not associated with it, but yeah, it was posted in /r/india few days ago:

https://www.reddit.com/r/india/comments/5mj3vz/setting_up_ne...

Domain is from Gandi (France) and IP address (hosting?) seems to be from Netherlands


Does DMCA require you to delete URLs with hashes of files when you don't own the content?

Also, are you allowed to publish the takedown requests?


If someone knows Python and uses qBittorrent, a search plugin for this site would be great!

Instructions here: https://github.com/qbittorrent/qBittorrent/wiki/How-to-write...


Hi. I would concur that you guys should get some help with the UI elements. The Verified icon messing up row height aside, on my screen the no. of files column is excessively large, and the size column is too thin that often the unit MB/GB is squeezed down to the second line, which hurts readability.


Add commas to "Serving 11234246 torrents"


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: