Hacker News new | comments | ask | show | jobs | submit login
Show HN: URL Canary – Get an alert when someone finds your secrets (urlcanary.com)
273 points by jstanley 11 months ago | hide | past | web | favorite | 116 comments

You might think about embedding bounties in crypto blockchains. For example, create BTC wallets that can be unlocked using a secret sitting next to (or steganographically embedded in) the secret you're trying to protect. This gives the person uncovering the secret an incentive to activate the canary. {RI,MP}AA are apparently doing this with their music and movies, so they know when they are showing up on pirate sites.

As skeptical as I am about crypto currencies, this is a really interesting application. Basically exploiting human greed. Thank you for sharing it.

That's some awfully pessimistic and dehumanizing language. I just consider it to be paying someone for the trouble rather than jumping straight to "exploiting greed."

Since the person profiting is someone who obtained the secret through shady means - either breaking into an insecure system, or taking advantage of their access to a system which doesn't encrypt passwords - I don't consider it pessimistic or dehumanizing to describe the behavior as greed.

Greed is a subjective term. The company who is charging money could be described as greedy for making a profit when they could have lowered the price to remove any excess profit. Why is the person who understands and uses the system in a way it wasn't intented greedy?

Because they're stealing secrets from people? I confess I'm confused why this is confusing. The exercise here is to know when something is public that you want to keep secret. The proposed solution is to make the secret something have value independent of its secretness, to tempt those that stole the secret to obtain that independent value. I'm fine attaching "greed" to profiting from stealing secrets.

You say that like there's something wrong with exploiting greed.

Strange. I feel only a slight negative connotation of "greed". Nothing to be ashamed of or to condemn. Maybe some meaning is lost in translation to my language.

Exactly, it's as if my pay cheque is "exploiting human greed".

I think this is a great idea and a great use of a BTC/LTC wallet. Breach notification often doesn't happen in a timely manner, and building a solution like this let's me, the consumer, get notified even when the vendor with the secrets doesn't send the notification.

Check out https://medium.com/@grantm/obtaining-instant-breach-transpar... for some more info about how this might be possible.

The MirageOS project has done this, they call it the "Bitcoin Piñata".

It's still unclaimed I believe: http://ownme.ipredator.se

Claimed today I believe. Possibly inspired by your post

Is there a source that it was definitely claimed? The BTC have moved, but the site notes "[i]n 2018 we will likely reuse most bitcoins for other projects", and the transaction (splitting into two amounts of 9BTC and ~1BTC) aligns with this Tweet from December:

> PSA: the bitcoin piñata will be reduced by a large amount, the owner who lend the 10 btc wants to spent 9 on useful projects


From their site:

"This challenge started in February 2015, and will run until the above address no longer contains the 10 bitcoins it started with, or until we lose interest. In 2018 we will likely reuse most bitcoins for other projects."

So I'm not sure what happened frankly.

You’ll notice someone gained access to your system (because the money will disappear) but you’ll have zero clue how they did it. Also, how do you prevent an employee from stealing the funds for themselves?

There’s a reason companies use bug bounty platforms instead of just having a bunch of bitcoins lying around.

But there has to be some way we can solve this with the blockchain. /sarcasm

I'm interested in understanding how the RIA/MPAA canary works. Can you please elaborate or provide an explanatory link?

Similar idea than https://uriteller.io. This is nice way to check whether your end to end encrypted chat is really secure or not.

Uri Teller is a perfect name, that's brilliant.

Also a lot better for simple use cases since it doesn't require an email address.

I created a similar idea a while back when Gmail started caching images. The main advantage is the image itself contains the log: http://cache-logger.herokuapp.com/hello-world

Code can be found here: https://github.com/kale/image-cache-logger

This is brilliant. They recommend using a url shortener, but I want to see if anyone is parsing comments and visiting urls from HN.


Edited to add:

Here's the view key if you're interested. Just append it to the end of uriteller.io:


A crawler on AWS hit it two minutes after I posted it.

Lots of Mac users with out of date OS :S https://uriteller.io/ZBt0gGoUHtIsyQ7KFwikYg

I clicked the link, but I'm not a bot, I swear!

FYI - I signed up and the email confirmation page showed me someone else's canary URL

That's not good! Taking a look.

EDIT: I believe it's because the CSPRNG state ( https://metacpan.org/pod/Bytes::Random::Secure::Tiny ) was created before the process forks, so they shared the initial state and generated the same token. I've reduced it to 1 worker pending an actual fix.

Sorry about that, and thanks for pointing it out.

There was a recent discussion about (not) sharing CSPRNG state with fork-ed processes: https://news.ycombinator.com/item?id=15759855

Given this kind of disaster potential, wouldn't it be a rather attractive option to use /dev/urandom?

Bytes::Random::Secure::Tiny seeds itself from /dev/urandom, I just need to make sure to initialise it on first use, instead of when my program first starts (which I've now done).

In general I suspect if anything you'd be more likely to mess it up by reading bytes from /dev/urandom manually than by using a library.

How exactly can one mess up reading bytes from `/dev/urandom`? Serious question.

Open the file. Read from it. If no failures on open or read, you have random bytes. In essence, there is already a library for this: `open` and `read`, which seems to be the same API surface area as this library.

Here's all the ways this can possibly go wrong:


Most of these are ridiculous.

First, the author mentions that a `read` from urandom can be interrupted. I am unaware of any system where this is actually possible. And even if it were, the author's original code (and my description of an implementation) already works! The `read` call will return an error, and that error is handled. His "improved" code is simply an optimization around retrying from this device, but it's not an improvement in safety.

His second argument is that /dev/urandom might not have enough randomness in it. This is, quite simply, not a concern for anyone not writing code for specific embedded devices or for extremely early in the kernel boot process. Anyone who is writing code for these environments is almost certainly already aware of these limitations. And even then, using a library like the one the GP is using doesn't actually help, since it's virtually guaranteed to just be reading bytes from `/dev/urandom` for its seed in the first place.

The rest go into situations that — quite frankly — border on ludicrous. If someone has replaced your `/dev/random` with `/dev/zero`, you have already lost and there is nothing you can or should reasonably do besides nuke the machine from orbit.

I though the main issue of reading bytes from /dev/urandom was performance (ignoring the issue of ensuring /dev/urandom has been initialized since boot.)

So instead, we use them to seed CPRNGs, and use them for speed.

This implies that you should probably invalidate all URLs created before you applied that mitigation, right? Or do you have some way (logs?) of knowing which ones were compromised?

+1 for using Perl

Care to share your stack?

Hey, sorry for the late reply. Not sure if you'll read this...

It's mostly http://mojolicious.org/

SQLite for the database (for now; that'll change if it gets too big), and nginx as a reverse proxy in front of Mojolicious. CentOS 7.

Cool thank you for response.

Well, did you go to the URL so that that user will know they used an insecure service?

They forgot to dogfood their own product

I don't understand this. Someone just visits the URL and I get a email notification? If it's supposed to be secret, how does the script knows if the visit is OK or not, sending me the notification only when it's OK? Are the good guys supposed to know this is a sort of "honeypot" URL and not visit it?

Put a URL in a firmware image that is never called by your device/app.

Monitor the URL for access. You then receive an alert that someone accessed the URL.

This gives you a real-time notification that a reverse engineer has looked at your firmware. It also gives you an IP address. So you now “know” that someone might try to hack your device. And you also have an IP address.

This is a billion dollar security play!

Jokes aside. It is a good concept for IoT firms who have a security advocate, but no budget. Would help persuade people that there are hackers targeting their devices/apps with quantifiable data of degrading value.

Huh. That’s...a good idea. The idea has to be developed further though. This has to be deployed on different domain names and with full content control on the web pages.

I guess that’s all doable with the “private server with root access” under enterprise pricing. What a great way to precisely measure cover time. You could inject arbitrary URLs into an application to see if your API has been reverse engineered.

I presume the idea is to make the URL appear to be part of the standard API (and maybe even does something useful to induce use?), but it is never actually called by a legitimate application?

Hackers worth their salt work in air-gapped environments.

This is a signal, but not a game changer for security pros.

I’m speaking specifically about the case where someone is trying to reverse engineer a private API from an application. Then interacting with an API endpoint will necessarily trigger the canary.

Having retrieved API secrets offensively, and overseen secret rotation defensively, I’d say it would be a game changer. It’s an excellent idea to automate this discovery with an alarm. The current discovery system is either an internally developed, half-baked version of this that comes from sophisticated logging, or manual oversight.

You won't share the URL in a public place, you'll store it alongside whatever you're trying to keep safe.

If the URL gets accessed at all, you will know your secret has been leaked.

Friendly feedback, coming from a reverse engineer: the commenter who also answered this question gave a really good, concrete example of how this could be used in the context of DRM. When I visited your page I was initially perplexed as to why you’d want any of this. You should include a specific example of how this service delivers value, and the reverse engineering example is a good one.

Thanks for the feedback :)

You could also put it in something like your 1Password Vault. Although, if that's been hacked/accessed, you may not have time to save yourself.

Semi related asking the wizard of HN. I recall reading a similar trick whereby one can embed a github key/token/api key/??? In a source file in one's repository and then if your code ever is stolen and pushed to github then you will receive a notification because github will revoke the token. Does this or similar ring a bell for anyone?

Great project! I just found out that one creepy Safari extension is crawling every URL I visit.

Also Yahoo Slurp is crawling my email URLs. Sigh.

What's the extension?

This is neat.

For added security, maybe better to hide the canary URL in a bit.ly link? Someone might know your 3 URLS.

You can preview bit.ly links by appending a + to it.

But if you are already suspicious enough that you want to do that, you were never going to click on the original URL anyway, so it's no worse than not using bit.ly.

Not true. I know bitly and googl have those info pages, so I'll check them out when I am curious but am not sure whether I want to alert whereever it leads. I know I'll get more info at best, and don't lose anything at worst. For random URLs, I guess I could try whois lookups, but I'd be much more likely to just check it out than with a short URL which is easily checked out.

I think you missed my point.

If seeing that the bit.ly URL redirects to a known-urlcanary domain would put you off visiting the URL, then seeing the raw known-urlcanary domain (not behind bit.ly) would also be enough to put you off visiting it.

Ah, yes, if that is the alternative, then of course. I thought it was between some shortened link and an unknown domain.

So, doesn't this mean you need to have a bunch of domains, none of which look like urlcanary domains? If you hide behind something like namecheap's whoisguard (customer, not affiliated with them), that would do the trick, right?

It's easy to integrate in a scraping script though, which I thought was what this was supposed to combat.

I usually curl shortened urls, the response is either a 301/302 or an html page with a redirect (usually the latter)

If you preview a link, wouldn't it then require the bit.ly to fetch it? Or does bit.ly serve a cached copy?

Doesn't bit.ly access the url? Either at creation time or at any point later on?

If you can use any domain you like, that is great.

MyURL.com/101/passwords /private /logins

Is the idea that you'd embed this in a way that it is automatically triggered? Or that you would leave it in plaintext somewhere and assume someone would eventually visit it if they were snooping around your stuff?

You can use any domain you like, although I don't think I've documented that. Just point any domain at and it'll work.

(But obviously it needs to be a hostname you're not already using for something else).

Sorry to offer unsolicited advice, but are you sure you wanna use a fixed IP for that? Seems more reasonable to ask people to CNAME to a domain you guarantee will always point to the correct servers.

This means you don't break the system when you move IP address. Moreover, should you ever need to, you can round-robin the domain for either reliability or load-balancing (though I doubt that would be necessary).

If you are serious about testing this, someone will just DDOS the ip and the monitoring is offline for everyone.


That doesn't fully obscure the original URL though, it's very easy to issue a HEAD request and examine the headers first.

Great idea! Might even be possible to sort that out for the user automatically...

The bit.ly approach might result in false positives as the namespace for bit.ly links is quite easy to iterate over (assuming you don't use a custom link).

Also, doesn't bit.ly access a URL to pull a title or generate a preview? This would send a click through to the canary as well.

who puts a bit.ly link alongside their secrets?

A quick test using Zapier's webhook functionality tells me I can duplicate this with their platform. Doesn't seem to work with link shortener, but I have the sense more time invested on that will yield a result. So, I think the concept has merit but you're off center of the target and much more value is required.

I imagine there's a lot of value to be found in a dashboard, or generally in managing these canaries. Which canary was deployed where, how often are they triggered (with the ability to mark some instances as false alarm) etc.

But of course that heavily depends on the use case

So what happens when the three offered domains become widely known to be fake? Can this service be federated and use custom domains or is it just another game of whack-a-mole?

From the landing page: "...and it's easy to setup a URL Canary on a custom domain name."

Not to be overly negative, but if you're going to the trouble of pointing a custom domain to this service, couldn't you just write a quick PHP script that mail()s out when it's accessed? As far as I can see, the only advantage of using this third-party site is convenience, a custom domain makes it much less so.

Will the canary URLs always be to the same domain? If so, can't I circumvent this just by using custom dns locally to map that domain to somewhere else, like localhost? If it's not a domain, but an IP address, can't I just route that IP somewhere else?

There are currently 3 domains available. If this takes off I could do something like mailinator (offer the user a small selection, but never reveal the full list at once).

You can also register your own domain and point it at my server and your canary will work just fine on that domain.

If you're playing at a high enough level that you've specifically blackholed URL Canary traffic by IP address, then you're a worthy adversary. And additionally, that is a splendid problem for my project to have.

On the enterprise tier you even get your own server with its own IP address, so there should be nothing linking it to URL Canary at all. (Although the enterprise tier is extremely expensive, and if anyone buys it I will probably panic).

This is neat.. Its currently a single type of token (ie. a URL).

If you check out https://canarytokens.org you will notice the ability to create several others (be notified when someone resolves an IP address, be notified when someone opens a file, be notified when someone views a QR code, etc)

how does this differ from https://canarytokens.org/generate ?

Very little!

Although the generated URLs don't have "canary token" in them.

Canarytoken allows you to download a docker instance, so you can host the server on any domain you like.

I'd really like to see a generalisation of this idea to any personal/private data stored in any database. Any time a piece of 'your' data (e.g. a medical record) is accessed, you get an alert. There could be an industry of alert brokers that decide if the alert is important or not - you might employ one, or write your own, or choose to look at every alert. While it would require a big change to how we store data, I suspect that the changes required to be GDPR compliant start going in this sort of direction.

Is following a URL common practice for someone who accesses documents? I mean if I come across a repository, my first inclination isn't to find all of the hidden URL's and chase them.

It's common practice for robots and vulnerability analysis

How does a potential customer know how effective this is? How do you even know that it's effective for your personal use case as described https://urlcanary.com/about ?

How many attackers are going to click that link or any link for that matter? Seems the value prop of the product is based on the assumption that folks will click. Maybe it's a solid assumption. I just can't see the evidence for it.

The value lies in sometimes demonstrating a channel is insecure. It won't always do so. Obviously the converse--proving a channel is secure--is much harder.

I don't think it has commercial value really. But for social awareness, showing that mail, or notes, or storage providers aren't always as private as you'd hope, that's where the value is.

But it demonstrates nothing if no one clicks. The channel may be insecure, compromised and no one knows - not the author or the parties who are supposed to have the secrets.

I see what he's trying to do, but this isn't the way I don't think. Mathematical proofs that verify that a payload is observed, opened, or accessed work. They are deterministic. They are also way more complex. I think this is trying to solve a problem in a simple way but it's still just as nondeterministic as without this solution IMO.

Absolutely. But if you have the choice between a) not knowing 100 pct of the time, and b) knowing for certain it's not private X pct of the time, for unknown X probably not 0, you'd want b. You will have more information by trying.

This looks neat. What is a good use-case for your average techy? What about your average non-techy?

It can be useful anywhere you have sensitive information that you wouldn't want to fall into the wrong hands, or more specifically that you would want to find out if it did.

It could go in backups, in your git repository, bug tracker, internal wiki, etc.

For an average non-techy, I don't know... they might want to put one in their diary?

For any web/cloud service, I think this might be easily triggered when the service is trying to be helpful. I.e. when fetching previews, displaying in/active links, etc...

You're right, and this does happen.

If that's a problem for you then you can mitigate it a bit by checking the User-Agent and IP address in the alert email you receive.

lol, just add a captcha to the trap page :D

Do you have domain diversity? Organizations will blackhole urlcanary.com

Nice! I've often considered implementing a service like this at the DNS level. DNS would be ideal because it would work for more services, not only HTTP.

given the options are 3 domains to choose from:

  1. emotionalrec.com
  2. factwisdom.com
  3. tdurl.uk
Smart thief simply won't follow the link.

Even then you're no worse off than if you didn't have it at all.

This is neat. Reminds me of the MR robot episode where he puts a link to track the FBI in season 3

Maybe I don't understand the concept but why would you ever put a secret public?

You don't. You put it in a thought to be private place but you'll find out if it gets public at some point that way.

But why host it somewhere reachable at all?

It's a canary, it should ideally never be triggered but in case it does you want to know about.

Another way would be to put a fake user in your users database and then watch password leaks and see if it shows up. Then you know you've been breached. It should never happen but if it does it's good to know.

I like the simplicity of this.

fail2ban[1] can be set up to do this trivially.

[1] https://www.fail2ban.org/

ok I'm like 8 pages into the configuration wiki and still have no idea how to set this up

I'd recommend just installing it and turning on a couple of the built-in filters by editing /etc/fail2ban/jail.local to see how it goes together. See the comment at the top of /etc/jail.conf as well and check out 'man jail.conf'.

Essentially to do the same thing as URL Canary you'd set up an action that only emails and trigger that with a custom filter that scans your web server's access log for accesses to a particular URL.

this is a really interesting application. Basically exploiting human greed. Thank you for sharing it.

Hah, that enterprise pricing is insane.

Indeed, at £1k (€1136, $1403) one might as well run an internal application. At a very basic level, it can be achieved with three lines of bash:

    while echo -en "HTTP/1.1 200 OK\r\n..." | nc -l $IP $PORT; do
        cat $MESSAGE | sendmail -i -t
More complex configurations could still be worked out in <50 lines of Python.

Why would anybody buy this (dropbox), I can recreate this with a simple ftp server and 30 minutes of coding!

Dropbox is not a devs-only product, given that eg. the CEO might want a personal Dropbox folder, or one might need a shared folder between the marketing team and the developer team (and even then, at the enterprise level it is not unheard of to use simpler alternatives like SFTP/NTFS shares).

Canaries on the other hand are exclusively used by technical people, who won't mind developing and spinning up a tiny honeypot.

Given the custom feature development, potentially not.

Some organizations have really high in-house costs...

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact