Hacker News new | comments | show | ask | jobs | submit login
Coinbase Merchant Data Leak? (bitcointalk.org)
132 points by dirtyaura 1715 days ago | hide | past | web | favorite | 117 comments

I think we should all calm down and look at this in a little more detail.

A simple look at https://coinbase.com/merchants will show you a screenshot of a merchant page that looks exactly the same as those 'exposed' by google (https://encrypted.google.com/search?q=site:https://coinbase....)

Until proved otherwise, I believe these pages to be merchant pages actually selling the items, as the copy also suggests ("Send 1.00BTC to...", "Comfirm payment"). The confusion must come, I suppose, from the ambiguous urls that contain /checkouts/... and from people not really liking Coinbase?

Edit: Funny how this is a perfect example of the 'URLs are for people, not computers' argument that is number 2 on HN right now.

Wow, this company went through YC? I hope they only invested bitcoins...

It's one thing to lose people's bitcoins or randomly delay/cancel transactions (both of which Coinbase has been accused of). People know that bitcoin is still young and the companies supporting it are inexperienced, so they expect that. But exposing personal info and purchase history goes beyond any definition of 'unacceptable' or 'incompetent'. Over in the Reddit thread, they're already linking Facebook accounts with illicit transactions.

Users from Bitcointalk told Coinbase a week ago that they were starting to get phishing emails, which means someone has been mining this data for a while now. Yet there it is, still available through a simple Google search.

Coinbase CEO here.

Just updated with a blog post: http://blog.coinbase.com/post/47198421272/data-on-public-mer...

These are merchant checkout pages. Your information is not going to be shown on one of these pages unless you created a "buy now" button, donate button, or checkout page and posted a public link to it somewhere as a merchant. Order pages are designed to be public so customers can reach them, but we messed up by making them publicly indexable and including merchant contact info there without being more explicit. The email in particular should not have been included. More details in the blog post. Very sorry for the trouble on this!

It scares me that you're just going for "difficult to scrape" and "not make them easily indexible by Google." We don't want it to be hard to get our personal information, we want it to be impossible.

Did you read the entire comment? Those are merchant listings, which means that the vast majority of them are probably intended to be public. You absolutely would not want that information to be impossible to obtain.

But why would you want them to be "difficult to scrape" then? It should either be "not available" or "easily available". If I'm a merchant, wouldn't I want my info to be as easily available as possible?

These are seller pages, not user pages. Sellers are web indexed by default. You could argue that Coinbase shouldn't index these pages and that they should only be accessible from seller websites, but sellers have already made them public on the WWW. There's no "data leak".

Giving out the emails associated with their accounts is a really bad idea. Having them indexed by Google is even worse, hence the phishing attempts.

These are sellers and their contact information. The whole point is that they are public.

Having official public contact information, well, public, is fine - that's the point. But having the email addresses associated with merchant accounts isn't. Hence the phishing concern.

I'm still wondering why they have to publicly display those emails in relation to the accounts associated with it.

Why can't those stay anonymous?

>"Wow, this company went through YC?"

And was actively defended and supported the last time a thread came up questioning the issues that Coinbase was having:


Turns out those questions were justified. They've outted people selling bath salts and god knows what else.

So much for anonymity. Between this, MtGox, Instawallet, do we still believe Bitcoin is taking over the world?

This is not transaction data, anonymity has not been lost.

Your problem is with YC supporting a company that violates the privacy of dangerous criminals, and not with YC supporting a company that provides a marketplace for selling illegal hazardous products?

> dangerous criminals

You don't really believe this, do you?

Of course, I believe that the DPRK is switching to BitCoin soon since illicit nuclear providers prefer it.

Sorry, I thought you were referring to people selling weed on Silk Road. Do you have any sources for this?

It's really, really easy to see "data leak", see an email address, and assume that this is really bad.

On the other hand, if you actually look at what these pages are, they're Checkout pages, not Transaction pages. Coinbase sellers generated these pages and intentionally linked to them. Someone even mentions this in the fourth comment on the linked page.

I feel like this "data" could be embedded into an image and then no bots would ever get a hold of mass amounts of personal data...

That wouldn't do anything. If it's readable, it could be OCR'd.

This is NOT a data leak. The guys building coinbase are incredibly savvy, dedicated founders. Please don't overreact without knowing what you are taking about. These pages are supposed to be searchable because they're like mini-store checkout pages.

It is a data leak. The CEO says so right in this thread, when he says the email should not have been on the page.

"Yet there it is, still available through a simple Google search."

Let's say, for some reason, you have some information from a site you own that you want to remove from Google search. How long does it take to remove it?

I've had to do it before (business reasons) and it did not take that long -- on the order of 24-36 hours.

There's a link in the webmaster tool asking for speedy deletion. I'm assuming that means gone within hours.

Right, but only the owners/admins for that domain can use it. In other words, I couldn't request that a page on coinbase.org be removed -- only the Coinbase folks can.

Jesus H Christ, this is quite a fuck up. Over on /r/bitcoin there's a comment linking to a transaction involving 229BTC worth of "Avalance Spa Powder", which is one of those synthetic drugs of ambiguous legality. That's quite a violation of trust on Coinbase's part. Their reputation is nuked. (Edit: Please note that I am 100% wrong about this. Why don't I just shut the fuck up for once?).


It is someone SELLING Avalanche Spa Powder. These all are checkout pages of sellers, not transactions. This particular checkout was probably indexed from legalhighaz.com which was run by https://twitter.com/legalhighsaz

I see. Now I feel bad for overreacting. In my defense, these pages look a lot more like receipts than purchase pages. I would normally expect a purchase page to have some visually emphasised call to action. I still genuinely feel a little guilty for contributing to any hysteria about this with my comment above though.

Well, it seems that almost everyone in this thread and on Reddit are wrong on this one.

Exactly, I don't see what the big issue is here. It is not the buyers info and it is not specific to a single transaction. It is the equivalent of doing this search for paypal: http://www.google.com/search?q=https%3A%2F%2Fwww.paypal.com%... You can find lots of freely available e-mail address and/or information by doing so.

Similarly, search for "Tube Ace" in the dataset. Not something you want your personal e-mail address forever associated with.

"Their reputation is nuked."

If someone was looking to leave coinbase, who else provides a similar service?

IMO, all Wallet web-apps are doomed to failure.

The POINT of bitcoin is to be decentralized. Just carry the bitcoin key in your phone directly. https://play.google.com/store/apps/details?id=de.schildbach....

Why do people think they need a web app for this sort of thing?

Because since iPhone took over the world, and people realized that they could get around the GPL with SaaS, it makes a certain type of developer/businessman physically ill to imagine someone loading a local file with a local program.

Due to the success of that no-install, no-config business model, a certain type of consumer can't understand files or folders, or understand why you would have something on your own computer, other than to break syncing.

Also it makes it easier to dabble?

Long ago, I put together some early release tools/plugins for shopping carts to connect directly to a bitcoind RPC JSON instance. No one could figure out really simple issues, like port forwarding on their routers, setting up bitcoind with authentication and allowing connections from specific IPs rather than %.%.%.%, or even the idea of hosting the web service for their "business".

It gained little to no interest.

Very few people really care about decentralised, and few people really care about privacy. And bitcoin wallet apps live on.

I think this is a common misconception about bitcoin. I see no reason why there shouldn't eventually be voluntary third party financial services for bitcoin. There could be bitcoin credit agencies, bitcoin small business loans, bitcoin savings accounts, etc. All these services have value, even if they technically carry a higher risk than storing all your bitcoins yourself (provided that you are knowledgeable and competent enough to do so safely).

Correct me if I'm wrong, but didn't most financial services exist with other currencies long before there was government backing (e.g. FDIC for savings accounts)? It's still really early in the bitcoin game, and I wouldn't be surprised if many of the bitcoin services around today are shady or incompetent, but it takes a long time for reputation-based industries to get up and running.

They have value for sure. But how are they going to make money?

In an inflationary fiat currency, (ie: USD), banks who hold on to your savings loan it out and make money. However, given the economics of bitcoins, it seems unlikely to be able to make a profit by loaning out BTC right now (possibly ever, due to the deflationary nature of BTC)

Bitcoin financial services are going to look very different from your typical fiat-based banks. Not only because of Bitcoin's unique benefits and challenges... but because its economics are vastly different from other currencies.

Bitcoin Wallet Services will gain a reputation, but will then fail to earn a profit. They make no business sense.

Why isn't there a version for iPhone. This looks really useful. There's probably an opportunity there.

I didn't realize Apple was full of jerks.


About a week later, someone from Apple called me on the phone to let me know that BitPak had been removed from the App Store. The guy on the phone sounded like a nervous teenager. I asked him why this had happened, and he said “Because that Bitcoin thing is not legal in all jurisdictions for which BitPak is for sale”. I inquired as to which jurisdictions Bitcoin was deemed to be illlegal in, and he told me “that is up to you to figure out”. I asked him which laws Bitcoin violated, and again, he replied that “that is up to you to determine”. I told the kid on the phone that he has in fact told me nothing and was most unhelpful.

Putting bitcoins on your phone is very insecure. If your phone is stolen/lost you've lost your money!

>Why do people think they need a web app for this sort of thing?

They're faster, easier to use, don't require installation and don't require a lot of hard-drive space. 8 GiB database is a bit too much to be wasted on my small SSD, when I can use blockchain.info's wallet.

Bitcoin Wallet stores only the headers of blockchains using Satoshi's simplified payment verification algorithm. It only needs ~1MB of storage right now.

don't store your private keys on your phone. That's only marginally better than a web wallet. Encrypt and put on an USB stick or in the cloud.

If you want to spend bitcoins from your phone, then the private key must be on the phone somehow.

If you're downloading it "from the cloud" every time, your compromised phone can keylog your password, and the attacker can then download your private key "from the cloud" himself.

Sure, not using your bitcoins anywhere is extremely secure. But lets be honest here: you need to move that private key around with you. The true security measure is to have two wallets, two private keys. Transfer money over to the wallet on your phone only for temporary measures. If it looks like you have had any funky transactions, get a new phone, and a new wallet.

Your "offline wallet" remains secure and encrypted at home. But you need to put some money on your phone somehow. The most secure way of doing that is just leaving it on the phone, and making sure that private key never touches the internet.

> one of those synthetic drugs of ambiguous legality

Out of curiosity, do you know its chemical name?

Amateur hour over there. I originally signed up because they were a YC backed startup. Thankfully I never got around to doing any actual transactions.

YC is an investment fund and business networking program. It provides no technical base.

Which is a bid silly, since (1) almost every YC-backed companty uses the the same technical infrastructure as many others, and (2) Paul Graham made his fortune building a self-service platform to build ecommerce websites just like YC business websites.

Free YC startup idea: Build an ecommerce platform and tecnical management for hosting startups. There's no reason YC should be promoting this legend of "CS whiz kids" as "technical founders". Just set up a solid ecommerce platform, and take on YC founders with business ideas to run their business on the YC stack.

The result will be much better websites, and a bunch of high-paying jobs for the engineers who can build quality sites. YC can be the next Yahoo Stores. Heck, Paul Graham can probably buy Yahoo Stores division that bought ViaWeb in the first place.

Oh my god, someone found merchant pages offering stuff for sale.

They shouldn't be indexed, but on the 1-10 scale of security vulnerabilities, this is about a 1.05.

OTOH finding it is not very far off what Weev got 3.5 years in federal prison for, though, under CFAA.

I don't understand why there's a big issue with being indexed? In fact they should be indexed if they remove the e-mail address from the page. Isn't the goal of this pages to sell more? Don't people use search engines to find stuff to buy?

The whole thing is just a big misunderstanding.

I think people are right to be on a "hair trigger" w.r.t. vulnerabilities at wallet providers, given the atrocious track record of almost everyone involved in the bitcoin industry for security.

OTOH, Coinbase and Coinlab (the new Mt. Gox) are the entities I'd trust the most not to be outright fraudulent, since they're venture funded. The founders stand to gain far more by being honest than running off with BTC, and the reputation of investors (including YC) would be harmed far more by fraud, so the only real risk is outside compromise, employee compromise, etc.

Coinbase has done a better job on security than any other BTC entity I've seen (although I've looked at them more closely than all but a few other providers).

These are buy it now / donation pages. These are NOT checkout pages for coinbase users.

Why are the checkout pages even public? No robots.txt, a lot of private information listed and public.

Shameful. I know little about web development but this seems rather obvious, even to me.


Phone no., names, addresses, e-mails, etc. all out. This is indeed pretty bad. A lot of people I know who use BTC use it foremost for privacy reasons, it is tremendously ironic how this has worked out.

If you are using BTC for privacy, then using a third party hosted wallet is not a very good plan.

Presumably you can take your bitcoins out whenever you want if you need to use them anonymously.

This is seller information to tell you who you are paying. There is no data leak here.

Using a robots.txt-file to hide data that shouldn't be public in the first place is a rather bad idea. Because the robots.txt itself is public, it actually highlights the location of the "private" data.

And, Google had to find a link to this content somehow, which means it's publicly accessible from some Coinbase page.

That is what I am wondering. Here I am building a web site and wondering how to make Google index sites behind authentication and such, and here is a site that has got everything indexed.

After reading Coinbase's response, it seems that the pages were not linked to by Coinbase, but by users posting links to their checkout page on the Internet.

What's a good alternative to robots.txt?

Proper access control on your web site.

Simply not showing the transaction if you're not logged in and not the user belonging to the transaction?

You can also specify a meta robots tag inside the page HTML. If you want to block a lot of pages, your best bet would be to add it to your master layout or template. You get the same effect of blocking on robots.txt but without exposing a list of blocked pages.

The downside is that Google will still crawl the page and use your bandwidth, but the page won't be indexed.

I would suggest both these two: Check the user agent for bots and if it is a bot send a 404 header and exit before page needs to load. Also add a meta noindex just in case. Robots.txt DOES NOT prevent indexing, just crawling.

Blocking access to googlebot for those pages is the easiest. But robots.txt would work, if you just did

Ignore /checkouts/

If the first 6 or so SERPs are representative of Bitcoin as a whole, it appears to be a currency that exists primarily to facilitate donations to blogs and websites. No wonder YC funded Coinbase; it let them take another whack at Tipjoy!

I'm fascinated at the data revealed by leaks.

I really hope someone is scraping this to create some nice graphs and charts.

Damn.. These are not transactions! These are public anyway on the merchant's site.

Just try out https://coinbase.com/docs/merchant_tools/payment_pages and press the button. It goes to the checkout page similar to these.

20 reddit.com upvotes: 0.20BTC


So that's currently $1.34 per upvote. Seems like a lot.

It's not when the price of coins is 5 USD per coin...

Coinbase started in June 2012, the lowest it has been since then is $10USD.

I'm quite surprised that over an hour after this was posted, these checkout pages are STILL public!

If I were running Coinbase I'd have put the site into some kind of 'down for maintenance' state immediately, and then put all my effort into plugging the leak.

Of course the Google et al indexes are a more difficult problem, but at least stop any more from leaking.

Edit: It has been pointed out that these are seller pages, with sellers details only, so not a data leak at all. I retract my previous statement :)

That's because it's not a leak. These are checkout pages that sellers have chosen to make public.

If they turn off the checkout pages nobody can check out using their service. These are the SELLER's pages with the sellers info.

This is shamefully bad. There is no excuse for this.

I was going to add on to this, but I think that's all that needs to be said.

This is bad.

Especially more so if people used it to buy illegal drugs and now have their checkout info available in Google..

The cryptocurrency company that's never heard of cryptography. Bringing you your world in plain text.

I trusted coinbase to cashout two years worth of bitcoin paid to my online t-shirt business. First they ignored me for two weeks[0], then they promised the funds would be deposited yesterday. They're still not deposited today[1].

[0]: http://www.reddit.com/r/Bitcoin/comments/1bdd8p/iama_bitcoin...

[1]: http://i.imgur.com/fNoXvMH.png and http://i.imgur.com/brlY2Ry.png

Added a response on Reddit also: http://www.reddit.com/r/Bitcoin/comments/1bdd8p/iama_bitcoin...

Should now be resolved with all funds paid out - but the delayed response was definitely our fault as we ramp up support. Thank you for bearing with us!

(YC S12)

I wouldn't be surprised if that were edited out. It has happened in the past with other criticisms of coinbase.

*it has happened in the past with all sorts of submissions (both positive and negative) about all sorts of companies because it is part of the rules of HN.

Is the rule applied 100% of the time so that nobody will be able to find any exceptions? No. We're talking about humans here. But they are pretty consistent. Especially for stuff that hits the front page.

"But they are pretty consistent."

Discussed last time there was a front-page article: https://news.ycombinator.com/item?id=5428402

I pointed out that "Coinbase (YC S12) hires first engineer" http://news.ycombinator.com/item?id=5011361 didn't conform to the standards.

As I said, there are exceptions. Humans do this, not algorithms. It's thus understandable how you found a two point submission that nobody ever saw which was overlooked.

Searching google for coinbase checkouts: https://encrypted.google.com/search?q=site:https://coinbase....


Link to close your account:


Can somebody explain how Google was able to index all these checkout pages? Presumably they were only sent over email.

That's a damn good question. I'd also like to know why these pages are still public? It's like, PULL THE PLUG, THEN FIX THIS, THEN PLUG BACK IN.

Google uses the "completion" feature of Google Chrome to collect new URLs to scrape. If you have that on, they crawl after your visit.

Do you have any links that go into detail on this? I was intrigued by your comment and I want to read more about it but ironically I couldn't find anything on Google!

I can't find the paper I read on it either, but I can confirm that it happens anecdotally with Google, and oddly enough AIM Messenger. I've had URLs that have never had an inbound link, and magically GoogleBot rocks up when I show a Chrome user. I'll keep looking for it.

I'm also convinced they use Google Analytics to find new URLs. I've seen URLs that I only had in AJAX calls indexed before (and I fired events to GA on these URLs).

They have expressly denied that in the past. (Where "that" is "using user data for Google Analytics to expand the crawl set". They're also on the record as saying "no use of toolbar data.")

The more likely thing you are experiencing is Google reading your AJAX URLs, either by evaluating JS or by using heuristics. Google is known to do both of these, but a lot of HNers get surprised when I mention it, so FYI.

I once had a spammer hit one of my contact forms a few hundred times on a page set up to capture traffic from South Dakota. there was a corresponding goal set up in Google Analytics that triggered and a week or so later the S Dakota page popped up as a site link on SERPS. Certainly doesn't prove anything but the page got essentially zero traffic and had no external inbound links and wasn't weighted very heavily in term of site architecture. Makes me wonder if there isn't some careful parsing of words in their claims. /removestinfoilhat

Are you implying that private URLs typed in the Chrome address bar might end up in the crawler queue ?

People using Chrome with those settings enabled should probably read up on its privacy policy [1]. Its features such as, "use a web service to help resolve navigation errors", "use a prediction service to help complete searches and URLs" send data to the default search provider.

Also, these features are enabled by default.

[1] https://www.google.com/chrome/intl/en/privacy.html

These are checkout pages, not transactions. You put up a link to the checkout page if you want to sell something.

There's a google chart on the page. Wouldn't surprise me if they collect the referrer address and indexes it.

I want to like coinbase, but 100% of the time that I try to buy bitcoins it says that it has run through its daily allotment and to try again in 24 hours.

EDIT: I wrote this before people suggested that this 'leak' is just a list of people selling stuff, and not people buying stuff. Oh well. I leave my comment here, mostly because of bath-salts-guy - selling a large quantity of stuff of dubious legality should probably be done more carefully.

Sorry to Coinbase people for jumping onto a pile-on before getting correct information. ---

Regular people are hopeless when it comes to privacy and anonymity. Just look at something simple like "Don't chose a ridiculously easy password", and then look at any leaked password list.

When users fail so hard at the trivial stuff (where we've had advice on best practice for years) how are they expected to succeed at tricky stuff like crypto currencies?

This lack of user knowledge makes any coinbase[1] failures particularly bad. It's bad because you're supposed to protect your users. It's also bad because it's a failed business opportunity - 'hand hold naive users through a complex crypto process' is an unfilled niche.

I was excited about Coinbase. I really wanted them to do well. But this? It's going to take some work to recover from this.

Please read past the headline. There is a lot of uneducated sensationalism criticism going on. The data leak has exposed info that is already public, and basically harmless. Given someone with enough time and effort can turn this public info into a seedy crime, like using the contact list for phishing, the average coinbase user is far removed from this so called 'data leak.'

I considered posting this, but wasn't sure how the HN community would react. Glad someone else did. Here's something scary:




Oh dear.

He's the person selling, not the person buying. These pages are essentially merchant 'buy' buttons.

Well, someone using his email address sold / bought it; it's possible someone else did it. (Reputation attack; etc).

Is there a date attached to that? (To work out rough quantities). It seems like a significant quantity. While we might not agree with what law enforcement do about drug and borderline-legal substances we know that law enforcement does take vigorous action.

As do the US tax people. I guess his yearly audits suddenly got worse.

These threads are full of misinformation and knee-jerk reactions to a problem so minor that it is barely worth noticing. And the title ("Coinbase User Data Leak?") is misleading - where's the zealous title editing now? The Reddit thread is even worse. This is more revealing of the HN community than it is of Coinbase, who hasn't done anything wrong. Disgusting.

these are merchant pages, not actual transaction invoices. checkout - a poor choice of name of the resource for the job it intends to do ;]

While this is bad, I do feel like anyone using a hosted wallet like Coinbase where you directly link a bank account (which I think is rare among bitcoin exchanges, etc) can't be expecting full anonymity. Coinbase is a registered company and if they have bank accounts linked then their identities are compromised anyway if the company were to get subpoenaed, I'd imagine.

If they want anonymity they should be personally holding their own wallet. Most exchanges only allow some sort of cash order deposit, for this reason exactly.

The irony is that bitcoins is the perfect technology for preventing these kinds of data leaks- If only some more capable developers could start opening bitcoin businesses.

My Twitter bot, @dumpmon, found a leak of these here: http://pastebin.com/raw.php?i=b34a2X3b

This has been said again and again- the main thing that Coinbase needs to do right now is to get better about their communications with the public.

Its understandable that a fast-growing startup in a new field, doing transaction-based work, will hit some bumps along the way. But they need to keep the community in the loop better. Twitter, blog, posting to threads like this (they know HN exists!)

But...but...but it's a beta!!

Sorry, had to do it ;)

Haters gonna hate.

You know, when people were posting every blog rant about Haskell, at least a small bit of knowledge was being circulated. This bitcoin fad is dredging the bottom of the barrel -- dozens of upvoted comments written by people who can't tell the difference between an advertisement and a transaction confirmation.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact