Hacker News new | past | comments | ask | show | jobs | submit login

About 10 years ago I met the head of IT for B&H cameras in NYC. Among many things, he was in charge of the hosting for their online store. After he complained about dealing with physical servers, I asked him if he had ever considered using AWS ec2 for the website, and he replied that his boss refused because he believed that Amazon would pull data on B&H products and use it to compete more effectively.

I'm not sure that Amazon would be able to pierce the veil of the hypervisor like that, but his instincts were in the correct direction.




There is absolutely no veil between the hypervisor and the guest virtual machines. Not in the EBS either.

If they say they won't read your data, better trust them. If you don't, stay away from their datacenters.

EDIT: fix typo.


> The is absolutely no veil between the hypervisor and the guest virtual machines. Not in the EBS either.

This is 100% true. To do any useful computation on your data (read, what you're using all AWS for) they have to have 100% visibility into your data.

> If they say they won't read your data, better trust them. If you don't, stay away from their datacenters.

That's it, right there. All of this is based on Trust in Amazon, not some technology that provides any assurances, much less proof, they're not looking at your data.

They can pull the curtain off anything you're running in their cloud, at any time they feel like it. It has to work this way for AWS to be of any use, and by using AWS you're implicitly trusting Amazon with your data.


This is a similar level of trust that you give to banks not to seize your money, or to your bodyguard not to do you physical harm. Stealing data from a customer paying for hosting would be _very_ different, and much more scandalous, than identifying trends on a competitive marketplace and taking advantage of them by launching competing products.


If a bank were to seize your money, you'd notice, because you wouldn't have that money anymore. And it would be very well documented, leaving a clear paper trail to a criminal conviction and a civil suit. If your bodyguard did you physical harm, you'd notice, because your knees would hurt. And there would be ample evidence for a criminal case. If amazon copied all your proprietary data, you would almost certainly never notice, no criminal law would apply, and you'd have a hell of a time proving it in a civil suit.

It's the difference between breaking into a Walmart with a ski mask and assault rifle and stealing a bunch of blu rays vs recording the HDMI out from whatever device you stream Netflix from. They're not the same thing at all, either in terms of harm done, applicable criminal law, or ability to build a compelling civil lawsuit.


> If amazon copied all your proprietary data, you would almost certainly never notice, no criminal law would apply, and you'd have a hell of a time proving it in a civil suit.

For a thought exercise, let's play this out.

Amazon copies data running through VMs (or grabs it from storage).

Let's assume it isn't on hardware certified for capital-letter processing [1], most of which require regular third party audits.

So they have your illegally-obtained data [2], which presumably they want to use to make money.

Except they can't leave any record of its source, in any documented form. This includes server logs, data transfers, emails about data, meeting minutes about data.

So they create some isolated network, run by a third party contractor, that transfers encrypted data from the taps to a store, then decrypts. All of which brings us to the most difficult part.

Who does... what with it?

The source data itself is radioactive. Who knows when "pricing strategy for company X" or obvious equivalent might pop up in the stream?

So you... what? Exclusively touch it via algorithm that outputs only aggregate information? How do you possibly code and maintain that pipeline, sight unseen?

All while risking an incredibly profitable business.

Or, you know, you just operate as an honest IaaS provider and make $10B in revenue / quarter with a 25% growth rate...

[1] https://aws.amazon.com/compliance/programs/

[2] https://www.law.cornell.edu/uscode/text/18/2511 (?)


Sometimes you can learn a lot from the metadata without actually looking into the actual data stream. For example, if B&H was hosted on AWS, Amazon could deduce the effectiveness of their holiday sale tactics by looking at the overall page traffic, DB writes, etc. These metrics are already recorded by Amazon for billing purposes and someone stealing a glance at them would likely leave zero evidence.


I thought Amazon was already organizationally constructed in very small functional units which each are encouraged to export their units "interface" in an formal way. Is the source data traceable if it becomes anonymized product sales samples exported to apis that mix into a pile of legit data and all fed into some sales analysis engine?

The unit could be the "open sales modeling unit" that just supplies one data feed among thousands.


You overestimate the competency of short-sighted individuals anxiously striving for a seat closer to Bezos, ie thinking for themselves versus the organization. Ironically, I wonder if such news actually motivates some PMs to ask around...


> Except they can't leave any record of its source, in any documented form. This includes server logs, data transfers, emails about data, meeting minutes about data.

They can certainly take the risk. If crimes only happened when there was a 0% change of getting caught there would be no crime.


You're right that they are different, but maybe not as different as you think they are.

> If amazon copied all your proprietary data, you would almost certainly never notice, no criminal law would apply, and you'd have a hell of a time proving it in a civil suit.

If Amazon were doing this and profiting from it, that would essentially be a criminal conspiracy that reaches to the leadership of the company. Is it possible? Sure. Is it likely? I tend to think conspiracy theories are rarely true. Would it be caught? I believe it would likely be caught.

Companies get things done by having meetings, informing their hierarchy, and following executive decisions. In what meeting do you imagine this being discussed? Who floats this idea, and who signs off on it? I just don't see it happening. And if it does, I expect whistleblowers to put a stop to it.


Criminal conspiracies by corporate execs are not all uncommon in the history of business and presuming that you can't possibly run into one because you personally haven't is taking an unnecessary risk. One thing due diligence is supposed to look for is criminal behavior. This is not because they never find it.


Criminal conspiracies by corporate execs are not all uncommon in the history of business

Actually, they are quite uncommon, which is why they make headlines when discovered.

I'm not taking a side here, just pointing out a fallacy.


This is even more fallacious- the only thing that unsourced opinion proves is that certain types of criminal conspiracies that are uncovered are deemed sensational enough to sell news services. It says nothing about the commonality of successfully covert conspiracies nor about the frequency of uncovered ones that are hard for the general public to understand/care about.


It is time to shut the computer, take a deep breath and see if you can do a long walk outside.

Bezos is making the most money of everyone living. Many of the scandals happen when the founder is retired or dead.


Lets see..

Boeing:

See 737 MAX, other 737 boondoggle like the vertical stabilizer reversal back in 94'ish.

Monsanto, hell, what chemical hasn't hid information they damn well shouldn't:

Dicamba, roundup.. Take your pick. The stellar behavior of this corporate citizen taints cements the stereotype of an entire industry.

https://thecounter.org/dicamba-trial-monsanto-basf-pesticide...

https://www.phillyvoice.com/new-york-times-dupont-hid-decade...

Special mention goes to a certain German pharma company who brought you Thalidomide:

https://en.m.wikipedia.org/wiki/Gr%C3%BCnenthal_GmbH

The lovely folks at Insys:

https://www.nytimes.com/2019/05/02/health/insys-trial-verdic...

Believe there was a fraudulent implant thing a bit ago... Where'd I put that?

https://www.desertsun.com/story/news/health/2014/07/09/south...

Someone beat me to Dieselgate.

Arthur Anderson LLP.

PG&E deliberately skimped on maintenance, leading to fires in California, and if I recall natural gas lines overpressuring in Massachusets?

https://en.wikipedia.org/wiki/Massachusetts_gas_explosions

Excuse me, the natural gas one was Columbia Gas.

Big Tobacco...

Nestle I think getting caught using child labor in their supply chain at one point.

https://www.theguardian.com/global-development-professionals...

Oh what else can I think of off the top of my head? Uhhh...

That's all I can think of for right now. I mean we can hit the history books or case law to get a solid count I suppose, but to be frank, once a company hits a certain revenue point, it is pretty much guaranteed they've had to do something to get dirty/avoid getting outed as dirty.

So it really isn't that unusual. Throw in stuff that happened back before the rise of the Unions of the last century, and since their decline, and you also end up with so.e decent stories of workforce abuse. Though admittedly there's slant depending on who is telling it.

Like the Pinkertons as a matter of fact.

https://en.wikipedia.org/wiki/Pinkerton_%28detective_agency%...

Or the original incarnation of Equifax, who were tasked with vetting prospective executive promotees.

Just because it'sorganized doesn't mean it's doing anyone any favors.


Several dozen companies doing bad things, compared with the hundreds of thousands of companies operating in the United States.

I stand by my statement -- it is rare.


Rare, that you ever hear about it.

I know of a case of fraud in oil well lease payouts, someone was stealing a small from a large number of leases and had been doing so for years.

A company auditor caught it. Did they go to the police? No. They paid the guy to leave the company and never talk about it again. The guy might have stolen hundreds of thousands in the process, but the company knew they'd lose millions, just from clients demanding audits going decades back. It was easier and cheaper to cover up and never mention again.


Much like typical crimes, only a subsection of company malpractice comes into public view. There's a few major scandals per year from the largest and most publicly known companies.

The very least we can say is that company malpractice is more common than it appears, unless 100% of it is reported on.


You misunderstand. That is just what I keep in my head and have been accurately tracking and commiting to remembering in the last 5 or so years.

As has been mentioned as well is that governmental/regulatory apparata are typically starved of funding, so must limit their investigation/scrutiny to likely the most obvious cases.

Furthermore, if you've just entered into white collar circles these last few years, you may have been surprised at a tendency to not write things down. This isn't just people not realizing it is a good idea to do so, but a conscious decision in many cases due to eDiscovery, and the effects it has on provability in a court of law.

Pay attention on HN, and you'll get little snippets of other cases of "tribal skeletons" every now and again.

Anyway, by all means, I'm not necessarily arguing against your point; merely stating that given the sample size, and keeping in mind that regulators/the media can only dig up so much muck given limited manpower; it is not prudent to assume there isn't wrongdoing where no one has looked yet. I used to hold the same view you espouse; then I started A)cataloging things and B) noticed how often settlements seem to be applied with no admission of wrong doing.

Absence of evidence does not imply evidence of the non-existence thereof. You just haven't found it yet.

Can't believe I forgot about Wells Fargo, btw. That whole mess.

https://en.wikipedia.org/wiki/Wells_Fargo_account_fraud_scan...

ISP's have been known to falsify their Form 477 data fabricating coverage stats, and overcharging customers:

https://www.cbsnews.com/news/complaints-att-directv-bundled-...

https://www.ripoffreport.com/reports/verizon-wireless/nation...

There's plenty more where that came from with every ISP to be honest.

FTC keeps stats on all enforcement actions apparently. Might be a decent place to start looking to get some solid numbers.

https://www.ftc.gov/enforcement/cases-proceedings/

Mind that that's only the ones. I assume CFPB and other commissions have similar, but do keep in mind they can't be everywhere or investigate everyone. So without stats on how many actions are dropped by prosecutorial/investigator's discretion, it is actually difficult to make really solid claims as to the actual frequency of malfeasance. Further, from my social circle's anecdata, it seems to be a safe bet that just about every organization at least has something in the the way of "muck they've cleaned up after" without getting authorities involved.

Anyway... I've rambled enough.


I used to think a lot more like you and then Dieselgate[0] happened.

[0] https://en.wikipedia.org/wiki/Volkswagen_emissions_scandal


as sibling comment author mentioned, look at dieselgate. it was huge conspiracy against emissikns regulations and they did it relatively well for multiple years. and it’s not like it’s simple hack in software. this solution required manufacturing additional special purpoce devices, adjusting assembly line, engineering and so on. definitely it must have some design stages, testing, actual implementaion done.

main thing here is that in big corps you can divide big (evil) task into smaller steps which could be defined as non-evil in isolation, and nobody in actual implementation people crowd would understand big picture.


> breaking into a Walmart with a ski mask and assault rifle and stealing a bunch of blu rays vs recording

I'm ready to watch that movie


For me it's not at a similar level.

For one, banks are far more regulated than Amazon is. If governments funded departments with 10s or 100s of thousands of employees monitoring and regulating cloud computing services, then it might be similar.

But the most significant difference is that if the bank seizes my money, I'll know about it pretty quickly and can respond. If Amazon sniffs through my commercial data, I'm unlikely to ever know. Most people are far more tempted to do wrong if they know if the chances of getting caught are miniscule.


Banks mightn't seize your money. They certainly take the data from your bank accounts and monetize/resell it. This is a dirty secret, and pervasive.

How else do you think "closed-loop" measurement of marketing effectiveness, and retargeting based on purchase behavior are done? How else do you think suppliers can pull a D&B report on your company showing your bank account balances?


Banks definitely seize your money. When I was a young teenager my parents encouraged me to put my lawn mowing money in a bank account. I had a total of $100.00! We went over to Bank of America and I opened up an account and deposited my hard earned cash. A month or two later I tried to withdraw some cash and was told I had no money. My full $100.00 had been consumed by insufficient balance fees.

A valuable if painful lesson to learn. I still do all my personal banking with a credit union and consider my relationship with banks to be adversarial. They only own my debt, never my cash.


> A month or two later I tried to withdraw some cash and was told I had no money. My full $100.00 had been consumed by insufficient balance fees.

Is that an exaggeration? It amounts to $100 or $50 a month in "low balance fee"!

All the banks I've looked at had a fee under $10.


I think it may have been more than a couple of months, IIRC the fee was $20.00. This was a very long time ago.


This is like saying Amazon seizes your money because you have to pay for their monthly service fees you agreed to when signing up for the service.


Actually it's not _that_ uncommon for guards to be involved into the business of braking in into high profit buildings. At least in countries with partially undermined police/law systems. Which sadly applies to most countries of the world even first world countries where people normally don't think about it.


Or your commercial landlord to not send the cleaning staff to rummage around in your filing cabinets. Which, while it could happen, is something that people don't really seem to get concerned about.


I have been chastised for not locking my desk for this exact concern. It does happen


Don’t let anyone chastise you for this. Most desk locks are easy to pick. Also, there are like 3 keys to have on your keychain to open like 80% of all manufactured locks like the ones in furniture. Deviant Ulam, a pen tester, gives a lot of talks on this topic.


I pick my battles, I’m not going to complain about a policy unless I think it could really hurt people. If I complained about everything i think is dumb, I’d never be able to keep a job, because most of it seems dumb to me.


So true, so often.


No the reason for these types of structures is simply to prevent passive leaks of information which is a far more common occurrence. Any large business is frequently visited by vendors and agencies who also work with others in the industry.

Similarly, if you're presenting externally, it's a good idea to close open applications that are not relevant to prevent info leaks from Alt-Tabbing.

Actually having a competitor pay someone to come into your office to pick locks etc. is rare, comes with criminal liability and is easily detectable on security cameras.


Most "crimes" of this sort would be stopped by simply locking the drawer. Nobody believes that a simple desk lock would keep out a determined attacker.


Not really wrong, actually. A friend picked a desk lock for another when they left their charger in there.


> This is a similar level of trust that you give to banks not to seize your money, or to your bodyguard not to do you physical harm.

That's not true. I surely don't trust banks, but at least they're regulated to the point that they have to come up with some legal pretense for seizing my funds. A bodyguard is ostensibly a person who I've incentivized more than the competition to not harm me, and who I probably form a relationship with over time. None of these things are true of Amazon.

> Stealing data from a customer paying for hosting would be _very_ different, and much more scandalous, than identifying trends on a competitive marketplace and taking advantage of them by launching competing products.

What part of using data that you have on your competitors but they don't have on you, to sell competing products on a platform where you don't have to pay fees but they do, sounds like a competitive marketplace?


I don't disagree with any of what you've said. I just think that many people are ignorant of that being the case with Amazon, Facebook, Google, etc because they assume 'Well Technology must have solved that'.

Then again, compared to the average bear, maybe I'm unusually circumspect when it comes to all of those things.


Technology alone cannot solve the use of technology to promote interests of parties in a zero sum game.


The promise of homomorphic encryption is to allow cloud computing without giving your data away.


Without giving what data away, exactly?

If for example I'm fully on amazon AWS for everything, DNS/DB/Web then no matter how encrypted your data is Amazon still has a very good idea of the effectiveness of your campaign. You can't hide the number of DNS queries. You can't hide the number of TCP SYNs. Hell, there is just a huge amount of things that encryption does not cover up, especially involving time for particular transactions to occur.


Don’t be obtuse. Observing some encrypted traffic going in and out gives away some info, but it’s nothing like the email addresses, addresses, names, and order history of all of your customers.

Amazon, if they wanted, could read stats from Netflix’s database about which movies drive the most engagement and use that to determine what to license for Prime video.

It’s the difference between root on the server and capturing encrypted packets on a network.


>This is a similar level of trust that you give to banks not to seize your money

How many PayPal horror stories have there been?


Bad analogy, I can tell when the bank seizes my money.


How about snooping the traffic through a load balancer service managed by AWS? That's exactly 'identifying trends on a competitive marketplace and taking advantage of them by launching competing product', except that instead of looking at sales data of products on your shelves, you look at URL access patterns for sites hosted on your platform.


For about half a billion they will build you an aws on site(s) you control: https://cloudcheckr.com/cloud-security/understanding-aws-gov...


If AWS used Intel SGX, then it would be possible for them to offer VMs that ran inside of a secure enclave that AWS could not peer into as long as Intel didn't give them a backdoor.

(Well, it seems like SGX is insecure right now with all of the CPU vulnerabilities, but in principle it may be fixed in a future generation and be well-suited for this.)

The fact that you wouldn't have to trust your host specifically could have a real decentralizing effect for cloud hosting: people would be able to run stuff on any cloud host without needing to trust them much. If you just wanted compute power and didn't care about strong uptime/connectivity, you could even safely rent cheap VMs on computers of random individuals.


SGX has no syscalls. You cannot run VMs or any regular application in SGX.

AMD SEV, on the other hand, is exactly that.


> To do any useful computation on your data (read, what you're using all AWS for) they have to have 100% visibility into your data.

This is true, but it doesn't have to be this way [1].

[1] https://en.wikipedia.org/wiki/Homomorphic_encryption


I'm aware, but thanks for posting nevertheless. I've actually read Gentry's thesis. Last I looked into FHE though it was something like 14 times to 100 times as inefficient (either in time or space depending on the scheme) as operating on unencrypted data.

Now things may have changed since then, but I'd imagine it's not yet gotten down to 1.X inefficiency multiplier regardless of the FHE scheme you're using.


That would increase their computation costs by a fair bit, it would be more expensive to run the same amount of computation on their cloud using fully homomorphic encryption, even without taking the engineering costs on your side into account.


Which is why a company operating cloud computing should just do that and nothing else. (And a company producing phones should also just do that and not start competing on the app marked. etc.)


Well it's a b2b business and YOU could vote with your wallet to take your business elsewhere.

As an aside, Amazon competitors like Walmart typically require their suppliers to host data on a platform other than AWS if they want access.


Except huge tech companies are incentivized to vertically integrate into selling their extra datacenter capacity. From there it's just an economies of scale game that a pure cloud company will have difficulty keeping up with (not saying it's impossible).


It is things like this that will make people lose trust on Amazon. If they start reading data, it is bye-bye AWS for Amazon.


Throwaway account for obvious reasons.

In the past, AWS has used the data from third party hosted services on AWS to build a similar service and in fact start poaching their customers.

Source: I used to be at AWS and know the PM & his manager who built a service this way. I was hired on that team.



As for talking to journalists, I didn't leave with any ill will and don't want to complicate my life. I personally know a friend who got involved with journalists... his past employer came to know about it, sued him... and he became almost unemployable in the valley.

Edit: fixed a typo


> ... his past employer came to know about it, sued him... and he became almost unemployable in the valley.

You might have a family to protect. A home to maintain, etc. I understand. It's scary. But the world doesn't and cannot change for the better if we let corporations bully us into silence. The world will and does change when brave individuals, with the support of society, stand up and blow the whistle.


Lol exactly what support will you be providing? Will you contribute to paying this person's salary for the next 10 years? I wish people would quit with the empty platitudes and the rhetoric.


Your comment reminds me of the kind seen from Russian bots. Everything about what you said matches the algos I've seen. Very interesting.

But yes, I would be happy to contribute to a support fund to support such individuals.


lol yea right now people that call you out on your meaningless blather are russian bots haha

>But yes, I would be happy to contribute to a support fund to support such individuals.

cool you can start by donating to absolutely any charity in need right now.


You should consider going to Project Veritas. They stand by their sources. Maybe it's not as likely to be published if you don't have video footage to back up your story, i don't know... but still it's worth trying. I understand wanting to play it safe though...


Have you ever been tempted to tell people (journalists, government) about it? If not, are there any particular reasons?


Cynical curiosity: did the services they poached from have such abysmal UX/DX or is that an Amazon touch?


They poached because it was a lucrative business. AWS’ sales pitch was that their service was so much better integrated with the underlying infra. Also, they priced out the competition and offered generous discount for bundling with additional AWS services.


Funny (or sad)... I described those very tactics in When AWS, Azure, or GCP Become the Competition: https://www.gkogan.co/blog/big-cloud/


This amazes me it's so much easier at that scale to deal with 6-10 boxes vs all the crap that comes with AWS. Don't want to deal with managing them there are companies that will do it for you and you will have an actual on call people that are accountable to you. Unless you are doing 6+ figures a month in AWS spend have fun trying to have same level of service.


I’m a big believer in use cases that fit on-prem solutions like that. But you’re dreaming if you think a 6-10 box operation is going to come close to the same service levels as AWS, and if you want to replicate the developer experiences that you can achieve on AWS, you’re going to have to devote a lot of resource to it. Whether scaling works well on-prem depends entirely on your scaling requirements. If you have bursty loads, or sudden increases in utilization, then scaling is going to be painful, because it will require hardware procurement, which is a slow process. There’s situations where it makes sense, but there’s way more factors than you’re considering in this comment, and you’ve completely misrepresented what the trade offs are.


I am not only not dreaming I've being running workloads like that for a long time. Both in cloud and colo. If it's my money sanity or ass on the line colo is my strong pref. I def. do not want to replicate developer experience of AWS in Colo as AWS has s ton more moving pieces which are black boxes and have arbitrary limits. Scaling is a disingenuous point for most e-commerce apps as you generally have RDBMS that do not scale horizontally so cloud or no cloud your bottleneck is the same. The price point at which say Spanner would outperform a cluster of RDBMS on high end boxes is way south of 100K/month and no of the shelf e-commerce software would support it anyway.


Say you completely ignore scaling. The two things you simply cannot replicate at that scale are redundancy and operational resource. AWS has their entire operations team working at all hours of the day and night supporting their infrastructure. They also offer some of the most highly redundant services in the world. There is simply no way you could ever dream of replicating those service levels with such a small operation, and if you were to even attempt it, it would require an absurd level of over provisioning. As I said, you’re completely misrepresenting what the actual trade offs are, and there’s no possible way your claims about replicating AWS service levels is even remotely plausible.


> AWS has their entire operations team working at all hours of the day and night supporting their infrastructure. They also offer some of the most highly redundant services in the world.

And yet, a couple times a year perhaps, we have discussions right here on HN about the latest AWS outage that took down half the Internet.


No group is infallible. If I thought about a world where cloud providers didn't exist (AWS or otherwise), where every company had to build and maintain all of their infrastructure themselves, and had to make a guess, I'd wager the combined occurrences of issues around availability, durability, etc. would far outpace what we have had.

That's not even considering the potential impact to software development and innovation that we get with commodity cloud services. This is hand-wavy of course but I'd stick to it.


AWS haven’t had a major incident since 2017. Their last one before that was in 2015. If you’re not counting managed services, it was 2014.


2018 May 31 Outage no ?


You mean the incident where a small percentage of EC2 instances were unavailable for 30 minutes in a single AZ in US East 1? I see your definition of major incident is pretty loose. I remember that incident. I had services running there. It was so minor that my auto-scaling picked it up and my service impact was nothing.


AWS has amazing marketing the truth of the matter is AWS Region has worse downtime than a single top tier DC. Mainly due to nightmarish complexity of their control layer. They had outages that lasted many hours in a row multiple times. You need to carefully separate marketing claims from operational reality and actual track record. When US East has major issues there is not enough spare capacity to spin up everything that was running there in other regions.


with high availability you don’t wait for an outage to spin up new resources, at that point it’s too late. it’s by definition not highly available and if you build infrastructure this way then you can’t blame AWS for an outage


So you have say 3 Region deployment are you saying that you are running 50% more instances than you need in 2 regions that are not US East to make sure you will have capacity when US East goes down :) ? I somehow seriously doubt that.


so you’re suggesting entire regions go down at once or one of the AZs? An entire region doesn’t go down. So again, you are not building for high availability.


US East as a whole went down several times plus a number of incidents when S3 or other critical services went down for the whole region.


i don’t remember seeing us-east go down in its entirely. show me supporting evidence or this is FUD. multiple DCs physically separated, different flood planes make up a region. it’s not easy to down an entire region. the biggest event they had, the S3 one you’re talking about, effected only 2 AZs and didn’t allow new EC2 instances to start and iirc some EC2 instances failed as well. this is a far cry from the entirely of us-east having an outage.


Amazon has some of the best uptime in the world, especially for basic services like EC2, even if you’re only considering the least reliable regions like US East. Their last major event was in 2017. There are few providers in the world that can compete with them in that respect, and there’s nothing that you or I or anybody else could build with 6-10 servers that could come close. If you were planning to try exceed their service levels yourself, there is no conceivable use case where a small to medium sized company could justify the requisite expenses to provide the redundancy and operational coverage necessary. What you are actually talking about is that you can meet your own needs without AWS, which is entirely plausible, but completely different from the absurd claims that you can build a low budget infrastructure that exceeds their service levels. That claim is so ridiculous, you might as well be saying that you can make a car faster than Toyota can, or that you can run a two minute mile.


Our AWS monthly spend is deep 7 figures over the last 10 years our colo projects had better uptime than AWS US East. You keep living in the marketing bubble for AWS.


I would love to see some data here. What service outages? What’s your infrastructure like? What’s your DCs uptime?


There’s a few confounding factors to address before you get to the believability of the claims (which are remarkably dubious). For starters they’ve only spoken about the AWS region which is least reliable by design (as it is the first to receive new features), and they’re talking about a 10 year time span (AWS in 2010 was much less reliable than it is today). It’s also not really clear what they’re talking about when they say AWS. If it’s just the core features you’d need to forklift an on-prem service into the cloud, then the claims are especially frivolous, but the SLA difference between EC2 and AWS Ground Station is more than an order of magnitude.

Even if their claims are true (which I certainly don’t believe they are), you’d be more likely to get better uptime than EC2 with a small on-prem setup through dumb luck rather than through deliberate planning. Something still has to go wrong for you to have an outage, and you’re more likely to get an incredible lucky streak than you are to outperform their entire AWS infrastructure capability with a few people and half a rack of servers.


Are you from AWS marketing :) ? AWS SLA is meaningless as there are no meaningful penalties.


AWS outages you can google details

2011 April 21 Outage

2011 August 8 Outage

2012 June 29 Service disruption

2012 October 22 Outage

2012 December 24 Outage

2013 September 13 Outage

2014 November 26 Service disruption

2015 September 20 Outage

2016 June 5 Outage

2017 February 28 Outage

2018 March 2 Service degradation

2018 May 31 Outage


There’s a terrible amount more information you’re missing here. What services were you using that went down? (not what services went down, how were you affected) What is your availability for your DC? it seems you’re being light in the details and perhaps there’s a reason why.


The CDN will be fronting most of the load, behind that 10 decently specced servers running sanely architected code can scale to millions, if not tens of millions of requests per second.

Drop the servers in HA sets of 2-3 nodes across 3-4 regions, anycast your service endpoint from each cluster. The hardest thing to replicate without AWS is the 6-7 figure bills.


What you’re describing is “good enough service levels for what I need” not “the same level of service as AWS” (or a superior level, as the parent comment implied).

If some sanely architected code was all you needed, then you’d expect at least other cloud/IaaS providers to be able to match AWS service levels. Which they can’t, and which some little software shop most certainly cannot either.


Look at actual downtime of US East over the years.


So what do you do when you are featured on CNN or whatever and you need to scale up massively in a matter of minutes? Do you just let all those sales go?


I genuinely want to understand the point you are presumably making here, but I’m honestly having a tough time with understanding what it is.


"If you use their boxes, they own your data. Don't use their boxes."


Its my understanding a some of the large bricks and mortar retailers also stray away from hosting on AWS for these same reasons.


I think that’s less about being afraid that Amazon will steal their data, and more that they don’t want to give any money to an entity already steamrolling them


Some are legitimately afraid that AWS will deprive them of the ability to scale during peak times, like holiday shopping seasons. I've heard claims of this happening to more than one retailer.

Personally, I wonder if that isn't an emergent property of a lot of people trying to scale at once.


Walmart won't even allow their suppliers to use AWS.


Home Depot refuses to use AWS and partners with Azure for this reason.


Home Depot uses GCP.


That's absolutely not the same thing lol. What Amazon did is unethical. What you are describing is illegal.


I'm honesty curious what crime this would be. If I rent time on someone else's server, and they look at what I'm doing on that server, what illegal thing has happened?


Seems like a pretty clear violation of https://en.wikipedia.org/wiki/Computer_Fraud_and_Abuse_Act.


I'm not so sure about that.

AWS terms do not assign their customers any rights to any physical computer. And the AWS customer agreement gives Amazon the authority to access your data for certain purposes.

I'm not sure I've ever heard of anyone prosecuted under the CFAA for accessing a computer that they physically own and physically control. AWS is a service, not a computer rental.


https://aws.amazon.com/agreement/

> We will not access or use Your Content except as necessary to maintain or provide the Service Offerings, or as necessary to comply with the law or a binding order of a governmental body.

The CFAA uses wording like "exceeds authorized access", which Amazon would absolutely be guilty of if they went into your database to spy on your product listings.

If they could go after Aaron Swartz for using authorized access in an unauthorized way, it seems likely it could be applied here.


"One reason we could charge the price we did for the service is that we were treating the data we had access to as an investment. Thus the data we accessed was done so to ensure the service could be maintained."

Would a judge accept that argument? From me? No. From the lawyers Amazon can afford? I wouldn't be comfortable betting either way.


A reminder that the legal system is designed to serve the wealthy, and few are wealthier than Amazon. It's not absolute, but the little guy isn't going to walk away with Bezo's fortune in damages.


The CFAA doesn't protect "content", though. It protects "protected computers".

In this case, Amazon fully owns, possesses, and operates the "protected computer".

You'd have to successfully argue that Amazon fraudulently accessed their own computer. It might be possible, but I'm guessing it'd be a first.

The difference in Aaron's case is huge: he didn't own the computers that hosted JSTOR.


The Amazon employee accessing the data would be "exceeding authorized access".

> The difference in Aaron's case is huge: he didn't own the computers that hosted JSTOR.

His access was authorized, though. They still threw CFAA at him.


"exceeding authorized access" is not enough to violate the CFAA.

You have to "exceed authorized access to a protected computer"

The CFAA is not a data protection law. It is a computer protection law.


https://en.wikipedia.org/wiki/Computer_Fraud_and_Abuse_Act

> In practice, any ordinary computer has come under the jurisdiction of the law, including cellphones, due to the interstate nature of most Internet communication.


Sure. The question I am alluding to is: can someone defraud their own computer?

Maybe it is possible, but the consequences to answering 'yes' to this is pretty scary.


If I buy my spouse a phone, and secretly bug it, I'm still violating wiretap laws, even if it's technically mine.

If I'm renting an apartment, my landlord can't install a camera in the bathroom, even if they're the owner of the building.

Ownership doesn't change the fact that the law says "exceeds authorized access". Amazon agrees to only access the computer I'm renting from them in very specific scenarios. If they violate that, it looks like a pretty clear CFAA violation.


Neither of your two examples have anything to do with the CFAA.

> Amazon agrees to only access the computer I'm renting from them in very specific scenarios.

AWS provides compute services, they do not rent computers. They make this clear in their terms.


> Neither of your two examples have anything to do with the CFAA.

They demonstrate that legal ownership is not the same as the legal right to do whatever you want with what you own.

> AWS provides compute services, they do not rent computers. They make this clear in their terms.

Good luck hoodwinking a judge with that argument.


Okay, you think you rent AWS servers?

Which one do you rent?

Where is your rental agreement?

When did you first take possession?


Huh, WTF?! Your FBI used to railroad random kiddies for messing around with poorly programmed dynamic pages and now you’re arguing there’s nothing wrong if a hosting provider trespasses and mines your private property?!


The rules the FBI/DoJ applies to kids on irc are not the same rules the FBI/DoJ applies to multibilliondollar infrastructure companies and/or trusted military defense contractors (Amazon is both).

Equal protection or application of computer crime law (perhaps, any law) in the USA is a fiction. It would be practically illegal to invent and run a web spider today, for instance, if they didn’t already exist as a concept. (France recently decided this was true for news link aggregation; Google must pay the newspapers for reproducing their headlines. I’m glad hosted RSS readers aren’t outlawed so far, but under these sorts of restrictive legal interpretations you could see how they might be. Google doing AMP, of course, gets a free pass.)

If you don’t believe me about the web spider thing, try making a complete download of Twitter for the purpose of making a tweet search index and see if you get to continue owning your house. (My theory is that Clearview is allowed to do it for Instagram because they’re using the database to provide services to law enforcement/military, so those groups want it to continue to exist free of prosecution.)

Bummer that actively collaborating with violent types like pigs and military seems to be the only way to avoid jail if you want to build large novel data systems with interesting public datasets today. This sort of freedom to experiment with new/neat algorithms over published documents got us Google; today these same companies will get you raided if you dare download/index their data. (Facebook’s idea famously started out scraping public yearbook photos. Try scraping Facebook now.)

one small counterpoint: https://www.eff.org/deeplinks/2019/09/victory-ruling-hiq-v-l...

RIP aaronsw


Amazon owns the computer and grants you limited rights to use it, in exchange for the money you pay them. It's basically the opposite of a script kiddie hacking into someone else's web server.

Now, indiscriminate access to your content might violate whatever commitments Amazon made to you in their terms of service; I have not read them for a long time and can't remember what the language is specifically. But that would not be a matter for the FBI.


I read the parent comment as less of an argument against it than a question of which laws do we have in place to prevent it.


This could fall under Unlawful Access to Computers.


Assuming that the information would be behind at least a password entrance that a user had setup, Amazon breaking through that would be considered illegal unless they had a court order or something. They can peer into metadata that your machine creates but I think looking at private information on a server that they lease out would be illegal. Maybe I'm just hopeful?


Why do you feel its unethical?


I love B&H, we planned family holidays from the UK specifically around buying from this shop. When the $/£ exchange was healthier we got some real bargains!


Couldn't they use Azure?

Of course there are other reasons to use physical servers.


At the very least, they own your IP traffic. From there, every single value-add service you use gives them an opportunity to eavesdrop your data. Take, for example, https://aws.amazon.com/elasticloadbalancing. All of the sudden your URL traffic is 'fair' game.


Hold up one second. Is this something that they're actually open to doing? Surely part of their ToS isn't about stealing data on their physical infrastructure to enable other aspects of their business. Right? Has any data centre ever done this?


Imagine if amazon has as much ad presence as google or fb. If they're able to identify you as a business of interest and identify your key employees you're pretty much f'd unless they have adblockers and even that might not be enough


Christ, this thought is terrifying. If any company would be this shady it would be Amazon, and they have the greatest market share. And other commenters here make it sound like it's not even clearly illegal.

The one point of solace is that there's a lot of competition out there for web hosting.


I know the B&H execs too. All Hasidic Jews, super nice, smart people. They had a big contract with some software I wrote ages ago.


It’s just as likely one of your own administrators could steal it and sell it to a competitor. A lot of espionage is inside jobs.


Wonder if they are on Microsoft's Azure now?


Just did a tracert to their website. After hitting b-h-photo-v.ear1.newark1.level3.net it goes through a couple routers on an IP block they own before hitting their IP.

Safe to say they are not on Azure.


Sounds like this head of IT isn't very good at his job if he can't explain the difference between EC2 access, databases, and web requests over TLS

There are ways that you can use AWS that Amazon would have no way to access any of your data even if they wanted to.


They have 100% hypervisor access. To give them zero knowledge, you need full homomorphic encryption which is impractical at this point (and likely for a while).

You may trust them not to abuse hypervisor access, but they still have network “meta” data - it could tell them how many transactions clear against credit processors (though not the actual amounts if encrypted), a good idea general distribution of page views With respect to time and user ip (though not the exact pages), times of day, demographics of users (Geo locations and ISPs, for example)

If you don’t trust them not to peek at what they can, don’t use them. He is perfectly right.

There are other cloud providers who aren’t competing with B&H and would be a better choice. But amazon is a direct competitor to B&H, even if they do have an IT barrier - they cross subsidize; any $ paid to Amazon helps it against B&H.


Even if homomorphic encryption was practical, you would need hardware to decrypt that would have to be either on the cloud oron premise.


If any decryption of your data occurs on AWS hardware (i.e. if your software in AWS has access to your unencrypted data), then wouldn't AWS also have access to it if they wanted? Even with encrypted volumes, etc, the decrypted data is present in memory, AWS controls the box with the memory in it.


Yep, this is how computers work. Not saying this to be snarky, just... it's surprising how many people don't know this. And when I say 'people' I mean 'Professional Software Engineers with Years of Experience in the Industry'


> There are ways that you can use AWS that Amazon would have no way to access any of your data even if they wanted to.

Is it worth the extra effort and moving already functional servers to do so?


> There are ways that you can use AWS that Amazon would have no way to access any of your data even if they wanted to.

Please explain, as I'd like to know how.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: