Hacker News new | past | comments | ask | show | jobs | submit login
The rise of embarrassingly parallel serverless compute (davidwells.io)
132 points by prostoalex on June 13, 2020 | hide | past | favorite | 131 comments

Except that none of these compute resources are free and the cost of cloud-based super computing is a complete rip-off. It's actually thousands of dollars cheaper to build your own super computer from used rack mount servers than it is to use AWS, Azure, or Google cloud for lengthy computations.

As an example: you can buy a Dell PowerEdge M915 from ebay with 128 cores for ~$500 USD and a rack costs around the same. Five of them is 640 cores for a total cost of just 3k USD. That's 640 cores that you now own forever. Guess how much it would cost to use that many cores for a month on AWS? Well over 10k... and next month its another 10k... and so is next month...

With this option you only pay for power and still have the resources for all future experiments. I think the m1000e rack can fit 8 blades in total so you could upgrade to a max 1024 cores in the future! The downside with this particular rack is that it's very, very loud. But I've run the numbers here and it's hard to beat high-core, used rack servers on a $/Ghz basis.

Are any compute resources free?

It's telling that you've neglected your own labor in this assessment, and that of the other specialists required to get such ad-hoc compute working at scale (dynamic software networks, on-demand compute, provisioning/billing/monitoring apparatuses, etc...)

What plenty have realized in cloud-native approaches is that the TCO is pretty compelling, sometimes because its easier to distribute procurement/provisioning/billing to teams for their own resource needs, or just in sidestepping the inevitable bureaucracy when you try to centralize that decision-making.

Of course its different if you're talking hobby kit setups in our own homes, but if you're aiming at real scale for absurd parallelisms then that ship has sailed.

I'm in the midst of returning from the cloud to coloc.

It's looking like for about $50K in hardware I get x10 the ram/compute/storage/(internal) network I get from the cloud at around $10K/mo. However, it is taking me about $25K in labor to setup. Hosting costs about $1K for a rack and a decent pipe to the server.

So more than a hobby but not approaching real scale.

The main barrier to using a lambda model is if you have anything that smacks of a DB. It would take me $100K+ to transition away from SQL. If you can do lambda w/o a big retool cost, then it is probably viable. If you are just running a bunch of VMs on the cloud, it is pretty expensive.


Interestingly, my main client (a very large company) just went the other direction and moved all its compute to a system that is Spark underneath running on Azure. They are trying to decommission some expensive TerraData instances. So far, it is a mixed bag -- it is a big step forward (for them) on anything that is 'batch-oriented' but 'interactive' performance is dismal.

Price appears to be a big motivation. I always forget that most large enterprises run on exotic stuff with crazy service contracts that makes the cloud look cheap.

> Price appears to be a big motivation. I always forget that most large enterprises run on exotic stuff with crazy service contracts that makes the cloud look cheap.

Don't forget the second factor of (not having) the workforce. Physical servers require a bunch of suckers who know physical infrastructure and accept to be oncall 24/7 and deal with DELL/HP/colo full time on top of periodic travels to the datacenter.

Those "suckers" are easy to contract out to, and a lot of colo providers provide managed services or "remote hands" themselves as well.

In practice, when I was responsible for racks in two separate data centres, I spent 3-4 days a year in the data centres (combined); everything else was handled easily via tickets or remotely. Overall I've generally spent less time on devops with hardware in colo's than with cloud setups.

But specifically the amount of on-call work tends to be down to code quality and higher level architecture not whether you host in cloud or colo or rent managed servers - the "low level" problems become part of the noise floor very quickly in any system with reasonable failover.

As opposed to cloud ops suckers who accept 24/7 on call and just sit around panicking when cloud providers are down and the status page is full of lies?

Colo providers have massive outages too so it's the exact same thing in that regard.

If we're talking regular maintenance, like a raid controller or a power supply going bonker. AWS is always accessible, it allows you to realize something is broken and create a new volume or instance in 5 minutes. Whereas with dedicated hardware you might be toast with no remote access and/or no spare parts.

There have been several AWS outages where EBS and EC2 instantiation was outright down, and you could not create new resources for a period of time. AWS is not “always accessible“ unless you’re in their marketing department.

Sure, you know you’re down, you simply can’t do anything about it. Not so with your own hardware, which is why many orgs continue to run their own gear.

> Sure, you know you’re down, you simply can’t do anything about it. Not so with your own hardware

I don't buy it. Everywhere I've worked that colo'd or owned their DC had wider-reaching outages (fewer than cloud, but affecting more systems). Usually to do with power delivery (turns out multi tiered backup power systems are hard to get right) or networking misconfiguration (crash carts and lights-out tools are great, but not if 400 systems need manual attention due to a horrible misconfiguration).

I think folks underestimate the non-core competencies of running a data center. Also often underestimated is the value of running in an environment designed to treat tenants as potential attackers; unlike AWS's fault isolation, when running your own gear it's really easy to accidentally make the whole system so interconnected as to be vulnerable--even if you make only good decisions while setting it up

Where are you going to store this thing? Pay for the power? Cool it? Replace failing hardware?

As someone who works for a company with over 150 data centers around the world, I know space and power is always one of our biggest expenses.

You can just store it at the datacenter.

Many of them offer a service which allows you to drop off your hardware and they will manage the power, cooling, even repairs and replacement for a monthly fee.

Surely you need to consider that monthly fee in your cost comparisons too then?

Sure, but that fee is very low from the options I have seen.

> You can just store it at the datacenter.

What do you think is ‘the’ data centre? There’s no single universal data centre. And most people don’t have a data centre, or access to one.

Actually everyone has access to managed data centre services.

It’s called co-location where you buy rack space in one of the core data centre hubs in each city. Been available for many years and it’s very affordable e.g. a few hundred a month.

Not sure why you said 'the data centre' rather than 'a data centre' though. Not sure if you think there's one massive data centre for the whole world?

I think GP said "the datacenter" rather than "a datacenter" in the same way they would say they have a computer at "the office" and not at "an office". Even though there's no "massive office for the whole world".

Aren't you a native english speaker? At "the X" is a common idiom.

E.g. people say "you can buy carrots at the grocery store" despite there being multiple such stores.

"At the datacentre" doesn't imply one universal datacentre. It just informs you of which kind of facility you can use.

Isn't 'the' a definite article. You use it when you mean a single definite instance of something. If you mean something in general you use an indefinite article, like 'a'.



"you can buy carrots at a grocery store"

"you can buy carrots at the grocery store over there"

From the same page you've linked:

The definite article can also be used in English to indicate a specific class among other classes: The cabbage white butterfly lays its eggs on members of the Brassica genus.

And from other sources:

We also use the definite article:

to say something about all the things referred to by a noun: The wolf is not really a dangerous animal. (= Wolves are not really dangerous animals.) The kangaroo is found only in Australia. (= Kangaroos are found only in Australia.) The heart pumps blood around the body. (= Hearts pump blood around bodies.)

We use the definite article in this way to talk about musical instruments:

Joe plays the piano really well. She is learning the guitar.

to refer to a system or service: How long does it take on the train? I heard it on the radio. You should tell the police.

It's pretty easy to get access to "a" datacenter. Just look up one that's around you and give them a call. I've done just that for the company I work for a few years ago. It's a pretty big DC provider, even AWS has a presence there. For a flat monthly fee, we get a full rack with cooling and power.

Almost everybody has access to a datacenter. There are countless datacenters that will host any machine you drop them, or you can rent their machines...

> Where are you going to store this thing? Pay for the power? Cool it? Replace failing hardware?

There are plenty of cloud providers who sell colocation services or even rent bare metal hardware.

Take for instance Hetzner. They rent 32-core AMD EPYC boxes wth 128GB of RAM for about 130€/month. If you rent a half a dozen of those boxes you get more computational bang for a fraction of the buck.

If you reach a scale where your needs span tens of data centers across the world then the economies of scale and operational expenditure are quite different and peculiar, but still they a couple of orders of magnitude cheaper than AWS.

You are right. And if you buy metal and rent some space in a DC, it is still more cost-effective.

You need a few people who understand a little bit of hardware and IPMI and you are set. That's not a big deal.

The cloud gaslighting is nuts.

Tons of companies now exist to give you insight in your cloud bill because that's the last thing cloud provides want you to have.

So in the end you must decide what kind of pain you want. Cost pain, tech pain, and that all depends on your company's particular circumstances.

None of this math is correct. You absolutely cannot get 128 cores for $500. Please link to a single valid-looking eBay offer at that amount (you may have seen 128GB and thought it was 128 cores). I see 32 cores for $500, at best.

Second, 640 cores on an m4.large for 3 years upfront (the equivalent purchase) is $8755/mo, not well over 10k.

Third, you really underestimate electricity costs if you're running a server 24/7 at all cores.

Last, and most importantly, most people simply don't run servers 24/7. They run batch computations for five hours a week, or a day, and then spend a week or two crunching the numbers.

There's some case to make that for certain institutions with very particular compute needs, running on-premise might make short-term financial sense (let's not even get started on labor costs). But it's really inaccurate to call cloud computing a "rip-off" for anything but the niche-est customers.

> I see 32 cores for $500, at best. Second, 640 cores on an m4.large for 3 years upfront (the equivalent purchase) is $8755/mo,

You can beat that cores per dollar handedly that if you're willing to use sufficiently odd hardware.

E.g. my Intel cluster here is mostly I have systems with 4x E7-8880v4 (22c@2.2GHz). I paid $150/chip + $450 host (thanks ebay) excluding ram.

(I have 8 such hosts now, so 704 cores)

So essentially one months cost for your AWS price w/ a 3year commit pays for an equivalent core count in systems for me... Throw in another month for the ram. Round up to another another to count for the (quite) non-trivial power usage.

You do not have to be anywhere near 100% utilization for AWS to be an extremely poor value large computing jobs compared to thrifty spending on the surplus market.


>I paid $150/chip + $450 host (thanks ebay) excluding ram.

E7-8880v4 are $600 used each. Quad core mobos are not cheap, ranked ram is not cheap. Even if you pulled these from a vendor going out of business the bit coin mining crews would suck these up in heartbeat.

Where are you getting your prices?

Ebay. Hunt harder (and best offer harder). :) E7 CPUs sell absurdly cheap at times because it can be a bit hard to find compatible systems.

(similarly, quad socket systems with slow or no cpus go cheap from people that don't know much about them)

> ranked ram is not cheap.

Yes, ram for quad socket hosts it adds up, since you need at least 2-8 dimms per socket for full memory bandwidth (which was why I explicitly pointed out that I left it out).

On the plus side, those systems tend to have a lot of dimm sockets so you can use lower capacity dimms.

> bit coin mining crews

nah, not useful for that.

On a side note, you would expect people to be super excited to explore possibilities of new infrastructure models. Instead, most posts related to lambda, etc... always come with this sort of reply.

For my use case (infrequent, easily separable jobs) FaaS stuff is a godsend. In the last year I've been doing things that would be orders of magnitude harder and more expensive thanks to it.

> Third, you really underestimate electricity costs if you're running a server 24/7 at all cores.

not to mention the cooling, or the per square foot cost of renting the space for this hardware. anything close to what's being described here is going to need some sound deadening (be it distance or material) from humans.

Frankly, I think many people here are overestimating the cost of electricity and cooling. My company leases several 42U cabinets with 208V 30A circuits in a modern SOC 2 Type 2 audited datacenter. We pay $600 per month per cabinet. Excludes IP transit costs. Granted this is in the Pacific Northwest where we have plentiful cheap hydro power.

> We pay $600 per month per cabinet.

Still a world of difference between AWS's 10k/month cost to access that many cores.

Comparing a Dell r740 (56 cores, 192GB ram, 1.6 kw psu), you need 11 of these for 640 cores (for a comparison with [0]). They're roughly 2500 each, so your up front hardware costs are 27,500.

Assuming the $600 covers power, rack costs and cooling, your data center bill is 6600/month.

For three years, the AWS bill is $315k, and the self hosted option is $265,100 (best case). I'm not up to speed with the pricing trends to say whether the gap will change over those 3 years based on data center changes or aws changes, but the difference is ~50k for three years, or roughly 3 months of an engineers salary over the cost of three months.

[0] https://news.ycombinator.com/item?id=23514698

It's $600 per cabinet, not $600 per server. If you go by the PSU, that's 3 cabinets. But that 1.6kW PSU is quite overspecced for multiple reasons, like efficiency and the fact that it derates to half on 110v power. 28 core xeons have a TDP in the 165-255 watt range, after all. So even 1kW per server is overkill and that's only 2 cabinets. $1200 per month, not $6600.

With your numbers, that puts the total running cost at 71k, or a saving of $245k over 3 years, which is pretty much the salary of one person to look after that system for the lifetime. I'm not arguing that AWS is good value, im arguing that the upfront cost of purchasing the hardware isn't comparable to reserving EC2 instances up front.

You still need someone to look after your systems if you're using AWS. That's not something special to buying your own server.

And management of the hardware itself doesn't take an extra employee. It probably averages to less than one workday per month.

> which is pretty much the salary of one person to look after that system for the lifetime.

My average time spend in data centers when handling racks in two different location was around 3-4 days a year. Buying on-call, out-of-hours support is going to cost you a tiny fraction of a persons salary per year, either on retainer and/or based on hefty day rates.

Hardware failures simply do not occur often enough for a rack or two worth of hardware to typically require enough maintenance to make it that expensive, and remote hands (people who act based on a ticket) are available in pretty much any data centre at low costs. If you get servers with IPMI etc. you typically pop in to wire up or retire equipment and dealing with the occasional failure.

Source: Have provided those kind of services for years.

> With your numbers, that puts the total running cost at 71k, or a saving of $245k over 3 years

To compare apples to apples, that hypothetical pricetag buys you the system's max capacity for the entire period. In AWS, the invoice from using any of their serverless offerings grows wildly even when utilization stays way below the resources of that hardware bundle.

For instance, let's keep in mind that AWS charges:

* API Gateway per million of requests,

* Lambdas per timememory used by a single call, including the time it wasted with hot and cold starts,

SQS per queued message,

* WAF per request through the firewall,

* Etc.

And if you don’t need them all the time , instead of 3 years RI you can us spot fleet or turn off half of the servers at non peak hour.

> so your up front hardware costs are 27,500.

That's about 3 months worth of AWS to access a similar amount of computational resources.

Did you consider that AWS counts threads, not cores?

No, I did napkin math that compared an unverified number farther up the thread to the first server I found on Google. If you're considering buying hardware you should do your own numbers, I was merely pointing out that difference isn't 10k/month vs 27k up front

The way I look at this is you still need 3 people minimum on call to have around the clock ops for this rack. Three IT salaries is more than $10k a month. Eventually you'll find the balancing point but its higher than you think.

Or you contract one of the vast number of consultancies that will do that for you at a fraction of that price.

And you still need someone available for a cloud hosted system - Over ~25 years of handling this kind of work, hardware failures have made up a vanishingly small proportion of the outages I've had to deal with, be it a colo'd setup or a cloud provider

> you still need 3 people minimum on call to have around the clock ops

Q: Don't you still need at least 3 people on call for around the clock ops, regardless of how and where your servers or instances live?

You already have 3 people in on-call rotation anyway, regardless of the cloud or metal.

Yes. This is the reason cloud is popular. You don’t have to pay for computer janitors.

We have a title and don’t carry mops, my dear condescending developer who ostensibly doesn’t respect the entire ecosystem of people holding the Internet, the public cloud, and your entire CI/CD pipeline for whatever bullshit you’re cooking in NPM together. Do you have any idea how many “janitors,” most basically volunteer and most not employed by Amazon, are involved in the pedestrian aspect of merely making amazon.com land at the right place?

Yes, this is a good future: who would ever want to touch the computers they operate? Better to rent computers and position the core competency of your business with a third party, right? Because you can shed some of those pesky janitorial salaries? I’ll be waiting by my phone when you’re on the verge of bankruptcy once the clouds have their teeth in your books, and suddenly I get promoted from “janitor” to “computer operator who has my interests in mind, even though I not-so-subtly malign his existence”.

All you’ve done with this mindset is drive the same janitors to work for the clouds instead and contributed to the downfall of computing as a discipline that any player has any semblance of agency within, as the people who have actually touched datacenter equipment all work for them now or sit around in horror watching as a generation sympathetic to FLOSS arguments willingly hands over the reins of owning a computer to massive corporations.

Reading the comments in this thread already shows how much of 'those' 'software janitors'[1] don't understand what is required (and how little) to run on metal.

It's not that cloud is not the right answer. But people have started to forget that running your own metal is still an option. Or with current prices and performance: even a more viable option as it was in the past, because you can do so much more with so much less.

[1]: cranking out useless features nobody ask for while looking down on those (dev)ops people.

> software janitors don't understand what is required (and how little) to run on metal

Quite. In the middle of lockdown a client needed to spin up some virtual machine instances to demo a product to a potential client. Previous boss had been pushing a cloud-only strategy using Azure and was itching to retire all the physical servers.

Problem: can't spin up any kind of VM due to lack of Azure capacity.

me: "Well, we have that [physical] dev server which we still have, we could spin up the demo stuff on that..." new boss: "Oh, cool. Great lateral thinking!" me: <wtf?>

Demo done. Potential client happy. Boss happy.

Thanks for sharing this example. I got a call years ago from azure that I could not get the 10 vms extra capacity at that time. Insane.

Nice attitude.

> you can buy a Dell PowerEdge M915 from ebay with 128 cores for ~$500 USD

Where? Was not able to find one. Found a 32 core for $750. It was a pre-Epyc AMD server which has it's own problems.

There were only 9 options listed for the exact term you gave 'Dell PowerEdge M915' for the USA before I got to pricey international sellers. The few options make me feel skeptical about this.

This is of course before I even consider security and performance issues of used Intel servers.

Every time someone says on the Internet "You can buy x thing in Ebay for Y price" multiply the price by 3.

You forget a few things:

1. AWS and other cloud vendors don't have just one data centre per region, they usually have three redundant locations for availability reasons connected via a direct fibre link. 2. Cost of time; time to find hardware, prep hardware, maintain hardware 3. Cost of your agility. If you need more compute capacity, it will take you weeks to get the hardware required and installed, otherwise you need to have unused hardware sitting around "just in case". 4. Cost of your availability. What if you have a sudden spike of traffic within minutes and the current available hardware cannot service it? At least in the cloud you can spin up short lived resources to manage that load before it throttles back down again. 5. Permanent running costs of fixed hardware. A lot of implementations do not need permanent hardware running and can throttle down to a base of almost nothing so the average running cost on a monthly basis is actually very low.

You can't really compare value just by looking at core count, I imagine an old 128 core server off of eBay is going to perform much differently from something running on a more modern architecture.

However AWS lambda stuff runs on a pile of abstraction layers that have an impact on performance, not to mention their need to do cold starts.

You also have to account for burstiness. If you have nightly/weekly workflows, you might see some wild swings in resource usage, where an average usage and max usage differ by a lot. It’s nice to pay for average needs rather than max needs (especially when you can pay for on demand pricing).

>Except that none of these compute resources are free and the cost of cloud-based super computing is a complete rip-off. It's actually thousands of dollars cheaper to build your own super computer

Only if your time is worthless

" It's actually thousands of dollars cheaper to build your own super computer from used rack mount servers than it is to use AWS, Azure, or Google cloud for lengthy computations."

The issue is 'total cost of ownership' - not 'unit cost of equipment'.

The operational cost and overhead of running your own gear can be prohibitive.

AWS is successful because it is in fact much cheaper than the alternative in many enterprise situations.

AWS is sucsessfull becouse there is no capital tied to hardware for the leasers. For some reason it is what executives prefer. A 1000USD server is in practice impossible for me to order at work but I can use multiple 1000s a year in leased servers...

"AWS is sucsessfull becouse there is no capital tied to hardware for the leasers. For some reason it is what executives prefer."

Your execs are probably right.

Capital outlays are expensive and risky and imply a structural lock-in - but that's just the tip of the iceberg.

"A 1000USD server is in practice impossible for me to order at work"

And what about hosting? Networking gear? Support? Repairs and upgrades? Networking security? And how will those servers integrate with the rest of your outlay?

AWS is ridiculously cheap compared to the alternatives in most scenarios and that's why it's so successful.

AWS/Cloud enables so much more, far more dynamically - the value is considerable.

In some situations, if you have a fairly big need for computing, and it's predictable over long time-frames, and those services don't need to be tightly integrated with other cloud services, and you have the internal know-how to keep them running, the cost obviously savings can be achieved, but this is an optimization.

Put on your Eng/Ops/Business Hat for a minute and consider why services like AWS are exploding and growing to be one of the biggest segments in tech? Because 'stupid executives'? No - it's because the value add is fairly immense.

> AWS is ridiculously cheap compared to the alternatives in most scenarios and that's why it's so successful.

It's so successful because a lot of engineers selecting what to rent never see the invoices and don't understand the costs of colo or managed servers. You see it all over this discussion e.g. with people assuming a rack or two requires full time staff when you can get sufficient on-call resources on retainer for a few hundred a month or so depending on complexity of the setup most places.

Having ordered, configured, set up, hired staff for and run both colo setups, managed servers, and AWS setups many times I've yet to see AWS be remotely competitive on price ever, to the point that when I did contracting I used to offer clients to transition their systems to managed hosting (whether managed or colo comes out cheapest depends on scale and location - e.g. I'm in London and real-estate prices here are too high for colo to typically beat managed servers if you can deal with ~8ms round-trip to providers in Germany or France; for others managed is necessary to be close to customers) for a percentage of their first 3-6 months of reduced cost. That was costs including their devops contracts and monitoring etc.

I use AWS. It has lots of great features, and sometimes those features are worth the cost, but it's the expensive luxury option of hosting.

> Put on your Eng/Ops/Business Hat for a minute and consider why services like AWS are exploding and growing to be one of the biggest segments in tech? Because 'stupid executives'? No - it's because the value add is fairly immense.

I've done this for 25 years, including private cloud setups from before AWS was a thing, and I've had to yell at execs that wanted to triple our monthly costs because AWS sales had whispered buzzwords into their ears. That was the all in costs. It took a massive effort to explain it to them even with the numbers in black and white in front of their faces.

So, yeah, a lot of the time (not always), it is "stupid" executives. Sometimes talked into it by engineers working around planning processes that makes paying by use an easy end-run around budgeting.

The biggest achievement of AWS is to sell the idea that it is cheap because Amazon.

Triple the cost of on-prem is pretty good. Whenever I’ve run numbers for storage it has been in the 10-50x range.

This is for cost optimized on-prem storage vs ebs or s3. I check the actual workload and space utilization, and give amazon every benefit of the doubt to get the ratio that low.

It probably doesn’t help that a $100 Samsung EVO (500GB) can do 500K IOPS, but that the storage would cost $62.50 per month on a provisioned IOPS EBS volume. At a five year amortization, thats 37.5 times more expensive than buying. (EBS has poor durability by design, so you end up keeping the same number of copies on prem or in EBS.)

At that point, it’s basically game over, so it doesn’t matter that the IOPS would cost $32,500 per month.

Note: I always include the cost of the scale out infrastructure, etc in the comparisons, but with that, on prem is competitive even if the cluster is 10% full on average and does vintage stuff like triplicating data instead of erasure coding.

Not the person you’re replying to, but here’s another way of thinking of it: A $1000 server can sit under my desk, no problem. The TCO comes down to the cost of my productivity with the AWS bill as a second order effect.

If the ops team can’t manage >>10 machines for the price of me managing one, there’s something horribly wrong.

They should just let the developer buy the $1000 machine if it will help productivity.

That doesn’t mean production should run on a pile of machines under someone’s desk. That’s a different scenario.

Hilariously, in 2008 Google was hugely resistant to this idea. DARPA had just put out an RFP for a new way of running computationally dependent tasks that currently ran on super computers on a 'shared nothing' architecture (which is what Google ran at the time (and I believe they still do). I had done some research in that space when I was at NetApp looking at decomposing network attached storage services into a shared nothing architecture so I had some idea of the kinds of problems that were the "hard bits" in getting it right.

I recall pointing out to Platform's management that if Google could provide an infrastructure that solved these sorts of problems with massive parallelism that currently required specialized switching fabrics and massive memory sharing we would have something very special. But at the time it was a non-starter, way too much money to be made in search ads to bother with building a system for something like the 200 customers in the world total.

I didn't care one way or the other if Google did it so after running at the wall of "under 2s" a couple of times I just said "fine, your loss."

I strongly believe that the author never tried out his examples.

One time, I wanted to process a lot of images stored on Amazon S3. So I used 3 Xeon quad-core nodes from my render farm together with a highly optimized C++ software. We peaked at 800mbit/s downstream before the entire S3 bucket went offline.

Similarly, when I was doing molecular dynamics simulations, the initial state was 20 GB large, and so were the results.

The issue with these computing workloads is usually IO, not the raw compute power. That's why Hadoop, for example, moves the calculations onto the storage nodes, if possible.

That's why Hadoop, for example, moves the calculations onto the storage nodes, if possible

You make a good point about I/O and I actually wanted to comment something along the lines of "why not Hadoop?" since the programming model looks very similar but with less mature tooling.

However, now I think about it, the big win of serverless is that it is not always on. With Hadoop, you build and administer a cluster which will only be efficient if you constantly use it. This Serverless setup would suit jobs that only run occasionally.

In my experience, the cloud is so slow and expensive for these tasks that even if your job only runs once per day, you're better off getting a few affordable bare metal servers.

Plus, most tasks that only run occasionally tend to be not urgent, so instead of parallizing to 3000 concurrent executions, like the article suggests, you could just wait an hour instead.

Serverless is only useful if you have high load spikes that are rare but super urgent. In my opinion, that combination almost never happens.

This is us exactly. We pay around $2k for one of our analytic clusters. Hundreds of cores, over 1tb of ram, many tb of nvme. Some days when the data science team are letting a model train (on a different gpu cluster) or doing something else, the cluster sits there with a load of zero. But it’s still an order of magnitude cheaper than anything else we’ve spec’d out.

Are you potentially interested in renting out idle capacity for batch jobs? If so, what kind of interconnect do you have? Feel free to contact me (info in my profile).

Sadly, given how cheap the infra is, it's not worth it to us to have to share with someone else. Let's say we could cost share 50% thus saving us $12k/yr...we would spend a lot more than $12k setting up a system and all the headache that arise from sharing the infra.

But thanks for the offer! The natural market forces will drive cloud computing prices down the same way they've driven everything else down. But until then, roll-your-own can save loads.

I figured it might be unreasonable, but thanks for responding.

Yeah, I was particularly curious because I was unable to find better public offers than AWS (with their homebrew 100Gbit/s MPI that drops infiniband's hardware-guaranteed-delivery to prevent statically-allocated-buffer issues in many-node setups, allowing them quite impressive scalability) or Azure (with their 200Gbit/s Infiniband clusters), at least for occasional batch-jobs.

I wouldn't ask if I could DIY for less than using AWS, but owning ram is expensive. And for development purposes it would be quite enticing to just co-locate storage with compute, and rent some space on those NVMe drives for the hours/days you're running e.g. individual iterations on a large dataset to do accurate profile-guided optimizations (by hand or by compiler). Iterations only take a few minutes each, but loading what's essentially a good fraction (minus scratch space, and some compression is typically possible) of the ram over network causes setup to take quite a long time (compared to a single iteration).

I think you are missing the point of the article. I read it as "this desktop software could make use of serverless to provide me a re-encoded 4gb video file in seconds by doing 3000" tasks (provided my bandwidth could handle that). My gripe with that is the one of privacy (I do not want my data processed elsewhere).

Still I would not be opposed to such client-server(less) architecture being used where I could have slower devices seamlessly integrating with my personal server for faster processing of compute heavy tasks.

It's not that this hasn't been done before (thin clients anyone? Even X server model is exactly like that), but a similar approach could make a come back at some point.

No, your example actually solidifies my argument.

For most people, uploading that 4gb file for cloud processing will take an hour. But re-encoding 2h of video with GPU acceleration only takes 15-20 minutes. So no matter how fast serverless is, it'll always need to wait for upload and download, which may be slower than all the computations combined.

As for X server, using it over the internet is a pain. It is optimized for a low latency connection, meaning the opposite of putting calculations in a cloud hundreds of ms of ping away.

Oh, I am not saying we are there yet as far as bandwidth is concerned, but even a 4G connection from a slow device like a phone is going to look better today. Uploading 4GiB file at 50Mbps will take less than 15 minutes, or 5 minutes at 150Mbps, and no phone would re-encode it in less than that. 5G goes up to 1Gbps, or 32s for a 4GiB file, and there's your case. Wouldn't you find it nice if your phone could do this in 30s without really burning like a stove as CPU usage spikes?

Again, we are not there yet, but we are not that far off either.

My mention of X was to highlight how this is just old technology within new constraints (move things that do not need small latency from the thin client onto the fast server), but how it's applied is going to make it or break it.

In that case, I agree. If we had WiFi-speed internet, Lambda would be amazing for mobile apps.

But my lived reality is that I have to go to the upper floor in my parent's house if I want to have 2G reception. And they only live a 10 minute drive from the town hall.

there is a subset of embarrassingly parallel problems that are are heavily data intensive.

E.g. suppose you have 100 TB of data files and you want to run some kind of keyword search over the data. If the data can be broken into 1000x100GB chunks then you can do some map-reduce-ish thing where each 100GB chunk is searched independently, then the search results from each of the 1000 chunks is aggregated. 1000x speedup! serverless!

however, if you want to execute this across some fleet of rented "serverless" servers, a key factor that will influence cost and running time is (1) where is the 100 TB of data right now, (2) how are you going to copy each 100 GB chunk of the data to each serverless server, (3) how much time and money will that copy cost.

I.e. in examples like this where the time required to read the data and send the data over the network is much larger than the time required to compute the data once the data is in memory, is going to be more efficient to move the code & the compute to where the data already is rather than moving the data and the code to some other physical compute device behind a bunch of abstraction layers and network pipes.

> there is a subset of embarrassingly parallel problems that are are heavily data intensive.

There's an even smaller subset which is one-shot data access.

> in examples like this where the time required to read the data and send the data over the network is much larger than the time required to compute the data once the data is in memory

The annoying thing about lambda and other functional alternatives is that data-access patterns tend to be repetitive in somewhat predictable ways & there is no way to take advantage of that fact easily.

However, if you don't have that & say you were reading from S3 for every pass, then lambda does look attractive because the container lifetime management is outsourced - but if you do have even temporal stickiness of data, then it helps to do your own container management & direct queries closer to previous access, rather than to entirely cold instances.

If there's a thing that hadoop missed out on building into itself, it was a distributed work queue with functions with slight side-effects (i.e memoization).

> If there's a thing that hadoop missed out on building into itself, it was a distributed work queue with functions with slight side-effects (i.e memoization).

Is that not named spark ? :)

If you stored the data in a distributed fashion initially then it's not a problem.

Redshift, Bigquery etc implement it this way and then have various schemes for computation on top. Redshift bundles individual compute with storage whereas other implementations scale the compute independent of the distributed storage.

But this has allowed very cheap scale for querying large datasets and in practice, I imagine you very rarely have to worry about implementing the data transport yourself beyond initial loading with tools like those available.

Edit: Also most clouds moving data within their networks is free so really it's just talking time for moving data which indirectly influcences cost in terms of run time.

> If you stored the data in a distributed fashion initially then it's not a problem.

Distributed over what? Putting all the data in one service and transferring it into another is still going to look an awful lot like a big centralized transfer.

If you're permanently storing chunks of data next to chunks of the compute that will run on your data, that sounds an awful lot like a server.

Distributed over the storage mediums you intend to pull from for the computation.

It's literally one of the basic building blocks for modern data warehouses which have solved this problem. By distributing to different nodes over some key in your dataset when you ingest the data or in Google's case some intelligent chunking method [0], at query time you wind up with hopefully evenly distributed chunks already, and by doing things this way you minimize the amount of data you need to actually move or copy at query time before the data warehouses essentially run a map reduce job (with some really cool query planning [1] [2]) with your query to get you your results.

As to your second point, of course it sounds like a server? Serverless just means I don't have to maintain the hardware resources which is a nightmare at any real scale, AWS Athena and Spectrum are great examples of not having to scale hardware, though there are tradeoffs. The point I'm arguing against is that it's expensive (it's not), because most clouds allow free data transfer within their networks and because of modern data techniques that minimize the amount of data transfer that's needed so run times are limited to how bad you've written a SQL query or chosen to initially distribute your data.

Combine that with instance reservations for more regular workloads and modern Big Data can be pretty cheap.

On the chance you have to move 100TB through a raw Spark job, if you have an idea that you know what you'll need the data for you can take a page or two from systems that were built to solve these problems and organize your data in such a way that lends itself to that fact.

Storing your 100TB data as one contiguous block and then having to chunk + transfer it at query time in their rudimentary search implementation like the OP suggests is probably about the worst position you could have put yourself in, and would have been a naive thing for someone to do.

[0] https://research.google/pubs/pub51/ [1] https://docs.aws.amazon.com/redshift/latest/dg/c-query-plann... [2] https://cloud.google.com/files/BigQueryTechnicalWP.pdf

Not to mention when the processing of huge set of data is not for having a short answer but to process the data into an other form, the same size or still big. Even if the data is massively distributed from the start the communication and aggregation of distributed data between processing or client nodes might be a considerable bottleneck. Different aspects might desire different logic for distribution otherwise the performance suffers, but rearranging for the sake of processing might ruin the overall performance of the task.

Stateless serverless platforms like Lambda force a data shipping architecture which hinders any workflows that require state bigger than a webpage like in the included tweet or coordination between function. The functions are short-lived and not reusable, have a limited network bandwidth to S3, and lack P2P communication which does not fit the efficient distributed programming models we know of today.

Emerging stateful serverless runtimes[1] have been shown to support even big data applications whilst keeping the scalability and multi-tenancy benefits of FaaS. Combined with scalable fast stores[2][3], I believe we have here the stateful serverless platforms of tomorrow.

[1] https://github.com/lsds/faasm (can run on KNative, includes demos) [2] https://github.com/hydro-project/anna (KVS) [3] https://github.com/stanford-mast/pocket (Multi-tiered storage DRAM+Flash)

It reminds me of this quote “a supercompter is a device that turns a compute bound problem into an IO bound problem” (Ken Batcher).

This is a very apt quote here.

"Serverless" is basically equivalent to a supercomputer in that context, but then it goes on to exhibit latency characteristics that would be considered a non-starter for a supercomputer.

Latency is one of the most important aspects of IO and is the ultimate resource underlying all of this. The lower your latency, the faster you can get the work done. When you shard your work in a latency domain measured in milliseconds-to-seconds, you have to operate with far different semantics than when you are working in a domain where a direct method call can be expected to return within nanoseconds-to-microseconds. We are talking 6 orders of magnitude or more difference in latency between local execution and an AWS Lambda. It is literally more than a million times faster to run a method that lives in warm L1 than it is to politely ask Amazon's computer to run the same method over the internet.

This stuff really matters and I feel like no one is paying attention to it anymore. Your CPU can do an incredible amount of work if you stop abusing it and treating it like some worthless thing that is incapable of handling any sizeable work effort. Pay attention to the NUMA model and how cache works. Even high level languages can leverage these aspects if you focus on them. You can process tens of millions of client transactions per second on a single x86 thread if you are careful.

Furthermore, the various cloud vendors have done an exceptional job at making their vanilla compute facility seem like a piece of shit too. These days, a $200/m EC2 instance feels like a bag of sand compared to a very low-end Ryzen 3300G desktop I recently built for basic lab duty. I'm not quite sure how they accomplished this, but something about cloud instances has always felt off to me. I can see how others would develop a perception that simply hosting things on one big EC2 instance would mean their application runs like shit. I am unsurprised that everyone is reaching for other options now. On-prem might be the best option if you have already optimized your stack and are now struggling with the cloud vendors' various layers of hardware indirection. Simply going from EC2 to on-prem could buy you an order of magnitude or more in speedup just by virtue of having current gen bare metal 100% dedicated to the task at hand. Obviously, this brings with it other operational and capital costs which must be justified by the business.

Even going to managed servers in a datacentre tends to have the effect of letting you spec machines far closer to what you need.

I've cut number of servers by large factors on several instances when moving off cloud to managed servers in a data centre as well because I've been able to configure the right mix of RAM, NVMe and CPU for a given problem instead of picking a closest match that often isn't very close.

It's going to take a new kind of software engineer to build these fully distributed systems. You can imagine calls for "Senior Serverless Engineers". Will conventional serverful engineers be left in the dust, or will the serverless engineers just break away and pioneer the apps on a new scale?

Serverless has been around for many years now.

It doesn't require a new kind of software engineer. It's just another software architecture to go alongside micro services, containerisation etc.

And it hasn't changed the world because (a) it's the ultimate form of vendor lock in and (b) it makes even simple apps much more complex to reason about and manage.

> much more complex to reason about and manage

I really dislike the local dev experience and deployment for serverless. But otherwise the model is pretty clear: a file is a function, data goes in, data comes out, just like any other server function. If one instance is busy, spin off a new one.

What’s hard to reason about?

Yep. Inversion of control has always been the norm with whatever server/framework you choose.

If you have a /myroute handler defined in express or flask or even form.php or form.cgi in Apache,you never had to write the code to make user requests trigger your handler anyway even in the old days. That's the entire point of using a server instead of listening to a socket yourself

With serverless the same thing still happens with someone else managing the path from a request to a handler and back.

In fact, if you ever used a cpanel host with PHP like in the good old days (110mb.com anyone?), you already used 'serverless'. You just uploaded .php files to a directory and your website just 'magically' worked.

Frankly, there isn't much engineering to be done. This only works for embarassingly parallel tasks, so you can just start a loop, distribute data, start compute, collect results. Done. For anything else, this model breaks down.

Need aggregation of results? Communication among nodes? Computation subdivision that is not strictly predeterminable? Sorry, not embarrasingly parallel, won't be doable like this.

You may be able to extract some embarrassingly parallel part, like compilation of independent object files, but very often you still have a longish, complex and timeconsuming serial step, like linking those object files. This kind of recognising different parts of a program is already state of the art, no need to invent a new field...

Speed and parallelism are good features but the killer app of serverless is the cost savings. You only pay for the time your function is running.

Traditional server applications can be rewritten “serverless” so long as your pipeline is pretty functional, ie you’re not saving critical state in memory on the server process itself.

> Traditional server applications can be rewritten “serverless” so long as your pipeline is pretty functional, ie you’re not saving critical state in memory on the server process itself.

And if you're storing things in DB, how long does it take for your lambda to start up now? In a project I was on, it could easily take 3s for a new request to be served, because we had a lambda that was doing and auth check with the DB, and then a different lambda that was doing the actual application stuff with another DB. So not only would we incur the cold start cost twice, but, since our lambdas needed a nic inside a VPC to talk to the DB, the cold start cost was huge. And of course, you pay that cost for each additional concurrent connection, not just once.

Of course, if we had stored everything in some other managed service, and maybe used some managed auth scheme, this would have not been a problem. AWS likes it when they get you hook, line and sinker.

Was the first lambda waiting on the second's result before returning? If so, you really shouldn't rely on synchronously invoking à lambda from within another lambda.

Also, things stored in the global scope (in Node.js functions) are kept between different invocations of a same container. If it can help...

Not explicitly, this was orchestrated through the API Gateway authentication support.

And yes, different invocations of the same container helped significantly (most importantly, the interface was already deployed in the private network), that is why I said that the problem was concurrent access. You could serve a pretty good number of users/second, but each user that happened to send a request while the already deployed containers were all busy would have to wait for up to 3s before we could give them any data from the backend. And of course, if two new users came in parallel, they would both have to wait, etc.

Ok, I see. And the VPC explains the high cold start. Only "solution" I see would be to "keep your functions warm" (there's a feature now for that), but it's more of an inconvenient useful anti- pattern to me. Quite annoying.

Sorry I can't be of any more help :/

I am still wondering if there is any way to get good lambda performance with a DB that is neither managed by AWS and not publically accessible.

For some use cases the cost saving is a big deal e.g. resizing an incoming stream of images.

But for the majority of apps it doesn't save that much. And it still pales in comparison to the cost of the engineers. Which often spend significantly more time to build, debug, test etc a serverless app than one they just throw on an EC2 server.

“Apps” is a pretty vague description of an application, but if you’re writing a server it’s just as easy to “just throw” the app on lambda as it is on EC2.

In general I ask myself if my new server can run on a lambda and if it can’t then I reach for something else. Most of the time it can.

> It's going to take a new kind of software engineer to build these fully distributed systems.

What your are selling here is nothing else than a new proprietary-scheduler-runtime to run embarrassingly parallel jobs ( the easiest kind of parallelism)

There were already plenty of solution to do that, the only difference here is that you run on AWS lambda.

Why would you need an entire new type of engineer to do that ?

There is nothing new here excepted buzzwords.

Are engineer nowadays a script-kiddies bind to a technology there entire career ? (Tip: of course no)

The scraping example seems poorly chosen. The original blog post describing that example is no longer online, but archive.org has a copy: https://web.archive.org/web/20180822034920/https://blog.sean...

If the author just wanted to fetch pages in parallel, they could have done better than 8 hours even on their own laptop (you can run more than one chromium process at a time). The real benefit they got from using AWS Lambda is that the requests weren't throttled or ghosted by Redfin, probably because the processes were running on enough different machines, with different IP addresses.

> One of the challenging parts about this today is that most software is designed to run on single machines, and parallelization may be limited to the number of machine cores or threads available locally.

Depending how you look at it, I don't think most software is designed to take advantage of multiple cores, let alone multiple machines.

Both points are true, but the author is talking about writing new software specifically for serverless compute.

But how much is serverless efficient?

Has anyone benchmarked the speed of running (let's say, on AWS) 1000x a lambda function vs. running the same function on a regular AWS instances?

What about all the overhead (for example, k8s overhead, both in CPU and disk, etc)

I'm afraid it would be very easy to get a repeat of this https://adamdrake.com/command-line-tools-can-be-235x-faster-...

Such a scheme depends heavily on whether the cloud providers can efficiently multiplex their bare-metal machines to run these jobs concurrently. Ultimately, a particular computing job takes a fixed amount of CPU-hours, so there's definitely no savings in such a scheme in terms of energy consumption or CPU-hours. At the same time, overhead comes when a job can't be perfectly parallelized: e.g. the same memory content being replicated across all executing machines, synchronization, the cost of starting a ton of short-lived processes, etc. These overhead all add to the CPU-hour and energy consumption.

So, does serverless computing reduce the job completion time? Yes if the job is somewhat parallelizable. Does it save energy, money, etc.? Definitely no. The question is whether you want to make the tradeoff here: how much more energy would you afford to pay for, if you want to reduce the job completion time by half? It like batch processing vs. realtime operation. The former provides higher throughput, while the latter gives user a shorter latency. Having better cloud infrastructure (VM, scheduler, etc.) helps make this tradeoff more favorable, but the research community have just started looking at this problem.

Lambda compute only seems to have been rising lately to those who never used it before. We’ve been running huge amounts of “serverless” style workloads in a Celery + RabbitMQ setup. Our workloads are fairly stable for that so bursting in a public cloud has no real value, but we regularly do batch jobs that burst capacity. And we spun up more workers as such.

The author seems to think the paradigm is new (it isn’t) and claims that it hasn’t taken off massively (it has) because he incorrectly points to a number of workloads that aren’t embarrassingly parallel. On the other hand, in theory having a common runtime for these operations from a public cloud provider should enable them to keep their utilization of resources extremely high, such that it would be cheaper for us to use AWS/GCP/etc instead of rolling our own on OVH/Hetzner. But if anything, the per compute cost of FaaS is higher than it is for other compute models, which means the economics really only work for small workloads where the fixed overhead of EC2 is larger than the variable overhead of Lambda.

Don't overcomplicate it. Xargs and curl is often enough to drive big, ad hoc jobs.

These things only have a handful of customers in the world.

Datasets that are tens of gigabytes, or maybe 100mil records or so...this really covers most things.

And for every 1 thing it doesn't, there's 20 more claimed that a single machine using simple tools could handle just fine.

Being able to detect when things have been processed, have a way to set dirty flags, prioritize things, have regions of interest, be able to have re-entrant processing, caching parts of the pipeline and having nuanced rules for invalidating it, these in my mind are kinda basic things here.

When they aren't done, sure, someone will need giant resources because they're doing foolish things. But that's literally the only reason. Substituting money for sense is an old hack.

Opening your mirrorless camera SD card with images and the image thumbnailer takes... Forever?

Doing a facial search on it?

Matching a rhythm picked up by the mic to your local music collection?

Hashing and/or encryption of data.

There's plenty of desktop-like use cases that would benefit from massively parallel computation, but network (or even IO) bandwidth is currently going to be the limiting factor.

IOW, we are not there yet!

Currently, we can parallelize tasks which are low on data and high on computation.

So how can we expand the IO bandwidth for everyone, even desktop or mobile users?

Why is your image and music collection not in the cloud? Definitely an xargs + curl to serverless job in a modern setup (if auth was easier)

Privacy and control. My email is not in the cloud either.

Still, as you note yourself with "if auth was easier", we'd need custom applications even for the cloud — it's just that you'd hope they provide unbounded bandwidth for each user, but I am not even sure that's the case for the biggest of players (dropbox, google drive...).

This is like an infoworld quality article in LaTeX. I don't know why this is beneficial. The LaTeX format scares me away most of the time. I think "oh god, I don't have all afternoon to study an academic document!"

This thing is a 3 minute read

There's no way to distribute really lightweight thunks of arbitrary code. Maybe WASM can work here, especially if you shape the standard APIs in the right way.

You'd also need built-in support in tooling and compilers, where you can compile specific functions or modules into something that can run separately without actually doing that manually.

This depends on how lightweight are those chunks.

If you goal is <0.1 sec startup -- yeah, then you'll need WASM.

If you are OK with 1-5 second of startup, you have a ton of options. Apache Spark uses JVM magic to send out the the raw bytecode. You can start up docker container. If you are willing to rewrite stdio, you can exec machine code under seccomp/pledge.

There are even full-blown VM solutions -- Amazon Firecracker, which claims that: "Firecracker initiates user space or application code in as little as 125 ms and supports microVM creation rates of up to 150 microVMs per second per host."

And for people that don't know. AWS Lambda uses Firecracker under the hood.

I dont really understand your claim. You can send code to another computer and it can execute it and send it back. What is the problem?

This seems to come up a lot in these discussions but Moore’s Law (actually just an observation) says nothing about single thread performance.

It only tells you that the number of transistors on on a silicon die will double every 18 months.

If we’re still able to add parallel threads of execution at the same rate then Moore’s Law still holds.

Yeah but how expensive will it be ? Some numbers would be nice for something like hyper parameter tuning

> Software compilation

Well, no. Software compilation does not work massively parallel. Maybe parts of the optimization pipeline, but compilation of 1000 unit program (assuming your language of choice even has separate compilation) does normally require to put the units into a dependency graph (see OCaml, for instance), or puts most of the effort into the inherently serial tasks like preprocessing and linking (C++).

> Note: If you are comfortable with Kubernetes and scaling out clusters for big data jobs & the parallel workloads described below, godspeed!

Probably my pedantic side showing through but I find reading text where ampersand is used in place of “and” really jarring (same for capitalised regular nouns). It seems somewhat common now so I guess I’ll have to get used to it.

Kubernetes is a proper noun, and thus always to be capitalized in English, while the "&" is a ligature of the Latin "et", the word for "and". Not sure why that would be jarring, it's all fine and correct English.

I wasn’t talking about Kubernetes being capitalised. I know that’s a proper noun and deserves capitalisation. I didn’t intend to confuse on this point.

Although & means “and” they are generally used differently. & is used in places like company names where it’s part of a noun (e.g. B&Q, Smith & Wesson). “And” joins parts of a sentence. I find it jarring to use & because a) it looks like a punctuation mark and I naturally pause when reading, b) I expect to have read a noun, not a join in a sentence, and it takes some cognitive effort to re-parse the sentence using & in a way I didn’t first expect. Reading, especially quickly, relies a lot on expectation and pattern matching and I find this disrupts it. If you don’t, good for you.

Obviously in informal speech people write whatever they want and it’s true that language evolves over time. But I’d argue that using & instead of and isn’t “correct”, at least by current standards — if it was we’d see this used in newspapers, books, and so on.

We can't even design systems to properly take advantage of multiple cpu cores yet.

I'm a huge fan of using Lambda to perform hundreds of thousands of discrete tasks in a fraction of the time it'd take to perform those same tasks locally. A while back I used Lambda and SQS to cross check 168,000 addresses with my ISP's gigabit availability tool.[1] If I recall correctly each check took about three seconds, but running all 168,000 checks on Lambda only took a handful of minutes. I believe the scraper was written in Python, so I shudder to think about how long it would have taken to run on a single machine.

[1] https://dstaley.com/2017/04/30/gigabit-in-baton-rouge/

> I believe the scraper was written in Python, so I shudder to think about how long it would have taken to run on a single machine.

Scraping is an embarrassingly perfect scenario for coroutines. Most asynchronous frameworks even use scraping as one of the examples.

In short, it would probably be done in 15 minutes, assuming you don’t get throttled quickly. If the tool wasn’t already async capable, another 15 minutes to wrap some scraping in gevent/eventlet.

Even without async it's pretty easy to slap a concurrent.futures ThreadPool on something normally single threaded and get massive performance gains.

> Basically grep for the internet.

So is this a workaround for "censorship" by Google etc?

And where would the crawl archives come from?

Also, I wonder how this could be made usable and affordable for random individuals.

But Azure Functions existed for years, AWS only now got to that?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact