
AWS Data Pipeline - ing33k
http://aws.amazon.com/datapipeline/
======
23david
AWS is slowly becoming the Oracle of our generation, in the sense that they
have found a way to lock startups and large companies into a software/services
ecosystem that is really really hard to stop using once you get started.

You start with regular open-source instances, but that's just the hook. Once
you have EC2, it's really easy to get started with AWS 'magic' services like
Elasticache and RDS. It's easier than setting up a memcache cluster or mysql
right? But once you get comfortable with those services, it's just so easy to
keep going down that road and making your software reliant on proprietary
services like SimpleDB, S3 and AWS Data Pipeline. And then you wake up at some
point and find that you're 100% dependent on AWS.

By that point, if you're lucky your monthly AWS bill gets you an invite to
speak at the next AWS conference. :-) You might even get a personal customer
support rep that calls you when your servers go down.

A website/service cannot by definition be HA if it's reliant on one service or
infrastructure provider. AWS has so many proprietary parts now that you really
need to be careful which ones to use so that you don't wake up one day and
realize that you're completely dependent on AWS.

I'd stay away from this with a 30-foot pole, but if we really did need to use
it, I would only use the features that I felt comfortable building internally
at some future point if we chose to move off of AWS.

It's important to keep your software stack as flexible and open as possible,
and for risk-management you should plan on using (or least having the option
of using) multiple vendors and service providers.

~~~
donavanm
"A website/service cannot by definition be HA if it's reliant on one service
or infrastructure provider." you seem to be conflating highly available with a
diverse supply chain. A _lot_ of highly available systems are "locked" in to
one provider, whether it's broadcom/citrix/intel/etc.

~~~
mapt
Does all Citrix hardware for a region go down simultaneously?

~~~
donavanm
Actually, yeah. How about worldwide nxos crashes due to the leap second bug?
Or the various poison bgp updates that've made the rounds? Or overrunning an
ospf domain? Anyways my point, if you read the sentance after my quote, was
there's a distinction between sole source provider and "ha". Multi source
supply is due diligence. But it's not a perquisite for or solution to high
availability systems.

------
balakk
A whole lot of glue-job VMs just became unnecessary.

------
mcos
Just this week I was looking for a better solution that would back up my RDS
database to S3. I'm currently using mysqldump, but the RDS instance size has
grown extremely large and so, it has become unwieldly. Hopefully this will
help with that.

~~~
mseebach
It might not be appropriate for you, but a good way to handle MySQL backups is
to maintain a mirror. This has the added benefit of being available as a fail-
over and as a secondary instance where you can run reports or test long-
running queries on current data without the risk of taking prod down.

~~~
jacques_chester
> _It might not be appropriate for you, but a good way to handle MySQL backups
> is to maintain a mirror._
    
    
        DELETE * FROM business_critical_data; WHERE obsolete = true;
    

You were saying? :D

~~~
iamjustlooking
You can run your daily/hourly backups on the mirror and not impact performance
on your main database.

~~~
jacques_chester
Right. But the grandparent comment was suggestive of the possibility that he
or she wanted the mirror to fulfil multiple roles, including being the backup.

------
jacques_chester
It's a mainframe in the cloud.

------
gourneau
Dear AWS hire a designer. Thanks.

~~~
Raphael
The AWS Management Console was recently redesigned with Bootstrap.

------
alexpopescu
ETL-as-a-Service

------
ucee054
You shouldn't really be trusting Amazon with your datawarehouse or paying that
much for the storage, but from a technical convenience standpoint AWS is
probably the best solution for some of the horrid little inept kinds of
organizations that I have encountered.

~~~
donavanm
Totally. I know I create lots of business value when I spend a day dicking
around with mysqldump and rsync and inotify and scp and hfds. Who would want
to use this kind janitorial service when the could do it themselves?

