
AWS Transfer for SFTP – Fully Managed SFTP Service for Amazon S3 - el_duderino
https://aws.amazon.com/blogs/aws/new-aws-transfer-for-sftp-fully-managed-sftp-service-for-amazon-s3/
======
dested
There are a handful of services out there that do this, I know because I've
needed it on multiple occasions. It's nice that Amazon is providing it in
house now, but it just reminds of me the last time I went to re:invent and
walked through the vendor area and thought about how many of these companies
are four dev cycles away from Amazon producing a baked in competitor.

Hard to make a B To B Amazon tool these days.

~~~
ryanSrich
That’s quite literally AWS’s playbook. Fill the vendor hall with tech, learn
it, see what sticks and then crush it.

This is why I’m bullish on cloud agnostic tech. These practices don’t
typically fair well in the enterprise space. This is why companies like MSFT
are interesting to me. They partner and rarely kill. Amazon is the complete
opposite.

~~~
benologist
You can't even sell merchandise on Amazon without significant risk they will
like your numbers enough to muscle you out:

[https://www.amazon.com/AmazonBasics/b?node=10112675011](https://www.amazon.com/AmazonBasics/b?node=10112675011)

~~~
clubm8
I've found anything more complex than an adapter or cable to be of inferior
quality with Amazon Basics, and rarely buy those options anymore.

~~~
degenerate
The mini condenser microphones are actually REALLY good and on par with Yeti
and other $100+ mics.

[https://smile.amazon.com/AmazonBasics-Desktop-Mini-
Condenser...](https://smile.amazon.com/AmazonBasics-Desktop-Mini-Condenser-
Microphone/dp/B076ZSRVFQ/)

~~~
copperx
According to reviews that's a lowly electret microphone capsule ($1) in a big
housing. For podcasting you want a directional microphone. In defense of
electrets, they sound great, but pick up everything.

The shape of the microphone suggests it's a directional microphone (and if you
read the reviews, people think it is directional). That's perhaps not a scam,
but certainly deceptive.

------
dagss
An alternative to doing file based SFTP is to just treat SFTP like an API.

A company I work for implemented an SFTP service where every operation simply
translates to some SQL DB lookup. And a file download kicks off a larger SQL
query and generates the report on the fly, streaming the result straight
through to the SFTP client.

Works great! SFTP can be an API just like HTTP. Under the hood the protocol is
reasonably contained and doesn't require a filesystem backend at all.

Depends a lot on the usecase of course.

See [https://github.com/pkg/sftp](https://github.com/pkg/sftp)

~~~
hemancuso
What if you open an SFTP handle, and then write 5 bytes halfway through a 20
GB file and close the handle? How do you translate that?

~~~
rrix2
From parent: "Depends a lot on the usecase of course."

The usecase that I see most often of SFTP (and hinted at in the parent's
problem description) is generating one-off reports for third parties, or
passing data to vendors who are stuck in the 90s, like financial services
companies.

It's almost always read only (or read and delete), in which case implementing
an API like this is pretty straightforward. Log unsupported commands perhaps
and decide if you want to implement them later.

------
koolba
Enterprises will love this. There are so many legacy app flows kicked off via
sftp/scp file drops. Being able to hook into those via lambda events on the
associated S3 bucket will create a whole ecosystem of enterprise spaghetti for
years to come.

~~~
brazzledazzle
The only downside is whitelisting but not on the SFTP server-side. Many
enterprises restrict egress SFTP (usually for security reasons) so you need to
provide IPs and they can’t frequently change because it can take enterprise
network admins quite some time to deal with all of the bureaucracy and change
control.

That said, I wouldn’t be surprised if modern networking gear can handle CNAMEs
but there’s no guarantee that they’re using modern gear or if they are that
the questionable outsourced team even knows how to deal with the modern
capabilities.

This will certainly help a lot of use cases though.

~~~
blincoln
It's less whether the gear is modern and more about the layer that it operates
at.

A network firewall doesn't see the DNS name that an internal system looked up
in order to make an outbound connection. It just sees the source/destination
IP/port. Processing a rule based on source/destination IP or CIDR and port is
very fast, and all happens locally. Trying to make that device handle rules by
IP address is pretty tricky. Does it do a reverse lookup on the destination
IP? That may not give a result that's even remotely like what the client used,
especially for cloud-hosted destinations.

For a lot of applications (probably including this one), a proxy is a good
approach, because DNS resolution can be delegated to the proxy, and therefore
the proxy can easily apply DNS-based rules as well as IP/CIDR-based rules.
However, proxies tend to make people unhappy because they generally require at
least some configuration on the client side. Microsoft used to sell a
product[1] that made this transparent for Windows clients[2], but obviously
that doesn't help for most modern shops where a lot of the systems are Linux,
MacOS, etc.

[1] Internet Security and Acceleration Server ("ISA"), later renamed to Threat
Management Gateway ("TMG"), now deprecated and approaching EOL.

[2] It hooked into the network stack and rerouted requests based on a proxy
routing rule table. Imagine a centrally-managed proxychains, but with the
system configured to default to check the proxychains config file for every
outbound TCP connection.

~~~
icebraining
I wonder if you could use the DNS resolution cache itself to do the reverse
lookup. As long as the DNS cache lasted at least as long as the TTL, it should
work.

------
hemancuso
I'd be curious how this handles all the posix cases not well suited to object
storage.

Renaming a folder than has a million files/folders inside is a single
operation in SFTP, but 2 million operations on S3.

Does it handle writing at arbitrary offsets within a file? Does it download
the file first then let you start writing?

What about just writing a few bytes at the beginning of a large existing file
and then closing your SFTP handle?

How about 2 users accessing same file via SFTP at the same time?

~~~
NathanKP
I don't work on the S3 team so I can't answer all these questions but relative
to the "2 million operations on S3" question I can point out that S3 now has
batch operation support: [https://aws.amazon.com/about-aws/whats-
new/2018/11/s3-batch-...](https://aws.amazon.com/about-aws/whats-
new/2018/11/s3-batch-operations/)

~~~
burtonator
I think the point being that when a filesystem the mv is atomic and just
updates an inode but on S3 those operations can take place on thousand of
different machines.

~~~
icebraining
SFTP is not a filesystem, though; to get a rename to be atomic, you must pass
SSH_FXF_RENAME_ATOMIC, and the server can return SSH_FX_OP_UNSUPPORTED.

~~~
hemancuso
Nearly any SFTP client would assume a rename would be a near-instant operation
on the server, and would probably fall-over if it took an hour.

------
jopsen
Charged by the hour... pay for an instance.. Tsk tsk..

The thing I love about S3 and cloud services in general is when I pay per
request and can scale through the roof.

Whenever a services is meter by number of instances my interest fades, and I
look for other solutions..

S3 has this very handsoff feeling to it :)

~~~
wmf
AWS should investigate a new concept called serverless.

~~~
abraae
That would never catch on, its obvious there has to be a server somewhere,
people ain't stupid.

------
viggity
We currently pay $250/month through some small vendor for hipaa compliant sftp
hosting (that we transfer a whopping 50kb on a weekly basis). I always felt
like it was a rip off, but azure/aws didn't have their own version. And I'm
loathe to manage a VM. PaaS is my sugar bear.

My eyes lit up when I saw this. We're an azure shop, but I'm not afraid to use
AWS for limited cases. Then I saw - $.30/hr (so, $214/mo). Really? REALLY?

Wouldn't it be comically easy to just the add SFTP as a protocol option for
S3? Why does this need a dedicated VM to run it? (Yes, I know this is PaaS and
you don't manage the VM, but they're essentially pricing it that way)

~~~
rhacker
HIPAA compliance, even on AWS is extremely expensive. I believe the best
vendor to get HIPAA (someone correct me if I'm wrong) is to go with Google
Cloud. Last time I checked did not charge any extra for HIPAA BAA signing.

Edit: I stand corrected on this, AWS no longer requires dedicated hardware for
BAA HIPAA: Sorry I didn't look this up, I had old information.

[https://aws.amazon.com/blogs/apn/aws-hipaa-program-update-
re...](https://aws.amazon.com/blogs/apn/aws-hipaa-program-update-removal-of-
dedicated-instance-requirement/)

~~~
viggity
We're on Azure and they definitely don't charge for a BAA. And it doesn't
appear that AWS does either.

~~~
rhacker
AWS requires you to use single tenancy hardware to be covered by that BAA.

OH I stand corrected:

[https://aws.amazon.com/blogs/apn/aws-hipaa-program-update-
re...](https://aws.amazon.com/blogs/apn/aws-hipaa-program-update-removal-of-
dedicated-instance-requirement/)

------
bmilleare
A tonne of our (enterprise-ey) customers had such trouble trying to integrate
into our S3 flow that we started launching VPS for each that abstracted it
away into simple SFTP upload/download, which they were used to.

Although this is much more expensive than Lightsail, the man hours saved will
make it worthwhile.

~~~
Spivak
Can you elaborate? I mean a plain CentOS server running SFTP, S3FS seems about
as set and forget as it gets.

And each? Surely chrooting users would let you consolidate all of those
servers into one (or one cluster for HA I suppose).

~~~
acdha
> Can you elaborate? I mean a plain CentOS server running SFTP, S3FS seems
> about as set and forget as it gets.

Think about the operational costs: someone needs to manage keys, logging,
security updates, when S3FS coughs a lung and hangs you need to catch that
problem and remount it to restore service, etc. This service reuses the
existing authentication systems so you don't need to spend time configuring
and managing integration with your customers’ LDAP/AD infrastructure, etc. If
you deal with anything which hits PCI, HIPAA, etc. you need to be able to
certify that your custom design meets those requirements as well.

That's not to say you can't do it yourself but for many places there's a
fairly significant amount of work where the cost of doing it yourself is
greater than 5+ years of managed service costs.

~~~
bmilleare
Exactly this. If sticker cost is your leading factor then these kinds of
services can seem crazy, but when you factor in the real cost of self-hosting
then it quickly becomes a no-brainer.

We're more interested in what happens when things break (and who's
responsibility it is) than minor cost savings in calm waters.

~~~
acdha
One other area which tends to get ignored is opportunity cost: if it's the
only thing you do there are many things which aren't that hard to operate but
if they're not a primary function the cost of having to pull someone off of
other projects to handle problems, security updates, etc. is more than the
direct service costs.

------
lsh
We use Cloudgates for our FTP/SFTP to S3 interface:
[https://cloudgates.net/pricing](https://cloudgates.net/pricing)

They're cheap, stable and dead simple to set up. This offering from AWS looks
attractive, but at $.30/hour for the server makes it $219/mo vs $25/mo.

edit: just a satisfied customer

------
hrez
So, how do I make sure my connection isn't MITM-ed? There is no server host
key anywhere to compare. No CA certificate support. Doesn't look like ed25519
is supported either.

Somehow people don't use self-signed certificates all over the web but for
sftp it's "fine" apparently.

~~~
sk5t
For SSH (+SFTP) you are expected/obligated/etc. to have some way to verify the
correct host key. There is no relationship to the clusterfudge of public CAs.
Nor are there x509 certs.

~~~
hrez
But you can't verify it since there is no host key published for sftp service
(at least in AWS console).

------
manigandham
This is why AWS is so far ahead: survey the landscape, find the things they
don't already cover, and come up with a managed service for it. It's usually
not perfect, but it almost always just works.

------
burtonator
This would mean that rsync now works with S3 as it has an sftp target...
correct?

~~~
LinuxBender
You should try the mirror sub-system of lftp [1]. It can replicate rsync
behavior on a chroot sftp server. No idea if that works on Amazon, but I use
it all the time on my own chroot sftp servers.

[1] - [https://tinyvpn.org/sftp/#lftp](https://tinyvpn.org/sftp/#lftp)

~~~
lathiat
lftp is fantastic. The mirror function has a “reverse mode” too

For regular tasks you could also look at “rclone” which is like rsync in many
ways but can upload to s3, backblaze b2, sftp and any more directly. Without
remote support.

------
twodave
So, I spent a couple days a few months ago building exactly this on an ec-2
instance. I have an SFTP service running on an Ubuntu box, it has jailed homes
for users, it's ssh-key-only, uses s3fs to persist things to the correct
buckets, etc.

My only problem with the managed service (which I'd LOVE to switch to tbh) is
I can't for the life of me get it to actually connect and upload a file. I
suspect I'm doing something wrong in IAM, but the tutorials suck and it looks
like IAM isn't even ready for this service yet. I can get a user
authenticated, but it's like it's trying to figure out where "home" is and
crapping out, connection closed. Nothing helpful in the verbose output,
either. Bummer.

~~~
twodave
And to emphasize, the process of simply adding a user to this thing SUCKS. In
my homebrew instance, it's just a matter of generating the key pair and
dropping the public key into a folder on S3. Cron job reads the bucket,
creates new users/homes/etc for anything new, all pasted together using bash
scripts basically. But at the end of the day it's ridiculously simple. I'd
hoped a fully managed solution would actually be simpler (instead of simply
more stable because it's managed, after all).

------
atonse
Ah this would've been so useful 18 months ago. I had to spend MONTHS to
convince a vendor (government) to use S3 to upload (keybase-encrypted) files
instead of SFTP.

And they finally budged. This would've been so much easier.

~~~
dagss
One can also implement SFTP on top of anyting just like HTTP apis if you make
your own backend, e.g.
[https://github.com/pkg/sftp](https://github.com/pkg/sftp)

~~~
crazysim
Here's an example:

[https://github.com/moriyoshi/s3-sftp-
proxy](https://github.com/moriyoshi/s3-sftp-proxy)

No FUSE. Pure Go so it's low on resource usage and high in platform
compatibility. No OpenSSH. No screwing around with Linux users or whatever.
Just a single declarative configuration file. You can run this baby in a
Docker container with some adjustments to the host if you want this on port
22.

I had to sourcegraph GitHub a bit to find this thing. SEO is so bad on this
implementation. I don't know why.

------
narsil
We ended up implementing a REST API endpoint for SFTP to provide an easy way
for web apps to transfer content without having to speak the FTP protocol:
[https://kloudless.com/products/file-
storage/](https://kloudless.com/products/file-storage/)

I can see this being valuable for apps to get user content into S3 more
efficiently from the server-side rather than funneling it through hosted
servers. The one caveat is programmatic user management, which I'm sure is
possible.

------
qwerty456127
> SFTP (Secure File Transfer Protocol)

It's SSH File Transfer Protocol. When you say Secure File Transfer Protocol
many people think about FTP over SSL if you don't emphasize it's about SSH.

~~~
floatboth
> many people think about FTP over SSL if you don't emphasize it's about SSH

Huh? Sure there's always potential for confusion but every time I heard
anything about FTP over SSL (which no one seems to actually use) it's been
called "FTPS"

~~~
qwerty456127
I agree FTPS is the right acronym for this but I had to correct people about
this all the time. So many people actually have no idea SSH does more than
just letting you execute command line programs on a remote server and FTP is
not the only/best protocol to access remote file systems over the Internet.

------
DenisM
It seems there are no web-hooks / callbacks, so you don't get notified when a
new file is uploaded (or someone downloads a file).

Another issue is that if your have to support a partner with SFTP data
transfer requirements you may have to support one with FTP/FTPS requirements
as well. At this point you will have to go to a dedicated FTP server (or
outsource it to another company) anyway, and AWS SFTP service will be
redundant in this scheme.

~~~
vorticalbox
You can set a lambda to be triggered on a file upload, at work we do this for
creating reports.

Lambda dumps mongo data to an s3 bucket which triggers another lambda to
create a csv.

~~~
DenisM
I see. Lambdas are more fussy though, I can’t just trigger an action in my
server, I have to write code that does that.

------
yawz
Unless I'm missing something, this functionality has been on the AWS
Marketplace for a while. We've already used an SFTP Gateway straight out of
the marketplace. This is a tough news for these folks, and generally speaking,
if you're making good enough money off the marketplace, then you're possibly
on the collision course with Amazon's "new" roadmap.

------
blaisio
OMG this will save me so much time if it works. I wish they had this feature a
few years ago!

------
matchagaucho
We use a number of AWS LightSail servers for SFTP today, which mostly sit
idle.

Will definitely adopt this!

~~~
markstos
The AWS pricing page for this service says it costs about $225/month for a
lightly used instance. I implemented the same kind of thing on AWS using a
nano-sized instance for about $10/month. The instance is managed with an
Ansible Role for automated SFTP server management. I connected it with an off-
the-shelf AWS Lamda function which listens for S3 PUT events and copies files
to the SFTP server as needed.

My solution took a little more human-time to setup than the AWS service might,
but once setup, it saves about $200/month.

~~~
scarface74
$200 a month is nothing for a business. Anything that we don't have to manage
ourselves or worry about reliability, scalability, and we can just use our AWS
business support plan is a win.

~~~
secabeen
In large business, sure. In small business and education, a $200/mo commitment
could easily require approval by the owner or the department chair.

~~~
scarface74
The alternative is developer time. Nothing about managed services is ever less
expensive if you don’t account for developer/Devops/netops time saved.

A small company has even more of reason to want as many managed services as
possible. You can avoid hiring netops if you both have a third party managed
service provider to manage your network and you have developers/architects who
know enough to fill in the gaps.

~~~
abraae
> Nothing about managed services is ever less expensive if you don’t account
> for developer/Devops/netops time saved.

I literally can't parse this.

~~~
scarface74
Let me try again.

Baremetal vs Cloud hosting -> resource for resource baremetal will almost
always end up being cheaper.

The only way you save money on managed services is the cost of _management_.
Meaning every hour that someone doesn't have to spend maintaining
infrastructure is a cost savings to the business. Every minute saved by
allowing someone else to do the "undifferentiated heavy lifting" is money
saved.

------
CloudBuddy
If your into plain vanilla SFTP and don't have large storage needs,
[https://cloudbuddy.cloud](https://cloudbuddy.cloud) takes less than 1 minute
to setup.

------
agopaul
Very interesting considering that it enables to make legacy applications work
with S3 basically.

The price is quite high for small projects though: $0.30/hour > $216/m >
$2592/y

------
sneak
This incurs per-hour charges to run the VM that runs sshd, same as running a
micro instance with FUSE S3 would, although with slightly less admin attention
required.

------
jonstewart
Presumably this will handle large file uploads with aplomb? Multipart upload
with s3 can be a pain (when you want someone else to be doing the uploading).

------
martyhu
I'd love to see something for plain vanilla FTP.

Many enterprises still use it, would love to see AWS support that as well.

~~~
merb
please no. ftp is a security nightmare.

~~~
jessaustin
Yes that sounds kind of like "loaded guns for young children as a service"...

------
tonylemesmer
somewhat late to the commenting party but apparently WinSCP can communicate
with S3:

[https://winscp.net/eng/docs/guide_amazon_s3](https://winscp.net/eng/docs/guide_amazon_s3)

------
retromario
Anyone know if this can be used directly from CloudFormation?

~~~
syntheticcdo
New services rarely launch with CF support, if you want to programmatically
create SFTP servers TODAY you could write a Lambda that uses the SDK and
reference that Lambda with a CF Custom Resource.

~~~
retromario
> New services rarely launch with CF support Yeah I'm slowly learning that.

Thanks for the tip, at least it's a step in the right direction.

~~~
syntheticcdo
FWIW I talked with one of the CF devs at re:invent and he said their team's
goal is to have day-one CF coverage of new major offerings going forward, so
we'll see. Maybe next year.

------
xaduha
Not a single mention of WebDAV here, sad.

------
emilfihlman
Anyone became extremely annoyed by the constant "breathing" of Amazon Polly?

------
sigi45
1tb for 40,- dollar?

What? Did i just became stupid?

------
pnutjam
Sftp is natively supported by Linux. I'm surprised this is a thing.

~~~
AnaniasAnanas
it isn't to my knowledge. You need an ssh client that supports sftp.

~~~
pnutjam
Unless you disable it in the sshd_config, it's supported by most Linux
distributions. Yes, you'll need a client, but any modern client supports sftp.
The only tricky part is chrooting the users.

