Show HN: MASV – Send 20GB+ files up to 1 Gbps. Pay-as-you-go and custom branding

jermaustin1 · on May 30, 2018

Do you chunk the upload so if I lose internet for a second, it can just upload the chunks that failed?

This is an issue I have with one of my employees. They only get consumer Comcast, and the connection is crap. I ended up writing a bash script that will chunk and scp the files to a central server, and if it fails a chunk, it leaves it in the directory for her to manually re-run.

That said, I pay $20/mo for the file server (which i would no matter what) and it has 3TB of outbound bandwidth and unlimited inbound, and is less than 5% of your cost.

davehorne · on May 30, 2018

Although we do chunk the upload and send the chunks in parallel to the server we do not at this point have the recover functionality implemented. We are able to recover from a disruption in internet connectivity with retry logic in javascript but if your computer crashed completely we would not recover. That is on the roadmap and expected to be released in the next couple of months though.

davehorne · on May 30, 2018

In terms of cost there are certainly cheaper ways to accomplish transfers. MASV is designed to be very fast and we have 9 servers across the world to enable transfers anywhere. If your doing local transfers your own file server is likely good enough but if your sending hundreds of GB's cross country or across the world you will likely run into major performance issues. It really comes down to how much your time is worth and if file transfer are a consistent requirement for your business. This is certainly intended to be a B2b tool less for consumer use.

microcolonel · on May 30, 2018

> I ended up writing a bash script that will chunk and scp the files to a central server, and if it fails a chunk, it leaves it in the directory for her to manually re-run.

Use Syncthing, it traverses NATs, it is trivial to set up, it is seemingly very secure by its nature, and crucially for you, it transfers in blocks. You can administer it through a web interface with credentials (or with none, if it's just listening on loopback).

davehorne · on May 30, 2018

A lot of why our users enjoy MASV is because it's dead simple to use for non-technical users. Not that syncthing is very hard but as a video pro dealing with clients the easier it is for them to just click a download button the more likely it will happen. Also because it's pay-as-you-go it's easy to charge back the transfer cost as a part of the project bill.

microcolonel · on May 30, 2018

Most certainly. Another major benefit is that you folks provide storage, which inherently simplifies administering something like this. One benefit of Syncthing/BT Sync that I have not seen emulated by other vendors is peer-to-peer transfers, which can be considerably faster if both peers are on the same network or an adjacent one (including one on the same provider, or connected to the same exchange, or if the central service has no nodes on the same continent as the peer). The performance is just unmatched for such transfers, in my experience. A peer-based protocol could be a considerable help even if you don't intend on allowing direct peer-to-peer transfers, because (if implemented with sufficient care) it can allow you to choose better nodes to maximize throughput and integrity.

I think there is a very legitimate need for centralized administration, I have all of my Syncthing node configurations synchronized over Syncthing itself (which is a bit of a risk, but not overly so), but I would consider paying for something (if I could have similar confidence and flexibility with the client software) more integrated, with some safeguards against accidentally disabling configuration sync (which would require me to directly configure the node) to a specific node, and some integrated ability to have nodes self-report and self-allocate.

davehorne · on May 30, 2018

Absolutely agree. We really built our network out to be independent of our file transfer tool. It creates routes across any cloud provider and can integrate into any cloud storage provider. It also uses machine learning to determine which routes are optimal based on previous performance and time of day. I see a future where you could use our network for many different products such as an accelerated VPN, streaming live content, or as you pointed out as a pass through network for a P2P connection if both endpoints are available but it would route across our network to avoid congestion. Potentially even being able to make a copy of the transfer in the cloud for archive. Even P2P relies on routing across the public internet which is prone to congestion and is set up to do least-cost routing for the ISP's. Usually it's less of an issue though because the protocols help ensure it pushes through that congestion. Lot's of cool possibilities for this in the future and nowhere near enough time!

jermaustin1 · on May 30, 2018

I'll have to check that out. I was just desperate one afternoon after dropbox was failing to sync because of connectivity issues, and I knew bash just well enough to split, scp, rm, repeat.

rocmcd · on May 30, 2018

You may also want to look into using rsync as an alternative to scp. rsync is much smarter about how it syncs data, so it should be better for your use case.

davrosthedalek · on May 30, 2018

Look at rsync. rsync -avP file host:/path/on/host Repeat if it fails. It will not retransfer already transferred data.

Retric · on May 30, 2018

Is your 20$/mo server 1Gbit? At a fairly common 100Mbit these file transfers would be 1/10 the speed and also need to be downloaded. Even slower if you need to send to multiple people.

Running bit torrent let's peers share the data you upload and get the same net effect as this. But, that's not as far as I know a browser plugin.

PS: I don't think it's a big deal, but executed well it might be a valuable niche.

jermaustin1 · on May 30, 2018

It averages 400Mbit upload and 350Mbit download, not that it matters to me personally, as my employee can rarely upload faster than 4Mbit.

That said, their 1Gbit connection will probably average around the same speed if they have more than a few users uploading 50GB files during normal business hours.

davehorne · on May 30, 2018

Just to clarify we will fill a 1 Gbps pipe but our servers are provisioned to do much more than 1 Gbps. If many users are uploading at the same time we scale up beyond 1 Gbps that's not our network bandwidth maximum.

jermaustin1 · on May 30, 2018

That's good to know. 1Gbit is our "max" but that works just fine for the company's "network share" If we ever were to start hitting our bandwidth caps, I can pay to increase those, but the speed is pretty much fixed without changing providers. If speed were to ever be an issue, I would probably look at a different service like yours. It would be a massive hit at your current pricing, though.

justinsaccount · on May 30, 2018

huh? Why didn't you just use rsync?

martin-adams · on May 30, 2018

I presume you know that rsync is a better solution to the problem. Not all of us do. Would you mind explaining why rsync is a better solution?

vortico · on May 30, 2018

For starters, it's a proper replacement for "a bash script that will chunk and scp the files to a central server, and if it fails a chunk, it leaves it in the directory for her to manually re-run". It's installed on almost all Linux and Mac systems, so if your customer is using a script already, they might as well use rsync.

seszett · on May 30, 2018

> Would you mind explaining why rsync is a better solution?

rsync basically chunks the files and sends the missing (or differing) chunks over scp, leaves the files in the destination directory if something fails, it is more or less an automated tool that already does what the GP is talking about.

mbesto · on May 30, 2018

> it is more or less an automated tool that already does what the GP is talking about.

But requires someone to be comfortable using a CLI.

j1elo · on May 30, 2018

But that 'someone' is someone who is already running a Bash script and re-running it each time an error occurs. So I don't see the problem here about using a CLI.

davehorne · on May 30, 2018

For technically capable people there are certainly lot's of good options for sending lots of data fast. We are just trying to make a service that makes this more accessible and requires no setup. As a side note we are currently working on an API and we intend to integrate with rsync so it can be used to upload content to our network.

mbesto · on May 30, 2018

I read it as "I'm running a bash script for someone else so they can pick it up from a server", not "my employee who has this issue created a bash script"...maybe I misinterpreted.

teilo · on May 30, 2018

Because we have to deal with people as they are, and not as how we want them to be.

This is for sending large assets from party A to party B, where either the parties do not have access to a shared rsync-capable server, or they do not know how to use rsync. On a deadline, you can't afford to become a CLI trainer or require special software.

A common use case would involve service bureaus and creators. Trust me when I say that there are a hell of a lot of creators out there who would never be able to manage rsync. You want to make stuff like this drop-dead simple for people. That means a service that works in-browser.

davehorne · on May 30, 2018

Couldn't have summarized it better myself. I like to think we are just making cloud services more accessible for non-technical people and adding a tax to that. There is always ways to get something cheaper by rigging up a setup yourself but for creators that is time that could be spent on creating instead of infrastructure or tooling.

justinsaccount · on May 30, 2018

I was asking the person who said

> bash script that will chunk and scp the files to a central server, and if it fails a chunk, it leaves it in the directory for her to manually re-run

not responding to the linked post itself.

jermaustin1 · on May 30, 2018

Mainly because I knew how to use split a file and use scp already. I didn't feel like researching the options and syntax of a new CLI tool.

halbritt · on May 30, 2018

I use a javascript library that works in the browser and supports S3 file copy chunking and resuming. Host that in a simple S3 bucket, create a target bucket with acceleration enabled and bob's your uncle.

It's very efficient. I've never seen it fail to consume all of the available upstream bandwidth from the client. If this is of interest, I can go dig up the library I used.

scarface74 · on May 30, 2018

How is this better than CloudBerry + AWS S3? Cloudberry allows you to map a drive and makes an AWS S3 bucket look like a standard Windows drive. It also supports multi part upload and supports retrying failed chunks.

It's intuitive, built on top of AWS and you can take advantage of all of the other AWS features -- i.e. send an email notification when a file is received, supports versioning, fine grained access control. as much storage space as your budget will allow etc.

CloudBerry drive has a one time cost of $40.

stephengillie · on May 30, 2018

Is there a way to transfer ownership of an S3 bucket? Like, can I upload a file in my AWS account, and by some means transfer the bucket and files to your AWS account? Without the actual data bits moving and incurring a bandwidth charge? That would be ideal for some scenarios.

scarface74 · on May 30, 2018

From the CLI you can copy files to another bucket in another account owned by someone else within the same region without any bandwidth costs. You only pay for the copy requests. The copy request is $.01 per 1000 requests.

You could also create a lambda function that is triggered any time a file is copied with a certain key prefix (like a directory but not really), that copies to the other account. Of course you both have to set up permissions.

https://serverfault.com/questions/349460/how-to-move-files-b...

auslander · on May 30, 2018

Use Signed URL feature - the link is valid for set period of time.

docs.aws.amazon.com/AmazonS3/latest/dev/ShareObjectPreSignedURL.html

stephengillie · on May 30, 2018

This is a URL that expires, yes? What if I wanted to close my account, but let them retain the bucket and data?

auslander · on May 30, 2018

Just one of the ways to share. Or you may set bucket replication to target account, or just aws s3 sync two buckets...

stephengillie · on May 31, 2018

Thanks. Those require 2 buckets, one in each account. The one-bucket solution is elusive.

halbritt · on May 30, 2018

CloudBerry is an application that needs to be installed on the client workstation, something which many corporate IT departments do not allow.

scarface74 · on May 30, 2018

If you're working for a company whose business is to transfer massive files, the IT department should be more than willing to pay for CloudBerry. They would need to set up the AWS account and S3 permissions.

It should be a lot easier sell to an IT department an AWS + Cloudberry solution than an unknown company's solution.

halbritt · on June 2, 2018

My point was that a browser-only solution is more tenable than installing an application in many cases.

I get files from customers in Fortune 100 companies and many of their IT departments absolutely prohibit the installation of something like Cloudberry. It was explained to me that getting IT to approve the installation of a particular application was a multi-month endeavor.

chaosprophet · on May 30, 2018

How does this compare against existing services? Wondering since we use box for file sharing at our company, and file upload speed has never really been an issue for us.

davehorne · on May 30, 2018

It depends on which service you compare us to but you could almost think of us as the WeTransfer for large video sharing. We don't have any file size limits, we retain full folder structures and create dynamic zips designed for each target OS, there are no plugins, it's pay-as-you-go so it fits nicely in with project based businesses, custom branding sets us apart from some of the cloud sharing tools, and the interface is intentionally simple to use. We plan to integrate with video specific tools in the near future as well such as Adobe Premier, Final cut pro, etc.. which should make our positioning clearer. When your dealing with sending terabytes of data month small improvements in performance become much more important to your overall business scalability.

sgt · on May 30, 2018

That was my first thought as well. There are tons of file sharing services that already exist, but I think their advantage here is allowing people to send files larger than 20GB. Using DropBox, the limit is 20GB or smaller.

stephengillie · on May 30, 2018

At some point, it has to be cheaper and logistically simpler to transfer ownership and not transport the bits. Or the old fallback:

Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway. - Andrew S. Tanenbaum

s/station wagon/"40 foot trailer";s/tapes/disks

davehorne · on May 30, 2018

Absolutely we don't try to hide that information you can try out our file transfer calculator here https://www.masv.io/file-transfer-calculator/ - This compares MASV to shipping hard drives and uses real rates from the fedex api. The way I like to think of it is it's much more convenient to transfer online instead of shipping hard drives as there is just less to deal with and it's harder to scale. So if MASV is faster or not much longer then shipping a drive or hand delivering then it's time you can spend on taking on more projects.

davehorne · on May 30, 2018

You got it. In general you can send as much or as little as you need with MASV because it's pay-as-you-go which means you don't have to manage fixed storage and it scales up and down with your work. If you don't use it the next month it costs nothing.

ca6d8815 · on May 30, 2018

Google Drive allows files up to 5TB https://support.google.com/drive/answer/37603?hl=en but does not give any guarantee you will not breach some magical limits set.

I guess MASV is transparent about file sizes (no limit) and bandwidth is the resource you pay for, so there is transparency on the SLA.

sagacity · on May 30, 2018

It seems to be targeted at content creators that routinely have to send very large (~1TB) files. If Masv can do this in a fairly sane way, without issues like "whoops, failed at 99%, try again!" then that's quite a good niche to be in.

Especially since there are plenty of post-production companies that host their own 'customer portal' where files can be uploaded/downloaded, which probably has plenty of security issues and so on. This can then be replaced by a branded Masv page, I guess?

davehorne · on May 30, 2018

Exactly correct.

auslander · on May 30, 2018

Tried to make sense of one their patents: - patents.google.com/patent/US8548003B2/en

Data transmission units (data units) from the source network are received at an encoding component logically located between the endpoints. These first data units are subdivided into second data units and are transmitted to the destination network over the transport network. Also transmitted are encoded or extra second data units that allow the original first data units to be recreated even if some of the second data units are lost. These encoded second data units may be merely copies of the second data units transmitted, parity second data units, or second data units which have been encoded using erasure correcting coding. At the receiving endpoint, the second data units are received and are used to recreate the original first data units.

Got brain damage. What, ffs, does it mean? Help :)

stephengillie · on May 30, 2018

Sounds like they cache your upload on a server physically near you, and use something like BitTorrent to haul it to a server physically near your destination. Sort of like a CDN for file transfers.

This way they can reduce the number of hops (and potential bottlenecks) between their server and destination, maximizing bandwidth.

davehorne · on May 30, 2018

I'm just the marketing guy :) but I will ask the engineer who wrote it and get back to you.

auslander · on May 30, 2018

Please, do. I wonder if AWS with their aws s3 cp and linuxes with rsync are infringing this patent.

gruez · on May 30, 2018

>$0.15/GB

that's pretty expensive

progx · on May 30, 2018

Worst case amazon price (uncompressed + Inter-Region cost + transfer acceleration) is around $0.12, so $0.03 for the Service

davehorne · on May 30, 2018

Exactly correct we are running on Azure also so it's slightly more expensive. The idea is eventually as we load the service with users our cost will come down and we can pass some of those saving on to our users. We have built out a network that can run on any cloud so in the future we will be able to further optimize cost by finding less expensive routes that have the same level of performance.

Matheus28 · on May 30, 2018

AWS/GCP/Azure bandwidth is already incredibly overpriced. You can easily find $0.005/GB and less from other providers, if you’re only interested in server hosting.

For many applications, you can host servers elsewhere while using AWS for S3, DynamoDB, etc.

davehorne · on May 30, 2018

We wanted to stand up a high performance service to start and took on extra cost to do so. We intend to continuously optimize our system to maintain that performance but access it through more affordable options. Even just being smarter about how we spin up and down servers could reduce costs for us outside of also adding other providers to the mix. Ideally once we have some mass of users we will be able to get better pricing and optimize our cost further which should enable us to pass on savings.

adriansky · on May 30, 2018

Is this price for monthly bandwidth or for storage?

davehorne · on May 30, 2018

The pricing includes 10 days of storage and the cost of the bandwidth. Today an upload can be downloaded many times but in the future we will be deploying download based billing which will charge you for the data downloaded. We are also working on a pay-as-you-go storage option so you can extend the expiry beyond 10 days and pay some low cent amount per GB for the data stored.

willejs · on May 30, 2018

Aspera uses UDP to work around issues sending large files over long distances on big pipes, which TCP usually isn't great for (1). How does MASV handle this over TCP? Do you have some 'tcp accelerators' in between or it is just heavily chunked content?

1. https://paulgrevink.wordpress.com/2017/09/08/about-long-fat-...

davehorne · on May 30, 2018

Yes to the TCP acceleration. Our parent company LiveQoS is a networking technology company. We use our own TCP acceleration software to enable faster downloads across the clouds. Upload acceleration is handled through chunking and by having many servers in major locations globally reducing latency.

zimbatm · on May 30, 2018

Can you talk a little bit about the implementation details?

I assume that it's using WebRTC data channels to upload the file. WebRTC includes STUN / TURN to bypass firewalls so it makes it a great fit.

My guess is that the performance comes from chunking the data and sending it through multiple data channels. The other performance is to bring the receiving client close to the sender to reduce ACK latencies. I don't think that WebRTC allows to develop custom UDP protocols.

davehorne · on May 30, 2018

Happy to. We are not using WebRTC. Uploads are accomplished by geo-locating the user to one of our 9 global servers closest to them then chunking the data and sending through multiple data channels using javascript in the browser. On the download side when a user initiates a download it geo-locates the downloader and calculates the optimal route across our cloud network. Our network then uses our own in-house TCP acceleration technology to stream the data quickly across the cloud and to push the data fast from our exit edge server to the end users location. We use a combination of parallel connections, TCP acceleration, premium middle mile networks, and reducing latency to maximize performance.

kyleperik · on May 30, 2018

How come we don't have a good P2P file transfer system that everyone can use now? Having a server sit in between just passing it along seems like a waste.

jermaustin1 · on May 30, 2018

I want as few inbound ports open as possible. I'm assuming so does every person. So the simplest solution is to instead we have 2 outbound connections that talk to a middleman.

Are their better ways, of course, but this is simple to wrap your head around and can be accomplished by anyone with even the most basic technical understanding.

jchook · on May 30, 2018

BitTorrent is pretty popular.

stanfordkid · on May 30, 2018

This looks really well designed -- but the performance numbers seem slightly suspicious to me.

Both Dropbox and Aspera do UDP + compression + retry so very curious to see how you are able to get such a drastic improvement -- are you using a special form of compression?

davehorne · on May 30, 2018

Hey! In terms of maxing out your bandwidth there are a few options. Aspera uses UDP which adds some overhead to the transfers and requires you to install plugins or software to connect both ends. The way we max out your bandwidth is by reducing the latency between you and the server closest to you then using javascript to push multiple TCP flows at the same time which also can effectively fill your pipe. On the download side of things we route the data across premium cloud networks and use TCP acceleration in the cloud to enable fast downloads. TCP acceleration tech only needs to be on the sending device so it requires no plugins on the download side versus UDP. Not having plugins is a big benefit though because there are less firewall concerns and it means you can use our service in more restrictive IT environments.

jedisct1 · on May 30, 2018

Looks appealing, but when I click "Start free trial", nothing happens, except "ReferenceError: cli_show_cookiebar is not defined" in the console.

Not a good start.

zhan_eg · on May 30, 2018

Didn't have issues with the starting the trial on FF 61 Beta, with uBlock and Privacy Badger enabled.

The service looks fairly good, but the speeds are not impressive. It looks Azure is used (and you can test your speed on it here http://www.azurespeed.com/Azure/UploadLargeFile)

davehorne · on May 30, 2018

Just to be clear we are not just using the default Azure setup. We have our own proprietary TCP acceleration technology in use on the Azure network and have 9 servers globally enabling acceleration. This test just shows you regular upload to Azure not a routed upload through our network.

davehorne · on May 30, 2018

Sorry about that we will look into it right away. The engineers building the web app are much better then my website coding. We will try to reproduce the issue and get it fixed + let you know.

thecatspaw · on May 30, 2018

Might be an adblocker blocking a script?

FrozenVoid · on May 30, 2018

Why not use torrents? Its both cheaper and scales with the number of recipients.

davehorne · on May 30, 2018

Torrenting requires software and seeders which means your throttled by the seeding parties internet connection. Torrenting can be a good option but you can run into port issues or for restrictive IT policies it can be blocked. ISP's frequently throttle torrents as well. It all depends on what environment you have and the network of the recipient. MASV is intended to require nothing but a browser and enable transfers in a clean, easy to use user interface. In other words it's client-proof which many freelancers or service companies would understand it's not meant to be a replacement for major repeatable workflows that are happening from the same controlled networks it's meant for adhoc deliveries from on-set locations or project based work.