I wonder why they didn't include Let's Encrypt integration - it's completely painless using the acme library, and that would prevent the whole "HTTP or HTTPS?" discussion around HTTP/2
It's in progress. Algernon is an open source project where I am the main contributor, and I develop Algernon in my spare time. Pull requests are welcome.
Compression technologies, including gzip, obviously have the goal of making things smaller by predicting later data based on earlier data. If the later data looks more like the earlier data, the result is smaller than if it was random gibberish. Compression!
If an attacker controls /some/ of this data, and would like to read /other parts/, they can abuse compression to measure whether the parts they don't know are "like" the part they control, because if they are then the compression will make the results shorter than otherwise which they can passively measure.
It's not a problem to move a compressed object over a secure channel on its own, the problem arises if either you try to compress the channel which is moving objects from different origins (e.g. a cookie set by a random advertising web site and your Facebook password) or compress a composite object e.g. maybe your backups mixed with a file you downloaded from a dodgy "pirate" video site.
In scenario when an attacker can see encrypted message (e.g. monitoring network traffic) and can affect part of the message that is being encrypted, he can use the compression to his advantage. He can for example try different inputs and observe the length of encrypted message, if with the certain input the length drops that means the given input contains string that's similar to another part of the decrypted message and the compressing algorithm did its job and used that to reduce the size.
This was mentioned in 2012 when CRIME[1] (also included BEAST[2] exploit) and later BREACH[2] vulnerability (when it was considered cool to come up with cool sounding name, creating a logo and a website for specific vulnerabilities)
Also, according to the spec HTTP servers may not always honour the value in the `Accept-Encoding` header[0].
> Even if both the client and the server supports the same compression algorithms, the server may choose not to compress the body of a response, if the identity value is also acceptable.
I've actually run into this twice in my career and it has been a surprise to those around me in both cases. Both times in the context of small payloads where the server is applying some heuristic about whether to encode or not. (e.g status page stops sending gzipped output when the server is becoming "unhealthy")
Makes more sense to use the verb "negotiating" with Accept-* headers rather than "honoring".
This makes obvious sense once you consider that the client tells the server which compression formats it supports in every request yet not every data format is compressible, nor does the server necessary support any candidate compression format.
For example, the server wouldn't gzip a jpeg since it's already compressed.
All Accept-* headers are like this. e.g. the server doesn't necessarily support any of the languages requested in the Accept-Language header, but it doesn't hurt to ask. You always have to inspect the response headers to see the result of negotiation.
"Accept-Encoding" means only that the client also understands specific encoding (in this case compression) it is still up to the server to chose what to dot. There was a time when browsers didn't support any compression. This header was introduced to signal to server what is acceptable by the client, that's why the header allows specifying multiple compression algorithms.
Similar thing is with header that's quite useful, but for some reason very few sites honor it: "Accept-Language" browser can specify which languages are preferred, but it is up to server to honor it (for example given language version is not available).
Tomcat, one of the most used Java application servers and the default server behind Spring Boot applications defaults to a "compressionMinSize " value of 2048.
Not a direct answer, but also interesting to consider:
As everything will end up in a packet when sent through the network stack, you might want to choose your minimum input size in such way, that you generate gzip-compressed output big enough. Why big enought? Nagle's algorithm [1]
So yet another reason to think about 'what to gzip'.
Applications that always know exactly what they want to send disable this algorithm, as that article explains, by setting TCP_NODELAY or its moral equvialents in their framework. A web server will almost invariably set TCP_NODELAY.
More sophisticated algorithms can either decide exactly which packets to send, or use TCP_CORK to shove part of a packet into a buffer before they add the rest of the stuff, e.g. preparing HTTP headers and then adding the static document that goes after them.
I don't know about "under 4096 bytes" but I have heard of not compressing data that is under ~1500 bytes. Part of the thinking is this - if your result data (plus HTTP overhead) is already smaller than the data payload of an IP packet (MTU settings come into play here), then you are spending CPU time that will not save you any network I/O time.
For self-contained architecture-specific server binaries, there is no practical difference between 240KB, or 2.4MB, or 24.4MB, or even, at a stretch, 244MB. It's not worth mentioning or optimizing for, except as novelty. I wish people would stop golfing with these numbers.
You have it backwards, apache or nginx size is not for the novelty. Go is just a pig and its size grows every new release, and the size isn't really for being statically linked or debugging symbols, because it's huge even when those options are disabled.
Right now a "hello world" application in Go has comparable size to an OS with full GUI.
You would need find an application where the binary image size is a problem. In an age where 1TB SSDs are 300 USD this will be ... challenging. You are able to have about a thousand gigabyte sized images on that one SSD and I suspect you will hit CPU and I/O limits well before you have a thousand different binaries running for real.
Media storage would go on spinning rust disks anyways separate from the SSD(s).
- increases amount of time to fetch and run container (it actually is quite noticeable when you have app that scales out and you updating it)
- it increases amount of storage to store multiple versions of containers (when you have an internal app and do frequent releases it adds quickly)
- it increases amount of data transferred on every deployment
- increases amount of memory used (the whole point of containers, was to efficiently use hardware (BORG), although a lot of people today miss that reason and run containers on VMs)
True. Luckily, most Go binaries can be upx'd ( https://upx.github.io/ ) for a fraction of their original size. Just put it into your Dockerfile as a part of the build process.
This works and helps with storage/transit. Word of warning though that many AV like to flag UPX'd executables (only important if it's not an internal tool), and it'll take even longer for it to start, and will use more memory. My understanding is it essentially does standard compression on the executable and appends the decompression part on the front of it.
You have to ask yourself what is that space used for??
They're all compiled languages doing (roughly) the same operations. There is an order of magnitude difference in the number of instructions in one compared to the other.
Apache is more likely to be entirely cached whereas the others aren't.
Size matters for performance.... if you're not CPU bound, fine, but to say its immaterial is naive.
Even if we don't know which particular problem it might cause right now, being wasteful when you can avoid it is never a good idea.
Image size is currently not the most important metric, but - guessing how your average node.js package already looks today - demanding people ignore it completely will probably set you on the road of multi TB images that also contain the developer's favorite desktop environment in the medium future.
Very civilized comment from mr. Caddy himself.
Heartening, not least considering the kind of vendettas some projects - Caddy among them - have to put up with from competing developers.
Well yeah it's definitely an interesting approach. I guess it's much more lean than running a whole Ubuntu container VM that has all of these things installed, or running a gigantic bloated Javascript toolchain just to convert the scss or md files to css or html.
As someone else commented, the huge size of Go executables is down to a design decision to include a map of functions for panic reporting. There was a whole discussion on this recently on HN.
I don't know why the grandparent was downvoted. Go binaries are not small and the claim that this is a "small" single executable is untrue.
Hopefully the Go team will give us a flag to decide for ourselves whether to optimise for executable size or initialisation time. I know I'm fed up of uploading 50Mb files over dodgy wifi+vpn connections to update my server.
I don't think it is fair that you are being downvoted.
Size does matter and not just in sense that it is using resources. The largest part of the 24 MB probably never gets executed but it adds unnecessary complexity that may hide bugs and security flaws.
On my FreeBSD machine, all files in the entire apache package (including modules, manpages, headers, default pages, graphic for displaying directories in gif and png formats, tools are) take 4.3MB
Good question. I'm not sure if it excels in any scenario. There are specialized web servers that excel at caching or at raw performance. There are dedicated backends for popular front-end toolkits like Vue or React. There are dedicated editors that excel at editing and previewing Markdown, or HTML.
I guess the main benefit is that Algernon covers a lot of ground, with a minimum of configuration, while being powerful enough to have a plugin system and support for programming in Lua. There is an auto-refresh feature that uses Server Sent Events, when editing Markdown or web pages. There is also support for the latest in Web technologies, like HTTP/2, QUIC and TLS 1.3. The caching system is decent. And the use of Go ensures that also smaller platforms like NetBSD and systems like Raspberry Pi are covered. There are no external dependencies, so Algernon can run on any system that Go can support.
The main benefit is that is is versatile, fresh, and covers many platforms and use cases.
For a more specific description of a potential benefit, a more specific use case would be needed.