
Show HN: XPress Compress v1.0 – A config-less compression algorithm - zelon88
https://github.com/zelon88/xPress
======
kstenerud
"What makes xPress stand out from the crowd is that no configuration data gets
embedded into an .xpr archive. So it doesn't matter what version, OS, or
architecture, you used to compress your file."

I don't understand this. For any archive format in modern use, it doesn't
matter what version, os, or architecture was used to compress the file. I can
decompress a zip or lha file created on an Amiga 500 no problem, 30 years
after the fact. Is there something I'm missing?

~~~
zelon88
Sorry, the wording on the repo could probably be revised.

The zip header is a required part of the archive, and without it the rest of
the data could be lost forever. .xpr archives have no header and (currently)
no offsets. The compression settings can be inferred from the data inside the
archive itself, making headers that describe the data unnecessary.

And while it's true that you can decompress old archives with new hardware,
you can't always decompress a new archive on outdated software. Sometimes even
with up-to-date software it's possible to sometimes create an archive with 7z
on Linux that WinRar on Windows will not be able to open.

------
ComputerGuru
I don't get it - this says it's new, but MS has had an LZW-based compression
algorithm named Xpress since forever, easily found by googling. Is this really
just a very bad name clash?

[https://docs.microsoft.com/en-
us/openspecs/windows_protocols...](https://docs.microsoft.com/en-
us/openspecs/windows_protocols/ms-xca/a8b7cb0a-92a6-4187-a23b-5e14273b96f8)

~~~
zelon88
This is just an unfortunate coincidence. I should have been more diligent.
This started as a learning experiment that I honestly never thought would make
it this far.

------
gliptic
"The xPress algorighm is very similar to LZW, but currently not as efficient."

LZW is not a good compression algorithm in this day and age. Why was this
chosen?

~~~
zelon88
Honestly the xPress algo came before the comparison to LZW. There was a
comment by someone else that made me notice the similarities.

I made this to learn about compression. Partly to see if it was possible and
partly to learn more about what makes compression work. I wasn't specifically
trying to "beat" anything currently on the market. Just learn and see if
there's any potential here.

------
aaaaaaaaaaab
>Decompression requires nothing special configuration-wise. The dictLength is
inferred during decompression. This means any config settings are
decompressible without knowing anything about how the file was compressed.

I don't understand this. Can you show an example where an XPress compressed
file is better in terms of portability than a ZIP file?

~~~
zelon88
Some zip clients may write archives using invalid settings that render the
file un-recoverable by other utilities.

For example, check out this cheat sheet...
([http://kb.winzip.com/kb/entry/313/](http://kb.winzip.com/kb/entry/313/))

None of that data goes into an .xpr archive. So there's no ambiguity between
different clients about how to compress or extract a file. There is only one
way to decompress ANY .xpr archive; to search the file for instances of the
dictIndex and replace them with the corresponding data. When you run out of
matches; your file is rebuilt.

------
benj111
Look up option parsing libraries (not sure what the state of the art is in
python at the moment).

Options shouldn't be order dependant like that.

Have you got any comparisons to other algorithms

~~~
zelon88
I cringe everytime I look at that code block.

I eventually want to put that code into a loop. That will probably come at the
same time as relative path support.

Just know that I'm not proud of argument handling in its current form and
improvements are on my radar.

~~~
benj111
Yes I thought it was an interesting code block :)

Good on you for releasing it though.

I would seriously look at using a library rather than rolling your own.
Sorting all the edge cases is a pain, they can automate help messages etc, gnu
style long options etc. Unless of course you want to delve into the weeds of
option parsing.

~~~
WorldMaker
Since Python 3.2 argparse has been in the Python Standard Library:
[https://docs.python.org/3/library/argparse.html](https://docs.python.org/3/library/argparse.html)

