
Apt 2.0 - jcastro
https://blog.jak-linux.org/2020/03/07/apt-2.0/
======
Aloha
This item: "The apt(8) command no longer accepts regular expressions or
wildcards as package arguments, use patterns (see New Features)."

Seems like a pretty big breaking change.

~~~
systemvoltage
Also feels like a regression. What's wrong with keeping that option? Regex is
universal way of matching strings.

~~~
tssva
Apt patterns allows matching on criteria besides strings such as searching for
all packages with broken dependencies. It also allows combining different
search criteria in various manners which would be difficult or impossible
using just a regex search. If you want to just match packages using a regex
against package names that is still possible using the name apt pattern.

I would suggest reading the apt-pattern man page referenced before passing
judgement.

~~~
systemvoltage
> difficult or impossible using just a regex search

I am sorry but this isn't true. I can prove it.

Regex is used to parse the program itself that runs APT, this is called
Lexical analysis[1]. GNU Bison and Flex [2] are lexical analyzers for example
that compile a lot of code we write today. So whatever pattern matching code
was written in C as part of the APT 2.0 package, that code itself is read and
compiled by regex. So, it provably cannot be impossible to do a match using
this new feature you're referring to that cannot be matched using plain old
regex. Difficulty is subjective depending on who the target demographic is.

All these things are actually moot - it doesn't cost you anything to keep
Regex option to search for some package names, perhaps a bit of additional
code to maintain. Also, other means of pattern matching can be added _in
addition_ to regex search feature. Why not both?

[1]
[https://en.wikipedia.org/wiki/Lexical_analysis](https://en.wikipedia.org/wiki/Lexical_analysis)
[2]
[https://aquamentus.com/flex_bison.html](https://aquamentus.com/flex_bison.html)

~~~
saagarjha
You're missing the point and being patronizing while doing it. Regex search
can only match package names. APT's flags lets it match on other attributes as
well, which a simple regex cannot interact with.

------
Avamander
I wonder when will apt will start supporting specifying packages per
repository. It's not safe how any package could come from any repository by-
default, unless annoying effort is spent manually pinning packages.

~~~
JoshTriplett
If you install a package, that package can do arbitrarily bad things to your
system; a repository you don't trust is never "safe".

~~~
asdfasgasdgasdg
I can think of at least one vector. Suppose a particular library announces one
generic repository as their approved distribution channel. Suppose the repo
does not do any vetting on what gets uploaded. In this circumstance, you'd
only want to pull that particular package from the repo, not any arbitrary
package.

So I think the trust model bears scoping. It's not all or nothing.

~~~
geofft
Is there any apt repo software that permits multiple parties to upload to a
repo but also enforces that people can't upload new versions of other people's
packages?

As it happens, Debian itself kinda does this (Debian "maintainers" can only
upload to specific package names), but I don't think anyone else runs that
software. The usual third-party repo tools like reprepro and aptly don't
support it, as far as I know. And sites like Launchpad or OBS just set up a
separate apt repo per account (or even multiple apt repos per account),
because doing that is very easy.

In other words - yes, the trust model you propose is coherent, but I don't
think anyone actually does that, because there's a more straightforward option
already.

~~~
andrewshadura
Actually, both reprepro and aptly support this.

------
sneak
Does it still redownload 30MB and spin for 10s when you (or an installer
script) do “apt update” a few seconds after the last invocation of same
instead of comparing a root hash or some other sane method?

~~~
throwaway8941
Check your repository and/or apt settings.

apt has had support for pdiff for what, two decades now?

[https://debian-
administration.org/article/439/Avoiding_slow_...](https://debian-
administration.org/article/439/Avoiding_slow_package_updates_with_package_diffs)

I am no fan of it though, dnf is just so much better in all respects.

~~~
sneak
I’m using archive.ubuntu.com and whatever the default apt settings in Ubuntu
LTS are. I feel like a ton of debian-style packaging is simply “this is the
way we’ve always done it”. debmirror, for example, is hot garbage.

~~~
pas
Debian was never about ergonomics :(

I remember checking their packaging howto every few years to maybe
really/properly/truly understand it, but just the sheer uselessness of the
text signaled each time that it's just not worth it. Just google whatever you
want copy from stackoverflow, and be done with it, don't try to understand it.
(Eg. if I wanted a virtual package that provides some package so I can fake
that so it won't get pulled in as useless dependency for other packages.)

~~~
sneak
I would absolutely love to maintain a metapackage and some tools on a personal
apt mirror that I can add to machines and update periodically. The
burden/overhead of learning the ridiculously tradition-based and
overcomplicated system in use has kept me from doing it for years.

~~~
julian-klode
It literally is so simple you could do it by hand.

~~~
pas
It's simple to brute force a .deb (after all it's just an ar containing two
tar files, on contains the control files, and that simple), but the process,
the myriad of debhelpers, obscure traditions, mandatory steps (changelog
update), and whatnot are not simple at all.

~~~
CameronNemo
I would also point out that actually getting software into Debian is no fun
for newcomers either. The whole mentorship process (where you use some arcane
command to upload a package you built yourself to a special website, then ask
someone to look at it) is wacky. The project could really improve their
onboarding process, or at least make it easier for fly by contributions.

[https://mentors.debian.net/](https://mentors.debian.net/)

------
techntoke
After working with Debian/Ubuntu packaging, and switching to Arch and Pacman,
it is so much easier to create/maintain packages on Arch and Alpine distros
than the alternatives.

~~~
taeric
My age has given me too much cynicism. Green field is always easier. Usually
does a ton less, though.

In some cases, shedding requirements is good. In most, it is just a race to
feature parity. :(

~~~
linsomniac
Perhaps, but as a counter example I'd compare it to RPM.

RPM is a solid, mature, heavily used package format. And the .spec file I've
always found to be so much easier to work with than Debian packaging. I've
made many debs, and my current work uses debs for deploying all our software,
but I still find it so much harder to make them. I have a workflow for my
current packages, but if I want to take a new piece of software and turn it
into a .deb, it is always pretty painful. I usually have to ask a debian
packager for help. They are always super nice and helpful, but I just wish I
could package things myself.

Seems like if you package .debs all the time, it becomes easier. But as an
infrequent packager, I find RPMs .spec to be much easier to manage than
Debian's system.

~~~
kelnos
I agree; I do .deb packaging _just_ infrequently enough that every time I go
to do it, I have to re-learn it nearly from scratch. And it's not simple at
all.

~~~
yjftsjthsd-h
I'm fond of FPM for this kind of thing. It provides a usable interface over a
bunch of package formats and explicitly aims to make the whole thing painless,
and in my experience it succeeds.

------
CoffeeDregs
Argh. Was just upgrading a Debian machine and was wishing (again) for parallel
downloads. _apt-fast_ is a severely limited hack (I couldn't figure out what
it actually supports). I assumed any significant apt update would include
performance/DL improvement. No?

~~~
julian-klode
Non-parallel downloads from the same server are by design - turning that
restriction off would be an easy thing. Server resources are limited, and
people should not be cheating their way around bandwidth limits the server
has. Put multiple mirrors in a mirrors.list, and then use `mirror+file://path
to mirror list` instead of the http source in sources.list so that apt
downloads from all these mirrors in parallel.

Be aware that high parallelisation of downloads may reduce your throughput for
small number of things to fetch.

There also should be no latency need - APT keeps the number of requests it has
sent to the server at 10, so there should be no latency overhead. Yes I know,
this does not work for Google because their latency is crazy high, but their
speed is super high too.

------
nrclark
Cool!

Does anybody know if there are plans to fix the .deb size limit? It's a bummer
that .debs can't be more than 10 GB. That number seems big, but I've hit it
before when packaging custom toolchains for internal use at my company.

~~~
teruakohatu
Can't you just create a dependancy and split it up?

~~~
nrclark
Definitely possible in theory. Kind of a pain in practice though.

I was working on trying to package Xilinx Vivado for internal use on my
company's build servers. It's around a 20-25GB installation, mostly of
smallish files. I could have definitely manually split the package up. It's
the kind of thing that's hard to do algorithmically though. After a day or so
of trying to solve the packaging problem, I eventually gave up on the idea
altogether.

Large software installs are still fairly uncommon on Linux, but I see more and
more of them in the wild. Especially when I look at some games and stuff - I
have a bunch of software that's larger than 10 GB. It's kind of nuts that the
.deb package format is still so constrained, especially considering its
importance to so many distros.

------
m0zg
Missed opportunity to switch to zstd IMO. Much faster than gzip, especially
during decompression, and essentially the same compression ratio. Them apt
upgrades take too long.

~~~
anttisalmela
apt has supported zstd since 1.6 already.

~~~
m0zg
TIL. Ubuntu needs to get it act together then, it sounds like. I think they're
using lzma, which is basically the slowest format available.

~~~
julian-klode
Last evaluation has shown that the size increase of zstd compared to xz
warrants an introduction of delta debs first, so that people don't have to
download more.

But then with deltas, mirror sizes grows even more, which is a hard sell.

Arch Linux gets away with using zstd -21 but that's not practical for general
purpose Linux distributions, as it has significant memory requirements
compared to xz.

Also the speedup in practice is negligent - while zstd is much faster than xz,
most of the time installing packages is actually spent in fsync(). So you only
see huge speedups when run in eatmydata.

~~~
julian-klode
Like, zstd -21 --ultra requires a whopping 365MB of RAM to compress a 16 MiB
file, zstd -19 requires 12 MB, and xz requires 10 MB.

Decompressing -21 requires 20MB vs 10MB for -19 and xz.

~~~
m0zg
zstd is _way_ faster at decompressing though, even from --ultra: some 13x
faster in the case of Arch Linux.

[https://en.wikipedia.org/wiki/Zstandard](https://en.wikipedia.org/wiki/Zstandard)

~~~
julian-klode
I know that, but compression time is not as relevant as you think it is. If
you look back at the measurements we did 2 years ago:

[https://lists.ubuntu.com/archives/ubuntu-
devel/2018-March/04...](https://lists.ubuntu.com/archives/ubuntu-
devel/2018-March/040211.html)

You can see that the performance gain from zstd under realistic scenarios is
about 10%.

you can see that switching from xz to zstd only improved firefox install time
from 37s to 33s; running in eatmydata to avoid fsync from dpkg improved it to
12.5s (8.5s with zstd).

I don't know what Arch linux does, but they likely do not correctly fsync()
data after its been written to a temporary file, and then fsync the directory
after files have been renamed to their final names, as is required to achieve
consistent results in a crash.

------
cosmiccatnap
I know they are kind of just wrapping aptitude but it's frustrating I can't
just wildcard for things. It's a package manager not a development framework,
why would regex and wildcard not be enough for anyone?

~~~
julian-klode
We could reinstate wildcards (but not regexes, they are unsafe as their magic
characters overlap with valid package names, so g++ can mean the package g++,
the package g+, or any package matching the regex g+).

I feel like once people got used to patterns, that will be sufficient though,
and the error reporting gets nicer.

