Hacker News new | past | comments | ask | show | jobs | submit login
Waiting for apt locks without hacky Bash scripts (sinjakli.co.uk)
41 points by Sinjo on Oct 25, 2021 | hide | past | favorite | 27 comments



Those darned apt locks have to be the second biggest PITA of my Linux development experience. Who in their right mind thought it was a good idea for an automatic system maintenance process to entirely block the user from doing their using for the first couple minutes after startup? As if that wasn’t enough they have the nerve to log this line:

> Be aware that removing the lock file is not a solution and may break your system.

Guess what? My system is already broken! It’s refusing to do what I want it to do (install a package so that I can test it) in favor of doing what some remote agent (the update service) told it to do.

At this point I just immediately delete the .lock file when I see that message. Never has given me any problems. If it did, I’d just reimage the VM and probably be done sooner than if I had waited for that lock.


You can disable automatic updates and instead run them when you want.


Funny thing is I click “not now” or whatever it is when they pop up asking if I want updates, yet it still gets locked.


If this is an Ubuntu flavor, you can run the "software and updates" program and set "Updates -> Automatically check for updates" to Never.

https://media.howtofix.io/full/images/posts/2020/11/13/16052...

Other distros may vary :)


Thanks, looks like it was set to auto-install security updates and prompt for the others.


Coming from Gentoo, encountering Debianesque systems only later, that all-out locking really surprised me. Portage, the Gentoo package manager, only takes a lock at the very end of the installation, when a file tree (result of compilation etc) is merged into the "live" filesystem - so that those merges are serialized and checked for collisions.


and of course with Gentoo, this problem needed to be solved from the start, otherwise we would've been locked out of using our package manager for a day in 2004 while we were waiting for XFree86 to finish compiling.


And Portage, if it can't get the lock, just waits for it to be free.

So whereas on Ubuntu, two competing apt commands cause one to fail, with no real recourse except a very nightmarishly long investigation resulting in the above article, in Gentoo, two competing commands … just … work.


I run large datacenters with thousands of boxes.

I have a little app (written in golang) installed on each box that effectively is a task runner. Tasks can be written to do anything, including apt-get installing software.

If apt-get fails to run, the task fails (context.WithTimeout) and is run again at a later date. No random hacks needed. Everything is built to be idempotent, self-healing and eventually consistent.


Might be a dumb question, but would systemd not be able to do this? Do your boxes not run systemd?


Not a dumb question. My process runs from systemd. But the process itself needs to do all sorts of custom stuff, which is baked into it.


Would you be interested in sharing this tool with the world?


Thanks for asking. This is a huge amount of IP for my business. It is also very custom for our use case. I'm not trying to create a general purpose DSL or anything.


cfengine3 does that (and much more) form me.


Great! Happy it fulfills your use case.

I can assure you that my use case is a bit more involved than what cfengine can provide.

Cheers.


cron does that for me, and it's already installed


Cron is not a very good task runner on its own if you have lots of hosts, care that your tasks actually run, care that they succeed, need sequencing or need to ensure that multiple instances don't conflict.

cron is fine as part of a task scheduler, but even for very basic use cases you'll hit its limitations and will have to work around them.


Yea but, how can you just keep using the same tools for 30+ years? Won't someone think of the developers!?

/s


I don't think the suggestion in the article works, does it?

It checks the dpkg lock, but I believe apt has its own lock, I believe at /var/lib/apt/lists/lock , and that option name strongly implies that it only checks the dpkg lock, still leaving you with a race condition on the apt one.


I have faced this locking problem while trying to use packer for building out AMIs. It was a flaky process. I used to google it always and not figure out a solution and continue rerunning the builds for a long time.

But one of my colleagues figured out that it is probably because the apt-get is getting locked due to cloud-init and removed the flakiness by making packer wait[1] for cloud-init to complete before running the installation scripts that involved apt-get locking.

We too wished that there were more docs to help us, especially explaining how apt-get worked.

[1] https://github.com/hashicorp/packer/issues/2639#issuecomment...


I had exactly the same issue in one of my deploy scripts about 6-7 hours ago and fixed it with the fuser solution you mentioned in the askubuntu thread. What a coincidence haha, I’ll try this out tomorrow, thanks so much :)


A side-note: instead of apt-get you can just type apt.


Wow. This would have been useful to have known about. I have worked around that dumb apt lock too many times for it have any meaning to me anymore.


> This all started when I was looking into why instances in an auto scaling group were sometimes failing to bootstrap correctly.

Please use Packer to build your images; don't do it on ASG instance deploy.

If somebody in the company tells you you're not allowed to build your own images, tell them to go fuck themselves, and write an e-mail to that guy's boss's boss explaining how much engineering time you're wasting (and how likely the products are to fail due to ASGs trying to bootstrap systems on the fly) because they won't let you cut an AMI.


> Please use Packer to build your images; don't do it on ASG instance deploy.

So you start using Packer to build your images. Your Packer script does "apt get install" and fails because something is holding the apt lock, and the author ends up writing more or less the same article.

Additionally, I work in Azure, and VM images in Azure are a world of absolute nightmarish pain there: a normal user literally cannot make the API call to bring up a VM with a custom image if the image is not in the same tenant. There is a way to do it with SP, but it is so completely and thoroughly undocumented as to be black magic. (Yes, if you Google, you'll be able to find Azure documentation on this exact subject. No, the instructions do not work.)

Yeah, I agree at the end of the day, it's the right way to do things, but it is an absolute, utter, PITA.

But then again, if you choose to not bring up a VM with a custom image, you get an unpinned image: "Ubuntu 20.04 LTS" is a moving target, and we once got one with a kernel that would BUG after ~5 minutes. Azure needed us to tell them what kernel we got from them.


In the Packer build you can have your provisioning script wait for the cloud-init to finish because you're not delaying a production scaleout; you've got all the time in the world. It's true that this article still gets written, but I don't think OP was suggesting otherwise. Building fully baked images rather than installing things on production startup avoids all sorts of flakiness.

Incidentally the author's suggestions are not that good, and not what I would suggest. He should be waiting for cloud-init to finish, not just apt:

    cloud-init status --wait
Before I knew about that, I used the following, which is still better than what the author suggests:

    while [ ! -f /var/lib/cloud/instance/boot-finished ]; do echo 'Waiting for cloud-init...'; sleep 1; done
Other stuff happens in cloud-init besides locking apt for awhile, and these will wait for all of it to finish.


Oooh, I did not know about that cloud-init command. I've always done something like your second one.

That just leaves unattended-upgrades, which is harder than the undead to kill, and Azure's WAAgent…




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: