Hacker News new | past | comments | ask | show | jobs | submit | bonyt's comments login

If you want this to be a little safer, instead of just those guardrails to prevent semicolons and such, you can split the command into an array of arguments, and use subprocess.Popen. It won't execute through a shell, so you don't have to worry about shell injection[1]. Though I'm sure there are unsafe ways to invoke ffmpeg anyway.

[1]: https://docs.python.org/3/library/subprocess.html#security-c...


I'd rather manually review all AI-generated command before running it.


I'm pretty sure you can dump a stream without transcoding directly to a file, and the stream can be sourced from an url, and the destination file can be users ssh authorized_keys


And the GPT-4 API would want to respond with an output to do that when that isn't what the user asked for?


I am almost certain one can find a seed+temperature pair that will result in such output given a non-malicious prompt.


And I am almost certain I could win $10M from a lottery ticket. I'm not worried about that actually happening though because it is statically never going to happen.

https://i.imgflip.com/3bhvio.jpg


On the contrary, it happens every time.


I'll start investing in the lottery right away, now that I know it is a sure thing. It is the difference between cumulative probability of all tickets sold (which would be 100%) versus and discrete probability of a single ticket (0.000000003 for Powerball, basically zero).

The cumulative probability of every `ffmpeg-english` command ever sent to OpenAI will likely be <5%, if not <1%, of all of the possible responses GPT-4 could give.


If you couldn't win why would people buy tickets? It's like saying god doesn't exist.

I was just wondering what hilarious personal programming lottery stories people are not telling us.

I rolled short 6 digit random unique orderNumbers one time and dropped the leading zeroes. There is an order without a number now.


Or do a second query to ask whether the command is safe to execute.


Then do a third query to determine whether the answer to the second query was trustworthy.


still awaiting the induction step.


I really wish there was a native way to provide a suggested command to run next, and then let your own shell deal with it, after the user’s Enter keypress.


http://openinterpreter.com will just run commands for you


Please, do not use subprocess.Popen. Use something like plumbum, way safer and more robust.


If you can make a python program which only uses stdlib, it becomes wonderfully portable and easy to work with. Also, significantly more people use stdlib, there is more knowledge on the internet, and xz-style supply chain attacks are significantly less likely.

This is why my advice to everyone is to use python's stdlib as much as possible, and avoid using Python's external libraries unless they significantly simplify code.

Plumbum seems nice (and also is packaged in debian/ubuntu, which is a plus), but it does not seem to be significantly safer than correctly written subprocess code, and it won't even save that much lines in this particular example.


I agree and disagree. Python’s Subprocess has been the reason for many unfortunate, time-consuming bugs among users who think they are properly executing external commands, but that in end cannot realize their logic are full of errors, with no indication that something went bad.

I agree that a standard interface is always better, however not at the cost of productivity. A better than the current Subprocess interface is needed, and I think plumbum is the direction to go.


I am curious which errors do you find most problematic? We have internal codebase with hundreds of developers, and we haven't observed many subprocess related bugs. And the ability to print command being executed (via shlex.join) so it can be copied to shell and debugged there is very nice.

That said, there is a bunch of rules in our internal style guide about this process, such as: avoid shell=True unless you need shell functionality and know how shell quoting works; use python instead of tools like grep/head/aws/etc.. when performance permits; check returncode after Popen calls; correctly quote ssh args (they are tricky!)


quoting is the enemy.


shlex.quote all the things. So many dumb problems solved trying to manually escape shell commands.


Is that actually an ethernet port, or is it just an RJ45 for something like I2C, UART or some other protocol? It's also got a little port for a Bluetooth controller, which came with my desk. I bet it's possible to reverse engineer whatever protocol that is using over bluetooth/BLE.


This. It's not Ethernet. This article has some dissection of the data going over those ports: https://hackaday.io/project/4173-uplift-desk-wifi-link


I've heard blame partially put on carriers - they initially resisted even carrying the Treo line without putting limits on what Handspring could do with it.

Apple had the iPod - and, crucially, customers - could bring these customers to the carriers, and so they could dictate more.

https://www.youtube.com/watch?v=b9_Vh9h3Ohw (the part I'm referencing is about 20 minutes in).


Wait, you're right, 0x20 is space and 0x09 is ht/tab... they've swapped spaces for tabs...


The payload is supposedly a corrupted xz file. An overzealous "replace spaces with tabs" formatter could easily cause this sort of corruption, so it seems like a perfectly reasonable test case.


For those panicking, here are some key things to look for, based on the writeup:

- A very recent version of liblzma5 - 5.6.0 or 5.6.1. This was added in the last month or so. If you're not on a rolling release distro, your version is probably older.

- A debian or RPM based distro of Linux on x86_64. In an apparent attempt to make reverse engineering harder, it does not seem to apply when built outside of deb or rpm packaging. It is also specific to Linux.

- Running OpenSSH sshd from systemd. OpenSSH as patched by some distros only pulls in libsystemd for logging functionality, which pulls in the compromised liblzma5.

Debian testing already has a version called '5.6.1+really5.4.5-1' that is really an older version 5.4, repackaged with a newer version to convince apt that it is in fact an upgrade.

It is possible there are other flaws or backdoors in liblzma5, though.


Focusing on sshd is the wrong approach. The backdoor was in liblzma5. It was discovered to attack sshd, but it very likely had other targets as well. The payload hasn't been analyzed yet, but _almost everything_ links to libzma5. Firefox and Chromium do. Keepassxc does. And it might have made arbitrary changes to your system, so installing the security update might not remove the backdoor.


From what I'm understanding it's trying to patch itself into the symbol resolution step of ld.so specifically for libcrypto under systemd on x86_64. Am I misreading the report?

That's a strong indication it's targeting sshd specifically.


Lots of software links both liblzma and libcrypto. As I read Andres Freund's report, there is still a lot of uncertainty:

"There's lots of stuff I have not analyzed and most of what I observed is purely from observation rather than exhaustively analyzing the backdoor code."

"There are other checks I have not fully traced."


As mentioned many times in other places now, this account had control over xz code for 2 years. The discovered CVE might be just a tip of an iceberg.


It checks for argv[0] == "sshd"


Ubuntu still ships 5.4.5 on 24.03 (atm).

I did a quick diff of the source (.orig file from packages.ubuntu.com) and the content mostly matched the 5.4.5 github tag except for Changelog and some translation files. It does match the tarball content, though.

So for 5.4.5 the tagged release and download on github differ.

It does change format strings, e.g.

   +#: src/xz/args.c:735
   +#, fuzzy
   +#| msgid "%s: With --format=raw, --suffix=.SUF is required unless writing to stdout"
   +msgid "With --format=raw, --suffix=.SUF is required unless writing to stdout"
   +msgstr "%s: amb --format=raw, --suffix=.SUF és necessari si no s'escriu a la sortida estàndard"
There is no second argument to that printf for example. I think there is at least a format string injection in the older tarballs.

[Edit] formatting


FYI, your formatting is broken. Hacker News doesn't support backtick code blocks, you have to indent code.

Anyway, so... the xz project has been compromised for a long time, at least since 5.4.5. I see that this JiaT75 guy has been the primary guy in charge of at least the GitHub releases for years. Should we view all releases after he got involved as probably compromised?


Thank you, formatting fixed.

My TLDR is that I would regard all commits by JiaT75 as potentially compromised.

Given the ability to manipulate gitnhistory I am not sure if a simple time based revert is enough.

It would be great to compare old copies of the repo with the current state. There is no guarantee that the history wasn't tampered with.

Overall the only safe action would IMHO to establish a new upstream from an assumed good state, then fully audit it. At that point we should probably just abandon it and use zstd instead.


Zstd belongs to the class of speed-optimized compressors providing “tolerable” compression ratios. Their intended use case is wrapping some easily compressible data with negligible (in the grand scale) performance impact. So when you have a server which sends gigabits of text per second, or caches gigabytes of text, or processes a queue with millions of text protocol messages, you can add compression on one side and decompression on the other to shrink them without worrying too much about CPU usage.

Xz is an implant of 7zip's LZMA(2) compression into traditional Unix archiver skeleton. It trades long compression times and giant dictionaries (that need lots of memory) for better (“much-better-than-deflate”) compression ratios. Therefore, zstd, no matter how fashionable that name might be in some circles, is not a replacement for xz.

It should also be noted that those LZMA-based archive formats might not be considered state-of-the-art today. If you worry about data density, there are options for both faster compression at the same size, and better compression in the same amount of time (provided that data is generally compressible). 7zip and xz are widespread and well tested, though, and allow decompression to be fast, which might be important in some cases. Alternatives often decompress much slowly. This is also a trade-off between total time spent on X nodes compressing data, and Y nodes decompressing data. When X is 1, and Y is in the millions (say, software distribution), you can spend A LOT of time compressing even for relatively minuscule gains without affecting the scales.

It should also be noted that many (or most) decoders of top compressing archivers are implemented as virtual machines executing chains of transform and unpack operations defined in archive file over pieces of data also saved there. Or, looking from a different angle, complex state machines initializing their state using complex data in the archive. Compressor tries to find most suitable combination of basic steps based on input data, and stores the result in the archive. (This is logically completed in neural network compression tools which learn what to do with data from data itself.) As some people may know, implementing all that byte juggling safely and effectively is a herculean task, and compression tools had exploits in the past because of that. Switching to a better solution might introduce a lot more potentially exploited bugs.


Arch Linux switched switched from xz to zstd, with neglectable increase in size (<1%) but massive speedup on decompression. This is exactly the use case of many people downloading ($$$) and decompressing. It is the software distribution case. Other distributions are following that lead.

You should use ultra settings and >=19 as the compression level. E.g. arch used 20 and higher compression levels do exist, but they were already at a <1% increase.

It does beat xz for these tasks. It's just not the default settings as those are indeed optimized for the lzo to gzip/bzip2 range.


My bad, I was too focused on that class in general, imagining “lz4 and friends”.

Zstd does reach LZMA compression ratios on high levels, but compression times also drop to LZMA level. Which, obviously, was clearly planned in advance to cover both high speed online applications and slow offline compression (unlike, say, brotli). Official limit on levels can also be explained by absence of gains on most inputs in development tests.

Distribution packages contain binary and mixed data, which might be less compressible. For text and mostly text, I suppose that some old style LZ-based tools can still produce an archive roughly 5% percent smaller (and still unpack fast); other compression algorithms can certainly squeeze it much better, but have symmetric time requirements. I was worried about the latter kind being introduced as a replacement solution.


> the lzo to gzip/bzip2 range

bzip2 is a pig that has no place being in the same sentence as lzo and gzip. It's nieche was maximum compression no matter the speed but it hasn't been relevant even there for a long time.

Yet tools still need to support bzip2 because bzip2 archives are still out there and are still being produced. So we can't get rid of libbz2 anytime soon - same for liblzma.


Note that the xz CLI does not expose all available compression options of the library. E.g. rust release tarballs are xz'd with custom compression settings. But yeah, zstd is good enough for many uses.


Looking forward to the time when Meta will make https://github.com/facebookincubator/zstrong.git public

found it mentioned in https://github.com/facebook/proxygen/blob/main/build/fbcode_..., looks like it's going to be cousin of zstd, but maybe for the stronger compression use cases


Not just Jia. There are some other accounts of concern with associated activity or short term/bot-is names.



Interesting, this would suggest exploits other than the known sshd one.


Note that zstd (the utility) currently links to liblzma since it can compress and decompress other formats.


Lol as if there weren't enough general archivers already.


> Given the ability to manipulate gitnhistory I am not sure if a simple time based revert is enough.

Rewritten history is not a real concern because it would have been immediately noticed by anyone updating an existing checkout.

> Overall the only safe action would IMHO to establish a new upstream from an assumed good state, then fully audit it. At that point we should probably just abandon it and use zstd instead.

This is absurd and also impossible without breaking backwards compatibility all over the place.


"#, fuzzy" means the translation is out-of-date and it will be discarded at compile time.


I tried to get the translation to trigger by switching to french and it does not show. You are right.

So it's just odd that the tags and release tarballs diverge.


RHEL9 is shipping 5.2.5; RHEL8 is on 5.2.4.


Thanks for the heads up.


> Debian testing already has a version called '5.6.1+really5.4.5-1' that is really an older version 5.4, repackaged with a newer version to convince apt that it is in fact an upgrade.

I'm surprised .deb doesn't have a better approach. RPM has epoch for this purpose http://novosial.org/rpm/epoch/index.html


Debian has epochs, but it's a bad idea to use them for this purpose.

Two reasons:

1. Once you bump the epoch, you have to use it forever. 2. The deb filename often doesn't contain the epoch (we use a colon which isn't valid on many filesystems), so an epoch-revert will give the same file name as pre-epoch, which breaks your repository.

So, the current best practice is the +really+ thing.


Thanks for the info, the filename thing sounds like a problem, one aspect of the epoch system doesn't work for the purpose then.


Honestly, the Gentoo-style global blacklist (package.mask) to force a downgrade is probably a better approach for cases like this. Epochs only make sense if your upstream is insane and does not follow a consistent numbering system.


Gentoo also considers the repository (+overlays) to be the entire set of possible versions so simply removing the bad version will cause a downgrade, unlike debian and RPM systems where installing packages outside a repository is supported.


Stop the cap your honor. There is not a single filesystem that prevents you from using colons in filenames except exfat, I went ahead and checked and ext4, xfs, btrfs, zfs, and even reiserfs let you use any characters you want except \0 and /.

And I fail to see why bumping the epoch would ever be a problem. Using the epoch not a reason why its bad.


Got this on OpenSUSE: `5.6.1.revertto5.4-3.2`


.deb has epochs too, but I think Debian developers avoid it where possible because 1:5.4.5 is interpreted as newer than anything without a colon, so it would break eg. packages that depend on liblzma >= 5.0, < 6. There may be more common cases that aren't coming to mind now.


Seems like debian is mixing too many things into the package version - version used for deciding on upgrades and abi version for dependencies should be decoupled like it is in modern RPM distros.


If a binary library ABI is backwards-incompatible, they change the package name. I was just guessing at the reason epoch is avoided and that <6 is probably an awful example.

So now I actually bothered to look it up, and it turns out the actual reason is that the epoch changes what version is considered "greater", but it's not part of the .deb filename, so you still can't reuse version numbers used in the past. If you release 5.0, then 5.1, then you want to rollback and release 1:5.0, it's going to break things in the Debian archives. https://www.debian.org/doc/debian-policy/ch-binary.html#uniq...

Additionally, once you add an epoch you're stuck with it forever, while if you use 5.1+really5.0, you can get rid of the kludge when 5.2 is out. https://www.debian.org/doc/debian-policy/ch-controlfields.ht...


I really like the XBPS way of the reverts keyword in the package template that forces a downgrade from said software version. It's simple but works without any of the troubles RPM epochs have with resolving dependencies as it's just literally a way to tell xbps-install that "yeah, this is a lower version number in the repository but you should update anyway".


Debian packages can have epochs too. I’m not sure why the maintainers haven’t just bumped the epoch here.

Maybe they’re expecting a 5.6.x release shortly that fixes all these issues & don’t want to add an epoch for a very short term packaging issue?


> If you're not on a rolling release distro, your version is probably older.

Ironic considering security is often advertised as a feature of rolling release distros. I suppose in most instances it does provide better security, but there are some advantages to Debian's approach (stable Debian, that is).


>Ironic considering security is often advertised as a feature of rolling release distros.

Security is a feature of rolling release. But supply-chain attacks like this are the exception to the rule.


Isn't that what security-updates-only is for?

This particular backdoor is not shipped inside of a security patch, right?


i mean, rolling implies rolling 0-days, too.


The article gives a link to a simple shell script that detects the signature of the compromised function.

> Running OpenSSH sshd from systemd

I think this is irrelevant.

From the article: "Initially starting sshd outside of systemd did not show the slowdown, despite the backdoor briefly getting invoked." If I understand correctly the whole section, the behavior of OpenSSH may have differed when launched from systemd, but the backdoor was there in both cases.

Maybe some distributions that don't use systemd strip the libxz code from the upstream OpenSSH release, but I wouldn't bet on it if a fix is available.


> From the article: "Initially starting sshd outside of systemd did not show the slowdown, despite the backdoor briefly getting invoked." If I understand correctly the whole section, the behavior of OpenSSH may have differed when launched from systemd, but the backdoor was there in both cases.

It looks like the backdoor "deactivates" itself when it detects being started interactively, as a security researcher might. I was eventually able to circumvent that, but unless you do so, it'll not be active when started interactively.

However, the backdoor would also be active if you started it with an shell script (as the traditional sys-v rc scripts did) outside the context of an interactive shell, as TERM wouldn't be set either in that context.

> Maybe some distributions that don't use systemd strip the libxz code from the upstream OpenSSH release, but I wouldn't bet on it if a fix is available.

There's no xz code in openssh.


> Maybe some distributions that don't use systemd strip the libxz code from the upstream OpenSSH release, but I wouldn't bet on it if a fix is available.

OpenSSH is developed by the OpenBSD project, and systemd is not compatible with OpenBSD. The upstream project has no systemd or liblzma code to strip. If your sshd binary links to liblzma, it's because the package maintainers for your distro have gone out of their way to add systemd's patch to your sshd binary.

> From the article: "Initially starting sshd outside of systemd did not show the slowdown, despite the backdoor briefly getting invoked." If I understand correctly the whole section, the behavior of OpenSSH may have differed when launched from systemd, but the backdoor was there in both cases.

From what I understand, the backdoor detects if it's in any of a handful of different debug environments. If it's in a debug environment or not launched by systemd, it won't hook itself up. ("nothing to see here folks...") But if sshd isn't linked to liblzma to begin with, none of the backdoor's code even exists in the processes' page maps.

I'm still downgrading to an unaffected version, of course, but it's nice to know I was never vulnerable just by typing 'ldd `which sshd`' and not seeing liblzma.so.


I think the distributions that do use systemd are the ones that add the libsystemd code, which in turn brings in the liblzma5 code. So, it may not be entirely relevant how it is run, but it needs to be a version of OpenSSH patched.


I did notice that my debian-based system got noticeably slower and unresponsive at times the last two weeks, without obvious reasons. Could it be related?

I read through the report, but what wasn't directly clear to me was: what does the exploit actually do?

My normal internet connection has such an appalling upload that I don't think anything relevant could be uploaded. But I will change my ssh keys asap.


> I did notice that my debian-based system got noticeably slower and unresponsive at times the last two weeks, without obvious reasons. Could it be related?

Possible but unlikely.

> I read through the report, but what wasn't directly clear to me was: what does the exploit actually do?

It injects code that runs early during sshd connection establishment. Likely allowing remote code execution if you know the right magic to send to the server.


Thank you for the explanation.


Are you on stable/testing/unstable?

With our current knowledge, stable shouldn’t be affected by this.


Stable, luckily. Thank you for the information.


$ dpkg-query -W liblzma5

liblzma5:amd64 5.4.1-0.2


Tumbleweed has a package: liblzma5-5.6.1.revertto5.4-3.2.x86_64 FYI


revertto probably just means "revert to" but it does sound quite italian lol.


A couple of good videos about the business of dollar stores with similar takes:

https://www.youtube.com/watch?v=p4QGOHahiVM (Last Week Tonight with John Oliver)

https://www.youtube.com/watch?v=vQpUV--2Jao (Wendover Productions)


I'm sympathetic to this view, but ultimately I think it's likely they are providing some benefit (even if their practices aren't 100% ethical) or else consumers wouldn't prefer them so much. I don't like to assume that a huge % of the population is too stupid to make basic choices.


Honestly, I don't think that assumption is necessary, and I don't think either video suggests it in the slightest.


Interesting, they’re not disabling adb. It sounds like they’re just disabling the ability for apps on the device to connect to adb on localhost and run (e.g.) shell commands that way.


AIUI this is also the case in "upstream" Android but apps like Shizuku work around it, and those workarounds seem like they'd transfer fine to FireTV.


Yes I was worried when I started reading, was assuming they’d pulled ADB altogether. This isn’t quite such a big deal (to me at least)


Working here, in a subway station in NYC.


As others have said, use the app. Got a double cheeseburger and a free medium fries there today, and this was in Manhattan. $3.79 plus tax.

Even if you don’t use the app, if you just buy a double cheeseburger and a small fries it’s $3.99.

They're pushing their prices but there’s still cheap meals to be had.


With the app, you're still paying, this time with your privacy.


How? On iPhone I only give it location permissions while I'm using the app, and I pay with Apple Pay to avoid giving them much that they would be able to correlate elsewhere.


An app that needs to know your location even though you're going to the store to pick up the order?


I believe they use the proximity to choose the closest store and to start the order when you get there but are still waiting in the drive-thru line to give them your name/code.

I haven't tried to disable it completely, but even if I can't I dont find that limited location tracking for when I'm actually ordering food and driving there to be an issue.


The only thing required for that is a zip code. In any case, once you choose a location, it can simply save that preference instead of tapping your lat/long coordinates every single time. Just more data to sell to a broker down the line.


Exactly. Currently the iOS App Store has the McDonalds app as having...

Data Used to Track You : Contact Info & Identifiers

Data Linked to You : Purchases, Financial Info, Location, Contact Info, Search History, Browsing History, Identifiers, Usage Data and Diagnostic


> Data Used to Track You : Contact Info & Identifiers

That's email address and a device ID.

> Data Linked to You : Purchases, Financial Info, Location, Contact Info, Search History, Browsing History, Identifiers, Usage Data and Diagnostic

That's purchases made from the app, and info about how you paid for them. The search and browsing history are the search and browsing history from web views within the app. The usage data and diagnostics data are usage and diagnostics for that app.

Putting it all together, what they get are the email you gave them when you made the account, a device ID, location, and they can see whatever you actually do in the app.


Is the app giving away any more data than what was being given away by using a credit card?


Duckduckgo app to stop much of the invasion on Android


That doesn't sound right. The fries alone cost nearly that much now. I'm also nowhere near as expensive of an area.


For the $3.99 one, it just becomes a bundle when ordered together. Also works with six piece nuggets. I’ve seen it on the menu in the store but it’s not incredibly prominent. And, in the app it’s basically just undocumented and the price drops when they’re both in your cart. It also doesn’t count as a promo in the app, so you can order more than one bundle and even use a promo with it as well.

https://imgur.com/a/2OOF8Fm


I’ve heard stories of people putting fake QR codes linking to fake payment portals for parking. I don’t think an app solves this problem though.

https://www.schneier.com/blog/archives/2022/01/fake-qr-codes...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: