More

bonki · 2024-05-23T13:03:51

The lightning might indeed not be the best but I really like it, well done!

bonki · 2024-05-03T10:58:33

... in the US.

bonki · 2024-04-30T12:09:04

Do you have a source on the cash gifts?

cenamus · 2024-04-30T14:55:19

In german, but wasn't cash, just gift bags as it turns out.

https://www.derstandard.at/story/3000000213247/wiener-polizi...

bonki · 2024-05-03T22:19:31

Yes, that's the article I remember. Was just curious whether I missed something or not. Thanks!

bonki · 2024-04-26T15:34:57

To my surprise, I have seen this more often than I would have expected (expectation = zero). It's not super common but not unheard of.

bonki · 2024-04-24T10:51:53

Wasn't Private Location removed from the store though? If it's the app that I'm thinking about I used it before but it stopped working because the map API changed which broke the app and it never got updated, so it was removed. There should be a github issue about that somewhere.

johnisgood · 2024-04-24T10:56:32

You are right. I made the necessary corrections.

bonki · 2024-04-10T13:15:34

Someone posted this [1] here recently which I found extremely informative. Unless I've missed something zstd outperforms bzip2 in all cases there?

[1] https://insanity.industries/post/pareto-optimal-compression/

treffer · 2024-04-10T14:06:04

There is one thing you can't with most algorithms: prallelize decompression. That's because most compression algorithms use sliding windows to remove repetitive sections.

And decompression speed also drops as compression ratio increases.

If you transfer over say a 1GBit link then transfer speed is likely the bottleneck as zstd decompression can reach >200MB/s. However if you have a 10GBit link then you are CPU bound on decompression. See e.g. decompression speed at [1].

Bzip2 is not window but block based (level 1 == 100kb blocks, 9 == 900kb blocks iirc). This means that, given enough cores, both compression and decompression can parallelize. At something like 10-20MB/s per core. So somwhere >10 cores you will start to outperform zstd.

Granted, that's a very very corner case. But one you might hit with servers. That's how I learned about it. But so far I've converged on zstd for everything. It is usually not worth the hassle to squeeze these last performance bits out.

[1] https://gregoryszorc.com/blog/2017/03/07/better-compression-...

dralley · 2024-04-10T14:21:18

That's possible with pzstd in any case. zstd upstream has a plan to eventually support parallel decompression natively but hasn't prioritized it given the complexity and lack of immediate need.

https://github.com/facebook/zstd/issues/2499#issuecomment-78...

treffer · 2024-04-10T15:23:15

The issue talks about one vs. multiple frames. That's exactly the issue. It's not a matter of complexity, it's a matter of bad compromises.

The issue can be easily played through. The most simplistic encoding where the issue happens is RLE (run length encoding).

Say we have 1MB of repeated 'a'. Originally 'aaa....a'. We now encode it as '(length,byte)', so the stream turns into (1048576,'a').

Now we would want to parallelize it over 16 cores. So we split the 1MB into 16 64k chunks and compress each chunk independently. This works but is ~16x larger.

Similar things happen for window based algorithms. We encode repeated content as (offset,length), referencing older occurrences. Now imagine 64k of random data, repeated 16 times. The parallel version can't compress anything (16x random data), the non-parallel version will compress it roughly 16:1.

There is a trick to avoid this downside. The lookup is not unlimited, there is a maximum window size to limit memory usage. For compatibility it's 8MB for zstd (at level 19), but you can go all the way to 2GB (ultra, 22, long=31). As you make chunks significantly larger than the window you are only loosing out on the new ramp up. E.g. if you use 80MB chunks then you have a bit less than 10% of the file encoded worse. You could still double your encoded size with a well crafted file. If you don't care about parallel decompression then you are able to only parallelize parts like the lookup search. This gives good speedup, but only on compression. That's the current parallel compression approach in most cases (iirc) leading to a single frame, just faster. The problem is that back-references can only be resolved backwards.

The whole problem is not implementation complexity. It's something you algorithmically can't do with current window based approaches without significant tradeoffs on memory consumption, compression ratio and parallel execution.

For bzip2 the file is always chunked at 900kb boundaries at most. Each block is encoded independently and can be decoded independently. It avoids this whole tradeoff issue altogether.

I would also disagree with "no need". Zstd easily outperforms tar, but even my laptop SSD is faster than the zstd speed limits. I just don't have the _external_ connectivity to get something onto my disk fast enough. I've also worked with servers 10 years ago where the PCIe bus to the RAID card was the limiting factor. Again easily exceeding the speed limits.

Anyway, as mentioned a few times it's an odd corner case. And one can't go wrong by choosing zstd for compression. But it is real fun to dig into these issues and look at them, I hope this sparks some interest in it!

dralley · 2024-04-10T16:10:38

My point is, it's already possible to use multiple independently compressed (and decompressable) frames with zstd if you really want to.

It's even in the zstd repo, under a "contrib" implementation

https://github.com/facebook/zstd/blob/87af5fb2df7c68cc70c090...

That does, of course, require that you compress it into multiple frames to begin with, which could be a problem if you don't control the source of the compressed files, because the default is a single frame. In theory if everyone used pzstd to compress their files, it would be strictly superior to BZ2 in nearly every circumstance. As it is, you do have to go out of your way to do that.

But I don't think that necessarily means the single-frame choice by default is a bad tradeoff. It's better in most circumstances. And if they do eventually figure out a reasonably efficient way to handle intra-frame parallel decompression, then it's just gravy.

queuebert · 2024-04-10T14:02:39

There are three kinds of people in my experience:

1. bzip2 -1

2. bzip2 -9

3. What's bzip2?

A huge amount of time is spent optimizing for #3. Maybe instead we should offer descriptive commands that convey the goals. Say, "squash", "speedup", and "deflate", or some such.

Shish2k · 2024-04-10T16:19:51

I think I grok bzip2 fairly well, but I can’t figure out what your descriptive commands would actually do :S

bonki · 2024-04-10T11:20:13

In which universe are German k and ch phonetically similar?

vidarh · 2024-04-11T19:08:01

I'm not sure what they meant, but "k" in some Germanic languages are pronounced with a sound similar to German "ch" in some contexts.

E.g Norwegian "kirke" and German "Kirche" has almost the same sounds expect the order of the "k" and "ch" sounds are reversed:

Norwegian "kirke" is IPA /çɪrkə/ vs German "Kirche", IPA: /ˈkɪʁçə/

Notably, this is not the case for either Scots "kirk" (/kɪɾk/) or English "church" (which is /t͡ʃɜːt͡ʃ/)

Interestingly, the words for church in these languages all comes from Byzantine Greek via Proto-West Germanic, and English is the odd one out (as usual-ish) of the Germanic languages by having gone to "cirice" in Old English while all the other Germanic languages retained k's, and usually at least one "k" sound as well, and even in cases like German where one of the k's has become a "ch", the "ch" sound is the IPA /ç/ that is often spelled "k" in other Germanic languages.

bonki · 2024-04-11T22:23:56

The first "k" in Norwegian "kirke" (Swedish "kyrka") and "ch" in German "Kirche" are entirely different sounds (with exception(s), one German dialect comes to mind). The "k" in "kirke" is somewhere between German "sch" and "ch" but is neither really, the sound doesn't exist in [almost all] German [dialects].

Source: My native tongue is German and I've lived in Scandinavia for over a decade.

vidarh · 2024-04-12T11:35:08

[Sorry for the dissertation; I needed to look some of these up because I wanted to make sure I wasn't messing it up, and I fell down a rabbit hole - it's very fascinating to look at the subtle differences here; at least it is to me, because it also helps my own pronunciation]

It's not nearly that simple. I won't say you're wrong, because you might very well - even probably - be right for some (many even) pairs of German and Scandinavian dialects. But the Scandinavian languages are not interchangeable in this respect, including the Norwegian dialects.

The "clear" ch- start is the "norm" for Bokmål in Norwegian, with the caveat that there is no official Norwegian pronunciation, but in Nynorsk, the IPA used by Wiktionary is /²çʏrçɑ/ (compare /çɪrkə/ for Bokmål) - again with the caveat that there is no official pronunciation and this is a "middle of the road" sort of choice. This is the voiceless palatal fricative, and it's a fairly rare sound and one that Norwegian definitely has in common with German, but the precise variant might not match in all German dialects with all Norwegian dialects.

In Swedish you don't use the voiceless palatal fricative (IPA /ç/ - that's the ch) at all in kyrka - instead /²ɕʏrka/ or /²ɕørka/. Finnish Swedish /²tɕʏrka/ - I don't know whether they use it in any other words. Danish doesn't get even close for kirke (IPA [ˈkʰiɐ̯ɡ̊ə]). So first of all, lets discount those, as I wrote about Norwegian, and while the pronunciation is certainly mutually intelligible it uses different sounds than both German and Norwegian for this example.

I'm also going to mostly ignore Nynorsk, which is also different, and to infuriate any Nynorsk speakers who come across this, when I write "Norwegian", assume bokmål, or to make it difficult "the dialects spoken where Bokmål is the main written language" as technically Bokmål is the written language.

But even within the set of those speakers, you'll find a range going from near Nynorsk to near Swedish or near-ish - but not quite as near - Danish (at least not in pronunciation; in terms of vocabulary we're often nearer Danish than Swedish), with variants in between.

I'm native Norwegian , and I speak German, and it'll sound perfectly fine in Norwegian if you use the "ch" from Kirche at the start of "kirke" - at least the German dialects I'm used with. It will match plenty of dialects, but it might not match specifically your Scandinavian language or dialect. In mine moving towards German "sch" would be very out of place.

I checked some German videos just make sure I wasn't imagining things or have been mispronouncing ch/sch all these years, but the ones I found used a "ch" sound that's perfectly fine at the start of "kirke". You might be pickier than I am with respect that what is the exact same sound, but to me at least, there's a very distinct different position of the tip of your tongue for German "ch" and "sch", and I struggle to even form the word "kirke" if I start close to the "sch" position.

BUT, the drift between the "ch" (voiceless palatal fricative) and the "rs" in "norsk" or "skj" in "skjønn", which in Norwegian is the voiceless retroflex fricative, and that certainly can sound like something in between German ch and "such" but would - as you describe it - be neither, is an ongoing "problem" in Norwegian (that it's "retroflex" means it's pronounced by curling the tongue up like the alveolar fricative used in German "schone" or the palato-alveolar "sch" in fleisch - but not with the tongue lifted towards the hard palate like in both the German "sch" variants; you can shift between those sounds by holding the tip of the tongue in the same position and shifting the back of your tongue up/down - if you speak Norwegian, try switching between "schone" and "skjønn" - your tongue should typically lift at least a tiny bit when pronouncing "schone" and drop for "skjønn" but the difference can be very minor depending on dialect)

My mother used to work in nurseries, and I remember her despairing over the shift towards the retroflex (not in those words...) going back to my own childhood in the early 1980's, as it sounds awful to many of us. She spoke of it as something new the kids were doing and that they struggled to speak "properly", but I'm not convinced it was actually a new thing.

Those sounds are historically very distinct in Norwegian but Norwegian children struggle with separating them. E.g. "kjøtt" is the archetypical example of a word that "should be" pronounced with firmly palatal "ch" (IPA /çœt/) but that people tend to despair over how the pronunciation is drifting towards the retroflex (e.g. closer to "skjøtt") because the mis-pronunciation has become more and more prevalent, and that adults would get really annoyed when kids got wrong.

It's a less pronounced and common "error" with "kirke", but I certainly wouldn't be surprised if it's creeping into that as well in some dialects, so again, to be clear I'm not saying you're wrong; I'm sure you're right for some/your dialect. Germanic/Scandinavian dialects are just infuriatingly diverse.

bonki · 2024-04-08T22:51:23

The name confuses me profoundly.

defrost · 2024-04-08T23:11:10

Yeah, I also kept thinking of the 1988 debut studio album by Big Pig and the urban dictionary usage of the word upon which the album was named.

bonki · 2024-04-07T18:02:00

I am currently rewatching Taskmaster while waiting for the current episodes to drop and I watched this exact episode yesterday...best show ever!

bloopernova · 2024-04-07T18:26:04

I'm really liking everyone in the new season, Steve Pemberton is practically Bob Mortimer levels of hilarious. Unfortunately I find it difficult to look at Nick Mohamed's dracula makeup because I have an issue with distorted faces.

bonki · 2024-03-24T23:57:06

Unfortunately the UI is broken for me on mobile Android/Mull (Firefox)