Hacker News new | past | comments | ask | show | jobs | submit | 0x0's comments login

This is particularly interesting as there seems to be, for decades, a general consensus that the problem of text compression is the same as the problem of artificial intelligence, for example https://en.wikipedia.org/wiki/Hutter_Prize


"It is well established that compression is essentially prediction, which effectively links compression and langauge models (Delétang et al., 2023). The source coding theory from Shannon’s information theory (Shannon, 1948) suggests that the number of bits required by an optimal entropy encoder to compress a message ... is equal to the NLL of the message given by a statistical model." (https://ar5iv.labs.arxiv.org/html//2402.00861)

I will say again that Li et al 2024, "Evaluating Large Language Models for Generalization and Robustness via Data Compression", which evaluates LLMs on their ability to predict future text, is amazing work that the field is currently sleeping on.


I’m not sure how this generalises to grammar based compression such as SEQUITUR for example is… incidentally LZW also is though not advertised as such.

Devising the minimal grammar that generates the text is NP-hard (https://en.m.wikipedia.org/wiki/Smallest_grammar_problem)

Math seems very limited when it comes to reasoning about generative grammars and their unfolding into text. Should the apparatus been there we’d probably had grammar/prolog based AI long ago…


Grammars are not AI, it's just another formalism (like regular expressions, Turing machines etc.) - formalism alone doesn't solve anything.

In formal language theory, you have different classes of grammars, the most general ones correspond to Turing machines, i.e. they are a glofified assembler and you can do anything. The most restricted (in the Chomsky hierarchy), "Type 3" grammars, are basically another notation for regular expressions, and they described regular grammars.

There are algorithms for learning grammars, but the issue with that is that the induced grammars may not resemble anything that a human may write (in the same way that a clustering algorithm often does not give you the clusters you want).

But to answer your question, we need to separate the discussion between appropriate representation and method to solve a problem. I believe grammar-based compression - if you accept probabilistic grammars - is similar to LLM-based compression at some level in the sense that highly probable sequence of words get learned (whether by dictionary, grammar, neural network = LLM, could be just an implementation detail). Whichever you choose, you still need to solve the problem you are trying to solve (any grammar formalism still needs a parsing algorithm, and an actual grammar that does something useful - even after you develop a parser generator).

[Side rant, not responding specifically to the parent or OP: as a linguist, I'd also warn everybody to use "AI" with an article: *"an AI" (asterisk marks wrong use). It wrongly suggests human-like properties when it's actually just a matrix of numbers that encode a model. Here is a test whether you are using "AI" right: replace it by "Applied Statistics" in a sentence and see if you would still say it.]

AI is just an academic field (ill-named for historic reasons), subpart of computer science, and while it's fair to talk about useful representations for modeling human-like behaviors, we should focus on what intelligence is, and talk about the limits of concrete models and possibilities to extend them.

The thing about LLMs is they are a bit like the perfect snake oil salesman: extremely articulate, but knows very little nothing about a lot, understands nothing. (Whatever one criticises, they do the one thing that they are designed for very well: to generate text. Sadly that misleads a lot of people that they are just next-word/next-sentence predictors.)


You are very brave to call or not call something AI, but it is precisely generative grammars (a stochastic ones) who were initially considered AI - as a linguist you should know this better than myself.


There's a general consensus that entropy is deeply spooky. It pops up in physics in black holes and the heat death of the universe. The physicist Erwin Schrodinger suggested that life itself consumes negative entropy, and others have proposed other definitions of life that are entropic. Some definitions of intelligence also centre on entropy.

What to make of all that however, has anything but consensus.


To have entropy, you need to have a notion of information. To have information, you have to decide which differences matter, I.e. which states you classify as the same.

This isn't a problem for physics, or for computer science. But it is a problem for would-be philosophers (including a few physicists and computer scientists!) who thought information was a shortcut to avoid answering big questions about what matters, what we care about.


I liked the awe you shared -- it made me want to learn more about entropy.


[flagged]


> but on the internet you don't have to say anything and if you do it may as well have some substance

Seems like we're using different internets. Which i am glad about. I just wish mine had less of the negativity that's coming over from yours. Guess in the end, the people on your internet realize, it's more fun over here.

You could have expressed all of that with less maliciousness towards the person. Thank god, in my internet everyone can say whatever they want f they want. Because– and more people should remember this apparently– if i don't like it, i just turn off the internet, like grandma!

Wish all the best to you and everyone you care about in real life. I might be just a bot. You might be. We'll never know for certain. Don't let some bits mess with your feels.


I'm sorry for for leaking negativity into your internet. I don't think negativity is inherently undesirable, but I don't think it's useful to express it towards people's selves. I meant only to criticize the comment without further implication.

In fact I went and got some references I really liked because I was hoping to add what I felt was missing from the discussion on entropy. My motivation in the end was to share my personal feeling of awe, and in a way that was accessible to the parent poster as well as other readers. How do you like that internet?


> My motivation in the end was to share my personal feeling of awe, and in a way that was accessible to the parent poster as well as other readers.

Then write it that way:

1. Remove the first paragraph, where you treat the OP like a child by telling them where it is and isn't appropriate to express their idea

2. Remove the first two sentences of the 2nd paragraph

3. Remove the clause "but you can't get that from a quip."

Now we've got the beginnings of a delicious comment! You could even garnish it at the beginning with something like "Not sure if we're talking about the same thing, but..." But you don't even really need it.

That's the difference between playing in a sandbox with others, and unwittingly kicking someone out of one.


I was trying to convey a subjective and emotional experience. Obviously I failed.

I hope that when you try to express awe it isn't dismissed as weasel words.

I give up. Delete my account please dang. This site isn't good for my mental health.


> I give up. Delete my account please dang. This site isn't good for my mental health.

While I cannot speak to your conclusion, I can humbly suggest to not put any credence in what some rando says on the Internet. Including myself. :-)

  Far better is it to dare mighty things, to win glorious 
  triumphs, even though checkered by failure... than to rank 
  with those poor spirits who neither enjoy nor suffer much, 
  because they live in a gray twilight that knows not victory 
  nor defeat.[0]
0 - https://www.brainyquote.com/quotes/theodore_roosevelt_103499


I didn't like your comment, that's all. I'm just one anonymous asshole, I can't invalidate your sense of awe.

FWIW, I didn't want or expect to harm your mental health.


>> This is all weasel words, and you've misspelled "Schroedinger"/"Schrödinger". That sort of comment might be fine for the pub, but on the internet you don't have to say anything and if you do it may as well have some substance.

> ... I can't invalidate your sense of awe.

Actually, yes. Yes, you can.

And so could I, or anyone really, given sufficiently focused vitriol.

For example, your sentence fragment "This is all weasel words" is incorrect English. "This is" should use the plural form "These are" as the subject is "words" and not "weasel", as well as the modifier "all" emphasizing plurality.

The irony of your subsequently pointing out a spelling error and then chastising the OP for same has not been lost.


At least 50% of posts that point out a spelling or grammatical error contain one as well.


> At least 50% of posts that point out a spelling or grammatical error contain one as well.

Quite true. While I do not generally claim to be a grammatical wizard, I do know when I hear from one (hello Zortech-C++, it's been too long!).

If you don't mind pointing out my mistake(s) above, I would appreciate it as my goal was to exemplify the social effect of pedantic critique. Being corrected when doing same could serve as an additional benefit.


It's nice to hear from a ZTC++ user!


What's the unconditional rate of errors in posts generally? Without the prior I don't know if whinginging about spelling or grammar makes my posts correcter or incorrecter.


> "This [comment] is all weasel words."

The subject was "this", referring to the comment.

By what standard of English did you reckon my post incorrect? I appreciate your effort to cheer up your parent post, and to improve my language skills, of course.

(I'm not the language usage police, though I am fussy about correctly rendering people's names.)

I didn't understand your gainsaying about invalidating awe. Whether or not the poster's awe was a real and worthwhile feeling seems to me entirely independent of my opinions.

I find your aims admirable. However, I regret to say that for me the irony, and purpose of this comment thread, have indeed been lost.


>> "This [comment] is all weasel words."

> The subject was "this", referring to the comment.

While I understand your clarification being the intent, in the original context "this" is in its determiner form and not pronoun form. Would the addition of "comment" have been included, then I believe most (if not all) readers would understand its use as the pronoun form it is often used as well as being associated with the noun form of "comment."

More important than my pedantry was an attempt to illustrate how corrections in this medium can be interpreted quite differently based on the person. As you intimate, my example did not affect you adversely (which is great BTW). How the OP responded to your original reply indicated a different effect unfortunately. I am not judging, only providing my observation.

A quote I wish I knew much earlier in my life is:

  A sharp tongue is the only edge tool that grows keener with 
  constant use.[0]
HTH

0 - https://www.brainyquote.com/quotes/washington_irving_384249


Thank you, especially for deft use of pedantry as a tool for good, and that quote which I'll retain.


Your comment is excellent, inspiring and quite true.

Please stay, otherwise the rest of us are stuck with the alternative (which essentially someone saying "read this wikipedia and Schrödinger original talks", with a perplexing pile of unhappyness, pretending to correct things that you didnt get wrong)


Inspiring and true

I could leave if you like


You wouldn't toss out your radio because it picks up a bit of static now and then, would you? That's all that posts like that one amount to... static.


I’m not sure this is strictly true. It seems more accurate to say there are deep connections between the two rather than they are theoretically equivalent problems. His work is really cool though no doubt.


In the sense I understand that comparison, or have usually seen it referred to, the compressed representation is the internal latent in a (V)AE. Still, I haven't seen many attempts at compression that would store the latent + a delta to form lossless compression, that an AI system could then maybe use natively at high performance. Or if I have... I have not understood them.


it is true, but i think it's only of philosophical interests. for example, in a sense our physical laws are just human's attempt at compressing our universe.

the text model used here probably isn't going to be "intelligent" the same way those chat-oriented LLMs are. you can probably still sample text from it, but you can actually do the same with gzip[1].

[1]: https://github.com/Futrell/ziplm


Microsoft SQL Server is only available as an x86-64 docker container binary. They actually had a native(?) arm64 docker container under the name "azure-sql-edge", which was (and still is) super useful as you can run it "natively" in an arm64 qemu linux for example, but alas that version was not long lived, as Microsoft decided to stop developing it again, which feels like a huge step backwards.

https://techcommunity.microsoft.com/blog/sqlserver/azure-sql...

There's probably other closed-source linux software being distributed as amd64-only binaries (rosetta 2 for linux VMs isn't limited to docker containers).


I guess if you use this, then the security of your key is only as strong as for how many minutes the bruteforce took (since anyone else could also run the tool and generate their own key matching the desired fingerprint in the same amount of minutes you needed - or less).


I don't think the idea is to use the visual representation of the SSH key as a security mechanism but rather to have an SSH key that looks cool when you visualize it.


Isn't the whole point of VisualHostKey in ssh to act as a security mechanism, i.e. "yes this looks like the correct server key" on first use on a new client that doesn't already have the key in known_hosts?


That's not how randomness works. The expected duration of the attack is only determined by how close they want to get to your artwork.

For example, if you pick the first key you generate, it obviously doesn't mean the attacker can get the same art in one try.


so the exact same as any other crypto key?


The number of minutes being greater than the heat death of the universe


Is the runtime of this application "a number of minutes greater than the heat death of the universe" to find something that could pass off as matching the target visualhostkey?


A large chunk of that php code could probably be replaced with a call to https://www.php.net/wordwrap


Odd, this is not happening here on macOS 15.0.0. Turning bluetooth off either via the system settings app or the menu bar icon shuts off bluetooth immediately with no prompt for me...


Laptop or desktop? The prompt is only for Macs without a built-in keyboard and pointing device.


Aha. This was on a laptop. I missed that distinction.


I guess you could format a diff patch file locally and paste it as a text comment in an issue in their repo...? :P


Closed-source applications or games, I'd say.


I'd be very surprised if ubuntu has backported the keystroke obfuscation feature (which was introduced in 9.5) to 8.9.

When the feature is missing, you're not safe from keystroke interception at all, which is what this bug is all about, so any 8.9 version would be actually considered "unsafe" until the whole feature is backported, which seems unlikely to happen?


Interesting, I started playing with spotlight and typing in (-20)^21 returns " = 0", which is obviously not correct.

And typing in "(-22)^21" gives "-71100888972574851072", but wolfram alpha insists it should be "-15519448971100888972574851072".

Looks like there are still bugs here.


Wow that's bizarre.

At first I thought it was just an overflow error but no it's nothing like that. The math is indeed very clearly broken, as I play around with it on Sonoma on my M1.

I'm genuinely shocked. I though this kind of floating-point math was rock-solid, tested thoroughly over the decades.


No bug in Big Sur spotlight:

"(-20)^21" = -2.097152e27 "(-22)^21" = -1.551944897e28


Spotlight on Sequoia looks correct, though it limits precision more than wolfram alpha.

(-20)^21 = -2.097152x10^27 and (-22)^21 = -1.5519448971*10^28


Hah,

macOS Sonoma 14.6.1 on M1 = 0

iOS 17.6.1 = -0

WTF.


Yes, it looks like spotlight math is broken on both.


The screenshot in the article shows MD5() is returned as part of the error message from the web server, so it is probably also a part of the original server-side query.


Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: