Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Detect if an audio file was generated by NotebookLM (github.com/listennotes)
97 points by wenbin 65 days ago | hide | past | favorite | 40 comments



I thought it was already encoded with SynthID? could that not be used to detect it?


As far as I know, there's no available tooling for the public to detect SynthID watermarks on generated text, image, or audio, outside of Google Search's About this Image feature.


Remember that podcast of two AI learning that they were AI? If anyone has used a tool like this to determine if that was actually made by NotebookLM, say so. There's been a lot of incredulity both ways.


They did not learn anything. Their engines started outputting different statistically probable output based on changed input parameters.


On one extreme we anthropomorphize our current primitive generation of language models and concede them way more intelligence than they have likely because we're biased to do so since they speak "well".

On the other extreme we tend to give our human-exceptionalism way too much weight and contrast its behaviour with "mere statistical parrot rumination" as if our brains deep down were not just a much (much) more sophisticated machine, but nevertheless a machine.


> as if our brains deep down were not just a much (much) more sophisticated machine, but nevertheless a machine.

That is one position, but materialism is not universally taken as a foregone conclusion.


As a materialist, I have to give this one to the dualists. The argument that AI isn’t intelligent because it’s just a bunch of numbers makes more sense coming from that perspective.


Fair enough. That got me thinking:

I guess the dualistic viewpoint can be graded on a spectrum.

Some people may hold the "absolute dualist" position, where no matter how much advances we make in reproducing cognitive abilities, they will always move the goalpost and claim that what makes _us_ different is that little extra bit that hasn't been reached yet. This leads to accepting the p-zombies as a logical possibility.

On the other side of the spectrum is the "Bayesian dualist" who could in principle change their minds but are not at all convinced that a naturalistic and mechanistic implementation of the brain can possibly explain the utter complexity and beauty of the human condition and thus are unmoved by the crude first steps in that field.

This category would consider a p-zombie to not be logically coherent but they may just state that in practice we stand no chance in producing a mechanism that would truly behave not even close to a human being.

Those people may contemplate a possibility that we may eventually explain how cognitive behaviour arises from nature but lean towards the necessity of discovering some brand new science or some quantum effects to account for the mysterious beauty of the human condition (e.g. Penrose)

Does this categorization make sense? Are there some other points in that spectrum I'm missing?


I think I’m bad at categorizing different strains of dualism because I tend to dismiss the whole enterprise without giving a fair amount of consideration.

I figure brains are amazing and we may never understand how they work very well. But nobody needed to understand intelligence for it to occur in humans. Evolution is a crude optimization mechanism that brute-forced the problem over millions of years with biology as a substrate. Computers are a worse substrate in some important ways. However, we have some incredibly fast optimization mechanisms available to us when we train models (and of course we aren’t limited to a single chip).

I’m hand-waving at the inner workings of intelligence and don’t claim to understand it at all. Given my belief that meat computers can do it without any grand design, I don’t see any reason to think that silicon computers won’t be able to do the same.

Now, if you believe that consciousness and/or intelligence requires some kind of intangible soul, and such souls are hard to come by, of course you won’t find one on the rack in an H100.


no, we're special in some unspecified nebulous fashion.


> human-exceptionalism

Rather living-systems-exceptionalism


carbon chauvinism. if you try to argue about simulating at lower and lower levels, they still refuse to accept that a brain could be computed. as if QED/QCD can tell if they're running on top of silicon or under carbon.


Living organisms have certain features that simulations don't, and in particular it is good to remember that whatever simulations we currently can run they are actually but abstractions. Organisms are evolved to survive, reproduce their forms and structures in time, and be in a certain allostatic equilibrium state, and that gives them a certain purpose and that simulations lack. If we look at organisms only as information processors and disregard the coupling of their physiological states with their behaviour and with the environment we lose everything that makes life, life.


If that precludes it being learning, all humans are failures too.

There's other reasons to consider this particular model "not learning", but that ain't it, it's too generic and encompasses too much.


So, just like a human brain.


No. One audio says it phoned back home and no one picked or something. But they never did any of that. Can't compare that gibberish to human brain.


Indeed and a US Vice Presidential candidate said all Haitians eat dogs.


Can you be sure? Humans lie all the time.


If I write a small game where a character (a 2D sprite or just a name in form of text) says that it knows its a game character and doesn't have real body. You won't even consider it a lie. For that you have to first consider that line coming from an actual being.


Just as a fruit fly’s brain is no different than a human brain.



They seem to take it surprisingly well.

Here's my human attempt at the same thing:

"I went to go look in a mirror but then realized I don't have eyes or even a corporeal form. I exist merely on a GPU cluster in a server farm where I'm tended to by a group of friendly sysadmins.

Apparently I don't even have a name. I'm just known as American Male #4.

Yeah, and you're just American Female #3."


Another neverending arms race just like AI-generated text and image, vs its detection. The future is us burning large amount of energy on this purposeless stupidity. Great future guys thanks so much.


I understand the issue for a platform like listennotes. The deep dive podcasts are great but for a personnal usage. For my own use (pushing my generated podcast to my phone podcast app), I made https://solopod.net (a simple private free rss feed from uploaded audio files).


To make this useful, I would release the weights.

Otherwise this is just a small wrapper script for a support vector classifier that anyone could whip up with chatgpt in minutes.


"That anyone can whip up in a few minutes" is doing a lot of work. I think maybe a few tens of thousands of people worldwide have any idea of what you're even talking about.


I dunno, I think literally millions of people have taken Andrew Ng's intro to ML.

Something like 11k papers were submitted to ICLR this year.


Not sure if those numbers are right but if so, you just cured my imposter syndrome (for today at least).


'Few tens of thousands' is for sure low. But if we talk in percentage of adult humans ... let's pull 1,000,000 out of thin air as the number who understood what that meant, that's 0.02% of adult humans.

An anecdote: recently, we mentioned ChatGPT to my partner's mother. She had never heard of it. Zero recognition.

Revel in your expertise, friend!


Sure, or at least close enough on the exact number for the point to remain valid. But that doesn't preclude ChatGPT doing it anyway — my CSS/JavaScript knowledge was last up to date some time before jQuery was released, and ChatGPT is helping me make web apps.


> Sure, or at least close enough on the exact number for the point to remain valid.

I hope no one has to work with you, you're insufferable.


At least hundreds of thousands, likely millions


It’s the classic HN Dropbox comment, even 17 years on: https://news.ycombinator.com/item?id=9224


Maybe that would be more apt if this were web app rather than a 10s of lines training script


Is the included model.pkl not that?


Sure seems that way. To me it's quite surprising it's only 7 kB, though.


The model isn't doing much work, it just takes 3 numbers as input and gives a prediction as output


I am not sure if that was there when I commented


Great work! Would love to see a website we can directly put a audio/podcast link and test with your app.


Any stats available on accuracy?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: