Hacker News new | past | comments | ask | show | jobs | submit | mixtureoftakes's comments login

this is crazy

when trying to run on a mac it only plays in a very small window, how could this be configured?


fuzzing lore: https://threadreaderapp.com/thread/1799457232607985698

great read if you wanna waste 11 minutes


Google turns up a CNET article from 2007 (probably because eEye was "pumping press releases left and right"):[1]

> Researchers at eEye used a standard process of code auditing in discovering the vulnerabilities, [eEye CEO Ross] Brown added. He noted that Microsoft either did not do a 'good job' with its code auditing, or it may not have had enough people working on such a task.

I don't really get this culture of racing to find a bug in another company's product, then strutting about finding one (in Microsoft Publisher of all things) and throwing shade. I guess we should all be so lucky to have a company whose "standard process" is to pull a week of all nighters testing our product.

[1] https://www.cnet.com/news/privacy/flaw-found-in-office-2007/


The style of writing certainly added a lot to it.

Edit: I just checked the author, I might actually know him from IRC. The "Mantis" and "infosec" checks out.


small world we live in...


Fantastic read. Funny, relatable, all the technical details, and so much heart. Thank you for sharing it!


Is this sota? How does rag quality compare to that in, say, lmstudio?

Also is there a "rag quality benchmark" yet


   Most of us will read this and continue living our life exactly the same way as before

          …wake up


Reminds me of my favorite copypasta:

>If you're reading this, you've been in a coma for almost 20 years because of a car accident. We're trying a new technique. We don't know where this message will end up in your dream, but we hope we're getting through. Please wake up.


Even if people wake up and "do something", govts will just tire us out. Similar to how online protests against reddit, (or on ground protests like occupy [X], and so on) and others failed. We have no option but to accept what is handed out to us.


Wake up and... do what exactly? Tell others to "wake up" ad nauseum? The whole "wake up, sheeple, you're being manipulated" is both correct and amusingly self-terminating.

Metacognition, for all its benefits, comes with the newfound sisyphean task of being unable to intentionally avoid thinking about a white elephant for an entire minute. "Don't be influenced by the ads/media/propaganda" works about as well.

So perhaps the best way to reduce manipulation is to find a way back to sleep sometimes. A sort of meta-meta-cognition, if you will. It's self-awareness all the way down.


TLDR - 2x cheaper, slightly smarter, and they only compare those new models to their own old ones. Does google have moat?


The math score exceeds o1-preview (though not mini or o1 full) fwiw.


Moat could be things like direct integration into Gmail (ask it to find your last 5 receipts from Amazon), Drive (chat with PDF), Slides (create images / flow charts), etc.

Not sure if their models are the moat. But they definitely have an opportunity from the productization perspective.

But so does Microsoft.


Have you tried the Gemini Gmail integration? I have that enabled in my GSuite account.

It's incredible how bad it is. I've seen it claim I've never received mail from a certain person, while the email was open right next to the chat widget. I've seen it tell me to use the standard search tool, when that wasn't suitable for the query. I've literally never had it find anything that wouldn't have been easier to find with the regular search.

I mean, it's a really obvious thing for them to do, I'm genuinely confused why they released it like that.


> I'm genuinely confused why they released it like that.

I agree. Right now it's not very useful, but has the potential to be if they keep investing in it. Maybe.

I think Google, Microsoft, etc are all pressured to release something for fear of appearing to be behind the curve.

Apple is clearly taking the opposite approach re: speed to market.


Yeah - The thing though is, you could build the same thing better in a day's work by using OpenAI's API, or Gemini's for that matter.

I wonder if there isn't a deeper, more worrying (for Google) reason behind that - that AI is killing their margin.

Google has always been about delivering top notch services, and winning by being able to do that cheaper than the competition.

It's "in their DNA" - everyone knows that using links to a website as a quality signal was a really good idea in the early days of Google, but what's a little less well known is that the true stroke of genius was the algorithmic efficiency of PageRank.

Similarly for GMail. Remember when it launched, 1 GB of free storage was just completely out of every competitor's league?

It may just be that this recipe of being smarter than everyone on algorithms and on datacenter operations might just not work anymore in the age of modern machine learning.


The problem with current crop of LLM models is that it makes for a great demo. I am also confident that you can build a working prototype for GMail, Outlook or any other surface. But I am equally confident it will be a massively different ballgame to role it out to a billion users. You'll run into a lot of edge cases and have to take care of a lot of adversarial scenarios as well. Pretty sure that's the same issue Apple is running into as well, and why they have had to postpone rollouts.


I don't buy that at all. They've literally shipped a broken, useless product that this amateur could do better (yes, as a demo).

All the hard scalability stuff, they've already done before. Gmail exists, the Gemini API exists.

If they're not getting it to work, there must be another reason. They just can't afford to provide it at a price point that users accept.


Doesn't Microsoft also get OpenAI IP, if they run out of money?


> Does google have moat?

Potentially (depends if the EU cares)...

E.g. integration with Google search (instead of ChatGPT's Bing search), providing map data, android integration, etc...


Their Android integration certainly isn't on track to earn them any moats... https://hachyderm.io/@ianbicking/113099247306589777


Honestly i would have typed commands in shell if "captcha" asked me for it. Just to see the scale of outcome's awfulness.

I'm almost bored enough to just start installing weird malware for research and funsies


all while some people spend their entire lives obsessing about privacy without even doing anything illegal...


wrong angle - most people aren't trying to cover their crime sprres, they know losing privacy is more likely to make you a victim. Could be robbery, but it could also be high hotel prices.


it was just deleted? Page not found


nope. maybe hugged, but it's there now


Good realease but the annoying part is they're very unclear about which types of models they are comparing. They provide benchmark comparisons for the base models only and arena comparisons for instruct only? Was that intentional? Why would you ever do that? This makes things unnecessary complicated imo and the only payoff is a short term win for google on paper.

Guess I'll just fully test it for my own tasks to know for sure


very cool website! I did sheep 4000 and my pc immediately exploded


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: