More

nylonstrung · 2024-04-24T05:33:00

Windows are legally required for safety purposes, not for sunlight

xvedejas · 2024-04-24T05:43:47

It’s prudish tendencies like outlawing apartments with no windows that force folks like me, who don't care for safety anyway, to compete for housing with families who do need safety.

but unironically; it'd be much safer than the street

reply

acchow · 2024-04-24T06:02:53

It’s preferable to have safety standards on general housing stock, instead of asking individuals to figure out safety levels.

I imagine there are plenty of people out there that aren’t aware that windows are for safety and might find out in the worst way possible.

reply

JumpCrisscross · 2024-04-24T13:34:14

> Windows are legally required for safety purposes

They are not used for egress in skyscrapers—no ladder goes that high. And commercial buildings have fantastic internal ventilation. (Skyscraper apartment windows often can’t be opened for ventilation anyway, and are not trivially shattered.)

reply

throwaway11460 · 2024-04-24T07:09:14

What safety? This makes no sense. Am I supposed to jump out of 20th floor or what?

kkfx · 2024-04-24T07:27:22

In case of ventilation fail an openable window allow for emergency manual ventilation, beside that being exposed to a bit of sunlight is needed for a good health, artificial lights can't do the same.

throwaway11460 · 2024-04-24T09:18:15

Many flats here in Europe have windows that don't open. Don't tell me I can open the windows in a NYC skyscraper

kkfx · 2024-04-24T09:57:57

Well, at least in Italy (where I was born) and France it's not allowed to have civil accommodations without openable windows. In Italy at all, in France you are allowed for non-living areas (bathroom and kitchens essentially) as long as another form of ventilation is present.

Limits I know are 1/8 of the flat surface you can march on must be openable windows, 1 complete air change per hour (you have a 50m³ room, you must change at least 50m³ of air per hour. There are some tolerance for historic buildings down to 1/12 of the flat surface, below that the accommodation can't be used to live in.

While for offices there is only an natural illumination requirement, and a ventilation NOT necessarily bound to openable windows. I do not know for the rest of the EU.

reply

nylonstrung · 2024-04-07T16:08:53

Boox is the best by far

It's normal android under the hood so you can just install your cloud storage app on it

nylonstrung · 2024-03-26T15:47:31

This is very cool. I would immediately buy it if someone ends up making an Obsidian plugin

tremarley · 2024-03-26T17:50:44

This would be very effective

nylonstrung · 2024-03-17T19:51:21

For what reason would you want to use this instead of open source alternatives like Mistral

rvnx · 2024-03-17T19:52:43

Mistral opened their weights only for very small LLaMA-like model.

MallocVoidstar · 2024-03-17T20:09:01

I'm pretty sure Mixtral outperforms Grok-1 and uses much less memory to do it

cavisne · 2024-03-17T20:47:31

One of the interesting things when weights are open sourced is the community can often improve the results. See all the bugs fixed in Gemma for an example.

ein0p · 2024-03-18T00:21:12

Doubtful, for purely information theoretic and memory capacity reasons. It may outperform on some synthetic metrics, but in practice, to a human, larger models just feel “smarter” because they have a lot more density in their long tail where metrics never go

elfbargpt · 2024-03-17T20:15:19

I'm a little out of touch, is there a way to see how Grok measures up to other models?

amrrs · 2024-03-17T20:36:05

Benchmarks here https://x.ai/blog/grok

refulgentis · 2024-03-17T20:52:16

And to compare, you can sort by MMLU on here: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderb....

Edit: to include my self summary after review: There's a good 100 models better than, a couple 1x7b even. Mixtral stomps it, half mixtral are universally better but one is close to same.

lossolo · 2024-03-17T21:35:49

This benchmark is mostly worthless, some of the top models there were trained on benchmark data, which is a known fact in the community.

The only reliable benchmark: https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboar...

refulgentis · 2024-03-18T01:02:07

No, it's not "mostly worthless" and yes, some of the top models were removed a few months back from being trained on benchmark data.

I urge you to at least think through what alternative you propose before posting so aggressively in these situations. Lmsys doesn't have Grok, or I would have included it. And having _some_ data is better than none.

I also had someone arguing with me 6 months back that we can't trust any benchmarks at all from vendors, which would exclude the blog post. Instead of just repeating that back vehemently, I filled a gap. It's important we don't self-peasantize as a species, all data has its issues, that doesn't mean we throw it all out.

michaelt · 2024-03-18T11:51:45

Quantifiable metrics are useful if they're credible, certainly.

But does it seem likely, to you, that a 7B-parameter model would outperform a 314B-parameter model? Given that we can look at the chatbot arena leaderboard and it's dominated by proprietary, 70B and 8x7B models?

A well regarded and modern model like Mixtral 8x7B, which is ranked 13th on the chatbot arena leaderboard, scores 72.7 'Average' on the open LLM leaderboard - and yet 'pastiche-crown-clown-7b-dare-dpo' scores 76.5.

To me, that sounds too good to be true.

refulgentis · 2024-03-18T12:41:27

Yup, 100%. Grok isn't very good and it was rushed.

Rest re: pastiche model, etc. are proposing things I'm not claiming, or close to what I'm claiming.

n.b. you don't multiply the parameters by experts to get an effective parameter count. Why? Think of it this way: every expert needs to learn how to speak English, so there's a nontrivial amount of duplication among all experts

michaelt · 2024-03-18T13:13:21

> n.b. you don't multiply the parameters by experts to get an effective parameter count.

I actually took the 314B from Grok's HF page [1] which describes the model as "314B parameters" when explaining why it needs a multi-GPU machine.

I certainly agree that parameter count isn't everything, though; clearly things like training data quality and fine tuning count for a lot.

[1] https://huggingface.co/xai-org/grok-1

verticalscaler · 2024-03-17T20:18:25

Well if nothing else, this one might be significantly less nerfed. Very interesting to compare to the others.

refulgentis · 2024-03-17T20:43:18

It's not, and I mean it, specifically in groks case.

Generally, it's a boring boneheaded talking point that the 1% of us actually working in AI use as a sorting hat for who else is.

benreesman · 2024-03-17T22:03:21

I’ve been known to get snippy on HN from time to time myself :) So please know that I’m only offering a gentle nudge that I’d want from a fellow long-timer myself regarding a line of discussion that’s liable to age poorly.

Talking about sorting hats for those who do and don’t have the one-percenter AI badge isn’t a super hot look my guy (and I’ve veered dangerously close to that sort of thing myself, this is painful experience talking): while there is no shortage of uninformed editorializing about fairly cutting edge stuff, the image of a small cabal of robed insiders chucking in their cashews while swiping left and right on who gets to be part of the discussion serves neither experts nor their employers nor enthusiastic laypeople. This is especially true for “alignment” stuff, which is probably the single most electrified rail in the whole discussion.

And as a Google employee in the diffuser game by way of color theory, you guys have a “days since we over-aligned an image generation model right into a PR catastrophe” sign on the wall in the micro kitchen right? That looked “control vector” whacky, not DPO with pretty extreme negative prompt whacky, and substantially undermined the public’s trust in the secretive mega labs.

So as one long-time HN user and FAANG ML person to another, maybe ixnay with the atekeepinggay on the contentious AI #1 thread a bit?

gopher_space · 2024-03-17T23:13:04

Every discipline has its bellwether topics. They’re useful for filtering out people who want to chip in without picking up the tools.

whimsicalism · 2024-03-18T00:20:37

regardless of whether they say it out loud, it is what many of us think - might be good for people to know why their opinions are getting immediately dismissed by insiders

benreesman · 2024-03-18T00:55:27

Letting people know how why their opinions are getting dismissed in a productive way is done by citing well-known sources in low-effort way, or by explaining things thoughtfully in a high-effort way: Karpathy has chosen the highest-effort way of most anyone, it seems unlikely that anyone is at a higher rung of "insiderness" than he is, having been at Toronto with (IIRC) Hinton and Alex and those folks since this was called "deep learning", and has worked at this point at most of the best respected labs.

But even if folks don't find that argument persuasive, I'd remind everyone that the "insiders" have a tendency to get run over by the commons/maker/hacker/technical public in this business: Linux destroying basically the entire elite Unix vendor ecosystem and ending up on well over half of mobile came about (among many other reasons) because plenty of good hackers weren't part of the establishment, or were sick of the bullshit they were doing at work all day and went home and worked on the open stuff (bringing all their expertise with them) is a signal example. And what e.g. the Sun people were doing in the 90s was every bit as impressive given the hardware they had as anything coming out of a big lab today. I think LeCun did the original MNIST stuff on a Sun box.

The hard-core DRM stuff during the Napster Wars getting hacked, leaked, reverse engineered, and otherwise rendered irrelevant until a workable compromise was brokered would be another example of how that mentality destroyed the old guard.

I guess I sort of agree that it's good people are saying this out loud, because it's probably a conversation we should have, but yikes, someone is going to end up on the wrong side of history here and realizing how closely scrutinized all of this is going to be by that history has really motivated me to watch my snark on the topic and apologize pretty quickly when I land in that place.

When I was in Menlo Park, Mark and Sheryl had intentionally left a ton of Sun Microsystems iconography all over the place and the message was pretty clear: if you get complacent in this business, start thinking you're too smart to be challenged, someone else is going to be working in your office faster than you ever thought possible.

refulgentis · 2024-03-18T00:57:57

I have no idea how you've wandered all the way to Napster, Sun, hackers, etc. Really incredible work.

Well, I kind of know, you're still rolling with "this dude's a google employee", so the guy foaming at his mouth about Google makes sense to you, and now you have to reach to ancient lore to provide grounding for it.

I don't work for Google.

benreesman · 2024-03-18T01:22:56

Then don't link to an "About Me" page [1] that says you do? How is confusion on that subject any reader or commenter's fault?

I don't care if you personally work at Google or not, Google got itself in quite a jam as concerns public perception of their product in particular and the AI topic in general by going overboard with over-alignment, everyone knows that so one assumes that insiders know it, which is one of a great many examples of how strongly-forced models are a real problem for arbitrarily prestigious insider-laden labs.

Framing the debate about whether large, proprietary models are over-aligned or mis-aligned as an acid test for whether or not someone is worth paying attention to is really weird hill to stand on.

[1] https://www.jpohhhh.com/about

refulgentis · 2024-03-18T12:44:04

Yes, you do care, in fact, you care a lot! You made it the centerpiece of your argument and went to a lot of trouble to do so.

Flag away, my friend.

refulgentis · 2024-03-18T00:48:38

You're making up a person and being extremely creepy while doing a poor job of it.

It's at least funny, because you're doubling down on OP's bad takes, and embarrassing yourself with trying to justify it with what you thought was brilliant research and a witty person-based argument. But, you messed up. So it's funny.

Punchline? Even if you weren't wrong, it would have been trivial while doing your research to find out half of Deep Mind followed me this week. Why? I crapped all over Gemini this week and went viral for it.

I guess, given that, I should find it utterly unsurprising you're also getting personal, and clinging to 1% as a class distinction thing and making mental images of cloistered councils in robes, instead of, well, people who know what they're talking about, as the other repliers to you point out.

"1%ers are when the Home Depot elites make fun of me for screaming about how a hammer is a nerfed screwdriver!"

benreesman · 2024-03-18T01:01:41

I've been around here a pretty long time, but I could still be off base here: as far as I understood people generally posted links to their own blog [1] in their HN profile because they want people to read them? I read your blog and particularly the posts about Gigadiffusion because I wanted to reply from a position of having put some effort into understanding where the poster I was replying to was coming from before popping off with what could be taken as a criticism. If that offends you or creeps you out I'm more than happy to steer clear of it with the parting remark that I really like Material and had hoped that any follow up would give me the opportunity to compliment you on some nice work.

If that's not your blog, you should probably take it off your profile?

[1] https://www.jpohhhh.com/

refulgentis · 2024-03-18T01:03:02

I'm not doing a faux-nice thing with you. You made up an elaborate argument, to justify rank fact-free ranting, based on false information. Thanks for your time.

renewiltord · 2024-03-17T21:20:09

The safety crap makes the tools unusable. I used to have a test for it that I thought was decent, but Claude failed that test and it is way better than ChatGPT-4 for code, which means my test was bogus. The people actually working in AI are kind of irrelevant to me. It's whether or not the model will solve problems for me reliably.

People "actually working in AI" have all sorts of nonsense takes.

benreesman · 2024-03-17T22:13:09

Another day, another fairly good comment going grey on an AI #1. The over-alignment is really starting to be the dominant term in model utility, Opus and even Sonnet are both subjectively and on certain coding metrics outperforming both the 1106-preview and 0125-preview on many coding tasks, and we are seeing an ever-escalating set of kinda ridiculous hot takes from people with the credentials to know better.

Please stop karma bombing comments saying reasonable things on important topics. The parent is maybe a little spicy, but the GP bought a ticket to that and plenty more.

edit: fixed typo.

refulgentis · 2024-03-18T12:52:42

What if they're wrong, and most people know what a "system message" is a year after ChatGPT launch, so they're willing to downvote?

Is there any chance that could be happening, instead of a complex drama play with OP buying tickets to spice that's 100% obviously true?

benreesman · 2024-03-18T19:18:28

I was trying to be helpful. I've made elitist remarks on HN that were dubious in at least two ways: it was dubious if I was actually all that elite, and it was dubious if any amount of being elite justifies or makes useful a posture of elitism. My internal jury is still out, but as of writing I think I probably underestimated how unique my first-hand knowledge and contributions were, but more than made up for that by the claims exceeding the reality by a huge margin, for a massive net lose that made me wish I could take the remarks back.

I click on every HN username I reply to, because I've been hanging out here for like 16 years and more than once I've mouthed off only to later realize it was about C++ to Walter Bright or something, and looked a fool as a result. I've since apologized to Walter for disrespecting a legend and he was very gracious about it, to cite just one example.

Your initial remark wasn't even that bad, certainly others talk that way, and I tried to frame it accurately as one guy who tends to FAANG-flex carelessly rather than thoughtfully to another guy who probably doesn't talk to people like that face to face and is probably a pretty good guy having a tough day. I was trying to say: "been there, maybe cool it man you're probably going to have the same bad time I've had on this sort of thing".

But this is getting to where I'm starting to lose my temper a bit, I've been pretty cool about this. I even went and read the Dart/`llama.cpp`/`ONNX` stuff because I've also messed around with binding to `llama.cpp` and `whisper.cpp` and stuff just to make sure I'm not mouthing off to Jeff Dean's alt or something. I'm not talking to Jeff Dean.

I surf with `showdead` on, and I don't know the current meta so I don't know if you know that you've been flagged dead 3 times on this subthread already and as much as I'd like to, I can't really argue with any of the 3.

But given that you've clearly got similar interests, and therefore probably things that you could teach me if I were willing to listen, I'm going to propose a do-over.

If you'd like to start this over from a place of mutual interest and write this thread off to "a pair of people had bad vibes on an Internet forum once", email be at `b7r6@b7r6.net`.

If not, no hard feelings, but in that case, let's just give one another a wide berth and call it a day.

threeseed · 2024-03-18T02:01:42

> The safety crap makes the tools unusable

For you that may be the case.

But the widespread popularity of ChatGPT and similar models shows that it isn't a serious impediment to adoption. And erring on the side of safety comes with significant benefits e.g. less negative media coverage, investigations by regulators etc.

wmidwestranger · 2024-03-18T05:34:58

Seems like marketing and brand recognition might be some confounding variables when asserting ChatGPT's dominance is due to technical and performance superiority.

random_cynic · 2024-03-18T04:50:29

The 1% who actually work on AI don't use terms as generic as "AI". Way to reveal yourself as college undergrad who read a couple of popular science books, downloaded MNIST data and thinks they are "experts".

not_really · 2024-03-18T00:30:29

lol, okay

verticalscaler · 2024-03-17T20:47:11

[flagged]

refulgentis · 2024-03-17T20:50:39

(not sure you're going to edit again, but in the current one, I'm not sure what Google's silly stock image warning has to do with anything, and I have generally chosen to avoid engaging people doing their politics hobby via AI discussion, since it became okay to across the ideological spectrum of my peers. So, mu is my answer.)

And you're right, I was really surprised to see the harder right people throwing up their hands after the Gemini stuff.

verticalscaler · 2024-03-17T20:54:15

[flagged]

itishappy · 2024-03-17T21:02:05

Wouldn't have even noticed had you not pointed it out.

refulgentis · 2024-03-17T20:54:58

Feel free to explain! You caught my attention now, I'm very curious why it's on topic. Preregistering MD5 of my guess: 7bfcce475114d7696cd1d6a67756761a

verticalscaler · 2024-03-17T21:08:36

[flagged]

refulgentis · 2024-03-17T21:21:06

No I didn't, at least, I don't think it did but it does sound exactly like me. But then again, I don't know what it'd have to do with anything you said specifically.

https://pastebin.com/yfUWZMmc, idk if it's right because you kinda just went for more free association.

mlindner · 2024-03-18T02:24:11

Curious why you're so dismissive of something that's pretty important?

zozbot234 · 2024-03-17T23:27:04

Isn't this Apache licensed? Regardless, you can run multiple models concurrently on the same input using well-known ensemble techniques. (Not to be confused with mixture-of-experts, which is more like training a single model where only a few blocks are chosen to be active at any given time - a kind of sparsity.)

tlb · 2024-03-18T12:56:31

Not super easy if they have different tokenizers.

nylonstrung · 2024-03-15T14:49:14

Yeah that 7% boost also includes every sub-average employee of the company.

What did the top quartile employees get?

nylonstrung · 2024-03-15T14:47:38

The company beat cash flow projections by $400M and he got $4M.

Doesn't seem that crazy.

therealpygon · 2024-03-15T15:37:16

You dare disagree with the “all ceo pay is bad” narrative? Shame on you.

If split among its employees, they could have given a massive pay raise of at least…(checks calculations)… 0.01 cents per hour. I don’t see how employees can eat with a greedy CEO taking a fraction of a cent like that.

Sarcasm aside, it is beyond me how people are unable to understand that the need to show constant growth and profit is the most detrimental thing to both customer and worker, and not bonuses or golden parachutes.

nylonstrung · 2024-03-09T05:34:14

You mention game devs as potential users. Why would I need NURBS for game models?

nylonstrung · 2024-03-07T00:50:34

This was more likely a 0-0.1x return.

There's no chance the investors were happy with this- if a company had years of runway they almost certainly advised them to tough it out. Even a downround would be preferable compared to a total writeoff

Eridrus · 2024-03-07T15:16:39

Probably 0x for everyone except the last round investors who would get basically all the cash in the bank due to liquidation preferences, which is probably 0.66-0.84x return IMO.

nylonstrung · 2024-03-06T18:39:43

I am so so so sick of these.

esafak · 2024-03-06T19:24:42

Of what, specifically, and why?

throwitaway222 · 2024-03-06T20:36:32

I'm guessing it's the creepy "notetaker" that these companies have join and listen in on you. It used to be fun to join a meeting on time and have a random moment with someone else, and discuss something interesting, before the late stakeholders arrive.

esafak · 2024-03-06T20:54:55

So that's a pain point that Circleback is now aware of: you want to be able to go off the record.

alihaghani · 2024-03-06T21:53:12

FWIW, by default we don't store meeting recordings–only the transcript. Saving meeting recordings is something you can enable from Settings → Account.

htrp · 2024-03-07T06:11:05

The number of my ai notetaker summaries which start with Participant 1 asked Participant 2 about their weekend

nylonstrung · 2024-03-07T00:44:28

I've just seen so many equivalent products that have never caught on in the market, are overpriced for what they offer, and ultimately tantamount to bloat

geraldhh · 2024-03-07T10:32:44

seems like there is money to be made

nylonstrung · 2024-02-29T07:30:31

What are the weirdest/most interesting ones in here

jareklupinski · 2024-02-29T16:16:37

learning about https://web.archive.org/web/20230919050349/https://www.sigid...

and https://web.archive.org/web/20231019154254/https://www.sigid...

took me down the rabbit hole, eventually to https://en.wikipedia.org/wiki/The_Conet_Project