Hacker News new | past | comments | ask | show | jobs | submit | kaoD's comments login

> but why permit people to run it themselves?

I wouldn't worry about that if I were them: it's been shown again and again that people will pay for convenience.

What I'd worry about is Amazon/Cloudflare repackaging my model and outcompeting my platform.


> What I'd worry about is Amazon/Cloudflare repackaging my model and outcompeting my platform.

Why let Amazon/Cloudflare repackage it?


How would you stop them?

The license is Apache 2.


That's my question -- why license as Apache 2

What license would allow complete freedom for everyone else, but constrain Amazon and Cloudflare?

They could just create a custom license based of Apache 2.0 that allows sharing but constraints some specific behavior. It won't be formally Open Source, but will have enough open source spirit that academics or normal people will be happy to use it.

The LLaMa license is a good start.

I suspect you might be confusing the numbers: 12B (which is the very first number they give) is not context length, it's parameter count.

The reason to use parameter count is because final size in GB depends on quantization. A 12B model at 8 bit parameter width would be 12Gbytes (plus some % overhead), while at 16 bit would be 24Gbytes.

Context length here is 128k which is orthogonal to model size. You can notice the specify both parameters and context size because you need both to characterize an LLM.

It's also interesting to know what parameter width it was trained on because you cannot get more information by "quantizing wider" -- it only makes sense to quantize into a narrower parameter width to save space.


Ah, yes.

Thanks, I confused those numbers!


How long does it take you to make a demo? Might be cool to do a live streaming, or just record the process even if not live.


Sometimes I do it in one sitting, sometimes that's quite long, like 12 hours or more! Other times I will revisit it over days or months - You'd be surprised how much thought and transformation can go into 192 characters. It depends how much potential room for improvement it "feels" like there is left, you get a sense for it, sometimes you just know it's done, others you're never quite sure when to stop trying to push harder.

I've commented in a sibling my experience live streaming which was interesting. Another Demoscener who has given this far more concerted effort on Twitch is KilledByAPixel, I think they are recorded somewhere maybe on youtube. My two attempts are lost to time.

As an example (yes shameless plug) I wrote this one recently over a couple of long evenings, I'd estimate 8 hours maybe:

https://www.dwitter.net/d/31805 (runs fastest in Firefox)

But there are some familiar micro-raymarching techniques I've already developed and reused in this. The bit that took most of the time in this one was figuring out how to fit binary search marching that was required to support large differences in marching distances and textures without producing excessive moire effects.


> no one building this software wants to “steal from creators”

> It’s why things like the recent deal[s ...] are becoming so important

Sorry but I don't follow. Is it one or the other?

If they didn't want to steal from the original authors, why do they not-steal Reddit now? What happens with the smaller creators that are not Reddit? When is OpenAI meeting with me to discuss compensation?

To me your post felt something like "I'm not robbing you, Small State Without Defense that I just invaded, I just want to have your petroleum, but I'm paying Big State for theirs cause they can kick my ass".

Aren't the recent deals actually implying that everything so far has actually been done with the intent of not compensating their source data creators? If that was not the case, they wouldn't need any deals now, they'd just continue happily doing whatever they've been doing which is oh so clearly lawful.

What did I miss?


The law is slow and is always playing catch up in terms of prosecution, it’s not clear today because this kind of copyright has never been an issue before. Usually it’s just outright stealing content that was protected, no one ever imagined “training” to be a protected use case, humans “train” on copyrighted works all the time, ideally copyrighted works they purchased for said purpose… the same will start to apply for AI, you have to have rights to the data for that purpose, hence these deals getting made. In the meantime it’s ask for forgiveness not permission, and companies like Google (less openAI) are ready to go with data governance that lets them remove copyright requested data and keep the rest of the model working fine

Let’s also be clear that making deals with Reddit isn’t stealing from creators, it’s not a platform where you own what you type in, same on here this is all public domain with no assumed rights to the text. If you write a book and openAI trains on it and starts telling it to kids at bed time, you 100% will have a legal claim in the future, but the companies already have protections in place to prevent exactly that. For example if you own your website you can request the data not be crawled, but ultimately if your text is publicly available anyone is allowed to read it, and the question it is anyone allowed to train AI on it is an open question that companies are trying to get ahead on.


That seems even worse: they had intent to steal and now they're trying to make sure it is properly legislated so nobody else can do it, thus reducing competition.

GPT can't get retroactively untrained on stolen data.


Google actually can “untrain” afaik, my limited understanding is they have good controls their data and its sources, because they know it could be important in the future, GPT not sure.

I’m not sure what you mean by “steal” because it’s a relative term now, me reading your book isn’t stealing if I paid for it and it inspires me to write my own novel about a totally new story. And if you posted your book online, as of right now the legal precedent is you didn’t make any claims to it (anyone could read it for free) so that’s fair game to train on, just like the text I’m writing now also has no protections.

Nearly all Reddit history ever up to a certain date is available for download now online, only until they changed their policies did they start having tighter controls about how their data could be used.


This is why I always use `Math.random().toString(16)` for my examples :D People often get lost on the details, but they see `Math.random()` and they instantly get it's... well, just a random thing.


> I also don't really get what this branded type adds beyond the typical way of doing it

Your example is a (non-working) tagged union, not a branded type.

Not sure about op's specific code, but good branded types [0]:

1. Unlike your example, they actually work (playground [1]):

  type Hash = string & { tag: "hash" }
  
  const doSomething = (hash: Hash) => true
  
  doSomething('someHash') // how can I even build the type !?!?
2. Cannot be built except by using that branded type -- they're actually nominal, unlike your example where I can literally just add a `{ tag: 'hash' }` prop (or even worse, have it in a existing type and pass it by mistake)

3. Can have multiple brands without risk of overlap (this is also why your "wrap the type" comment missed the point, branded types are not meant to simulate inheritance)

4. Are compile-time only (your `tag` is also there at runtime)

5. Can be composed, like this:

  type Url = Tagged<string, 'URL'>;
  type SpecialCacheKey = Tagged<Url, 'SpecialCacheKey'>;
See my other comment for more on what a complete branded type offers https://news.ycombinator.com/item?id=40368052

[0] https://github.com/sindresorhus/type-fest/blob/main/source/o...

[1] https://www.typescriptlang.org/play/?#code/C4TwDgpgBAEghgZwB...


This is a far better summary of branded types than the top level comment that most people commenting should read before weighing in with their "why not just" solutions.


> In all scenarios [...] there is no reason for wanting nominal typing.

Hard disagree.

It's very useful to e.g. make a `PasswordResetToken` be different from a `CsrfToken`.

Prepending a template literal changes the underlying value and you can no longer do stuff like `Buffer.from(token, 'base64')`. It's just a poor-man's version of branding with all the disadvantages and none of the advantages.

You can still `hash.toUpperCase()` a branded type. It just stops being branded (as it should) just like `toUpperCase` with `hashed_` prepended would stop working... except `toLowerCase()` would completely pass your template literal check while messing with the uppercase characters in the token (thus it should no longer be a token, i.e. your program is now wrong).

Additionally branded types can have multiple brands[0] that will work as you expect.

So a user id from your DB can be a `UserId`, a `ModeratorId`, an `AdminId` and a plain string (when actually sending it to a raw DB method) as needed.

Try doing this (playground in [1]) with template literals:

  type UserId = Tagged<string, 'UserId'>
  
  type ModeratorId = Tagged<UserId, 'ModeratorId'>                     // notice we composed with UserId here
  
  type AdminId = Tagged<UserId, 'AdminId'>                             // and here
  
  const banUser = (banned: UserId, banner: AdminId) => {
    console.log(`${banner} just banned ${banned.toUpperCase()}`)
  }

  const notifyUser = (banned: UserId, notifier: ModeratorId) => {
    console.log(`${notifier} just notified ${banned.toUpperCase()}`)   // notice toUpperCase here
  }

  const banUserAndNotify = (banned: UserId, banner: ModeratorId & AdminId) => {
    banUser(banned, banner)
    notifyUser(banned, banner)
  }

  const getUserId = () =>
    `${Math.random().toString(16)}` as UserId

  const getModeratorId = () =>
    // moderators are also users!
    // but we didn't need to tell it explicitly here with `as UserId & ModeratorId` (we could have though)
    `${Math.random().toString(16)}` as ModeratorId

  const getAdminId = () =>
    // just like admins are also users
    `${Math.random().toString(16)}` as AdminId
  
  const getModeratorAndAdminId = () =>
    // this is user is BOTH moderator AND admin (and a regular user, of course)
    // note here we did use the `&` type intersection
    `${Math.random().toString(16)}` as ModeratorId & AdminId
  
  banUser(getUserId(), getAdminId())
  banUserAndNotify(getUserId(), getAdminId())             // this fails
  banUserAndNotify(getUserId(), getModeratorId())         // this fails too
  banUserAndNotify(getUserId(), getModeratorAndAdminId()) // but this works
  banUser(getAdminId(), getAdminId())                     // you can even ban admins, because they're also users

  console.log(getAdminId().toUpperCase())                 // this also works
  getAdminId().toUpperCase() satisfies string             // because of this

  banUser(getUserId(), getAdminId().toUpperCase())        // but this fails (as it should)
  getAdminId().toUpperCase() satisfies AdminId            // because this also fails
You can also do stuff like:

  const superBan = <T extends UserId>(banned: Exclude<T, AdminId>, banner: AdminId) => {
    console.log(`${banner} just super-banned ${banned.toUpperCase()}`)
  }

  superBan(getUserId(), getAdminId())                     // this works
  superBan(getModeratorId(), getAdminId())                // this works too
  superBan(getAdminId(), getAdminId())                    // you cannot super-ban admins, even though they're also users!
[0] https://github.com/sindresorhus/type-fest/blob/main/source/o...

[1] https://www.typescriptlang.org/play/?#code/CYUwxgNghgTiAEYD2...


Finally someone writing practical examples instead of Animal / Dog / Cat.


Thanks for the examples, I'm working on a TypeScript code base at the moment and this is fantastic way of adding compile-time typing across many of the basic types I'm using!


This was a bit confusing for me. Is this a simplified Core War[0]?

[0] https://en.m.wikipedia.org/wiki/Core_War


Seems like it, the footer has a note that it’s “inspired by Core War”


Ouch, completely missed the footer.


As a counterpoint, here's the story of how for me it wasn't TCP_NODELAY: for some reason my Nodejs TCP service was talking a few seconds to reply to my requests in localhost (Windows machine). After the connection was established everything was pretty normal but it consistently took a few seconds to establish the connection.

I even downloaded netcat for Windows to go as bare ones as possible... and the exact same thing happened.

I rewrote a POC service in Rust and... oh wow, the same thing happens.

It took me a very long time of not finding anything on the internet (and getting yelled at in Stack Overflow, or rather one of its sister sites) and painstakingly debugging (including writing my own tiny client with tons of debug statements) until I realized "localhost" was resolving first to IPv6 loopback in Windows and, only after quietly timing out there (because I was only listening on IPv4 loopback), it did try and instantly connect through IPv4.


I've seen this too, but luckily someone one the internet gave me a pointer to the exact problem so I didn't have to go deep to figure out.


Anyone interested in synthesizers knows TE.


Anyone interested in demystifying TE should look at the PC case they designed. They sell polished turds, they just didn't polish that one enough for it not to stand out as a complete piece of crap.


This is pretty unfair. They sell sometimes pretty exclusively priced stuff like their field table - more than the case. But with these you are buying aesthetics. And i would much rather have case this cool looking if i had it in visible place. They are one of the few companies recognized for their design that is something AND their audio stuff might also be expensive but is really good.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: