Hacker News new | past | comments | ask | show | jobs | submit | maccam912's comments login

The idea is to serve models that would normally be considered too large for GPU memory (70 billion parameters at 16 bytes each for 140 GB of memory required). Some people figured out you can offload the model and only have parts of it loaded so a 24 GB GPU like the 4090 can still serve the model, but it goes a lot slower. They have a new way to serve the same model on the same GPU but 8x better throughput. Something about decoding tokens on a smaller model maybe, then just checking multiple tokens on the larger model in a single batch. Magic, but ultimately its the same model, same GPU, same output as before, but much better throughout.


It depends on the task I imagine. Like writing a novel was mentioned, keeping important story lines in your memory for a long time will be necessary, or at least certainly more important than remembering what the characters were eating for lunch on page 10. But if you need to find that one loophole in a contact you probably will benefit from the perfect recall.


I think kelseyfrog meant that the state for a mamba model is supposed to "remember" stuff even if it doesn't have the actual tokens to reference any more. It might not be guaranteed to hang on to some information about tokens from a long time ago, but at least in theory it's possible, whereas tokens from before a context window in a tradional llms may as well never have existed.


Yes, you said it better than I did :)


Not the person you are responding to, but I'm aware that the cloud is just other people's hardware. But that's usually what I intend to mean when I say that. My own hardware I have to power, maintain, reboot, dust, cool, etc. And if I talk about "the cloud" I usually am taking about an environment I can maintain fully through software, none of the messy hardware failures and temperature management I might need to think about at home.


Preview

Justin Trudeau is not the president of any country. He is, however, the Prime Minister of Canada, which he has been since November 4, 2015. The prime minister is the head of government in Canada's federal parliamentary democracy and constitutional monarchy.


Since you haven't had a response to the second part, here it is:

PreviewJSON

Justin Trudeau is not the president of any country. He is, however, the Prime Minister of Canada, which he has been since November 4, 2015. The prime minister is the head of government in Canada's federal parliamentary democracy and constitutional monarchy.


This right here is why there is a divide the size of the Grand Canyon in America. Some will see it as obvious satire, some are 100% convinced this is the truth. When did we reach the point that the same text can be read as fact or satire with such strong conviction that the dress being blue and black (no, it's white and gold!) would seem like compromise?


I always thought this would make a good movie. Parody both sides so hard each thinks you are "representing their views" while the opposing side thinks it's a comedy.



I'll check it out thanks!


Where can I learn more about these phrases to say to capture as many sounds as possible? They listed two in the article, but now I'm curious how they decided which sentences to say, how efficient they are, if they account for the same sounds but with different qualities, like going up at the end if it's a question? It's a rabbit hole I need to find the entrance to.


This doesn’t answer all of your questions, but Apple has an accessibility feature which lets you generate a voice based on your own. It’ll prompt you to say various phrases, at the end you’ll have a synthetic voice to use. Info on it here: https://support.apple.com/en-us/104993


A good starting point is speech synthesis, particularly diphone synthesis


Same! As soon as a new cert is registered for a new subdomain, I get a small burst of traffic. It threw me off at first assuming I had some tool running that was scanning it.


This seems about as close as we might get to another fun one, https://cratebeforeattack.com/ for more traditional worms style gameplay. It's fun but crashes randomly, and hasn't seen much since 2020. It anyone knows more details about if the game is looking for a new maintainer please reach out! There is so much potential here.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: