More

zacksiri · 2025-01-05T13:43:45 1736084625

I was thinking this exact thing as I was reading the article.

zacksiri · 2024-12-13T11:39:42 1734089982

Actual AI here.

zacksiri · 2024-12-11T02:52:04 1733885524

I've been a redis / elasticsearch user since the early days. Seeing these companies grow has been sources of inspiration for me.

I'm glad you're back and look forward to redis's future. There is still a lot of work to be done.

In terms of licensing there is no right answer, large companies evidently abuse opensource, at the same time community is required to make projects like redis / elasticsearch grow.

This is a good move and I think the future is long with redis. So much to explore and so much more that can be done.

zacksiri · 2024-09-17T02:29:00 1726540140

I use s6 in my product. It’s the standard for every app deployed using our platform.

Couldn’t be happier with the decision.

You can see how we provision s6 files here https://github.com/upmaru/pakman

zacksiri · 2024-08-01T08:30:49 1722501049

I wonder if the same sentiment applies to products using .ai TLDs.

The thing about AI is it reminds me of what Steve Jobs said about speeds and fees.

People care about "1000 songs in your pocket" not "30GB hdd". AI seems to be the "30GB hdd" and people don't always relate AI to how it's going to help them.

ksaj · 2024-08-01T09:47:38 1722505658

We keep saying "Any sufficiently advanced technology is indistinguishable from magic" while forgetting that technology only becomes good when we don't think about the technology anymore. It works like magic.

How many users know the difference between MFM and RLL hard drives? Now keep progressing that technology wise. People only know and care that they got bigger and faster. Do users care that the file system may or may not be self-defragging?

ecjhdnc2025 · 2024-08-01T09:44:31 1722505471

Every time I see a .ai TLD I just assume it's owned by a grifter.

It saves time.

*shrug*

zacksiri · 2024-07-11T06:35:47 1720679747

The thing about fly.io / heroku / vercel is you are gaining convenience by giving up control over the underlying infrastructure.

When you give up control of the infrastructure you also give up the ability to decide what's right and wrong.

If you want to work with something that allows you retain that control but still have the convenience of a nice deployment workflow check out my profile.

ec109685 · 2024-07-11T06:54:11 1720680851

Unless you build your own data center and run your own fiber, you’re at the mercy of someone else’s policies and could be shutdown at their whim.

qhwudbebd · 2024-07-11T07:56:49 1720684609

Whilst this is true in theory, the experience of being a data centre customer, taking transit from one or two providers and peering with a handful of others is very different to being a user of a bulk cloud infrastructure provider.

I think the difference is automation, or rather lack of it. You build personal relationships with real people. Senior network engineers at the supplier don't consider it beneath them to ring up similarly competent people at the customer and chat directly.

In the past I've hosted kit on a larger scale for my businesses, but it seems you get the same personal experience as a tiny individual customer. Even with just a couple of racks of kit of for hobby projects and friends' stuff, I know the network engineers at the DC and transit providers and can email or call direct to discuss things when I need to.

More importantly, if (for example) one of my friends starts gobbling 10Gbps of $$$ transit because of a typo in a script, they ring my mobile to chuckle about it while we fix in realtime, rather than some automated process pulling the plug with the real engineers hiding behind support tickets and first-line support staff that can't string a sentence together.

zacksiri · 2024-07-11T14:18:57 1720707537

I think it's important to realize, each level of abstraction has it's pros and cons. Concluding to have to go all the way down the stack and run your own data center and fiber is extreme.

In life it's always going to be a balance of what's feasible that you can do to keep control and giving up control because something is not feasible.

In this case obviously the author is having a problem high up the stack. The best solution would be to have more control in order to keep his business going.

johnchristopher · 2024-07-11T07:03:30 1720681410

Don't forget to bring a diesel generator if you forgot to build your own electric grid.

miki123211 · 2024-07-12T01:22:47 1720747367

Even if you build your own datacenter and your own fiber, other ISPs might still either depeer from you or threaten to do so if you don't boot customer x off your service.

zacksiri · 2024-07-10T14:30:45 1720621845

It makes sense then for AMD to buy them out.

If they’ve trained LLMs with lumi which has a lot of instinct GPUs there is a high chance they’ve had to work through and solve a lot of the gaps in software support from AMD.

They may have already figured out a lot of stuff and kept it all proprietary and AMD buying them out is a quick way to get access to all the solutions.

I suspect AMD is trying to fast track their software stack and this acquisition allows them to do just that.

rcarmo · 2024-07-10T15:22:42 1720624962

I am curious if the models are any good, though. The landscape is so fragmented I never heard of Poro.

ghnws · 2024-07-10T15:55:40 1720626940

Poro (reindeer in finnish) is specifically developed to be used in Finnish. GPT etc. general models struggle with less used languages. Unfortunately this sale likely means this development will cease.

rcarmo · 2024-07-11T09:41:11 1720690871

Reindeer is a great name, and gives me an idea - next time I create an Azure OpenAI resource (depending on model availability and data residency requirements, sometimes you need to create more than one) I'm going to start going through Santa's reindeer names.

hrududuu · 2024-07-10T16:03:48 1720627428

Gpt4 or even 3.5 is quite good at Finnish. Was there ever a benchmark against closed source models?

zacksiri · 2024-06-21T11:07:13 1718968033

https://archive.is/YCLBp

zacksiri · 2024-05-28T14:46:50 1716907610

I think the problem here is that 'understanding' is not the same as curve fitting.

If all one is doing is giving a model lots of data and fitting curves it's not really 'understanding' but brute forcing it's way (with gradient descent) and then storing the weights and finally approximate the solution when a query is passed in.

This is not the same as understanding. Human intelligence can operate deterministically as well as non-deterministically. We can listen to language, which is by it's nature non-deterministic and convert that into deterministic operations and vice a versa. IE we can operate on some logic and explain it in multiple ways to other people.

Understanding requires much less data than brute forcing your way into pattern recognition.

When you see a simple number like this 2 * 4 you are able to understand that it's equivalent to 2 + 2 + 2 + 2 and that in turn means 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 <- Count that and you've got your answer.

Because you 'understand' this basic concept and all the operations in between you are able to compute more examples. But you only need to understand it once. Once you understand multiplications and additions and all the tricks in between you are able to compute 23 * 10 without being fed 23 * 10 as prior data. Understanding is very different from fitting a curve. You can reach conclusions and understanding through pattern recognition, but it's important to differentiate 'approximation' from 'calculation'. If you understand something in it's entirety you should be able to calculate an outcome deterministically.

Right now LLMs lack 'understanding', and seems to only 'approximate' which may seem like 'understanding' but is actually not.

zyklu5 · 2024-05-28T15:02:55 1716908575

I think you are mixing layers of abstraction. To make a crude but I think not unhelpful analogy: 'Understanding' is a natural language concept that is our way to describe whats happening in our heads, and like most other such concepts is resistant to any clear definition and will exhibit sorites type paradoxes when one is attempted. It belongs to the presentation layer of the stack. While the process of curve fitting, however it is implemented, with whatever NN structure (like transformers) or maybe something else entirely belongs to the physical layer of the stack -- akin to frequency modulation.

While I am unsure whether LLMs are really understanding, whatever that means, I think it is not difficult to believe that any form of understanding we implement will involve 'curve fitting' as a central part.

zacksiri · 2024-05-28T15:35:28 1716910528

Thank you for your explanation. It's helpful to see another perspective on 'understanding'.

hackinthebochs · 2024-05-28T18:41:38 1716921698

This seems like its confusing how we conceptualize the training/learning process with what the system is actually doing. We conceptualize tuning parameters as curve fitting, and we conceptualize predicting the next token as maximizing probability. But that doesn't mean there is anything like curve fitting or probability maxxing happening as the system's parameters converge.

The core feature of curve fitting is learning explicit examples and then interpolating (in an uninformative manner) between unlearned examples. But there's no reason to think this completely describes what the system is doing, in the sense that there are no more informative descriptions of its behavior. Take an example that LLMs are surprisingly good at, creating poetry given arbitrary constraints. Imagine the ratio of the poems it has seen during its training over the number of unique poems it could create in principle. This number would be vanishingly small. Interpolating between two strings representing well-formed poems in an uninformative manner (i.e. some finite polynomial) will not generate well-formed poems. The only way you could move between two examples of well-formed poems while staying on the manifold of well-formed poems is if you captured all relevant features of the manifold. But I fail to see a difference between capturing all relevant features of the poetry-manifold and understanding poetry.

What LLMs do can be described as curve fitting in only the most uninformative description possible. What they do is discover features of the structures referred to by the training text and competently deploy these features in predicting the next token. A human that could do this would be consider to understand said structure.

zacksiri · 2024-05-17T09:18:59 1715937539

What would be super interesting is having something library, and able to use something like this inside Elixir or Ruby, to optimize hotspots.