Hacker News new | past | comments | ask | show | jobs | submit | winternewt's comments login

If Denmark is anything like Sweden there's also:

- SSN from different ID space gets assigned to immigrants; when they become citizens they are assigned a new permanent SSN.

- SSN:s have a long and a short form; the short form which cuts off century information can be the same for someone who is 5 years old and someone who is 105 years old.

- When an unconscious patient comes in to the E.R. you don't know their SSN, so a temporary one is assigned for use in patient records. Such temporary SSN:s are not coordinated nation-wide so multiple patients may have the same SSN. In some hospitals they don't even have a local standard for ID:s. The staff just makes something up on the spot. It happens that the SSN they made up collides with a valid SSN for another person.


Immigrants and refugees will get a "replacement CPR number", which is outside the normal space/range of CPR numbers. Once registered as living in Denmark, they'll get a real CPR number.

Danish SSNs doesn't have a long or short form, you used the seventh digit to do a table lookup to see if the person is born in 18XY, 19XY or 20XY. The date of birth is always ddmmyy, there is no long form. So if the seventh digit is 9, and the 5-6 digit is between 00 and 36, then you're born in between 2000 and 2036, if the 5-6 digits are 37 to 99, then your born in the 1900. But you need the published table to figure that out.

Last point, there is backup system for unconscious patients, but it should be the same across all medical records as these are somewhat standardized.


Also: the personnummer carries a date that often means birth date, but there are cases where it’s not, but I’ve seen a few system that just assumes it’s the same.

> SSN from different ID space gets assigned to immigrants; when they become citizens they are assigned a new permanent SSN

Having gone through that, my personummer didn’t change. Maybe that doesn’t happen anymore?


It does, but if you already have a Personnummer or get a residence permit right away so that you are eligible for one you don't get the temporary Samordningsnummer.

And the samordningsnummer isn't only for immigrants - it's also for Swedes born abroad who have never been folkbokförd.

They're still still an immigrant. The numbers are for residents (at some point in time) and citizenship isn't reflected by it.

Interesting, I've never heard of anyone referring to a citizen as an immigrant, but looking up the legal definition, they certainly are. TIL.

> - SSN:s have a long and a short form; the short form which cuts off century information can be the same for someone who is 5 years old and someone who is 105 years old.

We don't do that. instead, we shove that extra bit of information into the digit following the last two digits through a table: https://da.wikipedia.org/wiki/CPR-nummer#Under_eller_over_10...


They can say whatever they want on that site. To me Firefox feels much slower and it's not just placebo. I suspect it has to do with poor multithreading. I still get hangs or slow responsiveness of the entire Firefox chrome because of a single misbehaving tab. I can also see that some media sites (e.g. di.fm or SoundCloud) use more CPU when playing media.

It's almost certainly just placebo. I've been using firefox next to chrome for years and it's completely unnoticeable outside of a handful of Google-specific sites (YouTube, mostly)

Well, with 211 tabs open for a couple of weeks, I neither get CPU spikes, slow tabs, or a misbehaving chrome.

This is not on a cutting edge PC. It's a core i7-7700.

Firefox's soft spot is DNS response speed. If your DNS is slow, Firefox is slow. Otherwise it's indifferent from Google Chrome speed-wise.


I don't need 211 tabs open. One tab is enough to notice the difference.

Have you used Firefox recently in 2024 ? I had switched permanently to Firefox in early 2023. But, even in late 2023, there were issues with "misbehaving tabs" are you put it. However, now in mid-2024, everything is butter smooth. A fair number of pending issues were fixed in Firefox over the last year. Including a 25 year old bug!

A lot of Firefox contributors rolled up their sleeves over the last year. Chrome is FULLY in the "Enterprise Spyware" space now with the glaring introduction of ad topics over and above community objections. The need for an alternative cross platform browser that is not corporate and ad-controlled has solidified.


I've used Firefox for at least five years. I switched to Vivaldi (which is Chromium-based) about two weeks ago and the speed difference feels liberating.

Agreed. It's not just a placebo effect, it really is slower. People often don't notice this because they use relatively high-end machines. If you're skeptical, try running it on a low-end laptop or a Raspberry Pi. The difference is like night and day.

I get everything you mentioned for Chrome. Especially if I open the Meta store for Quest apps.

Doesn't happen with Firefox.


Without JWT, how do you support log in using third party identity providers like Office 365, Google or Facebook?

What would you recommend if requests are highly parameterized and some can be many orders of magnitude more taxing on the system than others?


We usually implement queueing on the route to those specific things that are vulnarable. If you can not discern what traffic does what, you just need move the rate limiter close to the application or the problem hot spot. It's perfectly valid to give a HTTPS response 429 from a backend and let your frontend handle that in some graceful way. The same is valid as an exception in code, the nearer the problem spot you get the harder it is to get right.

EDIT clarification.


In the abstract sense instead of pulling from queues round-robin you can assign "tokens" to each queue round robin. When the number of tokens a queue has is equal to the cost of the request reset the tokens and pull that request.

This can also be used to handle priority. Maybe paying customers or customers on the enterprise plan get 2 tokens per round or their requests only have half of the cost.


I believe that reducing the power consumption and increasing the speed of AI inference will be best served by switching to analog, approximate circuits. We don't need perfect floating-point multiplication and addition, we just need something that takes an two input voltages and produces an output voltage that is close enough to what multiplying the input voltages would yield.


I know someone working in this direction; they've described the big challenges as:

  * Finding ways to use extant chip fab technology to produce something that can do analog logic. I've heard CMOS flash presented a plausible option.
  * Designing something that isn't an antenna.
  * You would likely have to finetune your model for each physical chip you're running it on (the manufacturing tolerances aren't going to give exact results)
The big advantage is that instead of using 16 wires to represent a float16, you use the voltage on 1 wire to represent that number (which plausibly has far more precision than a float32). Additionally, you can e.g. wire two values directly together rather than loading numbers into an ALU, so the die space & power savings are potentially many, many orders of magnitude.


> which plausibly has far more precision than a float32

If that was true, then a DRAM cell could represent 32 bits instead of one bit. But the analog world is noisy and lossy, so you couldn't get anywhere near 32 bits of precision/accuracy.

Yes, very carefully designed analog circuits can get over 20 bits of precision, say A/D converters, but they are huge (relative to digital circuits), consume a lot of power, have low bandwidth as compared to GHz digital circuits, and require lots of shielding and power supply filtering.

This is spit-balling, but the types of circuits you can create for a neural network type chip is certainly under 8 bits, maybe 6 bits. But it gets worse. Unlike digital circuits where signal can be copied losslessly, a chain of analog circuits compounds the noise and accuracy losses stage by stage. To make it work you'd need frequent requantization to prevent getting nothing but mud out.


You can get 8bit analog signal resolution reasonablyish easyish. The Hagen mode [1] of BrainScaleS [2] is essentially that. But.. yeah. No way in hell you are getting more than 16bit with that kind of technology, let alone more.

And those things are huge which lead to very small network sizes. This is partially due to the fabrication node, but also simply because there is even less well developed tooling for analog circuits compared to digital ones compared to software compilers

[1] https://electronicvisions.github.io/documentation-brainscale... [2] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8907969/ [3] https://arxiv.org/pdf/2003.11996


> which plausibly has far more precision than a float32

+/- 1e-45 to 3.4e38. granted, roughly half of that is between -1 and 1.

When we worked with low power silicon, much of the optimization was running with minimal headroom - no point railing the bits 0/1 when .4/.6 will do just fine.

> Additionally, you can e.g. wire two values directly together rather than loading numbers into an ALU

You may want an adder. Wiring two circuit outputs directly together makes them fight, which is usually bad for signals.


Do you have any references of papers/people working on this? I'm very interested in the possibilities that lie here, but have no idea where to start


an analog value in such a chip has far, far less resolution than a float32. Maybe you get 16 bits of resolution, more likely 8, and your multiplications are going to be quite imprecise. The whole thing hinges on the models being tolerant of that.


I think we're far away from analog circuits being practically useful, but one place that where we might embrace the tolerance for imprecision is in noisy digital circuits. Accepting that one in a million, say, bits in an output will be flipped to achieve a better performance/power ratio. Probably not when working with float32s where a single infinity[1] could totally mess things but for int8s the occasional 128 when you wanted a 0 seems like something that should be tolerable.

[1] Are H100s' maxtrix floating point units actually IEEE 754 compliant? I don't actually know.


I'd go a step further, something which resembles how "wet brains" (biological) actually work, but which could be produced easily.

Biological neural networks are nowhere near as connected as ANNs, which are typically fully connected. With biological neurons, the ingress / egress factors are < 10. So they are highly local

It is also an entirely different model, as there is no such thing as backpropagation in biology (that we know of).

What they do have is lieu of backpropagation is feedback (cycles)

And maybe there are support cells/processes which are critical to the function of the CNS that we don't know of yet.

There could also be a fair amount of "hard coded" connectedness, even at the higher levels. We already know of some. For instances, it is known that auditory neurons in the ears are connected and something similar to a "convolution" is done in order to localize sound source. It isn't an a emergent phenomena - you don't have to be "trained" to do it.

This is not surprising give life has had billions of years and a comparable number of generations in order to figure it out.

I guess in theory this could all be done in software. However given the trillion+ neurons in primate/human brains, this would be incredibly challenging on even the thousand-core machines we have nowadays. And before you scream "cloud" it would not have the necessary interconnectedness/latency.

It would be cool if you could successful model say, a worm/insect with this approach.


> What they do have is lieu of backpropagation is feedback (cycles)

I wonder where the partial data / feedback is stored. Don't want to sound like a creationist, but it seems very improbable that "how good my sound localization is" is inferred exclusively from the # of children I have.


It’s evolved in simpler organisms with a much shorter generation cycle and more offspring per generation.

Being able to localize sound source has a lot of benefits including predation avoidance and prey detection.


Almost convincing, except there's no animal composition (beyond gut microbes)! You can't stick 2 animals that each evolved 1 thing together.


Sounds pretty impossble to me do that with a sufficient combination of range and precision.


What do you mean with inpossible? You are aware that what radio equipment does is often equivalent of analog operations like multiplication, addition, etc. just at high frequencies?

Sure accuracy is an issue, but this is not as impossible as you may think it would be. The main question will be if the benefits by going analog outweigh the issues arising from it.


In general the problem with analog is that every sequential operation introduces noise. If you're just doing a couple of multiplications to frequency shift a signal up and down that's fine. But if you've got hundreds of steps and you're also trying to pack huge numbers of parallel steps into a very small physical area.


TBH that sounds like a nightmare to debug.


How do you inspect what is happening then without having ADCs sampling every weight, taking up huge die area?


Maybe a silly question (I don't know anything about this) - how do you program / reprogram it?


Realistically, you'd train your model the same way it's done today and then custom-order analog ones with the weights programmed in. The advantage here would be faster inference (assuming analog circuits actually work out), but custom manufacturing circuits would only really work at scale.

I don't think reprogrammable analog circuits would really be feasible, at least with today's tech. You'd need to modify the resistors etc. to make it work.


Here's an example of Veritasium talking about this from 2022: https://www.youtube.com/watch?v=GVsUOuSjvcg


Even staying within digital, GPUs are not totally designed for AI learning or inference. But the more important problem right now is standardization.


I don’t know why you’re being downvoted, that’s an active area of research AFAIK


Maybe because that is a VERY different problem than the one discussed here.

Building a single analog chip with 1 billion neurons would cost billions of dollars in a best case scenario. A Nvidia card with 1 billion digital neurons is in the hundreds of dollars of range.

Those costs could come down eventually, but at that point CUDA may be long gone.


Do you have any references of papers/people working on this? I'm very interested in the possibilities that lie here, but have no idea where to start


But they will be as soon as this sees widespread use.


it won't be widespread imho, not when you share you email address with other parties that then lose/sell your details. fastmail like 'temporal' email addresses could help, however.


Or Sweden.


Sorry, do you mean on Andriod or IOS? (I live in Sweden ... and can download on IOS) If Android, then it is because we are waiting for Google Playstore Approval.


On Android.


You can do all that and still honor the points of the parent comment though. I agree that it's poor etiquette to make intrusive changes and hold the original author accountable for them.


Yes if I read the article right, every object is being allocated on the heap. That is a no-go for systems programming as far as I'm concerned.


I read it the same way you did. But, I’d be really surprised if there was no stack allocation in the language, given the author’s experience.


In an arena allocator, the stack can be just a special case arena that gets discarded automatically for you on function exit.


My gut feeling agrees with you but I would really like more detailed reasons why this is the case. Is memory fragmentation that big of an issue? Are heap allocations more expensive somehow (even if memory is not fragmented yet)? Is there something else? Does re-arranging memory in the heap makes performance unpredictable like GC languages?


Memory allocation is slow and undeterministic in perf. Some allocations also require a global lock on the system level. It’s also a point of failure if the allocation doesn’t succeed, so there’s an extra check somewhere. Furthermore if every object is a pointer you get indirection overhead (even though small but existent). Deallocation as well incurs an overhead. Without a compacting gc you run into memory fragmentation which further aggravate the issue. All of this overhead can be felt in tight loops.


Due to the quick fit algorithm, fragmentation is no longer an issue for memory allocators. Heap allocations are still a bit slower than stack allocations since you need some way to release memory. Stack allocations are released at virtually zero cost (one assembly instruction). Hence sophisticated compilers perform escape analysis to convert heap allocations into cheaper stack allocations. But escape analysis, like all program analysis, is conservative and wont convert as many allocations as a human programmer could.

However, in the grand scheme of things heap vs stack allocation is minuscule. Many other factors are much more important for performance.


For one thing, allocating every object on the heap leads to a lot of cache misses because the data you're working with is not contiguous in memory. It may also make it harder for the CPU to do speculative fetches from memory because it needs to resolve the value of a pointer before it knows where to fetch data. With the stack, the address is much more obvious since it's all constant offsets relative to the frame pointer.

Also, heap allocation is unpredictable. It is more likely to cause unexpected page faults or thread congestion (multiple threads often share the same heap so they need to synchronize access to memory book-keeping structures). Especially when it comes to kernel drivers, a page fault can lead to a deadlock, infinite recursion, or timeouts.

I'm not saying heap is always bad, not even that it's bad most of the time. But if a language doesn't at least give you the _option_ of having objects live on the stack, I wouldn't consider it a serious systems programming language.


There is no inherent difference. It's all memory.

That said, as a sibling already pointed out, it's standard to control stack allocation with a single counter. It's kind of standard to control heap allocation with an index and a lot of book-keeping.

But you are allowed to optimize the heap until there's no difference.


I think the mistake people make is assuming that "probability" is a simple concept.

If there are 50K possible tokens and I don't have any other information, I could make a naive estimate that every token has equal probability and start generating text that is just gibberish. With the simple single-token Markov-chain example I would estimate probabilities based the previous token, and that probability estimate would be much better. If you use it for generating text it will look like something that is almost, but not quite, entirely unlike human speech. [1]

The difference lies entirely in how accurately you model the world and what information you have available when estimating probabilities. Models like GPT4 happen to be very good at it because they encode a huge amount of knowledge about the world and take a lot of context into account when estimating the probability. That's not something to be taken lightly.

[1] https://projects.haykranen.nl/markov/demo/


I am skeptical anyone saying this is making a mistake: it only ever really comes up when someone has specific priors they're wanting to litigate - best summarized by the timeless: you cannot make a man understand something when his paycheque depends on his not understanding it.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: