Hacker News new | past | comments | ask | show | jobs | submit | more jonathaneunice's comments login

"Real-time latin-hypercube-sampling-based Monte Carlo Error Propagation" isn't the most "Hey, beginners! Come on in and try this!" description / synopsis. Ditto "Second Order Error Propagation."

The "free, cross-platform program that transparently handles calculations with numbers with uncertainties" module may not be as strong, but definitely sells and describes itself better.

mcerp might be better rated if it hoisted "easily and transparently track the effects of uncertainty through mathematical calculations" promise to its headline.


I never imagined a future in which PDFs talked back. Now I can.


PostScript is Turing complete and your PDF reader is a PostScript interpreter. So, yeah, potentially any PDF is instructions to a general-purpose computer.


PostScript is deprecated in PDF though.


Older versions of PDF files and readers don't know or care about what happened later. PDFs also did shady shit like being able to embed JavaScript actions and other files. They also became Turing complete when they adopted Type 4 functions in PDF 1.3 that still aren't deprecated in 2.0.

LaTeX subsumed most of the human authoring uses of PS where it was used in academia.


PostScript files are dynamic code. You can create polygons dynamically with commands. And, of course, font FX's, styles, elipses...

Also, there's a ZMachine interpreter (text adventure player) written in PostScript which can play Zork and some libre games such as Calypso with just GhostScript, the PostScript interpreter most software use to render PostScript files.


llama.pdf when?


If neither GitHub nor GitLab, what are you recommending? There are a few other non-DIY hosted options, but it's hard for me to translate your "fully noscript/basic (x)html friendly" spec into an actionable list of options. Or is this a "host it yourself" / DIY plea?


Soucehut works with noscript. No need to host or DIY anything.

Codeberg has a message that says "This website requires JavaScript." but I was able to use it without JS to browse around and look at code properly.


Fascinated this uses the Unicode glyphs / symbols for unit and record separator rather than the unit and record separators themselves (ASCII US and RS).

Perfect deployment of David Wheeler's aphorism:

> All problems in computer science can be solved by adding another level of indirection.

https://en.wikipedia.org/wiki/David_Wheeler_(computer_scient...



The answer makes sense to me, but I wish we could fix editors to properly handle the ASCII separators (1C, 1D, 1E, 1F) instead of resorting to Unicode control picture characters (241C, 241D, 241E, 241F).

Maybe if editors are fixed up we could adopt ASCII Separated Values (ASV) as the new standard.


Emacs has handled literal ASCII control characters correctly I believe since around the time I was born - probably somewhat earlier, if we count back further than GNU.

Unicode works fine there too, so it makes no nevermind to me which flavor people use. I just think it's funny how "everything old is new again".


Yes you're right. That's a long term goal.


Why not combine zero width character with visible character, i.e. use 2 characters for separators?

,<FS> for fields \n<RS> for records

This removes ambiguity in parsing and remains user readable. It's also relatively easy to auto-fix files edited by users in normal editors.

It also mostly removes need for escaping.

It's also smaller or same size as unicode multibyte characters (haven't checked).


Wouldn't this easily break your file in subtle ways when someone tries to edit it in their editor and the zero width character is not visible?

How could you make the difference with a standard CSV file if it looks like a standard CSV file?

They explain why they don't use control characters. Editors are not consistent in how they show control/zero-length characters:

https://github.com/SixArm/usv/tree/main/doc/faq#why-use-cont...


I tried to create a "printable binary" format for better visual inspection of raw binary data (and a to/from format)

https://github.com/pmarreck/elixir-snippets/blob/master/prin...


Indeed, if the result is to be encoded with UTF-8, using 1-byte separators vs the multi-byte encoding of (241F) would make sense to me.

I'd also prefer if escapes were done in the "traditional" manner of, for example, "\t" for a tab because you can then read in stuff with something like input.split("\t").map(unescape); you know any actual tab character in the input is a field separator, and then you can go through the fields to put back the escaped ones.


> you can then read in stuff with something like input.split("\t").map(unescape)

What about input lines like 'asdf\\thjkl\tzxcvb'? That should be two fields, one the string ‘asdf\thjkl’ and the other the string ‘zxcvb.’

I think that your way is a bit like trying to match context-free grammars with a regular expression. The right way is to parse the input character by character.


> you know any actual tab character in the input is a field separator, and then you can go through the fields to put back the escaped ones

The "\t" in "split" is not a "slash-tee" but an actual tab character and then escape sequences in fields are handled by the "unescape" function.


I think the suggestion is that the field separator is an actual tab character (ascii code 9) but tabs inside the field are `\t`. So, splitting on the tab character always works because fields cannot contain ascii code 9 but must use the two character escape instead.


Although matching up nested pairs of brackets requires something at least as powerful as a pushdown automaton (CFG matcher), discriminating between an arbitrary number of escaped backslashes followed by an unescaped 't' versus an arbitrary number of escaped backslashes followed by the '\t' escape sequence doesn't require anything more powerful than a finite state machine.


Indeed... I didn't read the standard in detail to check whether escaping is allowed/taken into account, but what if my data contains those symbols? I mean, they are perfectly legal Unicode printable characters, unlike the ASCII ones.


I one time attempted to write a blog post about escaping stuff in rss feeds, while technically correct nothing could parse the rss feed for the blog.


There's an escape.


I thought the point is you don't need escapes?

If you still need to implement escape mechanism, might as well do CSV/TSV.


The point is ASCII DSV, which gives innately better hierarchy than CSV, but with visible tokens and stream accommodation. You should read the github readme. It's not that long.

https://github.com/SixArm/usv/tree/main/doc/faq#why-choose-u...

As for still needing escapes, using obscure symbols instead of ones that are extremely common in writing inherently means needing far far faaaaaaar fewer of them.


What's the point of visible tokens if it's all squished in one line? You are not going to be editing this in regular editor once you have non-trivial amount of data.

And yes, I read README and source code, so I know that newlines are optional, existing tools don't generate them, and multi-line examples are basically fake.


> What's the point of visible tokens if it's all squished in one line?

It doesn't have to be all squished in one line, it just doesn't hurt anything. Visually splitting squished lines for presentation or perusal is trivial because of the record separator.

> You are not going to be editing this in regular editor

I know (or at least I think) that you meant this in relation to squished lines getting very long, but maybe we can talk about it in a broader context, since record splitting is trivial...

One could easily say these same words about documents written in right-to-left languages. But people in Israel manage to create files too somehow, so that's clearly not an insurmountable barrier.


Editors generally support composing right-to-left languages that way? So I suppose the metaphor suggests that all editors should directly support the visible glyphs semantically?

And yet, that's explicitly not the semantic purpose of those glyphs. The actual delimiters already exist at a lower code point. If we're asking editors to semantically support delimiters we should be asking them to support the semantic delimiters.


Good point. I'm adding automatic record separator newlines to the crate now.


You shouldn't need escapes for separator characters precisely because they are not designed for data. Their entire purpose is to separate hierarchical data.

If it turns out that escaping is needed, it will still be far rarer than escaping commas and newlines.


This makes me sad; such a missed opportunity.


(For text processing, I use octal \034 all the time.)

Perhaps there is a software developer version of "Needs more cowbell" called "Needs more complexity"

Computer languages generally use the Latin alphabet. And even in a case like APL, which some HN commenters call "hieroglyphics", the number of symbols is limited and each is precisely defined (cf. potentially up to 1.1 million Unicode symbols and "emojis" that are open to interpretation).


Well, yeah, not every language uses the Latin alphabet.


Perfect deployment of HL33tibCe7’s aphorism:

> For every interesting HN post, there’s at least one smug commenter who thinks he knows better, but actually doesn’t

https://github.com/SixArm/usv/tree/main/doc/faq#why-use-cont...


The OP was probably assuming no human would want to actually read a CSV raw, and so was probably correct from their POV. Your POV is probably from someone who reads CSVs raw. You don't have to be so rude about it, you're being even more smug than the OP, probably.


One of the two likely works with CSVs for a living, and it's definitely not the person suggesting "What if it just was hard to eyeball/edit".

If you don't understand why something is the way it is, it might be better to start with a question than with a statement implying the tech misses existing tech. Chesterton's fence still applies, and ignoring it means you're outsourcing your work to others. RTFM is a perfectly valid answer at that point.


I use CSVs for a living but I rarely read them manually. I’d rather have ASCII than Unicode in my CSVs.

My point above, though, is that everyone has opinions and you don’t have to be a dickhead about “correcting” them.


The problem with bespoke, homegrown, and DIY isn't that the solutions are bad. Often, they are quite good—excellent, even, within their particular contexts and constraints. And because they're tailored and limited to your context, they can even be quite a bit simpler.

The problem is that they're custom and homegrown. Your organization alone invests in them, trains new staff in them, is responsible for debugging and fixing when they break, has to re-invest when they no longer do all the things you want. DIY frameworks ultimately end up as byzantine and labyrinthine as Kubernetes itself. The virtue of industry platforms like Kubernetes is, however complex and only half-baked they start, over time the entire industry trains on them, invests in them, refines and improves them. They benefit from a long-term economic virtuous cycle that DIY rarely if ever can. Even the longest, strongest, best-funded holdouts for bespoke languages, OSs, and frameworks—aerospace, finance, miltech—have largely come 'round to COTS first and foremost.


> We believe that most messaging apps are secretly to-do lists in disguise; you have to read, respond, or do some task when you receive a thread.

Perfect observation. Thank you for focusing on this reality. They aren't benign messages; they're summons, pleas, assignments, and to-dos.


A few days ago I felt guilty for asking something from a colleague in slack.


Contrary take: Thank goodness! Finally a useful distinction between "we want to include you" and "you must interrupt what you're doing to look at this now, now, _NOW_!"

De facto other people already have control of interruptions, but with umpteen concurrent notifications (sometimes in multiple places, like DMs to say "did you see the notification in channel X / thread Y yet??" overlapping the original notifications), how to know where to focus? @ vs ! finally gives people a way to distinguish between inclusion and urgent interruption.


Nice repurposing. Bq it is!


IBM I and Z (each with a dizzying array of branding zigs and zags going back decades) are both pretty interesting technologically.

I has a fancy memory architecture, very smart disk controllers (essentially distributed intelligence, like an octopus), a virtual instruction set (that has been used multiple times to almost seamlessly jump huge under-the-hood processor changes), and historically a reliability record second to none (the old box in the wiring closet running for years upon years, completely untended). Z has even more toys, including some of the strongest clustering, partitioning, security found anywhere. Sysplex, LPARs, and RACF are all impressive, especially given how many decades ago they started. We won't even talk about the DBMS and transaction monitors, which are their own brand of crazy strong.

Those immersed in the higher-volume, standard microprocessor, Unix/Linux or Windows, cloud mainstream don't give "proprietary systems" much thought or respect. But we probably should. Those who knew the IBM I or Z, or the DEC VAX/VMS, HP MPE, Tandem NonStop, etc.—they were too expensive, too few in number, too quirky—but what they did well, they did outstandingly well in their purpose-focused, allopatrically speciated ways. Better in many cases that we can do today with the latest 2024 gear.


> they were too expensive

I think this is the biggest problem, plus the fact these systems tend to be tied to proprietary - and also very expensive - hardware platforms. If I want to learn about GNU/Linux or BSD, all I need is a computer (PC in most cases, but other options exist) and an Internet connection. These days, most people (at least in Europe and North America) have these anyway, so it's really easy to get started in the comfort of one's own home.

Having a free account on a public machine is cool, but it's not the same as having your own system, especially if you want to learn about system administration.


> too expensive

The killer, of course. As they say: anyone can build a bridge that stands, but it takes an engineer to build a bridge that barely stands. In this game, a solution that's too expensive is often not a solution at all.


Does not seem to include assisted-living facilities ("nursing homes lite").


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: