More

fractalb · on May 11, 2024

People then concentrate more on avoiding errors than getting things done.

AnimalMuppet · on May 11, 2024

> easily half the code I was writing in Multics was error recovery code.

And it can be worse. IBM, I think it was, back in the 1970s or 80s, said that 80% of the lines of production-ready code dealt with errors - only 20% was doing the actual program functionality.

fractalb · on May 1, 2024

> US can just print. How does that work? All the world’s economy is just equal to the US paper? Really?

fractalb · on April 1, 2024

I suspect you don't touch Go code.

fractalb · on March 30, 2024

> I've yet to see numbers on the probability of losing access to my email on Gmail vs another provider.

This is a great point. People often act as the other providers are 100 percent reliable without any numbers to back it up. Grass is always greener on the other side. To be fair, Google’s customer service is non-existent though.

izoow · on March 30, 2024

Thing is, with other providers, all I'm getting is email. With Google, I'm getting a bunch of services, all interconnected, and any of them could potentially get my entire Google account banned. One of the fuck ups I can recall is a bunch of people getting their Google account banned because they typed in chat of a Youtube livestream and some algorithm falsely picked it up, cutting them off from everything.

nolist_policy · on March 30, 2024

Yes, it is a good idea to use multiple Google accounts to separate these concerns. Google explicitly allows this[1].

[1] https://support.google.com/accounts/answer/40695?hl=en#zippy....

httpsterio · on March 30, 2024

that doesn't really solve the issue because Google can close all of the accounts associated with you in one fell swoop.

nolist_policy · on March 30, 2024

I wonder how many of the cases that failed to appeal skimped on adding a recovery email or phone number.

zeagle · on March 30, 2024

I own the domain, I control DNS, I pay a provider for email, and my phone and laptop have full downloads via IMAP. The last step aside I don't think that is an uncommon setup. Uptime might be worse, I don't know that there is a real problem there losing access.

fractalb · on March 29, 2024

I feel copyleft licenses look more favourable at this point of time. What’s the value of more free/business friendly licenses if you can’t guarantee that the same license will apply for all the future releases? Looks more like a bait and switch policy.

paulryanrogers · on March 29, 2024

The future is never guaranteed. Much less if you have no paid contract with the people building and maintaining the floor underneath your feet.

fractalb · on March 29, 2024

AWS, GCP have assurance that they won't need to pay for their Linux infrastructure. What is it if it wasn't for copyleft licenses(GPL)?

endisneigh · on March 29, 2024

What assurance is that?

crabmusket · on March 29, 2024

Am I right in understanding that the relicensing was possible because of the CLA, not just because of the BSD license? Would a permissively licensed project that didn't use a CLA be vulnerable in the same way?

8organicbits · on March 29, 2024

A key concern is that BSD isn't viral, so anyone can take BSD Redis and fork it into a commercial offering. If you want to, you can. The Redis trademark prevents anyone but Redis the company from calling their fork "Redis".

A CLA may impact relicencing, it depends on the terms. A simple CLA may only say "I am the owner of the code and I release it under $LICENSE". The current Redis CLA also has a copyright grant, which gives Redis the company greater rights.

Tabular-Iceberg · on March 29, 2024

“Viral” just means that the license has a “no additional restrictions” clause, not that you can’t make a commercial offering out of it. That’s why GPL and AGPL don’t really solve the problem.

And the problem with the trademark model is that AWS, and especially Microsoft, already have established brand recognition with the people who sign the big SaaS and support contracts. The people who know what a Redis is are just nerds with no money, the real big shots do everything in Microsoft Excel.

lmm · on March 29, 2024

A permissively licensed project without a CLA would be similarly vulnerable, because the BSD license allows them to make releases that include your code under a stricter license. To prevent them relicensing you would need both a strong copyleft in the license and no CLA/copyright assignment (like e.g. Linux - which can't even move to GPLv3 even if they wanted to, because it would be simply impossible to get all contributors' permission).

orthoxerox · on March 29, 2024

No, since you can include BSD-licensed code in non-free software with just an attribution. The only difference between relicensing Redis from BSD+CLA to SSPL and BSD to SSPL is that the former would've had a more detailed REDISCONTRIBUTIONS.txt.

fractalb · on March 29, 2024

GPL mandates that all derived software must carry the same license. No need for CLA, as I understand it.

pmontra · on March 29, 2024

The copyright owners of a GPL software can do whatever they want with future versions, even going proprietary. The problem is that all the owners must agree on that. That's why some GPL software only accepts contributions by people that give copyright to a single maintainer entity. An example is FSF's copyright transfer, which to be fair is more nuanced than that and has also other purposes.

https://www.fsf.org/bulletin/2022/fall/copyright-assignment-...

fractalb · on March 29, 2024

I misunderstood your comment. Yes, CLA's make it possible to change the license. I guess CLA's won't work for GPL'd software.

fractalb · on March 18, 2024

Apple should not be at fault! As simple as that.

fractalb · on March 6, 2024

I want to learn more about linkers and loaders. Can someone tell me if this is still relevant today? Any other books that are more recent?

monocasa · on March 6, 2024

Very relevant.

The big outlier not listed here is apple. Quick overview from someone who's written binary analysis tools targeting most of these:

Mach-O is the format they use, going back to nextstep. They use it everywhere including their kernels and their internal only L4 variant they run on the secure enclave. Instead of being structured as a declarative table of the expected address space when loaded (like PE and and ELF), Mach-O is built around a command list that has to be run to load the file. So instead of an entry point address in a table somewhere, mach-o has a 'start a thread with this address' command in the command list. Really composable, which means binaries can do a lot of nasty things with that command list.

They also very heavily embrace their ld cache to the point that they don't bother including a lot of the system libraries on the actual root filesystem anymore, and the kernel is ultimately a cache of the minimal kernel image itself as well as the drivers need at least to boot all in one file (and actually all of the drivers I think on iOS with driver loading disabled if it's not in the cache?).

There's a neat wrapper format of Mach-O called "fat binaries" that lets you hold multiple Mach-O images in one file, tagged by architecture. This is what's letting current osx have the same application be a native arm binary and a native x86_64 binary, the loader just picks the right one based on current arch and settings.

I think those are the main points, but I might have missed something; this was pretty off the cuff.

Someone · on March 6, 2024

> Mach-O is built around a command list that has to be run to load the file. So instead of an entry point address in a table somewhere, mach-o has a 'start a thread with this address' command in the command list. Really composable, which means binaries can do a lot of nasty things with that command list.

ELF isn’t immune to doing nasty things at link time. https://www.usenix.org/system/files/conference/woot13/woot13... has an example where they make ping call execl on arbitrary executables as root by tweaking such a declarative table.

lgg · on March 7, 2024

Conceptually not much has changed since the book was written, but in practice there has been a lot of advancement. For example, ASLR and the increase in the number of libraries has greatly increased the pressure to make relocations efficient, modern architecture including PC relative load/store and branch instructions has greatly reduced the cost of PIC code, and code signing has made mutating program text to apply relocations problematic.

On Darwin we redesigned our fixup format so it can be efficiently applied during page in. That did in include adding a few new load commands to describe the new data, as well as a change in how we store pointers in the data segments, but those are not really properties of mach-o so much as the runtime.

I generally find that a lot of things attribute to the file format are actually more about how it is used rather than what it supports. Back when Mac OS X first shipped people argued about PEF vs mach-o, but what all the arguments all boiled down to was the calling convention (TOC vs GOT), either of which could have been support by either format.

Another example is symbol lookup. Darwin uses two level namespaces (where binds include both the symbol name and the library it is expected to be resolved from), and Linux uses flat namespaces (where binds only include the symbol name which is then searched for in all the available libraries). People often act as though that is a file format difference, but mach-o supports both (though the higher level parts of the Darwin OS stack depend on two level namespaces, the low level parts can work with a flat namespace, which is important since a lot of CLI tools that are primarily developed on Linux depend on that). Likewise, ELF also supports both, Solaris uses two level namespaces (they call it ELF Direct Binding).

saagarjha · on March 6, 2024

I don’t disagree about the nature of load commands but Apple has been moving away from, say, LC_UNIXTHREAD for years at this point. For the most part load commands are becoming more and more similar to what ELF has, with a list of segments, some extra metadata, etc.

mzs · on March 6, 2024

There was also CFM-68K and PEF for PPC before osx (also used for BeOS I think): https://developer.apple.com/library/archive/documentation/ma...

snnn · on March 6, 2024

The mechanisms for Windows DLLs have been changed a lot(like how thread local vars are handled). Besides, this book could not cover C++11's magic statics, or Windows' ARM64X format, or Apple's universal2, because these things are very new. Windows now has the apiset concept, which is very unique. Upon it there are direct forwarding and reverse forwarders.

I think for C/C++ programmers it is more practical to know that: 1. The construction/destruction order for global vars in DLLs(shared libs) are very different between Linux and Windows. It means the same code might work fine on one platform but no the other one. It imposes challenges on writing portable code. 2. On Linux it is hard to get a shared lib cleanly unloaded, and it might affect how global vars are destructed, and might cause unexpected crashes at exit. 3. Since Windows has a DLL loader lock, there are a lot of things you cannot do in C++ classes constructors/destructors if the classes could be used to define a global variable. For example, no thread synchronization is allowed. 4. It is difficult to cleanup a thread local variable if the variable lies in a DLL and the variable's destructor depends on another global object. 5. When one static lib depends on another, a linker wouldn't use this information to sort the initialization order of global vars. It means it could be the case that A.lib depends on B.lib but A.lib get initialized first. The best way to avoid this problem is using dynamic linking.

For Windows related topics I highly recommend the "Windows Internals" book.

begriffs · on March 6, 2024

I have a hard-copy of this book, and it seems like the PDF isn't the final version, judging by the hand drawn illustrations at least.

The book does dive into some old and arcane object file formats, but that's interesting in its own way. Nice to get a comparison of how different systems did things.

After reading that book and other sources, I examined how people typically create libraries in the UNIX/Linux world. Some of the common practices seem to be lacking, so I wrote an article about how they might be improved: https://begriffs.com/posts/2021-07-04-shared-libraries.html

Edit: That covers the dynamic linking side of things at least.

fabiensanglard · on March 6, 2024

Here is my modest contribution:

https://fabiensanglard.net/dc/ld.php

fractalb · on March 7, 2024

Wow. That looks neat. Thanks for the link :)

Tomte · on March 6, 2024

A book that’s even older: http://www.davidsalomon.name/assem.advertis/AssemAd.html

bregma · on March 6, 2024

Still pretty much relevant in terms of introductory understanding. Some specific details seem slightly anachronistic for the general population (segmented memory for example, which still exists in modern PC hardware but is of little import to the great majority).

t-3 · on March 6, 2024

The concepts are still relevant, but the specifics are mostly outdated. If you read this and then read the relevant standards documents for your platform, you should have a good grounding. I don't know of any other books that cover the topic well.

Upvoter33 · on March 6, 2024

definitely worth reading and understanding. Concepts haven't really changed.

fractalb · on March 1, 2024

> ChatGPT makes mistakes and is usually dumbish, but is also flexible and cheap.

People wouldn't mind it if the keyword `dumbish` has been all along there.

fractalb · on Feb 23, 2024

Physical materials(Filters here) can change the wavelength but not the frequency. So it's still the same UV/light band. You can't change that.

fractalb · on Feb 19, 2024

Thanks for the link. As I understood it, it's just the support for the syntax that is mandatory.

> Variable length arrays with automatic storage duration are a conditional feature that implementations need not support

LegionMammal978 · on Feb 19, 2024

The syntax and also the semantics. For instance, you can take the sizeof() a variably-modified type, or the offsetof() one of its fields, and the compiler has to do all the layout calculations implied by the type declaration at runtime. These features are partially what motivated the mandatory support. The only part that is still optional is using such types by value as stack variables (i.e., variables with automatic storage duration).

actionfromafar · on Feb 19, 2024

How do you create a variable array on the heap?

confused

klodolph · on Feb 19, 2024

With `malloc`, and converting the pointer to the correct type.

    void function(int n) {
      int (*arr)[n] = malloc(sizeof(*arr));
    }

actionfromafar · on Feb 19, 2024

wow, that's wild

uecker · on Feb 19, 2024

That is exactly how you create any other type of object on the heap.

actionfromafar · on Feb 19, 2024

Snarky or just know a lot? It changes quite a bit how the compiler works. It has to know to make sure malloc gets the array element size argument multiplied in runtime by n. To a mere user it broke my mental shorthand of how a C compiler works.

Joker_vD · on Feb 20, 2024

> It has to know to make sure malloc gets the array element size argument multiplied in runtime by n

Um, C compilers already do that with arrays with compile-time lengths.

    #include <stdio.h>

    void main(void) {
        char x[20][30];
        printf("%zu\n%zu\n", sizeof(x), sizeof(x[0]));
    }

prints

    600
    30

so you can have "char *y = malloc(sizeof(x)); memcpy(y, x, sizeof(x));" and it must work since C89 at least. The main problem with VLAs is that they make exact the stack frame size unknown until runtime which complicates function prologues/epilogues but that's the problem in the codegen part of the backend, the semantics machinery is mostly in the place already.

P.S. And yes, uecker is a member of the ISO C WG14 and GCC contributor, according to his profile.

LegionMammal978 · on Feb 20, 2024

I think there is a real difference: in the static case, the compiler can just recurse into the type definition at any point, compute fully-baked sizes and offsets, and cache them for the rest of the compilation. But in the dynamic case, you end up with the novel dataflow of types that depend on runtime values, and more machinery is necessary to track those dependencies.

Of course, this runtime tracking has always been necessary for C99 VLA support, but I can easily see how it would be surprising for someone not deeply familiar with VLAs, especially given how the naive mental model of "T array[n]; is just syntax sugar for T *array = alloca(n * sizeof(T));" is sufficient for almost all of their uses in existing code. (In any case, it's obviously not "creating an object on the heap" that's the unusual part here!)

Joker_vD · on Feb 20, 2024

> more machinery is necessary to track those dependencies.

Well, is it much more machinery? IIRC doing

    void f(size_t n) {
        int x[n];
        n += 1;
        ...

does not resize x, so there is no dataflow dependency or rather, x depends on a hidden const variable so no additional dataflow analysis necessary.

LegionMammal978 · on Feb 20, 2024

Yet

  void f(size_t n, int cond) {
      if (cond) { n += 1; }
      int x[n];
      ...

does resize x depending on the value of cond, so the size can't necessarily be known until the point where the type (int[n] in this case) is named.

Also, the compiler has to make sure it keeps around implicit locals to store the variable layouts, so that code like

  void f(size_t n) {
      typedef int array[n];
      n += 1;
      array x;
      ...

functions as specified. This kind of pattern is one of the bigger things setting the feature apart from just "syntax sugar for alloca()".

pjmlp · on Feb 20, 2024

It only works at the same level, the moment they get passed in as arguments, they decay into pointers even if using [].