
The Worm Ouroboros - octosphere
https://semantic-domain.blogspot.com/2018/08/the-worm-ouroboros.html
======
13of40
I ran into a somewhat similar issue recently: Sometimes I work on a piece of
software that classifies files based on their contents, and with so many file
formats implemented as "just stuff it in a zip file" it ends up identifying a
lot of files by looking for well known filenames in zip archives. An annoying
quirk about the zip file format is that the authoritative list of the offsets
of the files in the archive is at the end of the zip file, but not at any
particular offset from the end because there's an optional, arbitrarily sized
comment field in it. That means the only way for a program to correctly parse
it is to read backwards from the end of the file looking for a magic number.
The folks who wrote the .Net API for enumerating zip entries knew about this
and wrote their code accordingly.

For our file classification tool...we made it work from a System.IO.Stream,
with the only requirement being that the stream was seekable, and we have it
classifying hundreds of millions of filesystem-backed file streams per day, no
problem.

So we recently shared it with a partner team, and all of a sudden in their
environment, every once in a while they get a file that takes five minutes to
identify. The difference turned out to be that their stream implementation was
backed by a database, so even though it was seekable, every read turned into a
database query, so if you passed in a big zip file with the directory
information missing from the end, .Net would read through the entire stream
backwards, looking for the magic number, one 32-byte read (database query) at
a time...

------
bumholio
_> can the behaviour of the following bit of C code depend

> on a modern computer, a pointer dereference could very lead to the execution
> of Python code._

So the BEHAVIOR of the C code does not change. The dereference of the pointer
triggers a memory page load, and if that load is successful, a numeric value
is returned and added to the array. If the load fails, you will have the
undefined result of accessing uninitialized memory.

In both cases, the behaviour of the code remains squarely within the C
standard - with the actual result of the computation contingent on various
external factors.

~~~
vokep
The behavior isn't changed, right, but it _depends_ on python behavior. IF the
python behavior changed somehow, than the C behavior might as well, and due to
the A implies B & B implies A basic structure of the system, very weird
unpredictable behavior might result.

~~~
bumholio
But the behavior of any data dependent algorithm depends on any behavior
change in the upstream data source. If you pipe Seti@home data into the array,
the behavior of the program might depend on what some distant alien
intelligence did thousands of light-travel-years ago. Is it "weird
unpredictable" or is it exactly what you programmed into it?

------
dblotsky
Almost any “can” question will lead to something like this.

This _can_ happen if the OS lets you use user-space file systems, the same way
it _can_ happen if you have custom hardware with memory-mapped temperature
sensors (and suddenly your code literally “depends” on the weather).

------
api
What's new? Unless you are manipulating logic gates directly you are working
at some layer of abstraction, and there is no way to tell how many layers are
below you. Modern CPUs long ago ceased being directly manipulated gate
machines where instructions mapped straight into logic. Instead they are
basically VMs that execute microcode.

~~~
mannykannot
You might as well ask "what's new? the laws of physics haven't changed" about
anything in computer science, or, indeed, in any technology. It's not a useful
point of view.

------
insulanian
What is the relation of the title to the content? (no arrogance - genuine
question)

~~~
weinzierl
I don't understand it either and almost ignored it because of that. I suggest
changing the title on HN to something like "Can the behaviour of C code depend
upon the semantics of Python?"

~~~
DDR0
The conclusion of the article is that the correctness of the memory model is
circular, feeding upon itself, like the Ouroboros of myth.

------
roghummal
Our Ob or Ros?

