Data is an old and fundamental idea. Machine interpretation of un- or under-structured data is fueling a ton of utility for society. None of the inputs to our sensory systems are accompanied by explanations of their meaning. Data - something given, seems the raw material of pretty much everything else interesting, and interpreters are secondary, and perhaps essentially, varied.
The point here is that you were able to find the interpreter of the sentence and ask a question, but the two were still separated. For important negotiations we don't send telegrams, we send ambassadors.
This is what objects are all about, and it continues to be amazing to me that the real necessities and practical necessities are still not at all understood. Bundling an interpreter for messages doesn't prevent the message from being submitted for other possible interpretations, but there simply has to be a process that can extract signal from noise.
This is particularly germane to your last paragraph. Please think especially hard about what you are taking for granted in your last sentence.
Take a stream of data from a seismometer. The seismometer might just record a stream of numbers. It might put them on a disk. Completely separate from that, some person or process, given the numbers and the provenance alone (these numbers are from a seismometer), might declare "there is an earthquake coming". But no object sent an "earthquake coming" "message". The seismometer doesn't "know" an earthquake is coming (nor does the earth, the source of the 'messages' it records), so it can't send a "message" incorporating that "meaning". There is no negotiation or direct connection between the source and the interpretation.
We will soon be drowning in a world of IoT sensors sending context-or-provenance-tagged but otherwise semantic-free data (necessarily, due to constraints, without accompanying interpreters) whose implications will only be determined by downstream statistical processing, aggregation etc, not semantic-rich messaging.
If you meant to convey "data alone makes for weak messages/ambassadors", well ok. But richer messages will just bottom out at more data (context metadata, semantic tagging, all more data) Ditto, as someone else said, any accompanying interpreter (e.g. bytecode? - more data needing interpretation/execution). Data remains a perfectly useful and more fundamental idea than "message". In any case, I thought we were talking about data, not objects. I don't think there is a conflict between these ideas.
The defining aspect of data is that it reflects a recording of some facts/observations of the universe at some point in time (this is what 'data' means, and meant long before programmers existed and started applying it to any random updatable bits they put on disk). A second critical aspect of data is that it doesn't and can't do anything, i.e. have effects. A third aspect is that it does not change. That static nature is essential, and what makes data a "good idea", where a "good idea" is an abstraction that correlates with reality - people record observations and those recordings (of the past) are data. Other than in this conversation apparently, if you say you have some data, I know what you mean (some recorded observations). Interpretation of those observations is completely orthogonal.
Nothing about the idea of 'data' implies a lack of formatting/labeling/use of common language to convey the facts/observations, in fact it requires it. Data is not merely a signal and that is why we have two different ideas/words. '42' is not, itself, a fact (datum). What constitutes minimal sufficiency of 'data' is a useful and interesting question. E.g. should data always incorporate time, what are the tradeoffs of labeling being in- or out-of-band, per datom or dataset, how to handle provenance etc. That has nothing to do with data as an idea and everything to do with representing data well.
But equating any such labeling with more general interpretation is a mistake. For instance, putting facts behind a dynamic interpreter (one that could answer the same question differently at different times, mix facts with opinions/derivations or have effects) certainly exceeds (and breaks) the idea of data. Which is precisely why we need the idea of data, so we can differentiate and talk about when that is and is not happening - am I dealing with facts, an immutable observation of the past ("the king is dead") or just a temporary (derived) opinions ("there may be a revolt"). Consider the difference between a calculation involving (several times) a fact (date-of-birth) vs a live-updated derivation (age). The latter can produce results that don't add up. 'date-of-birth' is data and 'age' (unless temporally-qualified, 'as-of') is not.
When interacting with an ambassador one may or may not get the facts, and may get different answers at different times. And one must always fear that some question you ask will start a war. Science couldn't have happened if consuming and reasoning about data had that irreproducibility and risk.
'Data' is not a universal idea, i.e. a single primordial idea that encompasses all things. But the idea that dynamic objects/ambassadors (whatever their other utility) can substitute for facts (data) is a bad idea (does not correspond to reality). Facts are things that have happened, and things that have happened have happened (are not opinions), cannot change and cannot introduce new effects. Data/facts are not in any way dynamic (they are accreting, that's all). Sometimes we want the facts, and other times we want someone to discuss them with. That's why there is more than one good idea.
Data is as bad an idea as numbers, facts and record keeping. These are all great ideas that can be realized more or less well. I would certainly agree that data (the maintenance of facts) has been bungled badly in programming thus far, and lay no small part of the blame on object- and place-oriented programming.
Not only is your perception based on an interpreter, but how can you be sure that you were even given all of the relevant bits? Or, even what the bits really meant/are?
But we can expound on this problem in general. In any experiment where we gather data, how can we be sure we have collected a sufficient quantity to justify conclusions (and even if we are using statistical methods that our underlying assumptions are indeed consistent with reality) and that we have accrued all the necessary components? What you're really getting at is an __epistemological__ problem.
My school of thought is that the only way to proceed is to do our best with the data we have. We'll make mistakes, but that's better than the alternative (not extrapolating on data at all.)
For parallel ideas and situation, take a look at Lincos
Now the question is, what if there are "objects" more advanced than others and what if advanced-object sends a message concealing an trojan horse?
I think this question was also brought up in the novel/movie too...
I think this is a real life and practical show stopper to develop this concept...
In the ideal homogeneous world of smalltalk, it is a less issue. But if you want a Windows machine to talk to a Unix, the remote context becomes an issue.
In principle we can send a Windows VM along with the message from Windows and a Unix VM (docker?) with a message from Unix, if that is a solution.
Think about this as one of the consequences of massive scaling ...
The association between "patterns" and interpretation becomes an "object" when this is part of the larger scheme. When you've just got bits and you send them somewhere, you don't even have "data" anymore.
Even with something like EDI or XML, think about what kinds of knowledge and process are actually needed to even do the simplest things.
I don't really know anything at all about microbiology, but maybe climbing the ladder of abstraction to small insects like ants. There is clearly negotiation and communication happening there, but I have to think it's pretty well bounded. Even if one ant encountered another ant, and needed to communicate where food was, it's with a fixed set of semantics that are already understood by both parties.
Or with honeybees, doing the communication dance. I have no idea if the communication goes beyond "food here" or if it's "we need to decide who to send out."
It seems like you have to have learning in the object to really negotiate with something it hasn't encountered before. Maybe I'm making things too hard.
Maybe "can we communicate" is the first negotiation, and if not, give up.
Another technique that occurs to me is from type theory ... here, instead of objects we'll talk in terms of values and functions, which have types. So e.g. a function a encountering a new function b will examine b's type and thereby figure out if it can/should call it or not. E.g., b might be called toJson and have type (in Haskell notation) ToJson a => a -> Text, so the function a knows that if it can give toJson any value which has a ToJson typeclass instance, it'll get back a Text value, or in other words toJson is a JSON encoder function, and thus it may want to call it.
For contrast, one could look at a much more compact way to do this that -- with more foresight -- was used at Parc, not just for the future, but to deal gracefully with the many kinds of computers we designed and built there.
Elsewhere in this AMA I mentioned an example of this: a resurrected Smalltalk image from 1978 (off a disk pack that Xerox had thrown away) that was quite easy to bring back to life because it was already virtualized "for eternity").
This is another example of "trying to think about scaling" -- in this case temporally -- when building systems ....
The idea was that you could make a universal computer in software that would be smaller than almost any media made in it, so ...
However since PC revolution, the mainstream seemed to take on the "data" path for whatever technical or non-technical reasons.
How do you envision the "coming back" of image path via either bypassing the data path or merging with it in a not so faraway future?
This also obtains for "thinking" and it took a long time for humans to even imagine thinking processes that could be stronger than cultural ones.
We've only had them for a few hundred years (with a few interesting blips in the past), and they are most definitely not "mainstream".
Good ideas usually take a while to have and to develop -- so the when the mainstream has a big enough disaster to make it think about change rather than more epicycles, it will still not allocate enough time for a really good change.
At Parc, the inventions that made it out pretty unscathed were the ones for which there was really no alternative and/or no one was already doing: Ethernet, GUI, parts of the Internet, Laser Printer, etc.
The programming ideas on the other hand were -- I'll claim -- quite a bit better, but (a) most people thought they already knew how to program and (b) Intel, Motorola thought they already knew how to design CPUs, and were not interested in making the 16 bit microcoded processors that would allow the much higher level languages at Parc to run well in the 80s.
On the other hand due to the exponential growth of software dependency, "bad ideas" in software development are getting harder and harder to remove and the social cost of "green field" software innovation is also getting higher and higher.
How do we solve these issues in the coming future?
But e.g. the possibilities for "parametric" parallel computing solutions (via FPGAs and other configurable HW) have not even been scratched (too many people trying to do either nothing of just conventional stuff).
Some of the FPGA modules (like the BEE3) will slip into a Blades slot, etc.
Similarly, there is nothing to prevent new SW from being done in non-dependent ways (meaning the initial dependencies to hook up into the current world can be organized to be gradually removeable, and the new stuff need not have the same kind of crippling dependencies).
Part of this is to admit to the box, but not accept that the box is inescapable.
It is the best online conversation I have ever experienced.
It also reminded me inspiring conversations with Jerome Bruner at his New York City apartment 15 years ago. (I was working on some project with his wife's NYU social psychology group at the time.) As a Physics Ph.D. student, I never imaged I could become so interested in Internet and education in the spirit of Licklider and Doug Engelbart.
Any Meaning can only be the Interpretation of a Model/Signal?
So there's this inherent tradeoff between "easy to process" and "expressive" -- and I imagine deciding which side you want to lean toward depends on the context.
Check this out for a practical example:
(not a Ruby article, but instead about essential structure of messages, loosely inspired by ideas in Gödel, Escher, Bach)
Interesting. But, practically, the interpreter would need to be written in such a way that it works on all target systems. The world isn't set up for that, although it should be.
Hm, I now realize your point about HTML being idiotic. It should be a description, along with instructions for parsing and displaying it (?)
This -- and the Parc PUP "internet" which preceded it and influenced it -- are examples of trying to organize things so that modules can interact universally with minimal assumptions on both sides.
The next step -- of organizing a minimal basis for inter-meanings -- not just internetworking -- was being thought about heavily in the 70s while the communications systems ideas were being worked on, but was quite to the side, and not mature enough to be made part of the apparatus when "Flag Day" happened in 1983.
What is the minimal "stuff" that could be part of the "TCP/IP" apparatus that could allow "meanings" to be sent, not just bits -- and what assumptions need to be made on the receiving end to guarantee the safety of a transmitted meaning?
The value of data is determined by the intelligence of those interpreting it, not those who recorded it.
Of course, this dynamic is sometimes positive. The Babylonians kept excellent astronomical records though apparently making little theoretical advance in understanding them. Greeks with an excellent grasp of geometry put that data to much better use very quickly. But if they had had to wait to gather the data themselves, one can imagine them waiting a long time.
If I speak something to a rock, what is it to the rock? Is it "signal," or "data"?
Making the concept a little more interesting, what if I resonate the rock with a sound frequency? What is that to the rock? Is that "signal," or "data"?
Up until the Rosetta Stone was found, Egyptian hieroglyphs were indecipherable. Could data be gathered from them, nevertheless? Sure. Researchers could determine what pigments were used, and/or what tools were used to create them, but they couldn't understand the messages. It wasn't "data" up to that point. It was "noise."
I hope I am not giving the impression that I am a postmodernist who is out here saying, "Data is meaningless." That's not what I'm saying. I am saying meaning is not self-evident from signal. The concept of data requires the ability to interpret signal for meaning to be acquired.