sentence --> noun_phrase(Num), verb_phrase(Num).
sentence --> noun_phrase_sg, verb_phrase_sg.
sentence --> noun_phrase_pl, verb_phrase_pl.
Most natural languages are not context-free, but he gives the wrong example. To provide a more convincing example - in Dutch you can construct phrases like:
dat ik Marie Jan de dozen zie helpen inpakken (lit: that I Marie Jan the boxes see helping packing)
The NPs and the verbs correspond in the following manner: ik - zie, Marie - helpen, Jan - Inpakken. Or, more formally, the structure is: np_a np_b np_c np_x v_a v_b v_c. Of course, this highly resembles a COPY language, which cannot be expressed using a CFG, but (obviously!) can be expressed using a DCG:
s(L) --> w(L), w(L)
w([a|Rst]) --> [a], w(Rst).
w([b|Rst]) --> [b], w(Rst).
w() --> 
Which can then be turned into:
"Programmers who do not understand history are doomed to reimplement it - poorly"
I thought both of them were good sayings to keep in mind, while reading the above post.
At least the ACTUAL developers of UNIX created Plan 9, they didn't put their hands up and said "this is how you do things, no improvements can be made".
For example, when I see the lame "it's how it was always done", arguments against a sane FS hierarchy like the one gobolinux attempts at, I want to tear my hair off.
Programmers, by virtue of being a little bit understanding and forgiving of computers' nature, can enjoy software that frustrates the hell out of normal people.
It gets pounded into us over and over again that normal people are the right standard to build software to, but once in a while we should allow ourselves to enjoy something like the blocks program, even if it would never pass an A/B test against a blank screen.
Lisp has this. It has probably invented this.
NoSQL databases (and not all of them are hierarchical) solve problems Codd wouldn't dream of having, but that doesn't mean they solve every current problem. I've seen people abusing them in horrible ways.
1. There's a very stringent one-to-one mapping between RDBMS & relational logic. To put it another way, BCNF is not a matter of taste. Its a matter of adhering to the rules. If two dozen data architects are given the same many to many relation, and asked to take it through 1NF to 2NF to 3NF to BCNF and then stop, every single one of them will have the exact same answer. And this is because normalization is an extremely formal process => formal in the mathematical sense of the word. Two sets that have a given many-many relation will decompose to the exact same subsets because the mathematics of BCNF is unambiguous. This is incredibly powerful => Because there's a wrong answer! So if you hire a data architect & the guy gives you a BCNF & you crosscheck that BCNF with a more reliable architect, you are checking if the answers agree objectively. Its just straight math, not a matter of subjective taste as to which should be the primary key & which should be the foreign key & how the transtive dependence works out.
2. With objects in the mix, the relationship is already on creaky foundation. ORMs have an impedance mismatch because the relationships captured by notions of inheritance, polymorphisms, mixin traits - these don't have an exact mapping onto the set-theoretic notions captured by BCNF. Morover, there is a matter of subjective taste. You might design your UML differently from me & you might have valid reasons to do so. So while your BCNF exactly matches mine , your object model won't.
3. With kv stores, your relations are even more mathematically fuzzy. You are storing entire multi-valued objects and your rows have variable number of columns & at times you are dealing with completely denormalized structure. The math simply doesn't apply at these levels, and people tasked with populating a NOSQL structure will come up with wildly differing solutions that are completely unsound from a mathematical standpoint, but will nevertheless perform ( upto a certain threshold ) simply because machines have gotten faster & network latency issues aren't holding them back.
I see more & more firms pick #3 because of trends( latest & greatest), lack of expertise to do #1 properly, or because of some false sense that scalability will bite them in the ass if they do #1 instead of #3, but primarily because #3 is EASY and #1 is hard. Its nice to not worry about normal forms & just have a multi-column row which grows & shrinks...but is it sound ? Am not sure.
In general, you'll have a bunch of concerns.
5. Ease of implementation
Before noSQL, you'd prioritize in that exact order.
Post noSQL, the order has precisely reversed. There used to be pricey Oracle consultants ( and Sybase consultants & DB2 consultants & so on....data architects of all stripes). They'd come in, study your data in some depth & figure out which attributes were single-valued, which ones were multivalued, which bunch went into which tables, who the primary keys & foreign keys were, what the typical views would look like, what joins would that involve & could you do away with some of those joins....hell, even the typical interview questions for a fresh grad having absolutely nothing to do with databases would still involve Outer Join, Inner Join, difference between the two, here take the Employee spreadsheet and the Company spreadsheet and decompose into tables upto 3.5NF, list all the keys & tell me how many joins you'd need for a view that lists all employees in Accounts making over 70k between the age of 35 and 45 in Illinois.
All of the above is pure relational logic. Its completely tech-agnostic. You don't even need an actual database to answer any of that.
Nowadays, the questions are more along the lines of - How do you deploy MongoDB ? Great, you're hired, cause Mongo will handle the scalability & performance & if it doesn't we'll just add another node....imo, this is a vacuous line of thought.
1. Informal . a person who is notably stupid or lacking in good judgment.
An even bigger letdown for me is the whole NoSQL situation. The proponents of NoSQL aren't even aware of how they are repeating history by recreating the hierarchical databases of yore. E.F. Codd was explicitly trying to rectify these mistakes when he formulated relational database theory 40 years ago, and now we've come so far along that we've wrapped around and apparently must re-learn these mistakes a second time. The morons promoting this technology have no clue, they see themselves as the new sheriffs in town, putting their scalability law above all else.
If you have issue with his argument address it! Don't use passive ad-hominem attacks. Are the creators of data stores that identify as "NoSQL" repeating mistakes made in history? Are they a different class of technology than those that were invented 30, 40 years ago?
Let's examine his claims:
"The proponents of NoSQL aren't even aware of how they are repeating history by recreating the hierarchical databases of yore."
Says who? The author doesn't bother justifying this statement, he just throws it out there. It's completely baseless, and likely a straw-man argument (as I have a hard time believing that everyone at Google, Facebook, Twitter and all the other large cloud companies are completely clueless about database history).
"E.F. Codd was explicitly trying to rectify these mistakes when he formulated relational database theory 40 years ago"
This sounds a little more plausible, but which mistakes is he talking about? And when did Codd say this? Citation, please!
"and now we've come so far along that we've wrapped around and apparently must re-learn these mistakes a second time"
Which mistakes exactly? Don't make me guess!
"The morons promoting this technology have no clue"
Ad-hominem attack. Let's move on.
"they see themselves as the new sheriffs in town, putting their scalability law above all else."
Finally we learn his beef with NoSQL databases, but he fails to justify his argument any more than this.
There are strong reasons for putting scalability above other concerns. The scalability of S3 matters far more to me than whether updates are atomic, for instance.
Frankly, I'm confused why you think "his argument is strong". His argument consists of unjustified statements, straw men, and ad-hominem attacks. There's utterly no substance to it.
In your original comment you said,
The author seems to have the attitude of "I don't understand why people do X, therefore people who do X are morons."
Now that you've explained your position it makes a lot more sense.
However I never did say whether I agree with the author or not.
I'm still curious as to whether the author might have a point despite how poorly constructed his argument is. You made some really interesting points that I'm probably going to follow up on. What were these 'mistakes' that the author keeps referring to? What was the motivation for the invention of the RDBMS? How similar are NoSQL systems today to the systems which the RDBMS was supposedly invented to replace?
An ad hominem argument is an attempt to win an argument by making claims about the person's character that are irrelevant to the argument. Saying, "I don't think this person understands what he is talking about," is not an ad hominem attack, because it is relevant to the argument. It might be unjustified, but that's another matter entirely.
Apparently, since I just looked this up, calling someone a moron isn't an ad hominem attack either, but merely verbal abuse, so I retract my earlier statement where I accused the author of an ad hominem attack.
As to whether the author has a point, he might do, but unless he puts forward a good argument we can only wonder.
> Says who? The author doesn't bother justifying this statement, he just throws it out there. It's completely baseless, and likely a straw-man argument (as I have a hard time believing that everyone at Google, Facebook, Twitter and all the other large cloud companies are completely clueless about database history).
For one, he doesn't need to "justify the statement". It's a HISTORICAL argument. So take a computing history book and look for the "hierarchical databases". You'll find them easily. Check how and if they are alike to current NoSQL databases. (Actually he not only justifies the argument, but he goes on to talk about Codd).
The second part of your argument, "I have a hard time believing that everyone at Google, Facebook, Twitter and all the other large cloud companies are completely clueless about database history" is an argument by authority: "surely because they are big companies they should know what they are doing".
Well, big companies do idiotic stuff too, all the time. Remember how all those big companies agreed on the SOAP fiasco? Or the CORBA fiasco before that? Or how IBM, SUN, Oracle touted EJB 2? I do.
> This sounds a little more plausible, but which mistakes is he talking about? And when did Codd say this? Citation, please!
And what are those databases? And SQL? What is a computer?
At some point you assume your readers do their work. It's not as if Code is some mysterious figure, and the things the relational theory solves (and proves mathematically) are lost in legend. This is database theory 101, no citation needed.
Now, if you are not a Computer Scientist, then the article is not meant for you, anyway.
No, it is not.
The author claims that the proponents of NoSQL are unaware they are repeating history. It's not an argument over history, it's an argument over what a certain group of people know.
"So take a computing history book and look for the "hierarchical databases". You'll find them easily. Check how and if they are alike to current NoSQL databases."
Irrelevant. The point is not whether hierarchical databases are alike to current NoSQL databases, but whether or not the proponents of NoSQL are aware of past database designs.
Nor is it a "argument by authority" to suggest that the author's statement needs to be justified. The author is claiming that everyone who designs or promotes NoSQL databases has less understanding about databases that a first year CompSci student. This is not an argument from authority, but an argument about authority.
"At some point you assume your readers do their work. It's not as if Code is some mysterious figure, and the things the relational theory solves"
Again, you valiantly miss my point. The question is not whether Codd invented the relational model, but whether he "was explicitly trying to rectify these mistakes" (whatever "these mistakes" are; the author never explains).
The relational model has its advantages, but it's not a silver bullet, and understanding the limitations of a model or algorithm is just as important, if not more so, than understanding its advantages. Codd would understand the limitations of the model he devised more than anyone.
Computer science is all about finding the right tools for the job. Codd added a useful tool to our collective toolbox, but it's not the only tool we have available. Nor are hierarchical databases inherently inferior to relational databases; sometimes a hierarchy is exactly what you want. A lot of websites, including Hacker News, would fit the hierarchical model better than the relational model.
"Now, if you are not a Computer Scientist, then the article is not meant for you, anyway."
It's painfully clear the article is not meant for computer scientists of any calibre.
It doesn't matter if they are aware that they copy the old hierarchical db's or not, this won't change the final outcome. What matters is if they DO copy them, warts and all. And this, you can check.
That said, you can also check if they are aware or not. Read their interviews, papers etc, and you can easily check if this is the case or not. It's not like it's something they keep in their heads.
Again, you valiantly miss my point. The question is not whether Codd invented the relational model, but whether he "was explicitly trying to rectify these mistakes"
And again, YES, Codd was bloody "EXPLICITLY trying to rectify these mistakes". It's DB history 101 that he did so. Read even just the bloody abstract of his paper: http://www.seas.upenn.edu/~zives/03f/cis550/codd.pdf
Nor are hierarchical databases inherently inferior to relational databases
As a mathematical abstraction for storing data, they are. Relational algebra is a mathematically proven and sound system, not an ad-hoc thing. (Not that common relational dbs respect relational algebra 100%).
Usually it's up to the person making the statement to justify their own arguments.
"YES, Codd was bloody "EXPLICITLY trying to rectify these mistakes". It's DB history 101 that he did so. Read even just the bloody abstract of his paper"
You're misreading it. All tools are inadequate in some way; Codd was not trying to claim that relational databases were a universal panacea for database storage. That's not how computer science works.
"As a mathematical abstraction for storing data, they are."
So if I have a tree of data, in your opinion the optimum method of representing that data is as a set of tuples and not, say, a tree?
"Relational algebra is a mathematically proven and sound system, not an ad-hoc thing."
If the relational model is so perfect, why do SQL databases include so many compromises and deviations from the theoretical model?
Not if they can be easily fact checked and they are addressing them to a knowledge audience.
No, that's just how math --you know, like relational algebra-- works.
Plus, relational databases not being a "panacea" doesn't mean tried-and-failed solutions like hierarchical bds should can come back.
If the relational model is so perfect, why do SQL databases include so many compromises and deviations from the theoretical model?
Because nobody cared to do it properly, for one. Because SQL caught on, and vendors followed that, instead of a purer Rel-Al solution. Because some DBs like Oracle they are based of three to four decades of legacy code that didn't follow RA right in the first place. Because some other DBs like MySQL were written very improperly, but "got the job done" and were freely available.
What exactly are you suggesting? That the relational model is the optimal solution for all database needs?
"Plus, relational databases not being a "panacea" doesn't mean tried-and-failed solutions like hierarchical bds should can come back."
Tried and failed? What?
You do realise that every computer contains a hierarchical database in the form of a filesystem, right?
And you do understand that we're talking on a forum with hierarchical posts, something a hierarchical database would be ideally suited for?
And of course you're aware that in order to represent hierarchical data in relational databases, you need to inefficiently simulate a hierarchical database on top of a relational database?
And knowing all this, you still believe that hierarchical databases are a "tried-and-failed" solution?
No, that the relational model is the optimal solution for all data modeling needs. And that every database system that follows it is more correct (as in provably correct) and generic than any ad-hoc solution.
You do realize that a filesystem is not a generic data storage solution to use with your arbitrary data, but a specialized data structure for specific items called files and folders?
Plus, do realize that OS research, from MS and Apple and the academia, has been going in the direction of making filesystems more generic, flexible and database like, and abstracting their hierarchical nature away?
You do realize that the "inefficiently" is just an implementation detail that has nothing to do with the relational model, right? And that the relational representation is the most flexible and correct, mathematically, possible representation of a set of data? It's basically set theory.
That's so unbelievably incorrect I'm not sure quite where to start.
Okay... So let's say we have a tree structure:
/ \ \
D E F
But if we wanted to represent this as a relation, we'd need to transform the data into a set of tuples. A naive implementation would be:
id value parent
1 A _
2 B 1
3 C 1
4 D 2
5 E 2
6 F 3
So we need a more complex representation, such as the nested set model:
id value lft rgt
1 A 1 12
2 B 2 7
3 C 8 11
4 D 3 4
5 E 5 6
6 F 9 10
To get around this problem, the numbers used for the lft and rgt values are often large integers. We can't use floats for obvious reasons, and exact decimals tend to be slower than raw integers.
So the best way of representing our tree in a relation looks something like:
id value lft rgt
1 A 1000 12000
2 B 2000 7000
3 C 8000 11000
4 D 3000 4000
5 E 5000 6000
6 F 9000 10000
Now, are you going to continue to insist that the relational model is always the best way of representing data?
Are you seriously going to claim the relation above is a better way of representing a tree than, well, a tree?
I want this.
This is what our good old friend the core dump is for. Load the core dump in GDB and there you are.
I'd argue that on windows it's even easier. You can create a symbol server to store all the debugging info (the PDB files) and the debugger can download the correct symbols for that particular compiled version.
The PDB format supports having an arbitrary commands added to procure each source file. You can connect this up to your version control system so that not only do you have the stack and symbols, you can actually browse around the source code of whatever version of the code actually crashed just as if it had crashed locally.
On linux you have to work out a system for archiving your symbols and getting them for the correct version of your software yourself, but the effort is certainly worth it.
We have a system to generate HTML reports each day by grouping crash dumps by the functions near the top of the stack (if the top 5 functions are the same, it's considered the same crash), so each day we can see what the most important crash is.
Whereas, generally with Common Lisp, I can trap the condition, fix the problem, and keep running.
I know what languages I'd consider starting grounds for 100% uptime systems (not 5 9s), and languages that segfault and die as part of their usual crash hangling are not among them.
(Erlang, Lisp, and Smalltalk would be my shortlist)
On linux the situation is a little harder. I am not aware of edit and continue on that platform.
Any kind of crash is catchable. For example, we get minidumps of our windows software uploaded by the crashing process by catching any kind of crash that is possible, and handling it.
However, I would argue that you want to crash and dump a core as fast as possible when something unanticipated is happening. Process death should not be a problem in a well designed system!
3. You enter some code and hit proceed
4. The app continues like nothing happened
Winograd talks about three different metaphors for HCI: conversation, manipulation and locomotion. In the 70's he gave up on conversation for manipulation, and was right do to so, but with modern computing power and databases, conversation is coming back as a relevant model.
I found it especially relevant how he illustrated thinking in terms of imprinting and erosion. I've been thinking about my personality, and how I think about things, in different, but oddly similar terms.
I also agree that personal computing is, to a large extent, re-inventing the past. The best quote I heard on that, as best I can recall, is "The PC recapitulates the mainframe", which is borrowed from Biology's "Ontogeny Recapitulates Phylogeny".
Except of processes.
X is just a protocol, you can happily think about X apps in terms of processes and sockets.
Now, if I only had the same thing for Cocoa Touch - after all, both languages are supposed to inherit from Smalltalk.
The scrollbar seems due to the conversation with the computer about the blocks--barring that, nothing else is offpage.