The formulation of the periodic table is quite like modern day data science. Chemists even before Mendeleev observed patterns and tried to build a model to predict missing ones. The beauty is that this was done without understanding the underlying mechanism of how it all worked, ie what makes an element (protons), what makes their periodicity (electrons). That came 50+ years later when we discovered protons and electron orbitals.
So going back to the article, their approach is fine because they're just formulating some model from observed patterns. Their reasoning for the underlying mechanism could be completely wrong but that's fine too until a better model comes along.
That's because data science is just science. The only distinction is that you're studying somebody's sales pipeline instead of the secrets of the universe.
In "Functional Programming with Bananas, Lenses, & Barbed Wire" http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.41.1... they talk about morphisms: Hylomorphism, Anamorphism, Catamorphism, Paramorphism.
If you break them down using Joy (programming language) in terms of the genrec general recursion combinator, they form simple patterns, which then suggest at least two other unnamed combinators:
Hylo-, Ana-, Cata-
H == [P ] [pop c ] [G ] [dip F ] genrec
A == [P ] [pop ] [G ] [dip swap cons] genrec
C == [not] [pop c ] [uncons swap] [dip F ] genrec
P == c swap [P ] [pop] [[F ] dupdip G ] primrec
? ==  swap [P ] [pop] [[swap cons] dupdip G ] primrec
? == c swap [not] [pop] [[F ] dupdip uncons swap] primrec
I think it's also interesting to distinguish between systems that reveal intrinsic order (e.g. periodic table), and systems that superimpose external order (e.g. a performance review framework).
Behavioural economics is an example of a field that desperately needs a modern-day equivalent of Mendeleev to come around and structure.
I am craving more examples of this, please, anyone, share!
Newton’s theory of gravitation unified the motion of apples falling from a tree, cannonballs thrown into walls, the motion of planet earth around the sun, and even motion of all the known planetary bodies.
Electromagnetic waves gives us one understanding of X-rays, gamma radiation, visible light, infrared, radio waves, etc.
Quantum field theory unifies every microscopic theory of matter and energy we have ever known.
This list could be much longer.
In general, see the Classification of Finite Simple Groups . In terms of specific examples, the minimal classifications of knots , trees , and braids  are 3 instances that popped to mind. From a number-theoretic perspective, the paper Enumerating the Rationals and its derivative works are interesting reads. In terms of spatial structure and spatial decomposition, look at the classification of crystal structures , lattices, polytopes (zonotopes are particularly interesting), periodic and aperiodic tilings, the classification of space-filling curves, and the classification of closed surfaces . From a relational perspective, look at the classification of topological spatial relations  and Allen's interval algebra . And the one my intuition finds most intriguing is the classification of Group Theory Single Axioms and its relation to my conjecture that division is primary.
 Classification of Finite Simple Groups https://en.wikipedia.org/wiki/Classification_of_finite_simpl...
 Prime Knots https://en.wikipedia.org/wiki/List_of_prime_knots
 Enumerating Trees https://www.cs.virginia.edu/~lat7h/blog/posts/434.html
 Braid Group https://en.wikipedia.org/wiki/Braid_group
 Crystal Structure https://en.wikipedia.org/wiki/Crystal_structure
 Classification of Closed Surfaces https://en.wikipedia.org/wiki/Surface_(topology)#Classificat...
 Topological Spatial Relations https://en.wikipedia.org/wiki/Spatial_relation
 Allen's Interval Algebra https://en.wikipedia.org/wiki/Allen%27s_interval_algebra
 "Yet Another Paper on Group Theory Single Axioms" https://www.cs.unm.edu/~mccune/projects/gtsax/
P.S. Clifford Algebra / Geometric Algebra  unifies much of mathematics, e.g. in GA, Maxwell's equations can be reduced to one equation and expressed on one line :
 Geometric Algebra https://en.wikipedia.org/wiki/Geometric_algebra
 Maxwell's equations formulated in terms of Geometric Algebra https://en.wikipedia.org/wiki/Mathematical_descriptions_of_t...
Their figure 1, which appears to describe fundamental design tradeoffs, makes me think of CAP theorem "maps", like this:
But the whole idea of figuring out first principles and "fundamental" design choices is quite appealing. They have a section saying that a big goal here would be automatic design of optimal data structures. They talk of machine learning techniques (they mention Bayesian optimization and reinforcement learning).
That seems a very interesting research direction. In the same vein there's this talk by Jeff Dean about how they use ML at Google to replace all sorts of heuristics in data structure design (bloom filters etc.) with machine learning to optimize performance, though from what I recall it doesn't automatically change the algorithm itself:
(discussed on HN previously https://news.ycombinator.com/item?id=15892956)
EDIT: I think they cite a paper by Dean and others which was part of that talk
This is a sort of functional bug in forums such HN.
Commenting on paper like this requires time to digest and reflect on the content. For a conceptual paper such as OP I would need at least a day (to figuratively sleep on it) but returning a couple of days later to make (hopefully) informed comment is no longer interactive. There is archival and future reference value, of course, but then rebuttals to possible misconceptions will not accompany the comment.
I only post when I think I can somehow add value to the discussion, at least value for some people who might read the comment. Obviously some people with in-depth knowledge of the paper and field in general will probably already be aware of the link I posted (for example), but others might find it an interesting avenue to explore. I know I often come to HN comments for this reason: a topic seems interesting, I want to learn of related stuff and/or get opinions from people with deeper expertise.
So even if it's only for "future reference", it might actually bring value _for some_. At least that's what I tell myself :)
Need quick reads? Use Y. Quick writes? Use X. Etc.
Michael S. Kester
Assistant Professor of Computer Science
"""Mendeleev was a friend and colleague of the Sanskritist Böhtlingk, who was preparing the second edition of his book on Pāṇini at about this time, and Mendeleev wished to honor Pāṇini with his nomenclature. Noting that there are striking similarities between the periodic table and the introductory Śiva Sūtras in Pāṇini's grammar"""
Most of the theoretical work is in picking the right properties that expose the right trade-offs. Beyond that, the method is deceptively simple.
A while back (order months or years but certainly within last 2 years) someone mentioned a paper, I believe it was here on HN either as the topic or in a comment, but perhaps it may have been on IRC.
I am no longer able to find back this paper, and have only a vague recollection of it (I had read the abstract and the introduction and then postponed reading until I forgot about it).
It dealt with the problem of say 2 peers each having their own list or set, and part of their list is in common, but both may also have entries the other doesn't have. The problem was finding the most efficient way such that both end up with the union of both lists or sets. A brute force way would be to each send a copy of their list to the other, a slightly less brute way would be to have only one send a full copy, and have the other return his difference. But the paper detailed a more efficient method, which obviously I can't remember...
Does this description ring a bell? Does anyone know the paper I am trying to locate?
The notes were a (draft?) PDF for a textbook on algorithms--much like Mathematics for Computer Science by Lehman & Leighton --but instead, the topic was narrowly restricted to algorithms with a cryptographic or otherwise number-theoretic basis. In particle, hashes, content-addressable storage, and Merkle-trees were covered.
Boaz Barak, "An Intensive Introduction to Cryptography"
S. Idreos, et al., “The Periodic Table of Data Structures,” Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, vol. 41, no. 3, pp. 64-75, 2018.
> This is why many argue that research on algorithms and data structures is
"is" refers to "research", not "data structures"
> Data structures is how we store and access data.
wich is arguably just a bad sentence.
Language is weird sometimes, so I’m not certain, but your argument seems circular and incorrect. Use of ‘is’ doesn’t cause ‘data structures’ to be singular a collective noun. The noun is plural, so the correct verb is ‘are’. The sentence would need to name the singular collective in order to use ‘is’. For example: “A set of data structures is how we store and access data.” That would be correct. Arguing that “Hard disks is how we store and access data” is correct and that “hard disks” is a singular collective noun, because the sentence used ‘is’, is not normally accepted grammar.