Hacker Newsnew | comments | show | ask | jobs | submit | daniel-levin's comments login

>> “This was so strange that we sat on this observation for several years”

>> "We tried for five years to model the production of the positrons"

Why would a scientist withhold data for 6 years? How typical is it for scientists to not reveal data until they can explain it using current models? I would think that Dwyer would have rushed to publicize such fascinating results.

It mostly depends on how close to tenure they are, and how controversial the data is.

e.g. Dan Shechtman, who recently got a nobel for his work in crystallography, was an outcast for a long while because his data did not fit with the prevailing model - to the point that people in his lab refused to peek through his microscope eyeviewer because what he said they will see there should not have been possible.

Read http://www.theguardian.com/science/2013/jan/06/dan-shechtman... and also http://motherboard.vice.com/read/quasicrystals-are-natures-i... (it's about a lot more, but also talks about Shechtman's history)

There are a few other cases like this: Robin Warren (Ulcer/Helicobacter connection), Barbara McClintock ("Jumping Genes"). The farther back they are, the harder it is to get the real story, but unfortunately Shechtman's story is far from unique.

"people in his lab refused to peek through his microscope eyeviewer because what he said they will see there should not have been possible."

Ah yes, the ever popular "Nyah Nyah Nyah I can't hear you" scientific method. I wonder how much progress has been held back due to scientists like that.

Yes, in fields where precision is prized above speed, experimentalists can sit on controversial results for a long time looking for the cause of a discrepancy.

If I tell you that General Relativity is fine, and Einstein's a little more right, you might be impressed and give me a job. If I report that there's something unexpected about gravity at distances less than a millimeter, and I'm not absolutely correct, it might end my career.

Irreversibly-unblinded blind experiments are a technical way to solve this problem (I just unblinded my thesis work a week ago. Gravity turned out fine, after a clerical glitch.), but it does not solve the social stigma that can be attached to someone who has made a measurement ultimately found to be incorrect.

I just completed a degree in physics from his university, actually. I can think of a few reasons:

1. He wanted to be 100% sure about the data and not rush it, especially if it could be an error with the instruments

2. He has funding for other projects that needed to be worked on in the mean time, and didn't have a lot of time to devote to this work (this can make #1 take longer)

Depends on many things, but if the data appears "controversial yet conclusive" then you investigate more deeply until you have ruled out just about anything else. Long time ago, I was part of a research team that sat on a 3.5 sigma result that only got stronger when we got more restrictive with the data set. We ran for two more years looking for other explanations. I was in grad school and could only think "why don't we publish?!?!" but the more experienced folks won out and upon further review, no Nobel Prizes were awarded and the reputations and funding remained in place. I think the adage "you only get to cry wolf! once" applies here.

If the most reasonable explanation for the data is an error in measurement...

Publish too early and you get people assuming that you're a crank because you haven't done the research yet.

Darwin waited 20 years, and only published after another naturalist was about to publish the same theory.

Those were other times. What stopped this researcher to share this on his personal blog to see if other people are able to see the same thing?

Well possibly the point here is the double pronged attack on the fact that Mac OS stopped at X (ten) and there is a common expression 'turn it up to eleven' [1]

[1] http://en.wikipedia.org/wiki/Up_to_eleven


The flip side of the coin is that they didn't want to name themselves after a satirical skit that is emphasising the stupidity of the phrase utterer.


Wouldn't be the first, or worst, marketing whore-out to happen these days


looks like you wooshed a little bit


This needs considerably more content. I find that 'graphical' linear algebra becomes a powerful tool when it is used to develop visual intuition for concepts such as determinants (areas/volumes in space and what it means for this quantity to be zero), solutions to linear systems (intersecting lines/planes/hyperplanes), subspaces (lines embedded in planes/hyperplanes and planes/hyperplanes embedded in planes), why a transform from R^2 -> R is not invertible (collapsing a plane into a line) what an eigenvector looks like and how it relates to other matrix properties such as invertibility (with illustrative examples such as a scaling matrix and a rotation matrix)

Having said that, I like the lego analogy for direct sums and why they don't commute. It explains concepts a lot more abstract than the basics of linear algebra. I was not expecting that from the title


I just recently discovered an awesome visual intuition for what a determinant is, and why one cannot solve a linear system with a zero determinant. I'd love to contribute to this!


You've piqued my interest.

Would you mind explaining it in a few words?


In 2 dimensions, the absolute value of a determinant of a matrix is just the area of the parallelogram subtended by the row vectors of the matrix. If this area is zero, it means that the vectors are co-linear. And of course the intersection of co-linear vectors is an infinite set (in R^n anyways). The same idea holds in higher dimensions, with the 3d case being similarly possible to visualise (as a volume). Since the unique solution of a linear system is the point at which a bunch of lines/planes/hyperplanes meet, a zero determinant implies a zero area which means that at least two vectors are coincident and hence there are infinitely many solutions.

[1] https://www.math.ucdavis.edu/~daddel/linear_algebra_appl/App...


>> The Camry ETCS code was found to have 11,000 global variables. Barr described the code as “spaghetti.” Using the Cyclomatic Complexity metric, 67 functions were rated untestable (meaning they scored more than 50). The throttle angle function scored more than 100 (unmaintainable).

>> Toyota loosely followed the widely adopted MISRA-C coding rules but Barr’s group found 80,000 rule violations. Toyota's own internal standards make use of only 11 MISRA-C rules, and five of those were violated in the actual code. MISRA-C:1998, in effect when the code was originally written, has 93 required and 34 advisory rules. Toyota nailed six of them.

How the ACTUAL FUCK did this happen!? The article makes Toyota's engineering team seem egregiously irresponsible. Is it typical for vehicle control systems to be this complicated? I would love to hear the other side of the story (from Toyota's engineers). Maybe the MISRA-C industry standard practices are ridiculous, out of touch and impractical.


I'm going to go out on a huge limb here and am prepared to get shot down. I've seen this first hand as inheritor of spaghetti firmware.

The limb is: don't let EE's lead a firmware group.

It's a natural division of labor. The hardware guys are familiar with all the component datasheets and the bus timings and all the other low level details. If the prototype is misbehaving, they'll grab a scope and figure it out. They're probably the only guys who can. Software is an afterthought, a "free" component.

Those same EE's who have performed that role get promoted to buck stoppers for hardware production. They're familiar with hardware design, production, prototyping, Design for Manufacturability, and component vendor searches. They might cross over with production engineering and that six sigma goodness. They're wrapped up in cost-per-unit to produce which is direct ROI. Software remains a "free" component; you just type it up and there it is.

The culture of software design for testability, encapsulation, code quality, code review, reuse, patterns, CMM levels, etc etc is largely orthogonal to hardware culture.


Oh yeah.

Corollary: Don't let EEs hire your firmware engineers. They have no idea what to look for.


> The culture of software design for testability, encapsulation, code quality, code review, reuse, patterns, CMM levels, etc etc is largely orthogonal to hardware culture.

I'm confused by this statement. Substitute say ASME for CMMI and the rest of this sounds like good engineering practices in general. What, for example, about "hardware culture" is orthogonal to testability?


Well, for one thing, every instance of mission critical hardware needs a full test, because hardware is messy and stuff happens. You wouldn't want someone hurt because a gate on a chip got a stuck-at fault or one of your vendors had a little extra variation etc etc.

Software only needs testing when the tools or inputs change. So design for testability of software is driven by the needs of a development group (eg simulators, debug statements), while hardware (at a minimum) is aimed at a post production group (eg JTAG, test fixtures).

I agree there are many parallels, but I think the culture is different.


As a hardware designer I agree that hardware people are typically not very good software engineers.

I find we can reliably write small programs that will do everything they need to. But as the program grows, we are not adept at managing the explosion in complexity.

However I don't believe that's because hardware culture doesn't value software or anything like that. We just have no proper education in software engineering.

Speaking for myself, I grasp most fundamental software concepts. Memory structure, search algorithms, that sort of thing. I appreciate the value of abstract code qualities like testability, simplicity, reusability, etc.

I just have no actual education in how to design large programs from scratch that will achieve those goals.


I don't think it's about education. It's just that software isn't what hardware people mostly work with and you only get experienced in what you mostly work with. And when you're experienced, you gain some insight into how things will work.

I know some basics of EE but I have no gut feeling of how to build or analyze any hardware for real. It's just something I've read and made my mind to understand but it's not something I know.

Conversely, I've been writing software on the lowest to highest levels for decades and I have a hunch of how to start building something that will eventually grow really big and complex. I don't know exactly how I do it but depending on what I want to achieve I have, already at the very beginning, a quite strong sense of what might work and also what will definitely NOT work.

My take on hardware is that it's mostly a black box that usually does most of what's advertised (but workarounds are regularly needed, and I guess it must be really difficult to build features into silicon and have it work 100% as designed), and these quirks had better be encapsulated in the lowest levels of the driver so that we'll get some real software building blocks sooner.

This would be no basis to design hardware. I would make such a terrible mess out of it that even if I managed to design a chip and make it appear to work somewhat, it would fail spectacularly in all kinds of naive cornercases that a real EE would never have to solve because s/he would never venture to build any of them, just because s/he would know better from the start already.


Maybe the MISRA-C industry standard practices are ridiculous, out of touch and impractical.

I currently work in the static analysis industry. Broadly speaking, the MISRA-C ruleset is, in my experience and in the experience of many customers who apply it, not impractical, clearly of benefit, and the majority of rule violations (including a number of rules violated by Toyota) easily found with static analysis tools.

I even know that Toyota is a customer of (at least) one of the static analysis tool companies (obviously Toyota is a huge company, and I've no idea if the specific muppets making this clusterf had any static analysis tools; just that Toyota as a company definitely has someone buying them). It seems that they either just didn't use it, or just didn't care (or were told not to care).

As an extra point of data, I sit opposite someone who is on the MISRA-C committee. He is a solid C coder who knows a great deal about the language and how to get it wrong. His day job is writing/maintaining static analysis tools for C programmes. Obviously this is no guarantee that the committee as a whole is good at maintaining the MISRA-C ruleset, but they do have at least one active, experienced and competent C coder (with lots of experience of having to actually automate detection of rule violations) at the table.


It's a matter of scale and economics. Every dollar cost reduction per unit is significant over the multiyear production cycle. The changes for cost reduction make the design process more complex and a Camry or any car is full of legacy features (the model designation is nearly 35 years old).

On top of that car companies are massive and capital intensive with all the lack of agility that requires...retooling a production line takes many years from design to cars at the dealer. The critical decisions about programming stack had to be made in the early 1980's when the first microprocessor based control systems appeared in automobiles.

Back then, automotive engineers and executives would have hardly predicted the software complexity that was on the horizon. It was state level actors who came up with Ada and Drakon...But those are money is no object and failure is not an option inspired. Toyota probably made money via their level of software quality...shutting down production for a year to get it right would have cost billions. It followed the bean counting and rode the tiger.


More interesting: how does it compare to the rest of the industry at that time?


I'm not really qualified here (and if someone is please post) but I've been poking around with the ECU on my 2001 audi and it doesn't seem to have ECC memory either. A section of the AM29F800BB eeprom is used as random-access memory. It seems to stick a couple CRC bits in every line so maybe this is how they're doing it?


I am starting to think that safety critical systems need to have schematics, pcb layouts, design documents and source code be registered with the gov before it touches the public.

I remember when nearly everything came with schematics so it could be fixed.


When hardware designers write code, the results are often ugly. Device drivers, firmware, embedded code, ...


Engineers tend to read specs and apply standards.


>I would love to hear the other side of the story

Me too, it's probably indeed bad, but there's ALWAYS another side to such stories. Unfortunately, we rarely get to hear those.


MISRA-C is practical enough that NASA's JPL used it as a basis for their coding standard (http://lars-lab.jpl.nasa.gov/JPL_Coding_Standard_C.pdf), and everyone would agree those Martian robot works pretty well...


> Barr’s group found 80,000 rule violations

I'm a bit puzzled by this figure given that according to the article the typical LOC count of this kind of software is within that order of magnitude ("tens of thousands of lines of code").


"Normal" C code that is written in a coding style that disagrees with the MISRA guidelines can have more than 1 violation per line of code, even if the quality of the codebase is otherwise quite good.

For example, the static analyser I'm working on finds 32700 violations of MisraC2012 in the SQLite codebase (130 kLOC).

This is because Misra is quite strict about the use of C: a simple statement like `if (ptr) f();` already violates two rules:

* use explicit `ptr != NULL` comparison

* always use braces with `if`

Once you get around to integer arithmetic, where MISRA is quite strict about not relying on implicit conversions or numeric promotion (which is non-portable due to depending on the size of `int`), it's not rare to have 5 or more violations in a single line of code.


I'm in no way excusing Toyota. For those who want to understand how this happens, I can shed some light.

Early firmware was a replacement for analog electronics. Consider modeling the classic lunar lander. A simple digital computational loop with no branches, just math, has advantages over analog circuitry, such temperature stability, noise immunity, reproducibility, etc.

From there, embedded firmware grew in the same way complex mechanical systems grew, like mills and robots. Meanwhile, non-embedded software on workstations was growing, theories were developing, algorithms and organization and business models were applied. And firmware grew and grew, but mostly like an engine grows in that abstractions are minimized, optimization happens in-place and theory is largely tribal knowledge.

It's only in the last 10 years or so that firmware has metamorphosis end and is confronting other software. It looks especially conflictual (word?) to high-level devs like Ruby or what-have-you.

So yeah, this is no excuse for negligence and ignorance. I just hope my perspective helps ppl evolve.


The apollo guidance computer actually used a sophisticated software interpreter for most of its code. Not just a "computational loop with no branches"


In addition, there is no way in hell that the control algorithms it was using could have been developed without the use of computers. State-space control theory was specifically developed to take advantage of discrete-time control systems.



Unless of course the language is designed and optimised for it. For instance, Jane Street use OCaml - in which recursion is a standard language primitive - and they sometimes prove their mission critical code correct.


Your discussion has been hijacked by somebody posting /r/spacedicks type pictures.


Yeah I disabled images for anonymous comments. Should have guessed that would happen.


Thank you very much for this!


How interesting. I would have thought such a common database operation (querying by ranges) would have been a better solved problem by now.

Also, how come the author of the blog post writes O(k) instead of O(1) for constant time? Is it because 1 is as arbitrary a constant as any or is there some difference that I am not aware of?

Link to the original paper [1]

[1] http://www.vldb.org/pvldb/vol6/p1714-kossmann.pdf


(I'm the author of the post)

You're right in terms of big-O it's O(1), that is O(k) and O(1) mean the same thing. I said O(k) because BFs normally require multiple, but constant amount, of operations (hashes).


I must respectfully disagree that Haskell's memory footprint is simply 'low'. This is because the memory footprint of a given Haskell program is not at all transparent, and Haskell is notorious for leaking memory in a maddeningly opaque fashion [1, 2, 3, 4]. Space leaks might be relatively straightforward to diagnose and fix for a true domain expert, but I would not want to have to rely on someone having such abstruse knowledge in a production application. It goes without saying that a space leak in a production app is a really, really bad thing.

Although, I suppose one's choice of Haskell is a function of one's own risk/reward profile. Haskell and its failure modes are hard to understand. That induces extra risk that some people (myself included) might be uncomfortable with. That said, I am now enthusiastically following you guys and hope to see the proverbial averages get sorely beaten.

[1] http://neilmitchell.blogspot.com/2013/02/chasing-space-leak-...

[2] http://blog.ezyang.com/2011/05/calling-all-space-leaks/

[3] http://blog.ezyang.com/2011/05/space-leak-zoo/

[4] http://blog.ezyang.com/2011/05/anatomy-of-a-thunk-leak/



Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact