I used to work at SpaceX on the team that did a piece of software called "WarpDrive". It was a massive monolithic ASP.NET application, with large swaths done with ASP.NET WebForms and a slow frontier of ASP.NET MVC gradually growing when I was working there. This application was responsible for practically everything that ran the factory: inventory, supply chain management, cost analysis, etc. Elon is a big Windows fan and pushed hard to run the whole shop on Microsoft tech. Thankfully the rockets fly with a heavily customized Linux install.
I used to work at SpaceX on avionics software, in a role very similar to OP, and my experience was similar in some respects.
The tech and products were complex. The turnover rate was high and training new hires was a lengthy process. The new projects coming down the pipeline never ceased (this was during a period where FH/F9-1.1/Dragon/Crew was all under design/development and constant iteration).
It was fun for a young engineer, but burnout is real.
*WarpDrive was actually pretty impressive given the amount of stuff that it did.
I heard the same thing about the burnout and turnover. Do companies really think they are saving money by paying peanuts, grinding people down to burnout, and then constantly having to rehire/retrain new people as the old ones leave? Meanwhile the code is a mess because nobody has been there longer than a year and there is no architecture or design continuity. Just frantic patches over other frantic patches by burnt out junior engineers. It makes no sense!
I think it's a case of what is seen vs. what is unseen combined with "you get what you measure."
"We're paying more for programmers than almost anyone else in the industry!" is an obvious thing a bean counter would notice and point out. Productivity is harder to measure, and all that time lost to training on the job doesn't immediately leap out in spreadsheets because it's blended with actual work.
> "We're paying more for programmers than almost anyone else in the industry!"
I can't speak to development specifically, but I looked into one of Tesla's devops offerings and I make slightly more for a mid-level engineer than they offer for a senior position. I also 'only' work 40-hour weeks and my cost of living is about 20-30% lower than there.
But no employer tracks that, not even in academia...
I can’t find the reference I’m looking for, but cost associated with turn over rate due to narcissists is apparently as high as taking care is ASD. I would imagine turn over rate associated with burnouts not due to narcissism, but simply poor & short term management & vision to be at least on par.
Looking at what's been done isn't a good way of determining whether your process is efficient at doing said thing. You need to be able to compare it to something else.
To put it another way: If your method of writing novels is to hire an infinite amount of monkeys and put them to work on typewriters, you can't say "Something about this model must be working right, I came out of it with the complete works of Shakespeare!"
They landed a rocket on a barge in the ocean. Maybe with a better process, they could have done that two years faster, for 1/100th the cost, with no burnout. You don't know, and you can't say the model works right just because there's something to show for it.
All you know is that the process is able to eventually land a rocket on a barge. It doesn't tell you whether it's good at it.
True, but they also did that while maintaining the lowest launch prices in the industry and presumably they can now decrease that even further if they need to.
So while it may not be the most efficient process, the overall process is much better than their competitors since they can launch for so much less money.
I'm not sure that's quite a fair way of wording that (legitimate) question.
The whole company seems to be operating in the "burning the candle at both ends," not just the workers at the bottom. Also, it's not just "saving money" but pushing super hard to accomplish something extraordinary, i.e. generating new revenue, not just reducing costs. Additionally, the workers are partially compensated via stock options, so they share in the success of the company even if not through higher wages alone. So I'm not sure "mistreatment" is the right word to use.
At the end of the day, SpaceX (and Tesla) are not for everyone forever. I am not in a station in life to want to join right now, but may in the future. And maybe this strenuous effort is not especially profitable for SpaceX because of the churn that it creates. But that churn IS helpful for the industry (and thus, in my opinion, society) at large because it has spread SpaceX's know-how throughout the US aerospace community and resulted in alumni founding probably dozens of companies that can leverage the lessons learned from SpaceX. But some people work well in that environment and stay long term (which isn't to say it can't be improved).
So I am glad SpaceX is the way it is, and I hope they're successful in the future. But it also doesn't have to be the model for everyone else to copy. It might not work for everyone else, nor should it be expected to.
”Additionally, the workers are partially compensated via stock options, so they share in the success of the company even if not through higher wages alone.”
I wonder if the constant burnout and churn keeps employees from vesting and thus ever collecting much if anything in stock?
Of course you cannot answer that based on a single companies culture.
I'm refuting the statement that "saving money by paying peanuts, grinding people down to burnout, and then constantly having to rehire/retrain new people as the old ones leave" is unanswerable in the current context based on 1 company especially because this company seems to be destroying their competition.
Part of the fun of the monkeys/Shakespeare mot is that it's completely inapplicable to the real world, and thus absurd. The point of the "landed a rocket on a barge in the ocean" response is that whatever methodology SpaceX uses is not in that category -- it's an existence proof that what they are doing works in the real world, and is thus not absurd.
If they have a process proven to work, in a world where they are already doing things no one else has been able to do, changes to that process should be introduced very slowly.
Although your simile is reductio ad absurdum, a version of this idea does, occasionally get surfaced, in the form of using (modern) small, lower-power processors (e.g. Atom, mobile ARM) in very large numbers in the datacenter.
While touting the purchase cost or energy benefits, these ideas routinely ignore the overhead cost inherent in a distributed system, let alone the Fallacies [1], which is the GP's and OC's (and possibly your) point, I believe.
i assume it means an above median annual salary but a very poor hourly rate. usually these discussions omit a calculation of to what extent the engineers took the job knowing this already and still decided to do so of their own free will, and how much is... not so voluntary.
True. From blind, I think it matches average startup pay. However, that does not account for the fact that most people work very long hours. My brother started off with 80 hours weeks and then came down to 60 hour weeks when he got into his groove at tesla. So when you account for that, it is 33% below average startup pay (assuming other companies are doing 40) which itself is below big tech pay.
I don’t think other companies are doing 40. In fact, I don’t know anyone making six figures that does 40. (I’m not saying no such jobs exist, just that no one I know works in one of them.
If other high end software jobs are paying the same for 45-50 that Tesla is paying for 60, that’s on the low side hourly, but the low side of the high end.
75% (45/60) of 150k is still $112.5k plus presumably good benefits and some sort of equity component. That’s damn fine compensation for someone fresh out of undergrad even in 2018.
I wouldn’t want to work 60 indefinitely even for great pay, but that’s a separate issue.
Maybe I am just lucky but I have worked for 2 of the major big tech companies and came out of college making 100k+ while primarily working 40 hour weeks at both.
> 75% (45/60) of 150k
Tesla new grad software engineer total comp is 150k? Damn, in that case they are pretty close with big tech (amazon is 145k and G/FB is 165k from what I have heard). I assumed it was lower since my brother was a PM with 6 YOE and got payed 130k a year.
> That’s damn fine compensation for someone fresh out of undergrad even in 2018.
Oh totally, my girlfriend is probably going to make like 60k out of grad school. However, while it is much better than anyone besides what my finance friends are making, that does not mean they are paying well relative to the tech industry.
Sorry if I was being unclear. I have no inside knowledge and just wanted to throw some numbers out there so we didn't continue to talk past each other. If Tesla is paying $120k to fresh CS grads and expecting them to work 60 hour weeks, I'm still not sure I'd say they were "paying peanuts" but it's at least getting there.
Capitalism. Need for money and results. I don't think that anybody wants to burnout anybody, it's just the pressure of the whole system I guess. Money pushes to do things fast. Shareholders and clients to keep happy.
In my opinion, this is the biggest flaw of the system. For many investors, a company is less about what it makes and more like a process to grow their money. Even when a company becomes profitable, there's always pressure to make it even more profitable quarter after quarter.
You do get the fact that a company with no growth but paying dividends will be worth (including the dividends you extracted) exactly what you paid for?
It would be worse than buying a bond: you'll get the risk of equity with the returns of a bond.
The question of course is whether the high turnover actually is the most efficient way—you're spending far more employee time training them (both learning the company's approaches at a high level, the problem space, and their way around the code base) than you would if you had higher retention.
While you might need to pay more and improve working conditions, would you be more efficient given the individual staff would spend more time being productive?
Burning out people at large rate is not effective and persistent crunch is contraproductie. It is less of capitalism and more of wishfull thinking combined with wish to be seen as tough manager.
I'd say it's the pace in which modern capitalism operates at, particularly in tech. When the rate of production goes higher and higher due to automation and a global outsource-able workforce, the shareholders chase after higher and higher profits.
Tesla can't outsource globaly and does not have profits yet. Effectivity is about whether burnout brings higher profits, not whether people unable to gain them burnout themselves or those under them due to wishful thinking.
Nothing has changed, and ironically, this whole part of the question - is it better to burn people out and rehire - is a capitalist question, and answered with capitalist goals.
Agree about WarpDrive being pretty amazing for all the stuff it did. Although amazing things tend to just clump up from all the features that you need, and you end up with an app that is hard to manage.
As someone who interfaced with the Warp system (both software and the organization), I agree with @cbanek. The number of features and the custom nature of it are impressive, but for anyone who had to deal with the politics of improving the system, it was a nightmare. There was (continues to be?) an effort to deconstruct the monolith, but it was a very painful process. Motives from different departments were in constant competition. Getting something done in Warp meant calling hours of meetings and getting the ear of a director/VP who would champion your cause -- and an associated PM that would be ready to serve said VP. This process was so backwards that, if the actual technical work that went in didn't burn someone out, the politics sure could.
It wasn't all bad. There were other groups at X that provided pretty amazing tools for people to get things done (thanks, @cbanek and friends!)
Elon nearly brought the newly merged X and Confinity to its knees during a critical period in its development by 1) insisting on the "X" brand when Paypal was more popular with users, and 2) insisting on Windows over Linux despite the protests of his tech team.
Soon after he was ousted and Thiel was made CEO. Interesting to see he's still pushing Windows.
The argument (at the time) was that Windows C++ tools were better than anything on GNU/Linux of the time. Also, in most non-CS fields (aka "real" engineering fields ahem) Windows, for better or worse, is still heavily entrenched. I have mixed feelings on this, as I am a Unix junky but it requires a lot of arcane knowledge to be effective, which many folk don’t have the wherewithal to acquire.
> The argument (at the time) was that Windows C++ tools were better than anything on GNU/Linux of the time.
That's still true today and it will probably always be true unless Microsoft ports Visual Studio one day. It's not a super great reason to choose it as an operating platform though. :\
There are many engineers who actually like Windows. Especially if they're not software engineers. A lot of industrial equipment runs some embedded Windows.
I would risk stating that all non software engineers have to like Windows because there is no software for them runnig on linux. All cad cam programs, ms Office, etc...
Thinking from first principle you can reason why Elon pushed for windows. Elon is a guy who wants to get done. On windows you get a linear output with time spend on it. On Linux the output is exponential. Relative to windows lot of time is required for entry on Linux.Since Elon is man of output he did not like to spent so much upfront time. Also he don't want to let go the control as he is micro manager.
Okay I will give another try. Elon was doing PhD and discontinued while he founded his first company in 90's. Since computers was not his main interest he used windows to the get the work done because windows is easier to learn and start getting output. From then on elon was crazy busy to do lot of other things and never got time to learn Linux . At the same time Elon wants control over everything which means at any point in time he wants to understand what the software is doing and even be involved in making changes to it if required first hand. Now with this background obviously Elon went with windows because he don't have to learn Linux and can use his already learnt windows knowledge.
> the rockets fly with a heavily customized Linux install
My jaw hit the floor the first time I heard this. Why Linux instead of an RTOS?? Apparently Tesla's autopilot also runs Linux, which seems like a huge accident waiting to happen (pun intended).
As a person who used to write navigation and control software for autonomous vehicles who occasionally gets downvoted for telling people that you really don't need a fancy RTOS for this stuff, you really don't need a fancy RTOS for this stuff. Linux is a very common platform for highly responsive robotic systems. I promise that their pid control isn't an Electron app.
As someone who worked on RTOS systems and now works on autonomy.
These things frighten me everyday. What frightens me even more, is the people who work on autonomy without a real grasp on determinism. It's unfortunate that the people who have the most high tech backgrounds (phds in computer vision AI, etc) applicable to autonomy, have never implemented safety critical autonomous systems outside of a research project that tested some aspect of detection or control in a test environment and had to only work once to get a paper published.
For general robotics linux is great. But there is an enormous difference between a robot roaming around your house bumping off walls, and a vehicle carrying a whole family at 70mph.
Most of the linux based systems I have worked with have some form of redundancy, whether it be other chips running linux, or ideally ECU's running an RTOS that perform monitoring, gating, and/or some level of safety fallback control. The RTOS based redundancies often are what provide ASILD. Trusting a single linux processor is what everyone does to get funding, but when you go out and test on public roads with human lives at stake, or start selling a product, you better have some quantitative guarantees other than "It's been fine so far..." That kind of stuff makes me angry.
Having worked several years on critical embedded systems in aerospace, I would tend to agree with you.
But on the other hand, has any Tesla car ever had an accident because of this? At some point, "heavily tested and validated end to end in real-life conditions for years" and "formally proven on a simplified model using reasonable assumptions made by human engineers" become relatively close in terms of how much trust you can put in a system.
But somehow we tend to prefer to later. I am not sure if this paradigm is still relevant these days.
SpaceX's Linux-based engine controllers at least have a decent amount of redundancy. They're all triple-redundant, and each of the three components consists of two cpu cores running in lockstep that are validated against each other as well.
+1. I wrote some near flight software for an instrument using Ubuntu with a RT patch on modern hardware, in parallel with another team that took the traditional approach. Not to brag, but our “undeterministic” system ran far more reliably than the real time one that didn’t have the advantage of modern application libraries. Plus, even though we had that pesky operating system in the way, we ran on blazing fast modern architectures and were actually more deterministic than the slow as hell hardware we were benchmarking against.
I’ve seen the same pattern a few other times. Slow, hand built, rad hard systems CAN be more stable and demonstratively safer... but that is rarely the case and the effort required to get such a system right is orders of magnitude greater than using standard “undeterministic” systems. That engineering effort can be better spent innovating and building fundamentally more advanced solutions.
Having done just enough robotics stuff to have used linux before (though not controlling anything even kind of large), I find this an odd notion. It's not that I think linux would go wrong for this sort of thing often, but when dealing with things like rockets or self-driving cars I'd think you want more assurance than "well we haven't had it be slow yet."
I haven't messed with an RTOS before but have done some fooling around with scheduling on microcontrollers and I can see why linux is tempting for ease and speed. But we're talking rockets and self-driving cars. These things are expensive as hell, can easily kill people, or both. It seems like the exact sort of place you'd want to take the time and effort to be sure.
Agree completely. Hard real time is possible with Linux, we use it for sub millisecond control of Traffic Lights. The only issue we ever hit is proving that the code running is the stuff we expected to run.
>we use it for sub millisecond control of Traffic Lights
Why would anyone need "sub millisecond control of Traffic Lights"?
Traffic lights are mission critical systems, of course, but even millisecond precision should be more than enough, and possibly even 0.5x-1 second precision...
For legal reasons. Regs say that stop cycles have to meet minimum times in each state, if you're even off by a Tony bit you could potentially challenge it legally in court.
Bear in mind that the software only controls the cycle. The lights are electrically wired so that it is impossible for example to have "GREEN" illuminated in crossing directions. Or so I've read -- I don't work in traffic control software or hardware.
Yeah, I still think that's overkill, simply because the bulk of the computation is done on a remote server anyways. All you really need on the frontend is a TCP/IP stack to send telemetry and receive commands.
If the connection is lost, the exchange can just fallback to "naive" mode.
I guess using off-the-shelf mass market hardware combined with a software stack anyone can design, setup, and implement is way easier and cheaper than a customized solution.
Consider the overhead of maintaining two entirely separate software stacks, with different libraries and controllers then? And the ongoing costs of discovering your "minimal" hardware can't accomodate a future improvement, compared to just using general compute at a marginal upfront cost, and then having everything else be familiar?
Depends on the area but yeah many intersections use sensors (cameras, sometimes under the road pressure sensors, etc) to make anywhere from subtle to extreme changes based on traffic patterns. The under the road pressure sensor has been around for decades.
When I was a kid there was one light that, when you drove over the pressure sensor, it wouldn't really do much. But if you backed up and drove over it again it must have registered an additional car coming through and the light would almost immediately go through its light cycle to change. It was really interesting to see!
Nowadays I think it's mostly cameras? We have a light near my home and the left signal will literally never trigger unless someone is in one of the left lanes.
It's not a pressure sensor, but an induction loop. Basically, there is a coil placed on the road that has a small AC current passed through it. When a car (metal) sits on top of the coil, the two "coils" couple, changing the overall inductance. A simple sensor can detect this change.
No, the current passes through the coil, and the coil has no physical contact with the vehicle.
Look up inductive coupling. The basic idea is that a changing (AC) current in a conductor generates a changing magnetic field (Ampere's law). This changing magnetic field then induces a voltage in the second conductor (Faraday's law). This is the principle behind how transformers work.
As far as I understand, it's just electromagnetic waves. No current passing through to the car, but the coil can "register" a change in its magnetic field and can determine that it's a car and how fast it goes
Can confirm. Also interesting to note that motorcycles often have trouble triggering these sensors (a common trick is to stick a heavy duty magnet underneath).
Venturing a guess -- there's either some X% accuracy standard required by the government for who-knows-what reason, or it's for red light cameras and ticketing systems.
"Ironically, the biggest concern with red-light camera systems is that they are so precise. They measure a driver’s speed and exact location within a fraction of a second — but do not leave any wiggle room for the errors of traffic signals such as inconsistent yellow light times"[0]
If there's not an accuracy threshold for safety reasons, there's gotta be one when traffic ticketing revenue is on the line (also I guess determining fault at accidents, vehicular manslaughter cases, etc.)
Why arent their efforts to create "autonomous only" traffic managment scenarios, where people drive into a given, known area, and the area then takes control of managing the traffic and vehicles. Such that you relinquish control of the vehicle to that area's control system, with your destination stated and then your vehicle is managed accordingly.
For example, a parking lot for a really large venue with an autonomous valet system.
YOu drive up and get out and then the system takes over your car and drives off iwth it and parks it and you recall it when needed...
Or managing traffic in a very heavily trafficed bottle-neck of a grid; such as the baybridge merging egress from SF financial district.
If you put in your destination, and join the group, all the cars could then be managed for getting onto the bridge more rapidiously ...
Autonomous doesnt need to drive me from SF to LA, but it would be great if an autonomous hive mind could get all the cars to up throughput in given situations, no?
That seems to match the right tool (AI driving) to the right job (well-defined, well-controlled situations).
I seem to recall that similar ideas go back to the early 1990s, at least, for highways: Drive your car to the entrance ramp, plug in your destination, and the autonomous system takes it from there.
But for many of these things, such as the Bay Bridge or a highway, it seems like there is a simpler solution: Put the cars on a train and take them across by rail. I suspect I'm not the first person to think of it so I wonder why it's never been done (i.e., what problem I'm overlooking).
I suspect it's never been tried because the cost necessary to get from where we are now to there outweighs the potential benefit compared to more conventional transit solutions, carpooling, etc. Once automated driving gets to a point where it's possible to implement "autopilot-only" lanes (and doing so gets past the sociopolitical hurdles), I suspect those will come into play too, though.
Ive been picturing the rail problem for some time as well.
Not just for cars, but also for cargo... just have a constant gondola-like conveyer that detaches a platform from the line to slow it enough to allow for cargo to get on, then re-zip-it backinto the line and speed it along, de-rail it once it hits its exit/location...
ideally though, in cities, there would be no surface streets and all cars would have their own level below that of bikes pedestrians.
What would SF look like if a superstructure was built above all streets and all pedestrian and bike traffic was moved up there? (sure, SF may be a poor example, so just select [city])
Look at Singapore's vast underground connecting malls between facilities. Those are pretty amazing.
> What would SF look like if a superstructure was built above all streets and all pedestrian and bike traffic was moved up there?
The street level would be dark and storefronts would become difficult to access. If the stores moved up to the 2nd floor (a massive transformation of real estate, probably greatly reducing available living space), what would go on the first level? Not many people would want to live in the dark.
Besides, the best integrated transport solution in the world already exists in places like Utrecht, Groningen and Assen thanks to reforms that started decades ago.
The dynamics of mine development show interesting parallels with tech, actually.
Lots of mines start from little companies searching for a possible ore body (the idea or market fit), then raising money to perform a closer survey (seed funding). If the closer geological work is promising they often obtain a lease (patents or other IP).
At this point it goes one of two ways. Either they raise enough money to start and operate the mine themselves (series A, B etc, leading to an IPO) or they sell the prospect to a major company.
Then the newly-minted millionaires, who know a lot about mining, invest in the next crop of junior miners.
So as with tech there are conceptual, exploratory, growth and liquidity phases, followed by a process of reinvestment.
I remember realising this when living in Perth and being frustrated that, with quite literally billions of dollars sloshing around the city looking to invest, you'd be hard-pressed to pitch anything smarter than a brochureware website to the local investment class.
There were other structural problems. Stock options are not A Thing for various legal reasons. Failure in starting a high-risk business is a bit of a black mark. There are VCs but so much of their money came from governments trying to jump-start a market that they were about as risk-taking as a loans officer at a bank (what government wants "10 MILLION WASTED ON PHONE APPS" as a headline?).
Meanwhile the super funds are collectively sitting on trillions of dollars[0] and investing an absurdly dumb fraction of it in the ASX. Putting just 0.5% of their holdings into VC would unlock tens of billions of dollars of potential investments.
For which, hey, VCs who lurk here and want to raise a fund: go talk to the Australian superannuation industry. It is a massive pool of underperforming cash languishing in the same dozen public companies and, because Australian law forces all Australians to set aside at least 9.5% of income for retirement, the industry will never stop having incoming funds. There will always be new money to raise[1] and it will probably the 2nd largest pool of pension investments sometime in the next 10-15 years.
I will accept finder's fees and/or massively remunerative job offers as reward for this insight.
Well TCP doesn't do collision avoidance, it's a link layer thing. And on Ethernet, it is collision detection on the shared medium. Wireless does avoidance due to the hidden terminal problem.
Neither of these models is really analogous to cars on the road.
But applying collision detection and exponential back off in road traffic is a "fun" thought experiment.
A more apt model would be critical sections and semaphores from concurrent programming. Which is named after a collision avoidance scheme used to control trains. And we all know how difficult concurrent programming can be. I don't want traffic with deadlocks, starvation, busy waiting or live locks.
Well this would mean collaboration between autonomous car manufacturers to build a common protocol. And this does not fit with their business model of getting massive investment on the grounds of potentially being the first player on the market.
I don't think there is any possibility of large scale autonomous driving without a shared control infrastructure. Autonomous driving will only work as long as autonomous cars are a small minority.
As soon as they stop being in the minority, some shared control infrastructure is necessary.
Case in point: 4 cars arriving in a no-lights 4 way intersection simultaneously will cause a deadlock. A tie breaking scheme requiring some form of communication is necessary.
Not just between car manufacturers, but also between the cars and the area conrol system. Basically, all car manufacturers whould have to agree on a common API that allows the control system to take over, with full access to sensors and drive controls. A manufacturer could not simply refuse to implement the API, because this would effectively make their cars unable to use certain parts of the road network.
Even if manufacturers would somehow manage to agree on an API, it could then be "abused" by competitors or accessory vendors to sell their own customized car assistants, which would instantly work with any car brand - without them having to negotiate with the manufacturers.
I fear we will sooner have a usable open IoT standard than manufacturers giving up that level of control.
Traffic signals have hardware interlocks so that can never happen. I don't know if they are using mechanical relays or what, but you can't turn both sides green in software.
That’s good to know. Indeed, the Pathfinder’s OS (VxWorks) had priority inheritance but it wasn’t enabled on a particular mutex and enabling it was the fix.
Priority inversion had been known about since the 70s. Priority inheritance seems to have first been proposed in 1990:
The Pathfinder engineers were apparently unaware of the priority inheritance option available in VxWorks until they had to debug the issue live from a few hundred million km away.
Automotive ADAS systems generally require ASIL-D certification, which is much easier with an RTOS than Linux. I don't have much experience with real-time embedded Linux, but my understanding is that it is very difficult or impossible to certify to ASIL-D. Can someone correct me?
A RTOS helps because the vendor will usually provide the RTOS already certified for ASIL-D application. The rest of the software components will also need to reach ASIL-D, but getting the RTOS to ASIL-D makes things a little bit easier.
Usually this boils down to the ASIL-D RTOS systems are much smaller, e.g. much simpler to verify, leaving the developers of the systems above with much more job to verify there parts.
Also in my experience it might have been easier to reach the ASIL-D requirements, using a smarter combination of a Limiter on RTOS and using more generic code on something like linux for more of the code. This probably also would end up in more used and tested applications reaching more stability. (That's is partly outside ASIL-D).
Functional safety and ISO-26262 is much misunderstood in automotive development and architecture.
Also imho the certifications, well with out the safety case are kind of useless. You still have to make the assessment how you will find the problems with it in your use case. That might ever so slightly differ from what they certified. The automotive industry thou loves to have someone else to blame, e.g. the supplier of the RTOS, Compiler etc. Using Linux makes the blame game hard.
Agreed. I work on both vxWorks and Linux in the defense industry for a very popular armored fighting vehicle, and despite popular belief, the Linux kernel with the RT patch works well enough that both the cost of vxWorks and the issue of finding developers to maintain it isn't exactly justifiable anymore. Without going into too much detail, there have been a good few studies internally to show that our current fire control unit doesn't need the hard time precision it once required on legacy hardware, and all of the Linux ports with the RT patch perform just as well. The biggest hurdle, of course, is not exactly the performance. It's the certification process.
Is that like, you really don't need a fancy RTOS for this stuff 99.999% of the time, although sometimes you do? Or truly, despite the life safety element, there is never any need for RTOS.
In every application I can think of off the top of my head, but mostly in the ones that apply to Tesla, you truly don't, except when the law or a contract says otherwise. I'm sure there are exceptions for things that don't apply to Tesla (or for that matter SpaceX).
Here's what's inside of every autonomous vehicle ever made: a message-passing subsystem, sensors, fusers, navigation, dynamic control, actuator device drivers, and thruster device drivers.
Sensors measure things and emit readings. Your most expensive, highest frequency general purpose sensors emit new readings at something too fast for a human but hella slow for a computer, like 100Hz-5KHz. Your common sensors, a video camera for instance, don't get even close to that. These sensors are often connected, even today because milspec companies hate modernity, via RS-232 serial cables. For those younger than 30, RS-232 is what non-Apple computers used for non-keyboard/mouse peripherals prior to the introduction of the first iMac in 1998 because USB didn't really take off until then.
Sensors send their readings via the message-passing subsystem to fusers.
Fusers take the readings from the sensors and, hur hur, "fuse" them together into a description of where the vehicle is and what the environment is like. This usually involves something like a kalman filter. Fusing even your very fastest sensors, the 5KHz IMUs of the world, is just a small bit of math and basically takes no time at all.
Fusers send their fused states via the message-passing subsystem to navigation.
Navigation takes the fused sense of self and the world and decides which direction to head and how fast to go. The objective could be something like hitting route waypoints or it could be something like staying in a lane and not being rear-ended and avoiding obstacles. Car navigation probably doesn't act on new input more frequently than 100Hz, you certainly can't act on new input more frequently than 100Hz, and it takes basically no time at all.
Navigation sends its directives via the message-passing subsystem to dynamic control.
Dynamic control takes navigation's "which way" and "how fast" directives and turns them into more realistic short-term goals accounting for hysteresis and other physical limitations of the system like minimum turn radius. This is just a small bit of math and basically takes no time at all.
Dynamic control sends its directives via the message-passing subsystem to the actuator and thruster drivers.
Actuator drivers convert dynamic control's "go more left" message into trying to go more left.
Thruster drivers convert dynamic control's "go more fast" message into trying to go more fast.
Actuator and thruster drivers send readings (hopefully) from the actuators and thrusters, because those are also sensors, back to dynamic control and fusion.
Sensors feed into fusers, fusers feed into nav, nav feeds into dynamic control, dynamic control feeds into actuation and thrust. When you have new data, you do something new with it which is technically doing the same old thing with it and just producing new output.
Now there aren't that many sensors. There are way fewer fusers. There's only one navigation. There's probably only one dynamic control, though there could be a couple.
Anything else that I haven't already described, like Waymo's machine learning object classifying 4D mustache adding hotdog detectors, are just sensors and fusers sitting on their own computers feeding new lat/lng/heading/speed to navigation at a rate that is hella slow for a computer. And for sure Waymo's convolutional neural network middle-out jaywalking yoga mom detector takes a lot of processing, but it's running on its own computer, not competing for resources, and emitting its fused readings at some hella slow for a computer rate.
Nice high level conceptual model of such a system. What you conveniently ignore is the complexity that is necessarily introduced by hard real time constraints, safety and all the reliable communication required.
This stuff really does get complex. A sensor controller will likely be on multiple cycles internally: one for oversampling the sensor hatdware and one for transmitting the (filtered/corrected/calibrated) results. A "fuser" as you call it (never heard that term before) needs to make sure that it does never act on stale sensor information (sensor malfunction, accumulated communication issues). Transmission errors need to be detected. Random bitflips in values that are stored in volatile memory for long time spans need to be checked and acted upon.
Every independent controller in such a system requires some kind of watchdog that needs to be reset periodically. Too many watchdog resets in a row indicate a failure and the affected system must shut down in a defined way. You need ways to deal with any combinations of controllers going belly up and avoid taking unsafe actions. For many systems transitioning into a totally inert safe mode is sufficient, but not always.
All of the hardware must constantly run self tests. That includes periodic CPU and memory tests (both volatile and non-volatile memory) and also all periphery that us involved. If, for example, a DAC is used to send a signal, the resulting signal must be read back by different hardware to check that the generated voltage is indeed correct.
Manually threading together all these different kinds of cycles and asynchronous events without a RTOS scheduler is hard and becomes error prone. The result is likely less resilient than a preemptively multithreaded firmware.
I had a very mean line-by-line response, but then I deleted it because it wasn't in keeping with HN guidelines. I apologize for having written it even though you'll never read it. Instead I'll just say that literally nothing you've mentioned has anything to do with whether you choose to use Linux. Yes your serial lines will be noisy. Yes you have to write software. Yes you need specific domain knowledge to do it well. Nobody has ever said otherwise. None of that, none, has anything to do with the operating system.
An operating system, real time or otherwise, handles activating processes, IO, and interprocess messaging. That's it. You don't get magical serial line noise clearing pixies with it, and it doesn't make your actuators less drunk.
And...
> Manually threading together all these different kinds of cycles and asynchronous events without a RTOS scheduler is hard and becomes error prone.
You just said "getting input and sending output is way too hard for software on a Linux kernel." That's a crazy person statement. It turns out that Linux is, and has been for a loooong time, very good at doing operating system things like activating processes and interprocess messaging.
> The result is likely less resilient
Saying "likely" here suggests that you don't know what the result actually is. So what are you arguing? What was my statement?
Whatever assumption you're making that says "This. This right here is the reason why we definitely need an RTOS." Just don't make that assumption. That assumption is wrong.
We need to both take a step back here and look at what we are saying. I am working on safety critical embedded software running mostly on small(ish) microcontrollers. It's the kind of environment where you're running either bare metal or an RTOS at best. There is simply no way to run anything else in this kind of very constrained hardware we have.
So to me, "you don't need an RTOS" means that you're running on bare metal. And that would be hard to pull off for the reasons that I outlined above. And I think this is where we ended up misunderstanding each other.
I enjoy the kinds of restricted RTOS environments that we use because their simplicity means that I can get a total understanding of what is going on quite easily.
This does not mean that
Linux is completely inappropriate for real time tasks. I am sure that you could analyze and patch the kernel to match pretty high standards (others mentioned patches). Given the relative size and complexity of a Linux system, this is no simple task. But if you run it on appropriate hardware (not your run of the mill x86), I don't see why you couldn't get reliable realtime responses.
But safety essentially means that the software will not fail more often than once every x hours where x ranges between 10^5 and 10^8, depending on the level of safety required. Proving that for a complex system is hard. For example, how do you show that the essentially indeterministic pattern of dynamic memory allocations happening in a Linux system will never lead to memory exhaustion by fragmentation?
I know of no version of the Linux kernel (or GCC, for that matter) that got a functional safety certification. Safety standards are transitioning away from allowing positive track records as sufficient proof that a piece of software meets safety standards. Do-178 now only allows certified software AFAIK and I expect this to be carriedover into ISO 61508 and ISO 26262. This means a regulated development process, pretty strict coding standards, complete test coverage, full documentation, and so on also for all 3rd party software. Not sure how this transition is going to play out in practice.
Do you know any ASIL-C or ASIL-D (or SIL-2/SIL-3) software that is running on Linux? I am curious whether anybody managed to get that certified. I know that Linux is running on some class II medical equipment, but then, standards for these devices are inexplicably lower in practice in my experience.
A common distinction is between hard real-time and soft real-time tasks. Hard realtime means that missing a deadline results in a complete failure of the system. A soft real-time deadline means that missing a deadline will not result in system failure. Many soft real-time problems are incorrectly and unnecessarily promoted to being described as hard real-time. This can lead to completely different system architectures and much longer development time.
A stock modern Linux kernel on almost any hardware platform will give millisecond level responses. Much of the old PREEMPT_RT patch set features from the old 2.6.x days for real-time response has been merged to the mainline kernel.
There are lots of problems in software where you are controlling something physical with a control loop from 1 to a couple hundred Hz. Many people assume a hard real-time deadlines are necessary for this sort of system, but through good system design practices it often is not necessary. For example, if something physical must be sampled with very low jitter, let some hardware do the sampling and latch it in a register and then let the software come in with a variance of hundreds of microseconds to get its work done. Once again, write the output to a latched register and let hardware worry about taking the shadow register with very low jitter.
Having worked on bare metal microcontrollers, to various RTOSes, to higher-performance embedded CPUs with Linux, I prefer Linux on higher-performance hardware. Obviously, this isn't always possible, especially in power constrained situations. But with Linux, when you suddenly need to have support for an arbitrary network protocol, a database, a filesystem, graphical output, etc. you can have something together in no time. It is often a monumental effort for such a task when bare metal or with a RTOS. It is often difficult to get the supporting software and libraries to build on an RTOS in the first place.
Among other things, determinism, since timing can be guaranteed (within a margin). RTOS will run with consistent timing, which is guaranteed and facilitated by control over tasks priority and checks whether timings are met. You could probably do that without a hard RTOS, but without any sort of formal guarantee. So, it might work all fine and dandy, until it doesn't. Doesn't should not exist in a hard RTOS by definition and proofing of it.
You don't need an RTOS to know that a system without extraneous background processes isn't doing extraneous background processing. Where precisely do you think think your timings are going? You're not also running a Minecraft server on the nav computer. Using Linux doesn't mean you also need to enable stupid PC things like search indexing or seti@home. Navigation isn't resource constrained. The only resource constrained component in the whole system is the computer vision module and it's going to be on its own processors, and the rate at which it can hand off new output is on far faaaarrrr longer timescales than your interprocess communication latency.
Yes, thata one of the RTOS thing, but almost always, when trying to build this down into what these timing limits are, they are kinda arbitrary, gut feeling put into a number. There are exceptions, yes, usally these exceptions as well are only for a limity subpart of the system. So for this using RTOS for everything is as stupid as not using RTOS, (or possibly even puting these parts into hardware, ASIC of FPGA) for these small subsystems.
These timing used, also is mostly up to scheduler and Linux is possible to run with a scheduler that gives it similar capabilities ans most RTOS systems. It's a blury grey area at best.
Yes, this is so true. When someone has a true jitter-sensitive task that needs to run on say microsecond or sub-microsecond accuracy, that is not for the realm of a high-performance CPU with caches even if it is on a RTOS. My first question is if that tight of a bound is truly necessary or just an over-specified requirement. If it is necessary, I say do that in hardware or on a simple microcontroller (e.g., Cortex-R) if that is truly your requirement.
Same reason SpaceX eschews radiation-hardened processors for redundant off-the-shelf cores: supplier competition. There aren't many RTOS engineers on the market; there are many Linux engineers. Once they got over the cost of hardening the kernel, SpaceX found itself at a scaling advantage versus RTOS-based competitors.
Not just that but holy shit some of those commercial RTOSes have major issues. I work in aerospace and we recently used one where the whole system would crash after ~230 days of uptime.
At least with Linux you're getting a system that's been used so much that all major issues like that are ironed out. Nothing beats a few million testers.
edit: I saw the same on a fleet of thousands of JVMs which hung on 100% CPU after 248 days very consistently. Closest thing to an explanation I ever got was perhaps it is storing uptime in hundredths of a second (why not ms???) in signed 32 bit integers, see: https://ma.ttias.be/248-days/
In the end we solved it by restarting with a cronjob between 2am and 4am after 247 days...
One thing to look at is the sum of Anonpages if you have THP enabled. That was enabled by default after CentOS6.2. The usage itself isn't an issue, but there is a known memory leak in THP and the fragmentation can get wedged after a couple hundred days based on usage characteristics of the server.
What about people who make video games? There is a great video on YouTube called "Software powering Falcon 9 & Dragon - Simply Explained" which goes more into this topic.
I would imagine commoditization generally increases - those variables which market values. In this case I presume the variables include reliability since that stuff is probably used in IoT systems where minimizing maintenance can be probably considered am asset.
Although I don't know the market. It might screwed by some weird market dynamic.
Arduino boards arn't designed to operate in a rugged environment such as a car which is a pretty hostile environment regarding vibrations, electro magnetic interference and thermal cycling.
Don't get that worked up my friend! It was meant as a compliment for the sly sleight, I do have a sense of humor... (oblique, but that depends on the crowd's judgement.)
The GP's joke was good dry humour that, given the lack of downvotes, was clearly appreciated. Reddit is where you go for low-hanging fruit. So no, I'm going to push back against turbonerd NO FUN ALLOWED types when the jokes are actually funny.
I'm sure anything embedded is almost nothing like the pre-modified stock. The few paper I've read from embedded people is that they know how and will strip everything down until things are like they need.
"strip everything down" consists in removing packages, especially daemons, you don't need.
You don't (want to) make huge changes to the kernel and libraries codebase tho, even if the changes are meant to remove code you don't need, because testing a heavily modified OS gets prohibitively expensive.
Especially on "modern" embedded from the last 10 years were RAM and storage are not that limited.
Ubuntu and Centos come pretty naked; what would you strip and why?
Im asking because I run few instances with very heavy traffic and have no issues whatsoever. Just added nano and fail2ban and it runs with no issues for about 2 years now.
Any kernel module that's not required (wifi, graphics, sound, USB, etc depending on application), any security system like selinux, any unnecessary libraries, helper utilities, etc etc.
Basically you rip out anything not strictly required for the task at hand.
Running on embedded hardware is quite different from running on server hardware, disk space and memory are measured in megabytes, not gigabytes...
I once ran an "embedded" Linux on a $10 marvel SOC. It was a pretty vanilla kernel running a basic Debian install. 10MB out of 128MB RAM used most of the time.
Obviously you can go much slimmer, but a $10 board is surprisingly capable.
Not necessarily. I'm working with B&R's range of industrial controllers at the moment, which are Atom processors with a few hundred Mb of RAM and CF cards up to 32Gb, but still running a traditional RTOS with hard timing guarantees. They have built-in web servers and (basic) web browsers...!
Arch? Arch ships packages with debug symbols and docs included, and takes over a hundred MB for just a base install! Alpine is way smaller; base image under 10MB, packages broken apart so you only get binaries unless you ask for more, linked with musl to make it even smaller.
EDIT: This is meant to be a bit tongue-in-cheek, but I seriously do prefer Alpine over literally every other Linux distro I've yet seen for minimalism. Also geared towards embedded-type work.
Fair enough, and you're quite right about Alpine being geared more for embedded. For general use I find Alpine a bit of a pain due to lack of systemd (bring on the hate ;)), and of course lack of docs hurts usability a bit. With regard to debug symbols, I like what Redhat and Debian are doing with an embedded build id linking binaries to separate debug packages.
yeah come on, don't bring Alpine in the mix, we were talking about Ubuntu, not DamnSmallLinux. Alpine is great, no doubt, but it's in a class of its own.
Certification. Many commercial RTOS's are already certified to ASIL-C or ASIL-D, which requires extensive testing to verify that the system will work as designed and every code path is covered by tests. Products exist like Automotive Grade Linux, but you will have to go through the verification and certification process yourself which isn't quick or cheap. I'm not even sure it's possible to certify Linux to ASIL-D.
So yes, I'm sure that Linux can work. But it will be difficult to prove to auditors that it will always work.
I'm speculating that he does know. But with all the filters you added (right hardware, right patches etc.) you would be better off to add the right patch -Linux/+<real_realtimekernel>
Besides, if you follow Linux kernel development, you see that the effort is virtually never for real-time but for general purpose.
I've worked with RTLinux in automotive. It is only used in R&D and testing. It's much worse than INTime (windows rtos) or L4, which are usually used. I've even patched g++ to work on RTLinux eliminating all the dynamic, unreliable stuff.
People are using Linux when they need HW and driver support, e.g. gigabit Ethernet, firewire and such. RTOS vendors charging shitload of money on those drivers.
I trust the Linux drivers more than the RTLinux scheduler or libc. But well, recently networking went to hell, so even there they start fucking up.
Pretty much anything that actually needs hard realtime is very likely running on a dedicated MCU/FPGA/ASIC. Linux is vastly easier to use for everything else, though. Navigation or communication doesn't need hard realtime, for example.
Now just hold on a minute. I’ll bet you five bucks that Linux is just the brains of the rocket, controlling a network of RTOS-running (or not) microcontrollers (or maybe ARM SBCs.
They could have used RTEMS (was designed for missiles) - on some level, if you design accordingly, you dont need a realtime operating system anymore - you can get by without it, and its easier to find people to design software in my opinion for not an RTOS
The operating system provides a standardized IO API and schedule processes/threads. If you don't have fancy IO (LCD, hard drive, network, etc...) and multi processing, there is no need for an OS.
Sorry, I don't understand your point at all. Even fairly simple looking applications can end up doing all kinds of things at once / in an interleaved manner. Preemptive scheduling is simpler and therefore safer than trying to squeeze everything into some kind of gigantic global state machine.
Yeah I don't even do embedded work or anything real-time and even I know that in applications like this you should probably be using something like Green Hills Integrity RTOS.
I'm amazed that anyone would think that. I've worked on all sorts of embedded systems and worked with a many more people who've worked on far more than I, and RT Linux was very common and suitable for these sorts of things 10-15 years ago, never mind now.
That's a reasonable starting assertion, I guess, but all kinds of things have long histories of use without precluding the validity of using other things. Sometimes things get used because of inertia. Sometimes things get used because people who don't really know the difference say things like "we definitely must use X" (historically X might be a megacorp technology company like SAP or Oracle or IBM) despite Y being just as good or better.
For instance, did you know that Windows XP has a long history of being used in military embedded devices that store user data (on a writeable, obviously, FAT32 file system) where the way you turn them off is to just cut the power? I shit you not. I've seen state-of-the-art Navy-used sonars where the internal computer was running Windows, and you would transfer data off of the internal hard drive by FTP over Ethernet, and it had no on/off switch, just power or no power.
Given the many ASP.NET software packages that handle all the backoffice functions you've described, it sounds like they chose the right platform for that.
Why would you want to run backoffice on Linux and then re-create all those wheels by hand in-house? Relying on the expertise of other companies for basic backoffice systems is actually recommended practice until you become big enough to actually need custom software (generally, north ten thousand employees).
Custom as in customized Oracle/SAP, or custom as in from-the-ground up custom?
The former is generally a given for a company of any significant size (employees or business activity). The latter is unheard of for most backoffice functions (other than specialized accounting and finance functions) since it's a waste of money and would place the company at significant legal and regulatory risks--it would require effectively becoming experts in accounting, HR, etc.
In my opinion once you start to need heavy customization in off-the-shelf ERP solution (ie. because you do not fit in the vertical that the system is designed for) you are generally better-off writing the whole thing in house. And if you want to do it as web application then ASP.net with WebForms is one of the more productive approaches to that problem.
As someone who now works in the "backoffice" I would say that building backoffice solutions from scratch is the height of hubris if your primary business doesn't involve those backoffice functions. There's a million small things on the compliance side that need to be addressed, and which the incumbents know about and already handle.
This applies even if you need heavy customization. In fact, it applies even more--since that level of customization usually means sufficient complexity of backoffice needs that only the pre-built service providers will have the sufficient depth and scope to cover you.
There seems to be a misconception that enterprises (specifically ones where software is not considered a product) means Java and .NET with frameworks. Reality is that is this just the tip of an iceberg: a bank, a hotel chain, etc... might use Java or .NET for web applications yet great deal runs on proprietary software. Much of that software either runs on machines that are still sold and marketed as mainframes or if on commodity hardware (which nowadays offers plenty of vertical scalability in terms of memory and total number of CPU cores) coming from a mainframe lineage.
It seems as if there were two distinct cultures of engineers. Those working on workstation-grade hardware networked over TCP/IP (whether running proprietary UNIX, open source UNIX, or Windows NT) -- and Java emerged out of this.
The second cultures were developers building mainframe applications; usually they would be ones working on problems related data processing, planning, and automation for businesses (not just enterprises but also many SMBs, government organizations, hospitals, etc...)
Java clearly emerged from the first culture being built by a vendor of networked UNIX workstations. Some of Java's most memorable failures - either exceedingly complex and brittle systems like RMI, JMS, and J2EE (I mean this literally: not modern Java EE like Jersey/CDI/etc... but EJB 2.0) or features that were in retrospect far ahead of its time (JINI or JXTA, compare with consul/etcd/zookeeper and the idea of a service mesh today) came as an attempt to commoditise approaches commonly used by the first group as frameworks for solving the domain specific problems of the second.
This reply really exemplifies clear thinking. It is likely you will fit the role of a solution architect, rising above the menial arguments that frequently occur between developers.
The majority of the professional world runs on Windows. There are many advantages:
(1) Stable platform that's backward compatible over long periods of time.
(2) Very good rapid application development tooling, e.g. Visual Studio which is probably still the best IDE overall.
(3) A huge trained developer base making it easy to recruit. Same goes for IT personnel.
(4) A huge pool of software, custom dev firms, etc.
(5) Certification for US DOD and other certification-heavy environments where Windows is used heavily, which may be important for an aerospace company.
(6) Integration with everything in the business and government world is already done.
(7) Windows has a lot of complex user, permission, and policy management stuff. Active Directory is The Standard for UAM in the corporate world.
The cost of Microsoft licensing is chicken feed compared to the cost of building and launching rockets.
Overall I don't think it's a bad decision. Not everything is an Internet startup or hacker project. Right tools for the job.
I don't agree with most of it (except that VisualStudio is a really good product). Windows is a consumer platform for playing games and having fun at home. It is also very expensive in corporate/server space. I have seen backwards compatibility broken many times (drivers no longer supported), it doesn't even support other arch than x86 and it is bloated.
For a long time the tools were either proprietary or GNU. Sadly gcc worked well enough that its policy of not exposing anything that could be abused by proprietary software meant that any tooling that surfaced had to work around the free compiler. Remember the day emacs got full featured refactoring support based on GCC? Stallman singlehandedly killed that for exposing too much. Good C++ tooling only started to turn up when Apple made its move from gcc to llvm/clang and we actually got competition in the free compiler space and a compiler based framework to build tools on top.
What's wrong with ASP.NET? It might not be the most sexy framework out there, but it's great for this sort of thing - there's a huge business application ecosystem surrounding it.
When Musk became CEO of PayPal, he tried to switch the servers from Unix to Windows. Of course the founders revolted and fired him for such an dumb and wasteful idea.
What like SAP? And then spend 10s of millions in customizing it for their needs and dealing with an arcane, unintuitive interface? I've seen this over and over again.
I understand SAP's business model: common core business logic code that helps your legal compliance, then lightly customized at great expense.
What I don't understand is why the UI sucks so very, very much every single time. And why it's so very, very slow. It seems like it has to be on purpose. Can anyone with insight explain it to me?
This is discussed at length in the first chapter of Founders at Work. The chapter is written by Max Levchin (co-founder of PayPal) and discusses his bitter feud with Elon Musk over Musk's desire to convert systems over to Windows. Interestingly Musk is never mentioned by name.
This was also in Eric Jackson's history of PayPal. Jackson was a marketing guy, so had no technical dog in the fight, but he wrote this particular fight up.
Jackson was the guy who realised that PayPal and eBay had massive synergy and worked super hard to get PayPal in there. Eventually leading to eBay buying them out, and Musk and Thiel going from merely rich to actually billionaires.
Yeah it was (Ashley Vance book), he chose Windows over Linux when developing PayPal back In the day because the tooling in Windows was far more advanced (visual studio IDE) due to the parallel games industry driving development on that platform
Some of the known hardships of working in the games industry seems to pervade present-day Tesla and SpaceX, particularly working over-time to get things done "in time". The upshot of SpaceX (and maybe other Musk ventures) is that devs are at least building things slightly more tangible than just pure entertainment.
Musk has even admitted as much that he prefers game developers. Maybe he see's the parallel for working overtime and uses this "perk" to his advantage? From an article from 2015[0]:
> "We actually hire a lot of our best software engineers out of the gaming industry," said SpaceX CEO Elon Musk, when Fast Company posed this question during the May 29 Dragon V2 unveiling. "In gaming there's a lot of smart engineering talent doing really complex things. [Compared to] a lot of the algorithms involved in massive multiplayer online games…a docking sequence [between spacecraft] is actually relatively straightforward. So I'd encourage people in the gaming industry to think about creating the next generation of spacecraft and rockets."
I have heard in some youtube video that one of the first post-merger conflicts at paypal was about elon pushing for Windows NT (and presumably MSSQL) instead of Oracle (presumably on Solaris).
When NT4's kernel was released and Linux was on 2.2, there was a good reason to choose Windows for stability - or at least, there were trade-offs that were acceptable up to Windows 2000. After that, it became a battle for libraries. If you're using C# or other dotNet, then you're on Windows (or Mono?!?), otherwise your platform is Linux.
Both are reasonably capable of high service uptimes and solid performance. With Server Core and PowerShell, there's a lot more parity than my fellow Linux admins want to admit, but either is a viable choice for general IT services at this point.
Note - I'm excluding licensing entirely from this, as well as infrastructure maintenance and control surfaces. Nobody likes DSC, and there are several superior config management solutions for Linux that don't have meaningful analogs on Windows.
> Thankfully the rockets fly with a heavily customized Linux install.
That's good to know. When I saw the inside of the Dragon capsule, with its shiny touch controls, I was already imagining Astronauts having to deal with installing Android updates on their control tablets while mid-flight, or some other crazy stuff along those lines.
The .NET CLR and core C# runtime libraries are really nice to work with. But things become somewhat Microsoft-y (that is, nice-looking but amazingly half-assed in the most unexpected ways) when you start to do things like writing GUIs.
I've been learning WPF, and I want to shoot myself. It looks pretty, and is very flexible, but it's just so much damn typing. Plus the errors you get out of it are often pretty useless.
Maybe my brain just doesn't get it, but the documentation makes me crazy too. I just hate everything about it.
WPF is a huge change from "traditional" UI frameworks like Windows Forms or Swing. It requires some rethinking and to get the most out of it you should really do things the WPF way in many (not all) cases, even though other options appear to work (they're just more work in the end and less flexible).
The documentation is actually fairly good in my eyes iff you're only writing applications. As a library and custom control vendor (a position I find myself in at work) it can be atrocious and sourceof.net is hugely helpful (and I still find myself wanting to debug framework source code at times).
Still, if you have specific questions, you can throw me an e-mail if you want.
I came to WPF having not touched WinForms for nearly a decade and I fell in love immediately.
Once you accept that reactive data models are the 'correct' pattern things become so much simpler.
Throw in JSON.net and QuickType (amazing if you haven't seen it, feed it JSON or JSON Schema and it outputs correct code to serialize to from JSON in about 25 languages pretty much idiomatically (for C# it uses JSON.net for TypeScript interfaces and if you want it runtime validation).
The actual principal is simple, your classes have private properties which contain the thing, you use get/set on public properties and on the set you raise a property change event (that's declared via an interface).
WPF binds to those objects and when you change the thing via a public property the change notification is fired and the UI updates.
Docs you want are MVVM and particularly INotifyProperyChanged.
Thanks, I appreciate the offer for help. If I get stuck again, you might hear from me =D
I think a lot of the problem probably depends on the work you're doing. I'm an engineer, usually I just want a GUI that shows me the information I need, I don't care much for design aesthetics beyond making sure it's not hideously ugly. Winforms was good for this, but it definitely looks dated and needed to be replaced or massively overhauled.
I can absolutely see how somebody doing more attractive design work would like WPF. It just makes me grouchy. Somebody else mentioned mixing blend into their workflow, maybe I'll take a look at that.
Expression blend saves a lot of typing. Also if you provide design-time data, saves time because WYSIWYG gives faster feedback than change-build-test cycle.
When I work on XAML-based GUI, I open the project in both VS and Blend, and use them alternately.
Some languages are unfit for certain purposes. For example, Python is not meant for low-level programming, and C is not meant for writing secure, mission-critical applications.
Furthermore, some languages tend to attract less skilled programmers, in part due to having lower barrier of entry and requiring less domain knowledge for being able to crank out something functional, quickly.
> C is not meant for writing secure, mission-critical applications
What OS kernels are actually used that are written in anything other than C? Plenty of them (INTEGRITY, VxWorks, QNX) written in C are used in secure, mission-critical applications.
Why is this interesting or surprising? Of course almost all challengers have hacky tech under the hood because they don't have the resources. Winning from that position is possible not because of generally better tech but by delivering something that the incumbents don't.
Remember the demo of the first iPhone (vs the huge expertise of Nokia). Or how Microsoft won the desktop starting from a single user, cooperative multitasking system (vs all the sophisticated Unix-based systems). Or Facebook running on despised MySQL and PHP.
These hacks are part of the strategy. It's risky but probably doing these 'properly' would increase the risk even more.
Cars weigh thousands of pounds and routinely drive upwards of 60 miles per hour.
The success of the first iPhone or Facebook app didn't depend on using it to navigate through life-and-death situations not only for the users but also for everyone around them.
There are places for 'move fast and break things'. But cars move fast already, and they can really break things.
I think this is a naive perspective. I've heard many similar stories from friends that work in the German automotive industry, and I think you would be surprised how many payment systems are tied together (e.g. (insecure) FTP servers to sync daily payments).
Every organization has these types of things internally. As an engineer, I don't like it, but it's a fact of life.
Banks can reverse transactions if they want to. Tesla's cavalier attitude towards manufacturing has yielded a 14% first pass through rate on the Model 3 line (which is abysmal and may bankrupt the company along with declining M3 demand) vs. the industry standard of ~80% FPT rate.
We're at the point in the Tesla story similar to that of 2008 where Dr. Berry et el were watching the housing market collapse around them and the banks wouldn't re-price the swaps. Tesla is already bankrupt, most people don't know this yet. But they will.
I don't think the rework rate is a big deal for the customer. Ultimately manufacturing is about increasing the yield rate so that costs are lowered, because rework is expensive. But if you can make money with lots of manual rework on every product, it's no big deal. Something to improve next quarter.
I have so many electronic devices, from cheap to expensive, that have some passive component manually bodged on somewhere. They work fine. It just means that paying someone to rework 1000 boards was cheaper than throwing the boards away and spinning up Revision B immediately. It's not a big deal. Waste is worse than a product that is imperfect immediately off the assembly line.
People are waiting 3 or 4 months for replacement parts for their vehicles or Tesla has had possession of the vehicle (and the person is using a loner) for a similar amount of time. That won't fly in the mass market.
>>> It just means that paying someone to rework 1000 boards was cheaper than throwing the boards away and spinning up Revision B immediately. It's not a big deal.
It's not about money. Correcting and refacturing a board takes time, in the order of months even for a simple one. It's too long.
That link is not a source for declining Model 3 demand. Source?
My info shows that
• In July, over 60,000 test drive requests in the US alone
• 5000 Model 3 new net orders in one week in mid-July
• Total deposits greater than total refunds in the twelve months ending April 2018
• Reports of Cancellations Outpacing Orders proved to be False wording
Meaningless negative indicators include Goldman Sachs analyst David Tamberrino saying Model 3 social media activity had lessened and frequent critic Latrilife said Tesla’s Burbank Airport lot is under 24/7 surveillance
While I agree that 14% is horrible, that was a point-in-time number that is being compared to an average. One would hope that factories regularly run well above 80%. I'd also bet that some occasionally drop well below it for a day or so.
The details matter here and given the quality that I see in the field, I'm not convinced that this is such a horrible situation.
>I think you would be surprised how many payment systems are tied together (e.g. (insecure) FTP servers to sync daily payments).
It's not like those matter. The bank itself guarantees the integrity of your account and can reverse charges. And of course would be insured for such losses.
Tesla’s infotainment and IT infrastructure is unrelated to their safety. If this guy worked on motor control or braking system firmware then that would be scary, but he didn’t.
If the infotainment system caused the MCUs to reboot while someone traveling "130mph on San Mateo Bridge" and that caused the break system to segfault due to unconventional way of loading parts firmware, it might be a life&death situation, easily. Examples in that threads go on, literally hundreds!
Well, you can reboot the system while driving(both console and dash), nothing special happens other than the AC turning off for a brief period of time. Brakes, wheel, throttle all respond normally.
Source: Done this a few times to clear bad map data or occasional glitch.
Doesn't surprise me -- a couple of times my new 3 has had nothing on the display but it's still happy to let me put it in gear and drive away, and the display pops up within a second or two.
To be sure, it's a bit unsettling, and I wouldn't be thrilled about it deciding to not work for a day or two, but, in a way, it makes me MORE comfortable that the vehicle control systems function as expected.
There’s a distinction between safety critical equipment and essential equipment. If the former fails, it could kill you. If the latter fails, you can’t drive anymore but you won’t die if it happens on the road. Brakes are in the former category, while things like HVAC and instruments are in the latter.
Safety equipment must not fail, but essential equipment can fail as much as your customers will tolerate.
Seriously yes, at least ventilation. Anybody who don't believe this, try turn off your AC completely and see how quickly your wind screen is fogged up to the point you can't see out. (depending on which climate you live in)
Even if this were true (and as another commenter says, it's not) I don't see how that fills in this gap. It would make sense if the first step were "MCU goes crazy," but not with a simple reboot.
There’s no concept of “reboot” with MCUs, since there’s usually no OS. Likewise there’s usually no concept of segfault, because segfault requires memory protection which is something most MCUs don’t use.
This must be an acronym mixup. We’re talking about the Media Control Unit, i.e. the giant screen in the center of the dashboard responsible for zero safety-critical systems.
Yes, The infotainment stuff, like the screen, internal lights, speakers are connected via a low profile third bus system, certainly not the main CAN bus or profibus or firewire. Some, like in the BMW even via WiFi.
These are the systems usually used with Linux or Windows or Android. On top of the important stuff.
Ignoring the infotainment system for this argument (as they firewall it off from CANBUS and other life critical systems [1]), I argue that their IT infrastructure is safety related, as it governs Tesla's velocity in getting patches and security fixes out to vehicles in a timely manner.
Can you imagine a zero day being found in Windows with Windows Update being down?
There's no guarantee that any given car is connected and receives updates, so the safety-critical systems need to be good enough when the car ships. They might mess up, but then they'd at least be able to patch cars faster, while other manufacturers would have to do a recall.
> Tesla’s infotainment and IT infrastructure is unrelated to their safety.
Only if it's deliberately isolated in the vehicle. It should be. Aaaand, it isn't: the firmware upgrades to all other car computing elements go through it.
Runtime isolation is distinct from compile/build-time isolation. You're citing the latter, but it's the former that matters. Tesla gets this right, e.g. an interrupt in the MCU does not have any effect on braking, drive-by-wire, or ADAS systems while a car is in operation.
If that's the standard, then almost everything becomes safety critical. Drivers can easily get distracted by malfunctioning smartphones or apps. (Or, for that matter, properly functioning smartphones or apps.) Yet the prior discussion was based on the idea that things like iPhones and Facebook aren't safety critical the way this is.
>If that's the standard, then almost everything becomes safety critical.
Almost everything in a cars front panel and dashboard can be. I've read somewhere that people have been killed even because of something as quaint as the wrong placement of car ashtray.
One time I pulled up at the red light behind a Tesla while I was on a bicycle. Straight through the rear window I could see the driver and front passenger being distracted by the massive flat touch screen. The traffic light turned green, and had I been beside the Tesla, I would have beaten it across the intersection despite all that electric motor tech in the car.
Anyway this little illustrated anecdote of mine aside, driver distraction is a genuine issue - even for drivers at red lights. The number of times I've seen this type of behaviour transcend into moving off while remaining distracted (whether that's heads down visually or just mentally) is too boringly frequent to detail. Sometimes the distracted drivers even creep forward unconsciously while traffic is flowing across them. Emergency vehicles can't get through, drivers end up splitting their attention, pressure mounts once proper movement starts again, and all the while they don't realise they don't have full attention in a changing environment.
I've missed the light turning green while just looking out the window (our car has no screen). As long as people are only distracted while waiting at a light I wouldn't worry too much.
Except they're not only distracted while looking at their screen when stopped. The distraction continues after they start rolling again - one thing directly leads to another here.
It is my understanding that two major reasons for various ugly UI and unintuitive UX in automotive infotainment systems are patents and safety certifications.
The iPhone has to reliably connect for emergency calls and provide accurate location. That can be life or death. In fact, "Apple isn't ready to engineer phones at a life-or-death standard" was a fairly common critique of the iPhone in the early days. In response, Apple ran a little PR campaign around their radio engineering efforts--they had a web page that showed all the cool-looking rooms for testing radios, which they invited a couple reporters to tour, etc.
> The iPhone has to reliably connect for emergency calls and provide accurate location.
An iPhone handling an emergency call is an exceptional use case.
A car driving is not an exceptional use case for a car.
A life and death scenario is occurs every few minutes, or continuously for something like a mountain road, if something like brakes, limited throttle, or steering were to fail.
But, I would, naively, assume that the operation of these critical components were in no way related to timing requirements of some process in Linux.
That PR tour was in response to "AntennaGate" when Apple failed to engineer a phone to a life-or-death standard (well, if you were "holding it wrong").
Are there any interesting papers on formal verification of some of the most modern machine learning algorithms?
Clearly when we "verify" Waymo or Tesla auto-pilot, we're going to want to use that stuff, right? Surely they won't just provide insurers with some data about the billions of miles they've driven without accidents and how humans can only drive like a million miles without an accident and try to get the insurers to give them policies...
Just like when we hand out licenses, we always check to make sure the 16 year old took some formal driving classes from professional driving instructors... I wish things were better but we don't care about this stuff as a society until much later usually. When did the car first appear? When did the first seatbelt law appear?
Read the Barr report on Toyota’s unintended acceleration incident. They didn’t have the right failsafes and watchdogs and only a single bit was responsible for a critical feature. Meaning memory corruption or a cosmic ray flipping the bit can cause disaster. They didn’t follow anywhere near best practices in their firmware development. It’s not just the young upstarts that mess this up by being “reckless”
There's still the problem with exploding phones and batteries so even if it's just a phone it can go wrong in dangerous ways. (If it happens on a plane for example)
a) interesting, because it shows how other companies solve (or fail to solve) specific common problems.
b) surprising, because I thought they'd be much better at it.
Car infotainment is generally not categorized as safety-critical and can be quite similar to regular software development. Having remote access changes things: I always thought their security must be top notch if they had the confidence to launch that. Ssh-ing and deleting files sounds like the opposite of that.
Btw, saying that doing things properly increases risk must be the biggest load of nonsense I have ever seen on HN.
Especially considering that the engineer was partially blaming Bosch as a supplier that made these processes necessary.
Bosch supplies VW (includes Audi, Seat, Skoda, ...) and various other large incumbents, as one of the biggest car manufacturer supplier. Why do people think they are considerably better off? Their practices are often equally ridiculous and dangerous.
As a rat in that cage you do whatever you need to make things work.
It sounds more like they didn't follow the specific way bosh requires you to use their HW. I highly double VW or others would use Bosh if hacks are required to get them to work.
Updating the software of node are usually in the automotive industry only allowed by trained technicians. Allowing the updates to be done without turning in the car so a shop is the big difference.
Software update in is complex and hard and error prone, for quirks like the bosh stuff here. Many of the big players see it as almost impossible to try to do as Tesla and have it done at the users home.
Remember the Toyota uncontrolled acceleration bug and the report that came as a result of the trial. It seems like software is a bit of a second thought for some incumbents too.
Not sure if you're being sarcastic or not, but the Toyota issues were not caused by any software bug. Most cases were caused by people hitting the wrong pedal. The others were caused by floor mats that were a bit too long, causing the gas pedal to stick.
Regardless of whether the acceleration were caused by the software or not, the testimonies for the software experts called in to review Toyota's source code for the case were eye-popping:
> Skid marks notwithstanding, two of the plaintiffs’ software experts, Phillip Koopman, and Michael Barr, provided fascinating insights into the myriad problems with Toyota’s software development process and its source code – possible bit flips, task deaths that would disable the failsafes, memory corruption, single-point failures, inadequate protections against stack overflow and buffer overflow, single-fault containment regions, thousands of global variables. The list of deficiencies in process and product was lengthy.
>There are a large number of functions that are overly complex. By the standard industry metrics some of them are untestable, meaning that it is so complicated a recipe that there is no way to develop a reliable test suite or test methodology to test all the possible things that can happen in it. Some of them are even so complex that they are what is called unmaintainable, which means that if you go in to fix a bug or to make a change, you're likely to create a new bug in the process. Just because your car has the latest version of the firmware -- that is what we call embedded software -- doesn't mean it is safer necessarily than the older one….And that conclusion is that the failsafes are inadequate. The failsafes that they have contain defects or gaps. But on the whole, the safety architecture is a house of cards. It is possible for a large percentage of the failsafes to be disabled at the same time that the throttle control is lost.
People quote this repeatedly here, but I'm not sure what it's intended to demonstrate.
Most code is buggy, and the more closely you look, the buggier it is. Most of that same code operates without noticeable error.
That code analysis turns up "inadequate protections against stack overflow and buffer overflow" does not actually suggest that there was any stack overflow or buffer overflow, and people quote this as if it does.
Meanwhile, it is all-but-certain that the overall findings were correct: people hit the wrong pedal regularly, and if enough press attention is given, all of those wrong-pedal-pushers find each other and try to blame the manufacturer instead.
You started with "Regardless of whether the acceleration were caused by the software or not," but I think that's the important thing, and that software is buggy should surprise nobody.
> Meanwhile, it is all-but-certain that the overall findings were correct: people hit the wrong pedal regularly, and if enough press attention is given, all of those wrong-pedal-pushers find each other and try to blame the manufacturer instead.
At risk of beating a dead horse, I do find it interesting how despite the myriad of "driver hit the wrong pedal" news stories out there, only the Prius one starts from the "it had a mind of its own" angle.
I think what most people don't realise is that it's so incredibly easy to be fallible with "automatic" familiar everyday activities (i.e., not maths or remembering facts). I recall in driving school approaching a quiet intersection where I once lost proprioception of my foot and had to call the instructor to use her dual brake because I didn't know where my foot was or could no longer reach the brake pedal (I guess I'd have attempted the hand brake if I were alone).
From this single experience along with running other dark hypotheticals through my head, I am so cautious with many things that it bewilders me that activities such as "carpool karaoke" is not an atypical practice.
But otherwise yep, same thing with software only that the stakes are often lower thanks to the sheer number of non-mission critical projects out there. It's only the mission critical bugs that get the most attention and surprise.
No, I was just using a fancy word I've always liked because it seemed to describe things as best I could.
At one point in my life, I was simply an inexperienced learner driver. It isn't and wasn't a medical condition (for me anyway). I literally couldn't find where the brake pedal was because I couldn't really position my foot in the right place as I had clearly lost track of where it was in the footwell area. I also didn't want to take my eyes off the road and knew I was going slowly enough that nothing bad would happen (because the professional driving instructor had a dual brake, and I had a working handbrake too which I never used - we'll call that risk compensation).
Now all this said, it's incredible how many people are completely unaware of their own inabilities when it comes to mission-critical tasks (the Dunning-Kruger effect). This isn't even always limited to "incomptent" people - experienced professionals also make "simple" blunders. See also Air France 447, or various deceased pro/amateur race car drivers.
Speaking of failures though - another time (also while I was learning), I was reversing out of the car shed with my dad supervising. Before leaving, I complained to my mother (who was just outside the car) that my brake was difficult to press. She replied "just press harder", so I shrugged, obliged and went on with the "supervised" driving practice. Her confidence in me had me convinced that everything was fine so off I went.
Damn, the brake was so difficult to press - I needed both feet as well as some bodyweight on it to get the car to stop at each intersection - it hardly budged each time and felt like a dodgem car brake. I figured at the time that if I just kept things slow and got the handbrake ready, things would be okay? Anyway nothing bad happened; it was only a short session before dinner on quiet streets.
The next morning, my mother went to drive the car out of the shed and was alarmed at how broken the brake was. What was I to know - I was pretty new to driving and thought it was just the car being old and figured it was good enough at the time for my two feet. It was probably bad enough to require towing but I think my dad ended up driving it very slowly/carefully to the mechanic.
That was a bit of a long story, but I'm detailing this to highlight how "stupid" cascading failures are "alarmingly" common - it's not unheard of to exist in both the human side, or the electronic/mechanical side. It doesn't really faze me whether the Prius "acceleration bug" was human or machine induced. Both are often as bad as each other. The only thing we should depend on are multiple layers of redundancy and good systems design.
Edit: When I first started driving, I mulled over whether to go barefoot or not (either is legal here so long as the driver has control). These days, I fortunately can handle either confidently. I'm also much better at sport now (and therefore coordination) than when I was a teenager.
Koopman has a talk here on his blog that has more details about how ETCS (Electronic Throttle Control System) is safety critical. "If a driver pumps brakes, loses vacuum power assist...WOT requires an average of 175 lbs of force on brake pedal"
And what was particularly negligent about Toyota's software practices that "more likely than not" caused issues with ETCS.
I actually experienced this with a rental car in South Africa. I drove for 10+ seconds with the engine revving at full blast with the car in neutral and no mat impacting the pedal. Both feet completely removed from any pedal, using the handbrake to decelerate and pull onto the verge. The floor mat thing is total deflection by Toyota. The only thing that reset the accelerator was turning the ignition off.
In the case, this could possibly have been limp-home mode. I believe this engages if the ECU thinks the throttle pedal is faulty. It disables the throttle and applies a constant throttle setting to allow you to continue driving slowly - it sounds alarming in neutral, but once you put the car in gear, the revs drop as the engine is required to produce torque.
Happened to my 2001 Clio several years back, although in my case it was when I started the car, not during driving. My hypothesis was that it triggered due to my bad habit of turning the car off in traffic queues then restarting it with the pedal held down.
The security researchers who were expert witnesses in the Toyota case and that analyzed Toyota's source code testified in court that they managed to reproduce the unintended acceleration with a real car on a dynamometer.
The linked trial testimony says otherwise. The engineer states he found bugs, and theoretically it might be possible he was unable to reproduce. Further, he evaluated the software against his own coding standard which was a variant of another.
The engineer was questioned:
" Q. Now, you have not reproduced in vehicle testing your
25 theory that there's a software bug that opens the
THIS TRANSCRIPT IS NOT PROOFREAD
30
1 throttle and then the task dies, have you?
2 A. No."
From page 245. Further reproduction questions were all answered with a no further.
To get a successfully task death failure (what the engineer wanted to happen) the engineer had to modify the source code.
Further in the limited testing even with modifications, the failsafe triggered 100% of the time.
I once had a project with a major lobbying organisation with links throughout the automotive industry. The one that published regular reports on real vs declared emissions discrepancies long before the VW scandal.
They were unanimous when asked about the unintended acceleration bug. It was a case of mass hysteria.
Is it possible that the timing on your contact with them made a difference? It seems that NASA/NHTSA concluded in 2011 that it was a mechanical defect, but research concluding in 2013 showed that it was totally possible to have been caused by the software.
At least according to the summary based on sources used for Wikipedia:
That’s what it takes to be a challenger in that space. For example, most biotech series As are $100M, because that’s just what it takes to compete (but/lease equipment etc.).
When Windows came out, there were desktop unix(ish) systems, e.g. from Sun or Apollo. Of course they were an order of magnitude more expensive than a PC, so it's hard to argue that they really competed. At that time, if you needed a unix workstation, a Windows PC was not a viable substitute.
Theranos never had a real product or sales and was a fraud from the beginning of institutional investment. Comparing them to Tesla is pointless and has been done to death.
My point certainly wasn't to compare Tesla to Theranos. I didn't mention the two in one sentence.
My point is that for every good example of a success story (about startups in this case), there's a good story of a failure. And the failures are much more than the successes. If you want to discount the failures because they don't fit your narrative, then you should equally oppose the successes when they don't fit your narrative.
TL;DR my post is a reply to a post and should be seen in that context. You've taken my post out of context; please don't do that.
Moreover, I recently read the book Bad Blood and I found it interesting to get an inside look at a startup who present themselves better than they actually are. I don't believe that part of the Theranos debacle is so uncommon. The severity and unique market though, are. And, that's actually underlined by the Twitter thread (the pictures). Another similarity is the massive quitting and burnout of quality personal, the fear of being fired and standing up, low morale. Those are, IMO, interesting similarities.
>The iPhone could play a section of a song or a video, but it couldn’t play an entire clip reliably without crashing. It worked fine if you sent an e-mail and then surfed the Web. If you did those things in reverse, however, it might not. Hours of trial and error had helped the iPhone team develop what engineers called “the golden path,” a specific set of tasks, performed in a specific way and order, that made the phone look as if it worked.
>They had AT&T, the iPhone’s wireless carrier, bring in a portable cell tower, so they knew reception would be strong. Then, with Jobs’s approval, they preprogrammed the phone’s display to always show five bars of signal strength regardless of its true strength.
>None of these kludges fixed the iPhone’s biggest problem: it often ran out of memory and had to be restarted if made to do more than a handful of tasks at a time. Jobs had a number of demo units onstage with him to manage this problem. If memory ran low on one, he would switch to another while the first was restarted. But given how many demos Jobs planned, Grignon worried that there were far too many potential points of failure.
That's actually pretty interesting to me. I specifically remember seeing those multiple iPhones in the demo and wondering what they were for. Jobs kept putting one down and picking up another one to do a different part of the demo. It never occurred to me that the iPhone wasn't fully cooked and couldn't handle the demo without problems.
Steve was a full on sales person and was really, really good at presentations.
At DEF CON 23 there were two car-hacking talks that made an impression on me: one about hacking a Tesla [1] and one about hacking a Jeep [2]. FWIW, I came away thinking that the software architecture of the Tesla was light-years ahead of what Jeep (and presumably many other legacy manufacturers) was shipping:
I didn't watch this talk, but I had a blast fuzzing CAN messages to various ECUs when I was in that business. I would be very easily convinced that RCE and memory overwriting (aka "recalibration") were trivially achievable around 2006 MYs
Re QA: To quote Dodge (via Deming), "You can not inspect quality into a product."
This poster seems to think well of the QA team they had. However,
It doesn't matter how great your QA team is, if the quality isn't in the system the QA team can't put it there. All they can do is tell you that you're making poor quality products, and attempt to inform management (as QA rarely has real authority) who should then act on that and work to improve the system. If quality isn't part of the corporate ethos, then they (QA) will make little difference in the end.
QA done well is more than testing. It's managing risk and ensuring the process matches your risk tolerance. Car entertainment systems and Car breaking systems can be handled differently because the results of failure are different.
I agree. QA has to inform management, who ought to work with the teams producing (software, widget, service) to address the causal factors of the discovered defects and deficiencies.
By inform, I don't mean "rat out". I mean, management has to have a learning objective. To understand the system that they're managing (because no one understands it fully, their mental model is different than what's actually happening). QA, along with other sources, inform the model of management who can then work with teams to improve the overall system.
This is OK if you can guarantee 100% separation of concerns, but the news reports where hackers were able to take over control systems in the car through bugs in the entertainment system should be a huge note of caution.
There is difference between entertainment system and infotainment system.
Infotainment systems are increasingly used as primary UI for almost anything on car that is not controlled by steering wheel and pedals and thus has to be able to communicate with almost everxthing that is in the car. Tesla is pretty extreme example of this, but it works this way for most manufacturers.
If the only interaction between the automated systems and the braking system is to apply more braking power and the cars breaks can significantly overpower the engine then nothing the infotainment system can do would prevent the car from coming to a complete stop.
Similarly, if the power steering was limited to applying say 5lb of force to the steering wheel (as felt by someone holding the outer edge) a driver could overpower any steering adjustments with minimal effort.
With just those two choices the risks associated with hacking the car dramatically decrease. Yes, the car could prevent someone moving, but that’s also an inherent risk from engine failure.
Check out the 20/20 report (or maybe it was 60 Minutes) where researchers who hacked into a car brought it to a complete stop on a busy highway (quite idiotically, IMO). Yes, other failures can cause this, but it can still be extremely dangerous.
Referring to an event where nothing happened does not support a ‘very dangerous’ argument. If you simply mean it’s possible and after that it’s then possible that something bad would then happen then I agree.
I am simply saying it’s vastly lower risk than a car that’s impossible to steer or slow down without causing massive mechanical failure.
No, the large touchscreen should be single point of vehicle config for most functions. Driving a car right now with totally different screens, controls and designs for audio vs. car settings and it’s nuts.
Isn't the point of inspection so that you can perform a REJECT, based on some rejection rate parameter ( which is chosen by business economic reality )
If your QA is only doing inspections, they're only doing a fraction of the job. Rejecting the work and performing rework or starting over does not address the causal factors in the system of the rejection in the first place.
You will always have some product that needs to be reworked or rejected. The goal is to ensure quality at every stage in order to reduce the amount of rework. You don't want to go the way of the old US car companies who had to suffer major setbacks in the market because they were spending gross amounts of time reworking defective cars before they could ship them. I do not have a copy of it in front of me, but one of Womack's books had some numbers. Double digit percentage of time for a car's production was spent reworking the car after it had been made to make it suitable for sale.
Now, a premise of Agile (and Lean) is to improve the feedback loop. This is where testing and other inspections come into play. By doing them more often (run unit tests on check in, reject if any formerly passing tests start failing, for instance) you can address some quality concerns earlier.
But you still have to address the cause of the rejections. If Stage 10 of production consistently requires rework, it's great that you're catching, addressing, and doing the rework right then. At least it's better than after Stage 20. But management has to work with QA to not just inspect and accept/reject, they have to address the systemic causes of the failure.
I have a Model S, and Tesla pushes buggy firmware to it last week. It causes the entire instrument cluster, including the speedometer, to disappear periodically while driving. Tesla knows about the bug and it’s apparently a “high priority”. It’s not entirely clear to me that they are capable of rolling back the update.
I join the III family this coming week. I am really interested in how well the updates are managed as the idea of cratering my car isn't something anyone is keen about.
Yeah, you can choose to postpone an update. The car may occasionally prompt you again, but I don't think it will force you to install it if you really don't want to. I've never postponed for more than a day or two though.
I have a Model 3 and so far haven't had any problems with updates. Installed the most recent one (2018.32.2) last night.
The most disturbing part of this, for me, is this line in the very first tweet:
"... caused almost the entire fleet to reboot loop ..."
I am not interested in being part of someone's fleet. The fact that they use this language at all to describe an end-user who has purchased an automobile suggests that their expectations and my own - of what it means to purchase and operate a car - are in deeply (possibly dangerously) mis-aligned.
""Fleet" is longstanding car industry lingo. Not something Tesla dreamed up to keep you under their boot."
I am aware of that and I hear that terminology used by rental car companies and equipment dealers, etc.
My objection is to what I hear as a subtle difference - the post-sale automobile to a private, end-user is still referred to as belonging to their fleet - as if one's ownership and use of the car were a minor detail.
I dislike this subtle shift in language and attitude.
What's being scrutinised here is the difference of being forced to be a part of a fleet as opposed to separately owning and controlling your own fleet.
If you buy a Tesla, you will not be able to control access to it without limiting its normal set of features.
If you buy a normal bicycle, you always have full direct control of it. No software updates, no data uploading, no tracking.
Normally fleet is reserved to meaning ownership in the management sense, not the micromanagement sense. Even a Navy fleet has autonomy within. Not so with Tesla software by default.
This could be rms territory. Free vs non-free, or even Airbus vs Boeing, etc.
I mean sure, but this isn't some encroachment by Tesla is the point I'm trying to make. At any car company for decades the set of cars for which you're currently responsible for sustaining engineering is known as a fleet.
And I think it's fair for the engineers that have responsibility to have an internal sense of partial ownership.
Do you prefer the alternative where after you bought the car, Tesla tells you to screw off if there's something wrong? "it's your car now, no more bug fixes"
As long as they still have responsibilities, they also have partial ownership.
This question is probably better answered by car owners, because I plan to never own a car. It doesn't really concern me like it might one of the (grand)parent comments. It still makes me wonder though as to what the best approach is - at the moment I'm sceptical that Tesla has an optimal approach to solving the world's problems (as some would appear to believe).
Parent is not objecting to the term; he's objecting to the implication that all of these dependencies and connections make the car more akin to something you rent rather than own, which is a legitimate concern.
Your post conveniently ignores the fact that over the last 20 years cars and trucks have become physical delivery vehicles for software. In fact, the complexity of the software running in any manufacturer's modern car dwarfs the complexity of the hardware in the car itself.
So, if you have a car manufactured any time in the last 25 years, you're running software that hasn't likely been patched in years, has a ton of unknown defects and bugs, and might kill you if you hit an edge case that wasn't tested before it was shipped to you.
I much prefer Tesla's ability to fix software defects remotely than driving a defective piece of software. For example, I had a 2009 Hyundai Genesis that worked great until I had about 75,000 miles on it, then it mysteriously started losing engine power and I had the entire computer reboot a couple times while I was driving. Imagine your entire instrument cluster going dark and losing engine, braking, and steering power while you're traveling 75 mph on the freeway. The Hyundai dealership said the only way I could get a software/firmware upgrade was by purchasing a $500 maps DVD and having them update my system manually, which takes several hours, for which I'd have to pay one of their trained technicians to do it. Fuck them.
I vowed after that to never buy a car again that can't receive OTA updates. Would you buy a smart phone that can never get security fixes or updates? Given the tech in our cars now, why would you buy a car that wouldn't either?
This is why I do not have a car that was manufactured any time in the last 25 years. You can't trust software, and you definitely can't trust software developers.
This is probably going to sound outlandish to the majority, but I take it a step further and don't trust any car (+driver), period. Including drivers with massive amounts of experience or skill (even the best have failed spectacularly).
So what do I do instead? I mostly cycle and ride where motorists don't drive and set the largest possible safety margins. I recognise this is not immediately practical for everyone but I'm fortunately set up in the right place with the right knowledge to achieve this.
So far this year, I've been in a car four times, a train twice and a plane twice. Musk is pushing for a world where everyone is dependent upon a form of low-occupancy heavy motorised transportation (including wanting to reinvent the train). Naturally, I recoil at this and so should more. More cars will never save the world.
We already live in that world (or country, at least). Your situation is an outlier, unfortunately. I'd personally rather have those cars be electric than burning fossil fuels, and if anyone can reinvent mass transit and make it available to people outside of major cities I'd happily take a train.
So you're trading off the theoretical unsafeness of software (how many car accidents have been caused by faulty software?) against the huge improvements in crash safety (engineered crumple zones, AEB, etc) in the past 25 years
> ol' musky isn't totally paranoid - we did catch bad actors doing stuff and they were nailed to the wall. finding a real apt in your network can be some next level shit
That looks like confirmation that there really were internal threats found as claimed by Elon.
Is there any corroboration of this? Like that this is even someone who worked at Tesla? I'm certainly not discounting that it could be real, but with so many people shorting Tesla, any uncorroborated information should be looked at with a high degree of skepticism.
I think people are shorting the company because it's bankrupt and way behind no fulfilling orders + having major rework problems in factories.
Elon likes to push that narrative that there's this epic battle between himself and people with short positions in a good vs. evil way.
In fact, it seems pretty apparent that the company is in trouble and the people saying that have put their money where their mouth is. They also probably aren't obsessing over it every last minute or pushing narratives to try to get the stock to tank, it'll do that on it's own. In fact, I bet the majority of people who are holding tesla shorts aren't sneaky oil execs, but hedge fund guys who want a pickup/hedge for the market as a whole. It's pretty common to short stocks that look weak to protect against general market volatility for when your main portfolio takes a little dip
The conspiracies and the cult of personality surrounding a company in moderate financial trouble with problems delivering their product is pretty strange to me. Startups fail all the time, they also over promise and under deliver all the time.
There haven't been any real claims of people trying to influence the tesla stock price other than Elon. If he has this kind of info, he should send it to the SEC since they should be able to track down the nefarious bastards that hold a short position.
This is one of those situations that is probably exactly what it looks like. Elon is learning that hardware is much tougher to build than software.
I have nothing against shorts -- they're betting on what they believe and I respect that. I can't even say I disagree, though I hope the longs are right.
My point was only that when there's a lot of money on the line, we should be skeptical of uncorroborated info (good or bad).
My personal expectation that it's all true is about 60% right now. Some of the figures could be exaggerated or misremembered, but most of the hackiness described is plausible.
> As a long-ago ex-Googler I find it amusing to watch employees name-drop where they work, always with some underlying subtext implying authority [...] Parent comment doesn't simply imply this but explicitly uses it as the basis for calling OP out. It's amusing because the namedrop is always accompanied by some claim they felt they couldn't support solely through a sound argument.
Why did you leave out the part where he admits to name-dropping in that same thread?
Also, you also left out the part where he is actually agreeing with the name-dropper person:
"Meanwhile the points made are relatively sound, 100kLOC C firmware updater set off my BS alarm (assuming 98k of that wasn't some static data tables), and the 700TB MySQL DB while possible is extremely unlikely.. even given a huge JBOD setup which nobody in their right mind would plug into a single server, InnoDB tops out at 64TB per table in the best case, which means OP would need a single database with somewhere around 10 64TB tables"
It's certainly not at all at odds with what I would expect to be the prevailing case at Tesla. I have always heard from ex-employees that the company is schizophrenic and unfocused under the micromanagement of Elon Musk. Media reports are beginning to paint a consistent picture of "shitshow" conditions at the company. This also all fits with stories about his earlier ventures like Zip2.
- The spacecraft has multiple onboard computers, all running Linux
- Tripe String Architecture, 3 redundant computers, whose results are cross checked with majority voting before being applied in real time, given radiation tolerant over radiation hardened hardware
More than anything else, this makes the SSH keys a very high-value target.
I may trust Tesla not to crash my car or violate my privacy, but I also have to trust them to sufficiently secure the SSH keys from bad actors who would (or sell them on the black market where they almost certainly would arrive in malevolent and capable hands). The latter is what scares me.
Sometimes, when I consider the quality of the software/systems engineering work that we do, I wonder how everything is not crashing and burning all the time.
> "the interior is a disaster, there's no instrument cluster which takes your eyes off the road"
This is the most WTF aspect of the model 3 to me and seems to confirm that there isn't going to be some last-minute mitigation. I know it was done because autopilot was expected to be mature by the time model 3 rolls out, but I don't see how it can even be street-legal when you need to look away from the road to determine your speed.
I'll take your word for it. From the pictures I've seen the vertical distance between the screen and the windshield looked big enough to block most of your view of the road when looking at the screen. Unlike a Toyota where the display is vertically positioned the same as any other dashboard.
There's an entire generation of traffic safety cameras that detect people looking down away from the road and it looks to me like they'd be tripped by a model 3 driver just reading their speed...
I see what you're saying, but the P3D's speed was top left on the screen, felt like a non-issue. Maybe driving it longer I'd feel differently, was a test drive.
Content aside, what is the deal with images of screenshots of a forum, then posted to Twitter? Why not include a fax machine in there while you’re at it? Just link to the forum and we can all read the text.
Bigger medium than Something Awful. Easier to read without all the other people’s replies mixed in. Some additional anonymity for the poster. The content will stay if the original thread is removed.
Twitter has a character limit which is bypassed through screenshots. Screenshots also are a sort of backup in case the original content is deleted for whatever reason. Not a good backup, but a backup nonetheless.
It's also faster and easier to take screenshots than it is to preserve formatting of HTML/CSS and it suits Twitter.
Related topic: Azealia Banks's recent screenshots of messages relating to Elon Musk, his companies, and other things.
The OP user did end up posting a link to the thread, but I never bothered clicking it. The first screenshot grabbed my attention sufficiently that I just continued looking at the next. They almost act in a similar manner to Wikipedia link previews. Twitter for better or worse is where things still go "viral". I had a love-hate relationship with it but no longer spend a lot of time there these days. There is some method to the madness...
The really scary part of this is how appears that they apparently run safety critical updates trough this same crappy system. If someone manages to slip in a modification that bricks Teslas or worse, makes them do quick left turn in in the morning rush hour, they can sink the company.
There are lots of people with motivation to pull it off. Either for money or for some political reason nothing to do with Tesla.
Tesla is not alone. Hardware companies in general don't think about software security at all. They are security nightmares waiting to happen.
Not likely - he even says they do mTLS (mutual TLS, where both the client and server verify SSL certificates).
While there is some cringeworthy stuff in there, I've seen way worse in healthcare when people's personal data and medical history was at stake. Nothing made me so alarmed that I wouldn't drive my Tesla.
Innovation is certainly not always 'clean' and 'pretty' (under the hood so to speak) ... any company doing bringing in something new to market (product, service) bears the same type of profile.
I really don't think tesla is the 'worse offender' when it comes to the type of horror stories we read in this article.
Does anything about this thread really surprise you though? As to be unbelievable? Do you think that other car manufacturers are that much better? I've read so many insane stories about software in Toyotas (or was it Honda?), and those are very old, very well established companies. That a company like Tesla would have rock solid software would be a miracle.
Didn't we just have this yesterday, with the junior shocked to find that code in the real world doesn't at all look like the ruby "made with heart in Portland" stuff that has all the stars on GitHub?
This is what systems look like in every industry. Look no further than the crappy Java SWT GUIs you see in those leaked abhorrent NSA powerpoints.
They can. Typically they are 2-5 years, but you have to read the specific NDA to be sure.
I will never sign a NDA that doesn't expire in <10 yrs after cessation of work, as its just too risky to indefinitely hold any information secret to the level required in most NDAs. Management changes, and what the current management is fine with, future management could not, or you do some work that the company doesn't like 10 years down the road and they dig through the filing cabinet looking for something to hit you with.
Hmm. Is it possible to include a hard-nonmodification clause in an NDA (so nobody in the future can change it), or is this kind of implied in the contractual nature?
Interesting to note that Tesla's bug bounty program specifically excludes "TLS/SSL Issues, including BEAST BREACH, insecure renegotiation, bad cipher suite, expired certificates..."
I think a lot of HN readers over a certain age have spent time on SA. It’s easy to code switch and I wouldn’t have trouble mocking my own comments there in the house style.
So, after separating from company, you can release all proprietary information and you'd have limited liability? I thought there would be larger recourse. Not familiar with law though.
So basically the code base and politics of any large (successful) company. I wouldn't want to work with this dude. It's one thing to talk shit with your coworkers and another to do it online. I'm sure everyone is trying their best and all I'm reading are rantings from an ivory tower troll that did nothing to try and change things. I'm no fan of Tesla and I've shorted them so I guess thanks?
> So basically the code base and politics of any large (successful) company.
How does that invalidate anything of what he said or vindicate Tesla exactly? Do you believe that because some large companies are shitty places to work that everyone should just accept it as the status quo and shut the fuck up about it? Why are you expecting a single employee to try to 'change things' and why do you think that him speaking up about the issues is not an attempt to do just that?
>So basically the code base and politics of any large (successful) company.
If you say so. As an embedded system engineer, I mentioned this twitter thread when I was shooting the breeze with my boss because it reminds me we hold ourselves to a high standard even though my brain sometimes turns into a perfectionist that can only see the problems with our product.
I don't know how to say this without going into full on humblebrag mode, so I'll give the condensed version. He touched on a handful of points that are common between our companies/products, and where I often worry about quality. In those cases I was reassured by the fact we don't have that severe of problems, and the way our processes are designed it would never be allowed out of the lab.
A company with such an awful engineering ethic is pushing for mass-market self-driving cars. Given what happened after the Uber incident, I can only hope those idiots at Tesla don't ruin it for all of the players.
Every IT system in the world is the worst in the world according to some tech who performs firefighting on it. These tweets apply to nearly every system. Singling out Tesla is pretty unfair!!!
> China has a law in place that mandates all electric cars send real time telemetry to their government servers - model s/x/3, NIO cars and any other electric car if they're driving already complies with that law to be road certified. don't be surprised if that becomes a mandate in other countries
They've been going 100mph for years now. It doesn't come as a surprise to me that they haven't gone back and made it pretty.
While not even remotely on the same scale, I just spent 3 weeks building an app as fast as possible because my client's old one was causing 75% of their support calls. The code isn't pretty, but it works and it works a lot better than the old one. I know deep down that code is going to stay ugly for a while but that isn't what's important right now.
So one ugly hack was replaced with another somewhat less ugly hack, which will be replaced by another hack at a later date.
Sounds like every corporate environment I ever encountered, and Tesla it seems. It usually happens becuase they have the money to do it wrong multiple times rather than no choice but to get it right the first time.
I guess I was commenting more on the general messiness he was describing. I can empathize with it is all. I can see how quickly and easily it happens in the small apps I build, I can't imagine how hard it would be to reign it all in at Tesla scale at the pace they've been going.
Did you ever have to stand outside in the winter, in ice and show, with the hood propped open, spraying starting fluid into the carburetor, and manually holding the throttle open, so that your engine could get to the proper temperature?
Did you ever floor a car and have the engine shut off because there was too much fuel?
I had that happen once. I rebuilt my Rochester Quadrajet and it purred like a kitten for a decade. I eventually gave that car to my friends that had a similar model to use for parts, but it was the best car I ever had. Chevy Caprice Classic, former undercover cop car with a 350 and a shift kit.
I certainly miss those days. Working on my current vehicle is a PITA. The fuel rails, coils and plugs take hours to change. On my old car, I could do that in 15 minutes even if the engine was hot. I can't even imagine how proprietary the components on a Tesla must be or how difficult it will be to fix myself.
I don't think we have to go back to carbs with manual chokes (not that it was all that bad anyways), but frankly an electronic TBI like GM put on the very last generation I small block Chevy engines (with wasted spark ignitions instead of a distributor) would have been a good place to stop (as they're dead simple to work on and program unlike a lot of newer FI setups, and overcome the major problems with carbs like needing altitude and temperature compensation).
Or fiddle with the manual choke to find a setting where the engine would sort of idle. Or wish you had a manual choke, when the automatic choke wouldn't go on -- or off.
I have never operated a carbeurated vehicle that I did NOT wish was fuel-injected.
I've never operated a fuel-injected vehicle that I wished had a carb.
The truck I learned to drive on would cut out if you turned left abruptly, or hit a slight bump while turning left somewhat less abruptly, while off the throttle. We lived on a really steep hill with a ravine at the bottom which meant if you weren't careful to give it gas on the (downhill!) left turn, you would soon find yourself operating a vehicle with no power steering and no power brakes, trying to use one arm to steer the now-manual steering, while using the other to try to restart it, while stomping on the manual brakes, hoping everything came together and restarted before the ever-increasing downward grade ran you across a busy road and into a ravine.
My first motorcycle had a carb, and fuelling issues on motorcycles are even WORSE because they screw up your balance.
I was not aware that a tesla would need so many backend services as he says, but I guess that it makes sense. Do you think that someday they could open up their platform (out of necessity maybe) to third party providers for each of the services? Some competition would mitigate the broken services issues.
This was the most intense and compelling piece of writing I've read all year. I love hearing this gnarly warts & all reporting from the front lines. Guy sounded really smart, able to wrap his head around so many types of systems and roles in the company.
Oh no. I happened to see the link to this guy's twitter earlier today in the HN comments (prob where OP saw it too, judging by timing), read the posts, and thought to myself "wow, i'm surprised how this has been posted to HN and seemingly hasn't garnered any response at all".
In the few hours after I stopped procrastinating at work today, every post in that thread has went to multiple thousands of likes/share/etc. Guess that's my answer.
Which probably implies that it is significantly less shitty than hodgepodges of SAP, non-standard EDIFACT profiles and other legacy crud being hold together with CSV-transforming shell scripts started from cron used by most of automotive industry.
woah woah woah woah. the computers on teslas are managed?! there are programmers/sysadmins at tesla that can ssh into any car at any time. these computers can steer and operate the car.
I thought about that as well. I am looking at a remote part of snow country. Every time someone says something can't be hacked in to, my coworkers prove them wrong, using tools that are already installed on the victim hosts.
I mean, I feel you, they'll pry my manually driven motorcycle out of my cold, dead, hands. But I bet any automated vehicle can be remotely accessed somehow... most of the trains in Taipei are automated and I'd argue that's the cleanest, fastest, bestest train system I ever been on in my life.
Well, that was a butter clenching read. It’s not hard to understand why their chip designers bailed as soon as possible, and they seemed to perpetually bleed technical talent. Is this a matter of incompetence or the usual anything-to-win “move fast and break things” approach?
The latter fits better with their marketing vs. reality gap on “autopilot” and subsequent crashes.
That's how I read it. I had to go back and re-read it to see it says "butter" actually. I guess I've successfully installed anti-autocorrect in my brain.
Let's look at the facts: a) Tesla is the most shorted stock in history, b) 1000s of hackers are working hard to break things like this, c) if it was possible to break it and damage the company someone would have done it already. This just feels like an disgruntled employee.
Counterpoint: when you ignore Musk’s Twitter tirades about shorts, and his uberfans’ subsequent aping of those tirades a simple truth remains: Tesla is one of the most overvalued stocks in history, and far from being mustache-twirling villains with limitless resources and no morals, people shorting TSLA are doing so in anticipation of reality catching up with the listed price. They’re not undermining Tesla, they’re not hoping for it to fail, they’re just waiting for the stock price and the reality of the company to intersect. It would be a lot less frustrating talking to hardcore Tesla/Musk fans if you didn’t have to wade through conspiracy theories to do it.
a) car manufacturing is one of the most competitive industries, b) 1000s of engineers are working on making better cars, c) if it was possible to make profitable, affordable, self-driving electric cars, somebody would have done it. Comes off as a snake oil salesman.