Hacker News new | past | comments | ask | show | jobs | submit login

My guess -- there are two separate pipelines one for code changes and one for data files.

Pipeline 1 --

Code updates to their software are treated as material changes that require non-production and canary testing before global roll-out of a new "Version".

Pipeline 2 --

Content / channel updates are handled differently -- via a separate pipeline -- because only new malware signatures and the like are distrubuted via this route. The new files are just data files -- they are supposed to be in a standard format and only read, not "executed".

This pipeline itself must have been tested originally and found tobe working satisfactorily -- but inside the pipeline there is no "test" stagethat verifies the integrity of the data fine so generated, nor - more importantly - checking if this new data file works without errors when deployed to the latest versions of the software in use.

The agent software that reads these daily channel files must have been "thoroughly" tested (as part of pipeline 1) for all conceivable data file sizes and simulated contents before deployment. (any invalid data files should simply be rejected with an error ... "obviously")

But the exact scenario here -- possibly caused by a broken pipeline in the second path (pipeline 2) -- created invalid data files with some quirks. And THAT specific scenario was not imagined or tested in the software version dev-test-deploy pipeine (pipeline 1).

If this is true --

The lesson obviously is that even for "data" only distributions and roll-outs, however standardized and stable their pipelines may be, testing is still an essential part before large scale roll-outs. It will increase cost and add latency sure, but we have to live with it. (similar to how people pay for "security" software in the first place)

Same lesson for enterprise customers as well -- test new distributions on non-production within your IT setup, or have a canary deployment in place before allowing full roll-outs into production fleets.




Same lesson for enterprise customers as well -- test new distributions on non-production within your IT setup, or have a canary deployment in place before allowing full roll-outs into production fleets.

It was mentioned in one of the HN threads, that the update was pushed overriding the settings customer had [1]. What recourse any customer can have in in such a case ?

1. https://news.ycombinator.com/item?id=41003390


Ah that was me. We don’t accept “content updates” and they are staged.

We got this update pushed right through.


> What recourse any customer can have in in such a case ?

Sue them and use something else.


Nice.

But the problem here is that the code runs in kernel mode. As such any data that it may consume should have been tested with the same care as the code itself which has never been the case in this industry.


> It will increase cost

And of of course that cost would be absolutely insignificant relative to the potential risk...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: