Author Topic: Is there something to learn for embedded/IOT from the Crowdstrike disaster?  (Read 3700 times)

0 Members and 5 Guests are viewing this topic.

Online langwadt

  • Super Contributor
  • ***
  • Posts: 4565
  • Country: dk
It is easy to say that security software can be worse than the problem it is trying to protect against. It is like installing malware to protect against malware. The latest "innovation" is when the security software randomly deletes files it doesn't like, or blocks installers from completing their tasks. Like, hey, isn't that what malware does?


I remember some maybe 25 years ago a guy at work spent days try figure out why a file to flash update a DSP kept getting corrupted, it turned out something at that binary file triggered a match in the antivirus that then just silently nulled that part
 

Offline zilp

  • Frequent Contributor
  • **
  • Posts: 297
  • Country: de
... the signature from MS doesn't say "this is high-quality, secure software", it just says "this doesn't fall over quite as easily as shoddy drivers in past windows versions often did".

which would be better than crowd strike causing likely the biggest most expensive outage in history

No, you are misunderstanding. That is what happened. Crowdstrike doesn't fall over quite as easily as shoddy drivers in past windows versions often did. That explicitly does not mean that it doesn't ever fall over, so in particular, things falling over doesn't mean that the process didn't work as designed.

This failure mode was not a common cause of problems in the past, which is why it is simply an inappropriate expectation that processes that are designed to minimize failures from common past  causes of failure would prevent it. That was never the goal.

apparently MS did try to force antivirus vendors to use a special MS api instead of everyone inventing they own stuff running in kernel mode, but were told that was monopolistic so they couldn't do that, but Apple could

Does Apple even have antivirus software?
 

Offline zilp

  • Frequent Contributor
  • **
  • Posts: 297
  • Country: de
For once, it's definitely none of MS's fault (apart from the kernel not being a microkernel, but none of the major OSs currently is a microkernel, so...) If anything, they should change their WHQL policy to prevent kernel drivers from doing what Crowdstrike did, that is run as an interpreter to code that can be updated without requiring new driver tests. But maybe even just that would be considered monopolistic. Who knows.

I would think that it would only be considered monopolistic if they themselves had better conditions?

I mean, in practice, I would think that this is not a viable strategy because I suspect that any testing done by MS on drivers simply takes so long that it makes anti-virus software even more useless than it already is anyway. And to avoid accusations of monopolistic behavior, they'd have to subject their own solutions to the same testing delays. After all, deploying detection of an exploit after the vulnerability fix has been deployed is kinda pointless (even though that's kinda the primary business model of anti-virus software).

Also, I'm not really sure that this is even correctly attributing the cause. I mean, sure, if this Crowdstrike stuff were some userspace process that only interacted with the kernel via an API that doesn't allow random corruption of the kernel, this particular method of crashing the system wouldn't have worked. But would that have changed anything about the effective result? I mean, if the security concept of your system is based on this Crowdstrike thing being active ... then, if that Crowdstrike userspace process were to crash, you would still have to halt the system, wouldn't you? So, what would be the difference?
 

Online woofy

  • Frequent Contributor
  • **
  • Posts: 349
  • Country: gb
    • Woofys Place
Dave Plummer has an update worth a watch.
https://youtu.be/ZHrayP-Y71Q?si=gr81yvJK_4nVib30

Online wraper

  • Supporter
  • ****
  • Posts: 17367
  • Country: lv
Blame the EU https://news.microsoft.com/2009/12/16/microsoft-statement-on-european-commission-decision/
https://www.pcmag.com/news/why-did-crowdstrike-update-only-hit-windows-blame-the-eu-microsoft-says
Quote
As Microsoft's Chief Communications Officer, Frank X. Shaw, noted on X, a 2009 agreement between the European Commission and Microsoft required Redmond to give security software the same level of access to Windows as Microsoft itself.
The agreement says: "Microsoft shall ensure that third-party software products can interoperate with Microsoft’s Relevant Software Products using the same Interoperability Information on an equal footing as other Microsoft Software Products.
"Microsoft shall make available to interested undertakings Interoperability Information that enables non-Microsoft server Software Products to interoperate with Windows Server Operating System on an equal footing with other Microsoft Server Software Products," it adds.
« Last Edit: July 24, 2024, 08:39:55 pm by wraper »
 

Offline zilp

  • Frequent Contributor
  • **
  • Posts: 297
  • Country: de
Blame the EU https://news.microsoft.com/2009/12/16/microsoft-statement-on-european-commission-decision/

So, where exactly did the EU force companies to install crappy software?

Yeah, exactly, it didn't, but if some PR guy from a big company tells a story about how they would have saved the world if only the EU hadn't prevented them from abusing their monopoly position, as they have done many times in the past, then you sure are happy to eat up their propaganda, as long as they blame the EU, right?

Also, not to forget that this whole "security software" industry only exists because of Microsoft's recklessness concerning IT security in the first place.
 

Online IanB

  • Super Contributor
  • ***
  • Posts: 12056
  • Country: us
It seems the EU position on this is not quite what those words imply. Evidently, Microsoft had proposed to create a special security API so that vendors could write security products that interact with the OS in a safe way without destabilizing it. This is something that Apple has apparently already done. It seems the EU viewed this proposal as a way of putting independent software vendors at a disadvantage, so they rejected it. The EU position is, of course, nonsense. The existence of a safe API in no way prevents vendors from continuing to write and install kernel mode drivers, if they so wish. It's just that no customers should want to buy those products. Every time my computer does a blue screen, it is due to a third party device driver (yes, Logitech, I'm looking at you).
 

Online wraper

  • Supporter
  • ****
  • Posts: 17367
  • Country: lv
Blame the EU https://news.microsoft.com/2009/12/16/microsoft-statement-on-european-commission-decision/

So, where exactly did the EU force companies to install crappy software?
Due to EU MS had no other choice but to allow Crowdstrike to run at kernel level. At 6:00 Dave Plummer says MS was working to move antivirus software out of kernel but was hit by EU regulators

 
The following users thanked this post: SiliconWizard

Offline ejeffrey

  • Super Contributor
  • ***
  • Posts: 3812
  • Country: us
- doing a rolling deployment,

This is basically the entire solution.

If you distribute updates to more than a few hundred systems, you need a randomized phased deployment protocol. 

Having test systems and beta tracks is great, you want to catch as many bugs as you can there.  But these systems and their deployment process never perfectly reflect reality and some problems will still only show up in production.  So you randomly select 0.01% or so of your fleet to receive the update first, and you monitor for problems.  Then you do  0.1%, 1%, 5%, watching for failures at each point.   You want it to be random so that you get good coverage of the possible client configurations.  It's not foolproof but as long as you do the rollout slow enough to actually respond to early problems this approach will catch a very large fraction of defective updates and keep the impact mangagable.

I don't know why crowd strike doesn't do this. Maybe they do but bypassed it in this case?
 

Online Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6593
  • Country: fi
    • My home page and email address
We've used Berkeley Packet Filter state machines to do with network communications what virus scanners do with storage I/O for decades: a filter interpreter running in privileged mode "executing" unprivileged filter "code".  Even the per-process seccomp filters in Linux are based on BPF.

There is ample experience and research on embedding interpreters and state machines in privileged environments safely, even when the instructions or filter is unprivileged/untrusted.  (The equivalent is BPF/eBPF filter validation (offline and runtime) and privilege separation.)

Even if CrowdStrike implemented a completely novel kernel-mode interpreter, they should be well aware and experienced with the security aspects, since it is well known and well researched; this kind of failure really does indicate horrible development practices.  It should not have been possible to happen, and is not an understandable accident or failure at all.

The question is, why did CrowdStrike implement their kernel-mode interpreter so poorly allowing it to hang Windows machines at bootup; it is not why EU required Microsoft to allow CrowdStrike to implement a kernel-mode interpreter.  The difference is similar to blaming a hardware store for selling a hammer to a person who later went on a rampage with it.

To be clear, I personally do not blame Microsoft or EU at all here; it's all purely on CrowdStrike.
« Last Edit: July 25, 2024, 03:51:49 am by Nominal Animal »
 

Offline zilp

  • Frequent Contributor
  • **
  • Posts: 297
  • Country: de
Blame the EU https://news.microsoft.com/2009/12/16/microsoft-statement-on-european-commission-decision/

So, where exactly did the EU force companies to install crappy software?
Due to EU MS had no other choice but to allow Crowdstrike to run at kernel level. At 6:00 Dave Plummer says MS was working to move antivirus software out of kernel but was hit by EU regulators

Did you notice how that is not an answer to what I wrote?

I'll spell it out for you: Just because Microsoft is not allowed to prevent some other company from building a product, doesn't mean that other people are forced to use that product.

Also, the premise of your argument, if one can call it that, is that Microsoft would have been guaranteed to build a better product.

For all we know, Microsoft would have built that kernel driver ("the API") with a bug in it that would have caused millions of windows machines to crash while using Crowdstrike, and then you would be going on about how the EU allowed Microsoft to prevent Crowdstrike from building their own kernel driver that wouldn't crash machines.

The point of having a market economy is that the market participants decide which products thrive and which don't, rather than having the government decide, because the idea is that the broad market is much better at picking out the best products vs. the government picking the products that are allowed to exist.

Your "argument" here is that the EU government should decide which company is allowed to build the kernel component of security solutions, rather than have the market decide which vendor they prefer. That's what is called a planned economy, i.e., what the USSR tried. As you might be aware, that experiment failed catastrophically.

All the EU did here is that they guaranteed equal market access to all vendors, so that the market has a chance to decide. Your demand is that the EU should have picked Microsoft and should have allowed Microsoft to prevent competition.
« Last Edit: July 25, 2024, 04:45:23 am by zilp »
 

Offline zilp

  • Frequent Contributor
  • **
  • Posts: 297
  • Country: de
It seems the EU position on this is not quite what those words imply. Evidently, Microsoft had proposed to create a special security API so that vendors could write security products that interact with the OS in a safe way without destabilizing it. This is something that Apple has apparently already done. It seems the EU viewed this proposal as a way of putting independent software vendors at a disadvantage, so they rejected it. The EU position is, of course, nonsense. The existence of a safe API in no way prevents vendors from continuing to write and install kernel mode drivers, if they so wish. It's just that no customers should want to buy those products. Every time my computer does a blue screen, it is due to a third party device driver (yes, Logitech, I'm looking at you).

Do you have a source for this? Because based on what I've read about this so far, things were about "moving antivirus software to a new API", not about "providing a new API that antirvirus software can use", including the quotes in this thread, like:

"The agreement says: "Microsoft shall ensure that third-party software products can interoperate with Microsoft’s Relevant Software Products using the same Interoperability Information on an equal footing as other Microsoft Software Products."

Nothing in that suggests that Microsoft is not allowed to provide such an API. All it says is that it can not prevent other companies from building a competing product to that API.
 
The following users thanked this post: Siwastaja, Nominal Animal

Offline madires

  • Super Contributor
  • ***
  • Posts: 7996
  • Country: de
  • A qualified hobbyist ;)
Back then it was a measure against Microsoft creating an anti-virus - equivalent to the Internet Explorer - and taking over the anti-virus market. If Microsoft wanted to establish a more safe API for anti-virus they could have done that under the premise that they provide access to that new API to all anti-virus companies. Microsoft partly blaming the EU for the Crowdstrike disaster is just classic FUD to influence current EU antitrust investigations into Microsoft's latest endeavors.
 
The following users thanked this post: Siwastaja, Nominal Animal

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3847
  • Country: gb
  • Doing electronics since the 1960s...
Quote
I don't know why crowd strike doesn't do this. Maybe they do but bypassed it in this case?

Probably zero-day exploits.

Also people don't have a choice on what to buy. Corporate IT life is about ticking boxes, to demonstrate "best practice". And if M$ bundle a product, you have to buy it. It ranks up there with LGBTQ+ recruitment, etc.

That is also what enabled Office 365 to blow up. It is another way of ticking the best practice box, plus permanently valid software licensing which avoids employing somebody who keeps checking PCs to make sure they are not running bootleg software :)
« Last Edit: July 25, 2024, 04:25:30 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 8355
  • Country: fi
Even if you fail to do anything else, do updates with canary testing. In other words, update say 1% of your customers first. Wait for a few days, update 10% and so on. This costs almost nothing to implement, and is the last line of defense against poor quality updates, greatly limiting the extent of the damage. 100 pissed customers is so much better than 100 000 pissed customers.
 

Online IanB

  • Super Contributor
  • ***
  • Posts: 12056
  • Country: us
Let's be brutally cynical about this. Companies like CrowdStrike tend to have their origin as startups with leaders who spin a good yarn to get venture capital investment, and young, inexperienced developers whose ambitions exceed their abilities. Expecting this event to be due to anything other than incompetence is to give far too much credit to those concerned.
 
The following users thanked this post: nctnico, SiliconWizard, Nominal Animal

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 8355
  • Country: fi
Let's be brutally cynical about this. Companies like CrowdStrike tend to have their origin as startups with leaders who spin a good yarn to get venture capital investment, and young, inexperienced developers whose ambitions exceed their abilities.

Which is completely OK - they say themselves not to use their product for anything serious, but idiots are buying. The only thing to blame is the sick corporate culture of "no one ever got fired by IBM <insert trend thing of the year>". Certain products (for example, in open source world, docker) get a cult status are getting used just because, to the point of using something else for good technological reason is considered like some kind of extreme rebellion needing almost impossibly strong arguments, when it should be the other way, choice of a certain product should be justified by arguments. (In more healthy culture, this would lead to varying, different choices being made, in a distributed way companies not even knowing about each other. But if everyone copies others, that puts all the eggs in one basket.)

Those tech team middle managers should have some serious explaining to do to their company higher-ups and finally shareholders. Why choose a product the manufacturer of which says it is not suitable for the job, just because others are doing the same?
 

Offline glenenglish

  • Frequent Contributor
  • **
  • Posts: 393
  • Country: au
  • RF engineer. AI6UM / VK1XX . Aviation pilot. MTBr
Yes, Crowdstrike's fault. pure and simple.

So what's the likelihood of crippling litigation?

Yes, sure the fine print says "we wont be responsible for anything" but that matters little in a court where the judge can probably be fairly easily convinced of cowboy behaviour , the crowdstike people knew their kit was being used in commercially critical infrastructure, with big dollars at stake due to non performance-  they cant hide from that.. .   
 

Online IanB

  • Super Contributor
  • ***
  • Posts: 12056
  • Country: us
Yes, Crowdstrike's fault. pure and simple.

So what's the likelihood of crippling litigation?

Yes, sure the fine print says "we wont be responsible for anything" but that matters little in a court where the judge can probably be fairly easily convinced of cowboy behaviour , the crowdstike people knew their kit was being used in commercially critical infrastructure, with big dollars at stake due to non performance-  they cant hide from that.. .   

Leonard French posted a video on YouTube where he suggested that if litigants could prove gross negligence on the part of CrowdStrike, then this might persuade the courts to put aside any limitation of liability claimed in the EULA.
 

Offline radiolistener

  • Super Contributor
  • ***
  • Posts: 3624
  • Country: ua
The lesson learned is to keep away from software/hardware products which is controlled by some world wide corporation.
Any kind of OS like Windows, Android, iOS, ChromeOS is a real danger and should be avoided.
 

Offline NorthGuy

  • Super Contributor
  • ***
  • Posts: 3219
  • Country: ca
The lesson learned is to keep away from software/hardware products which is controlled by some world wide corporation.
Any kind of OS like Windows, Android, iOS, ChromeOS is a real danger and should be avoided.

Exactly, this is the very mechanism of forced updates which is at fault here. Companies should be in full control of their servers. The companies who trade their freedom for false security do not deserve neither and will lose both.
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 8355
  • Country: fi
Exactly, this is the very mechanism of forced updates which is at fault here. Companies should be in full control of their servers. The companies who trade their freedom for false security do not deserve neither and will lose both.

So you should not outsource to other companies? How about outsourcing to consultants? Hiring under your own payroll? Or maybe the CEO should run the whole business alone?

In other words, I disagree in general. Usually, outsourcing to specialists is a good idea, so that you can concentrate on your core business. If everyone invented their own security solutions and hired their own security experts to write/manage/update their own firewall software, quality in general would be order of magnitude worse.

Then again, the other extreme end, everyone buying from one giant supplier isn't that great either. Healthy competition and somewhat diverse ecosystem of technical solutions is what offers good middle road in reliability, cost, and limitation of consequences if something goes wrong.
 
The following users thanked this post: wraper

Offline quince

  • Contributor
  • Posts: 48
  • Country: us
Quote
Is there something to learn for embedded/IOT from the Crowdstrike disaster?

Yes. Write your embedded/IoT firmware in Rust. If this is too hard for you, maybe you should pivot to woodworking.

 

Offline eutectique

  • Frequent Contributor
  • **
  • Posts: 425
  • Country: be
Yes. Write your embedded/IoT firmware in Rust. If this is too hard for you, maybe you should pivot to woodworking.

Does Rust do automagic sanity checks of user input?
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 8355
  • Country: fi
Quote
Is there something to learn for embedded/IOT from the Crowdstrike disaster?

Yes. Write your embedded/IoT firmware in Rust. If this is too hard for you, maybe you should pivot to woodworking.

Too obvious troll post. Almost got me, but not quite. 2/5.
 
The following users thanked this post: madires


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf