Author Topic: Proof that software as service/cloud based, will never work for long term ... (Read 149572 times)

tautech · « **Reply #875 on:** July 20, 2024, 02:42:44 am »

Quote from: Halcyon on July 20, 2024, 02:22:53 am

Quote from: tautech on July 19, 2024, 09:43:41 am
Quote from: paulca on July 19, 2024, 08:59:47 am
https://www.theregister.com/2024/07/19/microsoft_365_azure_outage_central_us/
LOL, it's worldwide !

But smart dudes previously active members here are fixing it bit by bit.

Well yesterday was fun. Thankfully we are 50/50 Microsoft/Apple so our business operations weren't impacted too much. Each time could still do their job, albeit at reduced capacity. This is why I always say not to put your eggs in one basket. Those guys who rely on Windows endpoints and servers and Hyper-V will still be picking up the pieces this weekend.

And well into next week once they hit their desks next week only to find their systems still down.

Halcyon · « **Reply #876 on:** July 20, 2024, 05:41:20 am »

Quote from: tautech on July 20, 2024, 02:42:44 am

Quote from: Halcyon on July 20, 2024, 02:22:53 am
Quote from: tautech on July 19, 2024, 09:43:41 am
Quote from: paulca on July 19, 2024, 08:59:47 am
https://www.theregister.com/2024/07/19/microsoft_365_azure_outage_central_us/
LOL, it's worldwide !

But smart dudes previously active members here are fixing it bit by bit.

Well yesterday was fun. Thankfully we are 50/50 Microsoft/Apple so our business operations weren't impacted too much. Each time could still do their job, albeit at reduced capacity. This is why I always say not to put your eggs in one basket. Those guys who rely on Windows endpoints and servers and Hyper-V will still be picking up the pieces this weekend.
And well into next week once they hit their desks next week only to find their systems still down.

The fix is pretty easy. It just depends on how well they've planned and designed their systems. It'll be a learning curve for many, that's for sure.

tautech · « **Reply #877 on:** July 20, 2024, 08:00:43 am »

SiliconWizard · « **Reply #878 on:** July 20, 2024, 08:27:53 am »

mikeb1279 · « **Reply #879 on:** July 20, 2024, 09:18:24 am »

Quote from: PlainName on July 19, 2024, 11:39:08 pm

Quote
As someone pointed out above, anti-viruses has been screwing OS's since the dawn of time.

Which is actually preferable to being hit by ransomware or worse. At least problems with AV (and they are actually few) aren't malicious.

Quote
On-prem IT can manage this by patching non-critical boxes first to test etc

With other stuff, sure. But AV is often pushing out critical patches for 0-day exploits, and if you hang around a week for IT to try it on some spare kit you might be too late to apply it. While your IT bods are having a good play the bad guys are deconstructing it to find the hole it patches, and then hope they get to your setup before your IT people finally give the OK and think about rolling it out.

It's a matter of risk, and effectively you're outsourcing the testing and stuff to a third party who should know their onions - your local IT bods generally don't have a clue because they don't have the mindset of do-badders. Just think of how many security holes there are all over the place (requiring AV to stop them being exploited) - the developers don't have the mindset to see them, and IT support are not really any different (and if they were, you wouldn't want to be employing them).

Well I mean if your IT people are incompetent then sure. Otherwise they would apply the AV patch as an expedited change, starting with the non-critical boxes no? This should be their bread and butter kind of stuff in a moderate to large organization.

That's the theory, but every experience I have had with SaaS has resulted in inferior solutions with more downtime and buggy features. What you are saying is an advantage I am saying is a liability. Maybe we have just had different experiences though?

PA0PBZ · « **Reply #880 on:** July 20, 2024, 09:34:04 am »

Quote from: mikeb1279 on July 20, 2024, 09:18:24 am

Well I mean if your IT people are incompetent then sure. Otherwise they would apply the AV patch as an expedited change, starting with the non-critical boxes no? This should be their bread and butter kind of stuff in a moderate to large organization.

Crowdstrike is fully in control over the patches/updates, there's nothing local IT can do about it, there's no setting like in Windows where you can delay the update.

Marco · « **Reply #881 on:** July 20, 2024, 11:38:54 am »

Since Microsoft is getting blamed any way, they should just give themselves the power to fix it in the future. Have a PXE server store optional patches and let the bootmanager apply them. A bios update could still brick below that level, but at least drivers would be fixable.

madires · « **Reply #882 on:** July 20, 2024, 12:37:33 pm »

The cause for the CrowdStrike disaster is a missing pointer check:
- Crowdstrike causes the largest IT outage in history, massive questions about testing regime (https://techau.com.au/crowdstrike-causes-the-largest-it-outage-in-history-massive-questions-about-testing-regime/)

BravoV · « **Reply #883 on:** July 20, 2024, 12:55:06 pm »

Quote from: madires on July 20, 2024, 12:37:33 pm

The cause for the CrowdStrike disaster is a missing pointer check:
- Crowdstrike causes the largest IT outage in history, massive questions about testing regime (https://techau.com.au/crowdstrike-causes-the-largest-it-outage-in-history-massive-questions-about-testing-regime/)

What testing regime ? Clearly the new update deployed was never tested, not even once.

C'mon, a blue sreen problem is easily spotted, either manual or automated testing.

m k · « **Reply #884 on:** July 20, 2024, 01:30:00 pm »

Any idea how that old boot over network is doing?

This case would be a bit easier if that kind of a "mainframe" was available.
At least with all these "terminals" that are around.

iMo · « **Reply #885 on:** July 20, 2024, 01:57:25 pm »

The simplest improvement of the crowdstrike's deployment process of their updates delivered to their lovely customers would be to deploy their updates into their own company first. The day after they may go with the update out to the masses (if they will still be able to go, sure)..

Marco · « **Reply #886 on:** July 20, 2024, 02:29:47 pm »

Quote from: m k on July 20, 2024, 01:30:00 pm

Any idea how that old boot over network is doing?

Doesn't help without ipmi for remote reboot.

TimFox · « **Reply #887 on:** July 20, 2024, 03:30:43 pm »

Quote from: iMo on July 20, 2024, 01:57:25 pm

The simplest improvement of the crowdstrike's deployment process of their updates delivered to their lovely customers would be to deploy their updates into their own company first. The day after they may go with the update out to the masses (if they will still be able to go, sure)..

I didn't get hit by this disaster, but with other smaller dumpster-fire software upgrades I always wondered if anyone had deployed the updates first to a reasonable-sized system before foisting them onto the customer base.

Karel · « **Reply #888 on:** July 20, 2024, 04:08:35 pm »

Google continues to "ruin" Fitbit for users by discontinuing web app interface

Seventy-nine pages of mostly negative customer feedback with no acknowledgments to voiced concern

https://www.techspot.com/news/103874-google-continues-ruin-fitbit-users-discontinuing-web-app.html

When Google was a lowly upstart, its motto was "Don't be evil." It even listed the phrase prominently in its corporate code of conduct.
After the Alphabet restructuring in 2015, it was changed to the tamer-sounding "Do the right thing."
It's telling that by 2018, Google no longer had a motto and had removed both phrases from the company CoC.
It makes sense, considering the company no longer lives by either creed.

coppercone2 · « **Reply #889 on:** July 20, 2024, 04:22:16 pm »

their not being ethical to the share holders with those mottos

coppice · « **Reply #890 on:** July 20, 2024, 05:35:11 pm »

Quote from: Karel on July 20, 2024, 04:08:35 pm

Google continues to "ruin" Fitbit for users by discontinuing web app interface

Seventy-nine pages of mostly negative customer feedback with no acknowledgments to voiced concern

https://www.techspot.com/news/103874-google-continues-ruin-fitbit-users-discontinuing-web-app.html

When Google was a lowly upstart, its motto was "Don't be evil." It even listed the phrase prominently in its corporate code of conduct.
After the Alphabet restructuring in 2015, it was changed to the tamer-sounding "Do the right thing."
It's telling that by 2018, Google no longer had a motto and had removed both phrases from the company CoC.
It makes sense, considering the company no longer lives by either creed.

The next step will be more honesty with the motto "do the far right thing".

coppercone2 · « **Reply #891 on:** July 20, 2024, 06:08:49 pm »

Quote from: TimFox on July 20, 2024, 03:30:43 pm

Quote from: iMo on July 20, 2024, 01:57:25 pm
The simplest improvement of the crowdstrike's deployment process of their updates delivered to their lovely customers would be to deploy their updates into their own company first. The day after they may go with the update out to the masses (if they will still be able to go, sure)..

I didn't get hit by this disaster, but with other smaller dumpster-fire software upgrades I always wondered if anyone had deployed the updates first to a reasonable-sized system before foisting them onto the customer base.

buddy maybe you did not get the memo but someone is getting promotions for being a 'go getter' that can 'assert risk' to maintain business operations. because they hunched and figured that testing costs money and we don't need it! I 'asked' (in the tony soprano sense) the guy if he was sure (10 seconds after he submitted the final revision of the code) it would release OK!

Don't you love those 'cost sensitive' deadlines that supposedly do something more then make your boss feel relaxed because hes not sure if he feels like asking for a extension based on new information?

m k · « **Reply #892 on:** July 20, 2024, 06:15:36 pm »

Quote from: Marco on July 20, 2024, 02:29:47 pm

Quote from: m k on July 20, 2024, 01:30:00 pm
Any idea how that old boot over network is doing?
Doesn't help without ipmi for remote reboot.

It shouldn't be a reboot, nor management.

PXE and what was before it didn't need any special ports.

But special watchdog there must be, no matter what.
After that the style is pretty much irrelevant and management can be automated.
Maybe some motherboards already have something like that.

tautech · « **Reply #893 on:** July 20, 2024, 10:34:52 pm »

https://www.theregister.com/2024/07/18/security_review_failure/

SiliconWizard · « **Reply #894 on:** July 20, 2024, 10:49:23 pm »

Quote from: Marco on July 20, 2024, 11:38:54 am

Since Microsoft is getting blamed any way, they should just give themselves the power to fix it in the future. Have a PXE server store optional patches and let the bootmanager apply them. A bios update could still brick below that level, but at least drivers would be fixable.

Well, both should get the blame.

The fact that some security software could even get the OS on its knees is a sign of a severe OS design flaw overall. But obviously not something they can easily fix unless they redesigned it almost entirely.

After that, a large part of the blame should be on the customers' shoulders - decent sysadmins should never allow third-party companies to remotely deploy such a low-level update on a large scale without testing it first locally. That's insane. They just get what they deserve here, and CrowdStrike should be thanked for making it obvious to everyone.

fzabkar · « **Reply #895 on:** July 20, 2024, 10:54:09 pm »

Quote from: BravoV on July 20, 2024, 12:55:06 pm

What testing regime ? Clearly the new update deployed was never tested, not even once.

C'mon, a blue sreen problem is easily spotted, either manual or automated testing.

Some years ago I uncovered a bug in Seagate's firmware update for their ST3000DM001 HDD. No-one was able to apply the update. I identified the bug, posted a workaround in Seagate's own forum, and made certain Seagate personnel aware of the problem. Four years later the bug was still there. Worse still, the forum had been totally deleted on April's Fools Day.

It turned out that the update was never tested prior to being released. I can confidently say this because the payload files that were bundled with the update package had the wrong filenames. The updater tool expected different file names, so it errored out when those files were not present. The solution was to rename the payloads with their correct names.

ISTM that the testing process must be long and tedious. Presumably some employee decided that a particular change was too insignificant to affect the integrity of the update, so it was decided not to retest the package. I wonder if that is what happened at Crowdstrike.

rsjsouza · « **Reply #896 on:** July 21, 2024, 12:21:31 am »

Quote from: tautech on July 20, 2024, 08:00:43 am

That was awesome! A cautionary tale indeed...

That brought me back to the days of those "Doublespace" DOS disk utilities that compacted your data but a single glitch on the disk would flush your data into oblivion... A mistake my dad and I did just once.

Marco · « **Reply #897 on:** July 21, 2024, 08:58:59 am »

Quote from: SiliconWizard on July 20, 2024, 10:49:23 pm

The fact that some security software could even get the OS on its knees is a sign of a severe OS design flaw overall.

A driver which operates during boot (antivirus, storage, network, whatever) can always do that.

Automatically reverting is not an option because that might be used as part of a downgrade attack. There is no way to design the OS to protect against the initial hang except for mobile phone OS level control. When there are no third party drivers they can't interfere with the boot process. Then they'd need to be designing the hardware too, not just the OS.

What they can do is add a mechanism for remote updates which kicks in before the drivers are running through PXE.

themadhippy · « **Reply #898 on:** July 23, 2024, 12:25:39 pm »

Dont blame microsoft for international IT failure day ,don't blame crowdstrike, its all the fault of that pesky EU.

https://www.euronews.com/next/2024/07/22/microsoft-says-eu-to-blame-for-the-worlds-worst-it-outage

coromonadalix · « **Reply #899 on:** July 23, 2024, 12:31:01 pm »

blame "stupid" people(s) who rely on these cloud based softwares loll would be a lot ... loll


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: Proof that software as service/cloud based, will never work for long term ... (Read 149572 times)

Share me