Linus Torvalds Growing Frustrated By Buggy Hardware & Theoretical CPU Attacks

Nemeski@lemm.ee · 1 month ago

Linus Torvalds Growing Frustrated By Buggy Hardware & Theoretical CPU Attacks

jerakor@startrek.website · edit-2 1 month ago

Fully validating hardware is an insane task that hasn’t been really done in years. It would mean 5 years between chip releases and a 2-5X in cost to produce, and people wouldn’t follow the validated configs anyways. If we followed the validated hardware spec we would have 50 min boot times and not go past a 3.5Ghz clock.

People have the choice today on if they want to run on validated hardware. You can opt in to get a 2.8Ghz part that supports 2666MT/s that is mostly tested and validated, or you can get a 5Ghz part that supports 6000MT/s that is only partially validated. They cost the same price. What do folks think people pick?

Ethan@programming.dev · 1 month ago

Who said anything about fully validating hardware? “Hardware vendors should solve their own problems” is not the same as “hardware vendors should fully validate their products”.

jerakor@startrek.website · 1 month ago

Is this really the hardware vendor’s problem though? It’s the consumers problem.

I bring up full validation because the concern here is putting in a speculative fix. If the ask is, why was the hardware like that in the first place the answer is because it can’t be fully validated. If the ask is why should a speculative fix go into the Kernel it is because the consumers are not on top of tree and if a fix has a chance of never being exploited it needs to be pulled in years ahead so it goes into an LTR that customers migrate to BEFORE the issue comes up.

Ethan@programming.dev · 1 month ago

If the ask is, why was the hardware like that in the first place the answer is because it can’t be fully validated.

But that’s not the question. There are two questions: Who should be responsible for patching hardware vulnerabilities? And if the answer is “the kernel” then should speculative but never demonstrated vulnerabilities be patched? Linus’ answer is the hardware manufacturer, and no.

Is this really the hardware vendor’s problem though? It’s the consumers problem.

Maybe we’re running into the ambiguity of language. If you mean to say, “Who does it cause a problem for? The consumer.” then sure. On the other hand what I mean, and what I think Linus means, is “Who’s responsible for the vulnerability existing? Hardware vendors. Who should fix it? Hardware vendors.”

If the ask is why should a speculative fix go into the Kernel […]

Depends on what you/we/they mean by “speculative”. IMO, we need to do something (microcode, kernel patches, whatever) to patch Spectre and Meltdown. Those have been demonstrated to be real vulnerabilities, even if no one has exploited them yet. But “speculative” can mean something else. I’m not going to read all the LMK emails so maybe they’re talking about something else. But I’ve seen plenty of, “Well if X, Y, and Z happen then that could be a vulnerability.” For that kind of speculative vulnerability, one that has not been demonstrated to be a real vulnerability, I am sympathetic to Linus’ position.

jerakor@startrek.website · 1 month ago

This is a patch from the hardware vendor so I am assuming that the ask is not that the hardware vendor take responsibility but that they not release buggy hardware. That is what I mean about the validation issue.

The attack vector is shared in the patch so it isn’t entirely a theory.

There is a comment from Linus about how this patch is only needed for some hardware and doesn’t apply to others but I don’t get his relevance there as different hardware validates against different use cases and their source logic might be entirely disparate.

So my validation talk is simply saying that bugs happen. My concern here is what more should a hardware vendor do beyond submitting a kernel patch? You can’t just not have the bug, and if you recall the part someone else will just keep theirs in the field and take all the market share and roll the dice that their bugs don’t get exploited.

P4ulin_Kbana@lemmy.eco.br · 1 month ago

Could someone please explain to a non-tech expert?

FizzyOrange@programming.dev · 1 month ago

This is about Spectre, not about buggy hardware implementations.

Spectre is a fundamental flaw in speculative execution that means it can leak information, so it’s a security vulnerability. Apparently Intel has been imposing draconian requirements on software to work around the issue rather than fixing it in hardware, which is obviously what they should do, but is not at all trivial.

Rain World: Slugcat Game@lemmy.world · 1 month ago

hardware is like your computer, stuff like the cpu and ram. software is like the programs on that computer. linus torvalds makes a program that has to deal with the details of the computer (the linux kernel). as such, they have to work around problems in the hardware.

wulrus@lemmy.world · 1 month ago

Thanks for the unappreciated ELI2

Rain World: Slugcat Game@lemmy.world · 1 month ago

fuck

El Barto@lemmy.world · 1 month ago

I’m a graybeard software engineer with 30+ years of experience.

I appreciated your explanation.

wulrus@lemmy.world · 30 days ago

Me too; it’s BECAUSE I’m so old that I appreciate a general rooting what this is all about.

teawrecks@sopuli.xyz · 1 month ago

In the last 10 years there has been a seemingly noteworthy uptick in hardware bugs in both intel and amd CPUs. Security researchers find and figure out potential attack vectors that rely on these bugs (ex. Specter/Meltdown). Then operating systems have to put workarounds in their kernel code to ensure that these hypothetical attack vectors are accounted for, at the cost of performance and more complicated code.

Linus is saying how annoyed he is with all this extra work they have to do, resulting in worse performance, all to plug vulnerabilities that we’ve never actually seen any real attackers use. He’s saying instead we should just write the code how it should be, and if the hardware is insecure, let it be the hardware company’s problem when customers don’t use the hardware.

The problem is, customers will continue to use the hardware and companies who need a secure OS (all of them) will opt to not use Linux if it doesn’t plug these holes.

ikidd@lemmy.world · 1 month ago

Plus a lot of these bugs don’t get fixed, because they exist to allow the processors to “look ahead” for improved performance, at least on unmitigated benchmark tests.

Kissaki@programming.dev · 1 month ago

we should just write the code how it should be

Notably, that’s not what he says. He didn’t say in general. He said “for once, [after this already long discussion], let’s push back here”. (Literally “this time we push back”)

who need a secure OS (all of them) will opt to not use Linux if it doesn’t plug these holes

I’m not so sure about that. He’s making a fair assessment. These are very intricate attack vectors. Security assessment is risk assessment either way. Whether you’re weighing a significant performance loss against low risk potentially high impact attack vectors or assess the risk directly doesn’t make that much of a difference.

These are so intricate and unlikely to occur, with other firmware patches in line, or alternative hardware, that there’s alternative options and acceptable risk.

Possibly linux@lemmy.zip · 1 month ago

So he is like the rest of us

sleen@lemmy.zip · 1 month ago

Who would’ve thought

5714@lemmy.dbzer0.com · 1 month ago

Hardware people sounds like a euphemism for protogen.

1 month ago

Rain World: Slugcat Game@lemmy.world · 1 month ago

eww put an nsfw tag on that

Blisterexe@lemmy.zip · 1 month ago

TheGrandNagus@lemmy.world · 1 month ago

what

Rain World: Slugcat Game@lemmy.world · edit-2 1 month ago

joking about protogens being made of computer? and therefore completely different?

5714@lemmy.dbzer0.com · 1 month ago

Check your projections

mlg@lemmy.world · 1 month ago

I believe he was also worried people developing RISCV would make similar mistakes which would slow down adoption.

luckystarr@feddit.org · 1 month ago

He’s not the only one. Laptops especially seem to crash regularily nowadays, regardless the OS.

I’d like to see hardware classified by boringness and thus stability.

arran 🇦🇺@aussie.zone · 1 month ago

I’ve definitely moved back to desktops. Still have my laptops but I use them in limited cases.

Xanthrax@lemmy.world · 1 month ago

I’ve never had my laptop crash unless I was playing STALKER, GAMMA. What makes your laptop crash? I’m not doubting you, I’m just curious.

luckystarr@feddit.org · 1 month ago

Waking up from sleep mode mostly.

jerakor@startrek.website · 1 month ago

Every security feature ever made has basically started by absolutely dumping on S3 recovery. S3 recovery requires every device in the computer to give you a complete understanding of how to bring it up cold without engaging the boot flow. Sometimes devices don’t do this because they are lazy, other times they don’t do this for security reasons.

toothbrush@lemmy.blahaj.zone · edit-2 1 month ago

I have that too! It started after an update at the beginning of this month. It seems to be a new bug that I cant reliably replicate. Do you have an AMD cpu/gpu?

Xanthrax@lemmy.world · 1 month ago

Not op, I have AMD Ryzen 7, but I haven’t had the issue. I have all Windows bloatware un-installed or disabled, though.

Björn Tantau@swg-empire.de · 1 month ago

Had that until I stopped using the nvidia GPU.

tekato@lemmy.world · 1 month ago

That’s not what this is about. He’s complaining about hardware developers putting more work on kernel developers by making them patch all the CPU vulnerabilities that are introduced by trying to increase performance.

Deckweiss@lemmy.world · 1 month ago

I recently got a Minisforum V3 and put arch on it.

Not only has it never crashed so far, but sleep and waking up worked out of the box, which was a huge surprise to me.

FizzyOrange@programming.dev · 1 month ago

Not the kinds of bugs he is talking about. This is about spectre mitigations.

unrushed233@lemmings.world · 1 month ago

Not trying to shill for Apple or anything, but I have found MacBooks (excluding the 2015 MacBook, and the 2016-2020 Air and Pro models) to be extremely stable and reliable, especially since they use their custom ARM CPU/SOCs. It reminds me of the good old PowerPC days, these machines were also reliable, basically unbreakable like a tank. In build quality, hardware and software. With the ARM transition, Apple really appears to have brought back the glory days of computing (unfortunately not in terms of upgradability and repairability, but at least in quality, stability and reliability).

I’m even more excited for the continiously improving Linux support on these devices - thanks to the amazing Asahi Linux (!asahilinux@lemmy.world) project. Also consider following them on Mastodon: @AsahiLinux@treehouse.systems

Ptsf@lemmy.world · 1 month ago

I will say I’ve never ever even once had an issue with my M1 pro 16", can’t say that about any other laptop I’ve owned (be it battery swelling, software bugs, or “issues” one learns to live with like sleep mode causing boot crashes or sleep mode draining battery %). Kinda amazing in hindsight.

unrushed233@lemmings.world · 1 month ago

I will say I’ve never ever even once had an issue with my M1 pro 16",

Same for my M1 Pro 14", the only issue I have is that the macOS version of Firefox just absolutely obliterates my battery, I mostly use Safari now, because it’s much better optimized. That’s really quite unfortunate, but it’s not Apple’s fault, and I don’t see any hope for this, unless Mozilla decides to continue development of the Rust-based Servo browser engine, and eventually Firefox may switch away from the antiquated and incredibly inefficient Gecko code.