More

wooosh · on June 6, 2024

Probably not the most practical attack, but it is very possible to MITM the connection between the keyboard itself and the motherboard.

izacus · on June 6, 2024

And then return me my laptop and steal it again?

xzjis · on June 6, 2024

You have to consider what kind of risk you are protecting yourself against.

It's highly unlikely that you would be the target of such a highly sophisticated attack, but a hacker could get into a place where you left your computer without surveillance (such as your home or a hotel) for about 15 minutes, and install it inside your computer.

If you think you could be the target of such an attack, you could maybe enable an alert in the settings of your UEFI if your computer has been opened (I know that my ThinkPad has that option), or the better option is to always keep your laptop with you.

izacus · on June 6, 2024

I'm mostly asking because the original poster was painting a process that can be sniffed off the bus (that is - buy a stolen laptop off ebay, try to boot it, sniff the key off the bus) with a process that requires active targeting and multiple breakins to work as equivalent.

It seems like these security discussions always devolve into rather funny moving of goalposts without actually considering how much work each exploit requires.

michaelt · on June 6, 2024

The goalposts haven't moved in my mind, but I suppose I didn't make them clear in my first post.

Basically the TPM provides a set of features that are really useful for corporate Windows deployments. No more forgotten passwords, because the self-unlocking disk encryption sends the user straight to the Windows login screen, and helpdesk can reset forgotten Windows passwords remotely.

And for casual home Windows users, it lets them log in with a 4-digit PIN or with biometrics, so it's got usability benefits for them too. If every OS now needs Microsoft's signature of approval, or a really fiddly setup process? Well they were running Windows anyway, so no problem.

These usability/support benefits rely on self-unlocking disk encryption, which is vulnerable to sniffing if someone gets a stolen laptop on ebay.

For the kind of technically sophisticated, security enthusiast users who comment on blog posts about TPMs? We're more than happy to key in a strong unique password at every boot, and if we forget the password and lose access to everything on that disk that's just the system working as it's supposed to.

For us, the benefits of TPMs and measured boot for personal use are a lot more obscure. You'll sometimes hear people claim it protects against 'evil maid attacks' where an attacker repeatedly gets physical access to your laptop. The truth is it provides no such protection.

Foxboron · on June 6, 2024

> For us, the benefits of TPMs and measured boot for personal use are a lot more obscure. You'll sometimes hear people claim it protects against 'evil maid attacks' where an attacker repeatedly gets physical access to your laptop. The truth is it provides no such protection.

TPMs give you fine and adequate protections in many scenarios, even physical ones.

They also provide you with better protection for private key material.

I'll even give you an example:

https://github.com/Foxboron.keys

The last key is a TPM key from my `ssh-tpm-agent` project: https://github.com/Foxboron/ssh-tpm-agent

Here is the private key: https://paste.xinu.at/9fc2YJQuUCbg1Sa/

I don't remember if the key has a PIN (it was for a presentation/demonstration), but if it has it's like 4 digits long.

michaelt · on June 6, 2024

> TPMs give you fine and adequate protections in many scenarios [...] my `ssh-tpm-agent` project

I agree that's adequate, in the sense that keeping the an SSH key as a password-protected file on disk is adequate, and having it be a password-protected secret in the TPM is no less secure than that.

But the whole point of binding a key to hardware is to be secure even if a remote attacker has gotten root on your machine. An attacker with root can simply replace the software that reads your PIN with a modified version that also saves it somewhere. Then they can use the key whenever your computer is online, even if they can't copy the key off. And although that's a bit limiting, once they've SSHed to a host as me once they can add their own key to authorized_keys in many cases.

That's why Yubikeys and U2F keys and suchlike have a physical button.

TPMs would be a lot more useful if the spec had mandated a physical button for user presence.

Foxboron · on June 6, 2024

> But the whole point of binding a key to hardware is to be secure even if a remote attacker has gotten root on your machine. An attacker with root can simply replace the software that reads your PIN with a modified version that also saves it somewhere. Then they can use the key whenever your computer is online, even if they can't copy the key off.

It protects against extraction, not usage on the machine itself. Of course they can use the secret on the compromised machine.

> And although that's a bit limiting, once they've SSHed to a host as me once they can add their own key to authorized_keys in many cases.

Assuming they can edit the file.

> That's why Yubikeys and U2F keys and suchlike have a physical button.

The TPM spec has a policy setup to account for some fingerprint reader that can be used to authenticate. I haven't been able to figure out how/what/whys of the implementation here but this is very much a thing.

michaelt · on June 6, 2024

> It protects against extraction, not usage on the machine itself. Of course they can use the secret on the compromised machine.

Yes, this is why I was careful to say that the benefits are obscure, rather than saying they're entirely nonexistent.

I'll admit that's a benefit, but it seems very small benefit considering the far-reaching changes it's needed like kernel lockdown mode, the microsoft-signed shim, distro-signed initrd, the difficulties it creates with DKMS, and so on.

Whereas people who need to bind their SSH key to hardware can get a higher degree of security with a far smaller attack surface by simply spending an hour's wages on a Yubikey.

Foxboron · on June 6, 2024

> I'll admit that's a benefit, but it seems very small benefit considering the far-reaching changes it's needed like kernel lockdown mode, the microsoft-signed shim, distro-signed initrd, the difficulties it creates with DKMS, and so on

None of this is needed to take advantage of TPMs.

> Whereas people who need to bind their SSH key to hardware can get a higher degree of security with a far smaller attack surface by simply spending an hour's wages on a Yubikey.

Yubikeys are expensive devices, and TPMs are ubiquitous. Better tooling solves this problem.

michaelt · on June 6, 2024

> None of this is needed to take advantage of TPMs.

You're not binding the secret to PCR values? I thought TPM fans loved those things?

I don't blame you - they look like a design-by-committee house of cards to me, with far too many parties involved and far too much attack surface. Just like the rest of the TPM spec.

Foxboron · on June 6, 2024

> You're not binding the secret to PCR values? I thought TPM fans loved those things?

Binding things to PCR values doesn't imply you need Secure Boot, signed initrd, lockdown mode, shim and signed kernel modules. All of these things are individual security measures that can be combined depending on your threat model.

> I don't blame you - they look like a design-by-committee house of cards to me, with far too many parties involved and far too much attack surface. Just like the rest of the TPM spec.

The v2.0 version of TPM doesn't really make PCR policies easier to use, so I've had troubles getting them properly integrated into the tools I write as you need to deal with a key to sign updated policies. `systemd-pcrlock` might solve parts of this but it's all a bit.. ugly to deal with really.

The entire TPM specc is not great. But I find TPMs too useful to ignore.

AnthonyMouse · on June 6, 2024

> Basically the TPM provides a set of features that are really useful for corporate Windows deployments. No more forgotten passwords, because the self-unlocking disk encryption sends the user straight to the Windows login screen, and helpdesk can reset forgotten Windows passwords remotely.

Unclear why this requires a TPM. Boot the system from a static unencrypted partition containing no sensitive data, display the login screen, when the user authenticates the system uses their credentials to get the FDE decryption key from the directory server. Bonus: Now the FDE keys are stored in the directory server and if the system board fails in the laptop you can remove the drive and recover the data.

An attacker with physical access could modify the unencrypted partition to compromise the user's password the next time the user logs in, but they could do the same thing with a hardware keylogger.

> And for casual home Windows users, it lets them log in with a 4-digit PIN or with biometrics, so it's got usability benefits for them too.

This could be implemented the same way using Microsoft's servers, given that they seem to insist you create a Microsoft account these days anyway.

It's not clear that unsophisticated users actually benefit from default-FDE though. They're more likely to lose their data to it than have it protect them from theft, and losing your family photos is generally more of a harm than some third party getting access to your family photos.

vel0city · on June 6, 2024

What happens when I try and login offline or unable to reach a directory server?

FWIW, Bitlocker already can store recovery keys in AD. It has been a feature for a long time.

AnthonyMouse · on June 7, 2024

If the machine is already on but asleep, the keys are in memory, they only have to be downloaded from the server on first login. If the machine has been off and you have no network connection then you need the long password to unlock it instead of the short one, but for most users that is already irrelevant because everything else requires a network connection too.

vel0city · on June 7, 2024

Ah ok, so I'll need to memorize the super long password whenever I'm out and about and want to just check something real quick. I guess I'll just put that on the sticky note on the bottom of the computer.

AnthonyMouse · on June 7, 2024

You want to check something real quick on what... the internet? Then you have internet access. You also have access to the local data on the machine as long as it was asleep rather than off, which will be the case the vast majority of the time.

Keeping the key stored on the machine, TPM or no, is also less secure than keeping it somewhere else. If someone steals your laptop, you deny all access to the key on the server and they can't get it even if they could guess the pin (or the user wrote that on the bottom of the computer), and there is no way to use an offline method to extract the key from the TPM because it isn't there.

vel0city · on June 7, 2024

> You want to check something real quick on what

Computers have these neat things called "local filesystems". They're a real hoot. Maybe you can get one on your computer.

AnthonyMouse · on June 7, 2024

So the sole legitimate use case for a TPM is when you're somewhere with neither cellular service nor Wi-Fi (rare) and your portable device is off rather than asleep (rare) and you can't remember a long passphrase, which doesn't have to be unmemorable, it's just less convenient to type.

This seems like it isn't worth the cost in authoritarianism?

For that matter you could still implement even that with just a secure enclave that will only release the key given the correct PIN (and then rate limits attempts etc.), but then does actually release the key in that case and doesn't do any kind of remote attestation or signing.

vel0city · on June 8, 2024

> a secure enclave that will only release the key given the correct PIN

So...a TPM?

> This seems like it isn't worth the cost in authoritarianism?

You know what's really authoritarian? Having your computer practically only decryptable by some remote directory server, potentially not even under your control.

michaelt · on June 6, 2024

A high grade hardware implant doesn't just capture your password, it'll also replay your password along with a curl | sudo bash at 4am

rwmj · on June 6, 2024

Bluetooth keyloggers are a thing. The attacker would need to be nearby.

xattt · on June 6, 2024

Not if there’s some sort of cell bridging device nearby as well.

HeatrayEnjoyer · on June 6, 2024

Relays can be prevented with a round-trip timeout. Limit to 8ft/c, should be plenty for a keyboard. You can't outpace light.

izacus · on June 6, 2024

I'd have to use bluetooth keyboard then, right?

Foxboron · on June 6, 2024

Glitter nailpolish on your machine seams/screws and tamper detection. Keyboard sniffing is not as trivial as people make it out to be.

wooosh · on Jan 12, 2023

A similar project is rr[0], which is freely available. Like you said, I find that reversible debuggers are a huge improvement over regular debuggers because of the ability to record an execution and then effectively bisect the trace for issues.

[0]: https://rr-project.org/

wooosh · on Sept 5, 2022

Intel has an implementation of this technique here as well:

https://github.com/intel/hyperscan

wooosh · on Aug 8, 2022

Yes, this is on a single 8GB 2667MHz DIMM in a laptop.

edit: For dual channel RAM, I would suspect the throughput depends on how the kernel decides to map physical memory to virtual addresses.

adrian_b · on Aug 8, 2022

The memory is already mapped by the BIOS/EFI firmware, before the kernel takes control.

By default, whenever the memory modules used in all different channels have the same size, e.g. two 8 GB modules, the firmware maps the modules with interleaved addresses, to ensure a double throughput for 2 channels, or triple/quadruple/etc. for workstation/server motherboards with more memory channels.

wooosh · on Aug 8, 2022

> I doubt my current garage band could afford the OP just this moment, but I sure wish we could!

Well, I intend to finish high school at a minimum before pursuing employment :)

benreesman · on Aug 8, 2022

Someone as far ahead of the curve as you clearly are might enjoy the second chapter of Coders at Work, which is an interview with Brad Fitzpatrick. bfitz wrote everything from memcached to big chunks of TailScale and much in between.

He went to university in CS but would have been bored to sleep if he didn’t have something else going on, so he founded and ran LiveJournal simultaneously.

I bet someone on this thread knows him and I bet he’d take the time to offer some pointers to an up-and-comer like yourself. I’ve never met him, but I did some business with Six Apart in a previous life and people say he’s a really nice guy.

Either way, keep it up!

wooosh · on Aug 7, 2022

This was actually considered, and other libraries do ignore checksums, or at least have options to:

https://github.com/richgel999/fpng/issues/9

wooosh · on Aug 7, 2022

Unfortunately I haven’t had the time to do a proper benchmark, and the fpng test executable only decodes/encodes a single image which produces very noisy/inconclusive results. However, I’m under the impression that it doesn’t make a large difference in terms of overall time.

fpnge (which I wasn’t aware of until now) appears to already be using a very similar (identical?) algorithm, so I suspect the relative performance of fpng and fpnge would not be significantly impacted by this change.

Nyan · on Aug 7, 2022

As someone who has been recently optimising fpnge, Adler32 computation is pretty much negligible regarding overall runtime. The Huffman coding and filter search take up most of the time. (IIRC fpng doesn't do any filter search, but Huffman encoding isn't vectorized, so I'd expect that to dominate fpng's runtime)

wooosh · on Aug 4, 2022

Not sure if any of these would result in meaningful performance gains, but a few ideas I had:

* An avx96/avx128 version, which requires more care than avx32/avx64 because you will overflow a 16 bit signed number if you simply extend the coefficient vectors from 0..32 to 0..96/128 (e.g. 255*96 + 254*96 > 32767), but looking at it now, I realize you shouldn't actually need more than one 0..32 coefficient vector.

* The chunk length could be longer because there are 8 separate 32 bit counters in each vector, which can be summed into a uint64_t instead of a uint32_t when computing the modulo.

* As you said, aligning the loads and deferring the `_mm256_madd_epi16` outside of the loop. For deferring the madd specifically, using two separate sum2 vectors and splitting the `mad` vector into two by using `_mm256_and_si256(mad, _mm256_set1_epi32(0xFFFF)` and `_mm256_srli(mad, 16)` which should improve upon the 5 cycle latency hit incurred by the madd.

Plus I am sure there are many other opportunities to optimize this I have not thought of :)

Nyan · on Aug 4, 2022

Nice!

> For deferring the madd specifically, using two separate sum2 vectors and splitting the `mad` vector into two

Actually, the idea was to accumulate into 16-bit sums, and only do madd to 32-bit every 4 loop cycles. I'm not sure splitting it up like that actually helps, since the latency can be easily hidden by an OoO processor, and could actually be detrimental adding more uOps.

One thing to note is that you've got a dependent add chain on sum2_v, so using two independent sums instead of one could help.

> Plus I am sure there are many other opportunities to optimize this I have not thought of :)

Other implementations I've seen don't go any further, e.g. https://github.com/zlib-ng/zlib-ng/blob/develop/arch/x86/adl... https://github.com/veluca93/fpnge/blob/9a9fc023870bacd06674f...

So perhaps as you allude to, it isn't really worth it.

wooosh · on May 22, 2022

Wouldn’t it be more than one person per computer though, seeing as it’s a "home" computer that would likely be shared?

wooosh · on May 6, 2022

You can debug programs that ran in the past using debuggers like rr[0], which support both recording execution for later debugging, or stepping backward in a running process.

[0]: https://rr-project.org/