malloc() == 'reservation' (but not paged in!) memory
// If touched / updated THEN the memory's paged in
A copy _might_ not even become a copy if the kernel's smart enough / able to setup a hardware trigger to force a copy on writes to that area, at which point the physical memory backing two distinct logical memory zones would be copied and then different.
That's a good point that Linux doesn't actually allocate the pages until they're faulted in by a read or write. So, if it were doing some kind of thread inspection optimization, it would presumably just need to check if the faulting thread is currently in a loop that will overwrite at least the full page.
However, that wouldn't solve the problem of other threads in the same process being able to see the page before it's fully overwritten, or debugging processes, or using a signal handler to invisibly jump out of the initialization loop in the middle, etc. There are workarounds to all of these issues, but they all have performance and complexity costs.