You'd have to be doing something where the unified memory is specifically necessary, and it's okay that it's slow. If all you want is to run large LLMs slowly, you can do that with split CPU/GPU inference using a normal desktop and a 3090, with the added benefit that a smaller model that fits in the 3090 is going to be blazing fast compared to the same model on the spark.
Eh, this is way overblown IMO. The product page claims this is for training, and as long as you crank your batch size high enough you will not run into memory bandwidth constraints.
I've finetuned diffusion models streaming from an SSD without noticeable speed penalty at high enough batchsize.
(Edit: GB of course, not MB, thanks buildbot)