Maybe my imagination is limited or our documents aren't complex enough, but are we talking about realistic written documents? I'm sure you can take a screenshot of a very complex spreadsheet and it fails, but in that case you already have the data in structured form anyway, no?
It's not really about SEC filings, though. While we folks on HN would never think of hard copies of invoices, but much of the world still operates this way.
As mentioned above I have about 200 construction invoices. They are all formatted in a way that doesn't make sense. Most fail both OCR and OpenAI
Now if someone mails or faxes you that spreadsheet? You're screwed.
Spreadsheets are not the biggest problem though, as they have a reliable 2-dimensional grid - at worst some cells will be combined. The form layouts and n-dimensional table structures you can find on medical and insurance documents are truly unhinged. I've seen documents that I struggled to interpret.
To be fair, this is problematic for humans too. My old insurer outright rejected things like that stating it's not legible.
(I imagine it also had the benefit of reducing fraud/errors).
In this day and age, it's probably easier/better to change the process around that as there's little excuse for such shit quality input. I understand this isn't always possible though.
I think with the web UI it is a little more user friendly but not as super familiar with FRP. I think we might have a little more authentication control on top of the tunnel for web traffic as well.
Yup. aws-cli is not dedicated for file sync. It was strange that there is sync command when aws-cli is meant to be a wrapper for a single api calls at the beginning.
It is not same as OP, but according to a similar o11y stack on top of Clickhouse, Signoz, Clickhouse based logging costs less than ELK for storage and performs better: