Could you please link the source code for the WhatsApp client, so that we can see the cryptographic keys aren't being stored and later uploaded to Meta's servers, completely defeating the entire point of Signal's E2EE implementation and ratchet protocol?
This may shock you, but plenty of cutting-edge application security analysis doesn't start with source code.
There are many reasons, but one of them is that for the overwhelming majority of humans on the planet, their apps aren't being compiled from source on their device. So since you have to account for the fact that the app in the App Store may not be what's in some git repo, you may as well just start with the compiled/distributed app.
Whether or not other people build from source code has zero relevance to a discussion about the trustworthiness of security promises coming from former PRISM data providers about the closed-source software they distribute. Source availability isn't theater, even when most people never read it, let alone build from it. The existence of surreptitious backdoors and dynamic analysis isn't a knock against source availability.
Signal and WhatsApp do not belong in the same sentence together. One's open source software developed and distributed by a nonprofit foundation with a lengthy history of preserving and advancing accessible, trustworthy, verifiable encrypted calling and messaging going back to TextSecure and RedPhone, the other's a piece of proprietary software developed and distributed by a for-profit corporation whose entire business model is bulk harvesting of user data, with a lengthy history of misleading and manipulating their own users and distributing user data (including message contents) to shady data brokers and intelligence agencies.
To imply these two offer even a semblance of equivalent privacy expectations is misguided, to put it generously.
These are words, but I don't understand how they respond to the preceding comment, which observes that binary legibility is an operational requirement for real security given that almost nobody uses reproducible builds. In reality, people meaningfully depend on work done at the binary level to ensure lack of backdoors, not on work done at the source level.
The preceding comment is saying that source security is insufficient, not that transparency is irrelevant.
Source availability is what makes a chain of trust possible that simply isn't meaningfully possible with closed source software, even with dynamic analysis, decompilation, reverse engineering, runtime network analysis with TLS decryption, etc.
Both you and the preceding commenter are correct that just running binaries signed and distributed by Alphabet (Google) and/or Apple presents room for additional risks beyond those observable in the source code, but the solution to this problem isn't to say "and therefore source availability doesn't matter at all for anyone", it's to choose to build from source or to obtain and install APKs built and signed by the developers, such as via Accrescent or Obtanium (pulls directly from github, gitlab, etc releases).
There's a known-good path. Most people do not take the known-good path. Their choice to do so does not invalidate or eliminate the desirable properties of known-good path (verifiability, trustworthiness).
I genuinely do not understand the argument you and the other user are making. It reads to me like an argument that goes "Yes, there's a known, accurate, and publicly documented recipe to produce a cure for cancer, but it requires prerequisite knowledge to understand that most people lack, and it's burdensome to follow the recipe, so most people just buy their vials from the untrustworthy CancerCureCorporation, who has the ability to give customers a modified formula that keeps them sick rather than giving them the actual cure, and almost nobody makes the cure themselves without going through this untrustworthy but ultimately optional intermediary, so the public documentation of the cure doesn't matter at all, and there's no discernable difference between having the cure recipe and not having the cure recipe."
No, you're completely off the rails from the first sentence. It is absolutely possible --- in some ways more possible[†] --- to make a chain of trust without source availability. Your premise is that "reverse engineering" is somehow incomplete or lossy with respect to uncovering software behavior, and that simply isn't true.
[†] Source is always good to have, but it's insufficient.
Never once anywhere in this thread have I claimed that source code alone is sufficient by itself to establish a chain of trust, merely that it is a necessary prerequisite to establish a chain of trust.
That said, you seem to be refuting even that idea. While your reputation precedes you, and while I haven't been in the field quite as long as you, I do have a few dozen CVEs, I've written surreptitious side channel backdoors and broken production cryptographic schemes in closed-source software doing binary analysis as part of a red team alongside former NCC folks. I don't know a single one of them who would say that lacking access to source code increases your ability to establish a chain of trust.
Can you please explain how lacking access to source code, being ONLY able to perform dynamic analysis, rather than dynamic analysis AND source code analysis, can ever possibly lead to an increase in the maximum possible confidence in the behavior of a given binary? That sounds like a completely absurd claim to me.
I see what's happening. You're working under the misapprehension that static analysis is only possible with source code. That's not true. In fact: a great deal of real-world vulnerability research is performed statically in a binary setting.
There's a lot of background material I'd have to bring in to attempt to bring you up to speed here, but my favorite simple citation here is just: Google [binary lifter].
This assumption about me is not accurate at all, I've done static analysis professionally on CIL, on compiled bytecode, and on source code. Instead of being condescending and patronizing to someone you don't know that you've made factually inaccurate assumptions about, can you please explain how having just a binary and no access to source code gives you more information about, greater confidence in, and a stronger basis for trust in the behavior of a binary than having access to the binary AND the source code used to build it?
I have no idea who you are and can only work from what you write here, and with this comment, what you've written no longer makes sense. The binary (or the lifted IR form of the binary or the control flow graph of the binary or whatever form you're evaluating) is the source of truth about what a program actually does, not the source code.
The source code is just a set of hints about what the binary does. You don't need the hints to discern what a binary is doing.
I'm not refuting that the binary is the source of truth about behavior, I never stated it wasn't, and I don't know where you even got the idea that I wasn't. It's been very frustrating to have to repeatedly do this - you and akerl_ have both been attacking strawman positions I do not hold and never stated, and being condescending and patronizing in the process. Is it possible you're making assumptions about me based on arguments made by other people that sound similar to the ones I'm making? I'd really appreciate not having to keep reminding you that I've never made the claims you're implying I'm making, if that's not too much to ask of you.
At a high level, what I'm fundamentally contending is that WhatsApp is less trustworthy and secure than Signal. I can have a higher degree of confidence in the behavior and trustworthiness of the Signal APK I build from source myself than I can from WhatsApp, which I can't even build a binary of myself. I'd simply be given a copy of it from Google Play or Apple's App Store.
Signal's source code exhibits known trustworthy behavior, i.e. not logging both long-term and ephemeral cryptographic keys and shipping them off to someone else's servers. Sure, Google Play and Apple can modify this source code, add a backdoor, and the binary distributed by Google Play and Apple can have behavior that doesn't match the behavior of the published source code. You can detect this fairly easily, because you have a point of reference to compare to. You know what the compiled bytecode from the source code you've reviewed looks like, because you can build it yourself, no trust required[1], it's not difficult to see when that differs in another build.
With WhatsApp, you don't even have a point of reference of known good behavior, i.e. not logging both long-term and ephemeral cryptographic keys and shipping them off to someone else's server, in the first place. You can monitor all the disk writes, you can monitor all the network activity. Just because YOU don't observe cryptographic keys being logged, either in-memory, or on disk, or being sent off to some other server, doesn't mean there isn't code present to perform those exact functions under conditions you've never met and never would - it's entirely technically feasible for Google and Apple to be fingerprinting a laundry list of identifiers of known security researchers and be shipping them binaries with behavior that differs from the behavior of ordinary users, or even for them to ship targeted backdoored binaries to specific users at the demand of various intelligence agencies.
The upper limit for the trustworthiness of a Signal APK you build from source yourself is on a completely different planet from the trustworthiness of a WhatsApp APK you only have the option of receiving from Google.
And again, none of this even begins to factor in Meta's extensive track record on deliberately misleading users on privacy and security through deceptive marketing and subverting users' privacy extensively. Onavo wasn't just capturing all traffic, it was literally doing MITM attacks against other companies' analytics servers with forged TLS certificates. Meta was criminally investigated for this and during discovery, it came out that executives understood what was going on, understood how wrong it was, and deliberately continued with the practice anyway. Actual technical analysis of the binaries and source code aside, it's plainly ridiculous to suggest that software made by that same corporation is as trustworthy as Signal. One of these apps is a messenger made by a company with a history of explicitly misleading users with deceptive privacy claims and employing non-trivial technical attacks against their own users to violate their own users' privacy, the other is made by a nonprofit with a track record of being arguably one of the single largest contributors to robust, accessible, audited, verifiable secure cryptography in the history of the field. I contend that suggesting these two applications are equally secure is irrational, impossible to demonstrate or verify, and indefensible.
[1] Except in your compiler, linker, etc... Ken Thompson's 'Reflections on Trusting Trust' still applies here. The argument isn't that source code availability automatically means 100% trustworthy, it means the upper boundary for trustworthiness is higher than without source availability.
It's clear we're not going to agree on the technical discussion, but I do want to reply to the claim that I've been strawmanning you.
I've been largely ignoring your sideline commentary about not trusting Meta and their other work outside of WhatsApp. Mostly because the whole thrust of my argument is that an app's security is confirmed by analyzing what the code does, not by listening to claims from the author.
Beyond that, I've been commenting in good faith about the core thrust of our disagreement, which is whether or not a lack of available source code disqualifies WhatsApp as a viable secure messaging option alongside Signal.
As part of that, I had to respond midway through because you put a statement in quotation marks that was not actually something I'd said.
Sorry, no, I'm not going to pick this apart. You wrote:
Can you please explain how lacking access to source code, being ONLY able to perform dynamic analysis, rather than dynamic analysis AND source code analysis, can ever possibly lead to an increase in the maximum possible confidence in the behavior of a given binary?
This doesn't make sense, because not having source code doesn't limit you to dynamic analysis. I assumed, 2 comments back, you were just misunderstanding SOTA reversing; you got mad at me about that. But the thing you "never stated it wasn't" is right there in the comment history. Acknowledge that and help me understand where the gap was, or this isn't worth all the words you're spending on it.
Great, then it sounds like we agree: your original equivalence of Signal and WhatsApp was misguided, since one offers a verifiable chain of trust that starts with source availability and the other doesn't, to say nothing of the lengthy history of untrustworthiness and extensive, deliberate privacy violations of the company that owns and maintains WhatsApp, right?
No, we don’t agree. There are things that source code is good for, but validating the presence or absence of illicit data stealing code in apps delivered to consumers is not one of those things. For that, source code can show you obvious malfeasance, but since it’s not enough to rule out obvious malfeasance, you’re stuck going to analysis of the compiled app in both cases.
The population of users who have a verifiable path from an open source repo to an app on their device is a rounding error in the set of humans using messaging apps.
I think we've both made our positions clear. From my perspective, you're continuing to heavily cite user statistics that are irrelevant to the properties of verifiability or trustworthiness of the applications themselves, the goalposts I am discussing keep being moved, and there is a repeated pattern of neglect to address the points I'm raising. Readers can judge for themselves. Curious readers should also read about the history of Meta's Onavo VPN software and resulting lawsuits and settlements in evaluating the credibility of Meta's privacy marketing.
Just to be crystal clear about the goalposts: I said at the start of this chain that if somebody wants secure messaging, they should use Signal or WhatsApp.
You raised concerns about lack of source availability, and I’ve been consistent in my replies that source availability is not the way that somebody wants secure messaging is going to know they’re getting it. They’re going to get it because they’re using a popular platform with robust primitives, whose compiled/distributed apps receive constant scrutiny from security researchers.
Signal and WhatsApp are that. Concerns about Meta’s other work are just noise, in part because analysis of the WhatsApp distributed binaries doesn’t rely on promises from Meta.