When I was on Google Docs, I watched the Google Forms team build a sophisticated ML model that attempted to detect when people were using it for nefarious purposes.
It underperformed banning the word "password" from a Google Form.
I wonder if this is just an example of Goodhart's law. How did they measure performance of those models? I would imagine they tried measuring against known cases of forms misuse, aka those forms that contained 'password' field.
I followed his blog back when he started this descent, and I have a theory that it was hill climbing.
He used to blog about pretty innocent stuff; his wife making fun of him for wearing pajama pants in public, behind the scenes on drawing comics, funny business interactions he'd had. But then he started getting taken out of context by various online-only publications, and he'd get a burst of traffic and a bunch of hate mail and then it'd go away. And then he'd get quoted out of context again. I'm not sure if it bothered him, but he started adding preambles to his post, like "hey suchandsuch publication, if you want to take this post out of context, jump to this part right here and skip the rest."
I stopped reading around this point. But later when he came out with his "trump is a persuasion god, just like me, and he is playing 4d chess and will be elected president" schtick, it seemed like the natural conclusion of hill climbing controversy. He couldn't be held accountable for the prediction. After all, he's just a comedian with a background in finance, not a politics guy. But it was a hot take on a hot topic that was trying to press buttons.
I'm sure he figured out before most people that being a newspaper cartoonist was a downward-trending gig, and that he'd never fully transition to online. But I'm sad that this was how he decided to make the jump to his next act.
Ahh, so that's what I've internally called "The Sharpiro Effect" really is. Though it's still a bigger shame that a philosophy professor would need to resort to this compared to a newpaper cartoonist.
I should have clarified for people who had the good fortune to not be exposed to these posts, but that was usually his lead-in to his ultra toxic writing. i.e. it was an engaging hook that led to more engaging trolling
"Chuck Norris facts" was a text-only meme format from the mid '00s. Stuff like "Chuck Norris is the only man to ever defeat a brick wall in a game of tennis" or "When Chuck Norris does push-ups, he doesn't push himself up, he pushes the Earth down." The Jeff Dean Facts use the same format. It doesn't have anything to do with Chuck Norris himself.
I vaguely remember another instance of this around a guy in the army - I forgot if it was at boot camp or what the rank was, but it was something along the lines of “things I’m no longer allowed to do” and just had a bunch of silly military joke/prank type things… man I wonder if I could dig that up again, I think it might have been late 90s internet.
That would be Skippy's List[0], which as far as I know is the seminal work in the genre (at least on the internet). I originally learned about it through a (rather less compact) version about someone's D&D crimes[1], which was closer to my cultural wheelhouse, but the original holds up even if you have to google some phrases.
With the current crop of LLMs/agents, I find that refactors still have to be done at a granular level. "I want to make X change. Give me the plan and do not implement it yet. Do the first thing. Do the second thing. Now update the first call site to use the new pattern. You did it wrong and I fixed it in an editor; update the second call site to match the final implementation in $file. Now do the next one. Do the next one. Continue. Continue.", etc.
Because the systems are so complex and capable of emergent behavior that you need a human in the loop to truly interpret behavior and impact. Just because an alert is going off doesn't mean that the alert was written properly, or is measuring the correct thing, or the customer is interpreting its meaning correctly, etc.
Health probes are at the easiest side of software complexity spectrum. It has nothing to do with it and everything with managing reputational damage in shady way.
Cash flow is another facet of paying off your mortgage early, and I think it’s underrated. Eliminating thousands of dollars from your monthly expenses dramatically increases your flexibility. Since most people have “cash / reserve fund” and “retirement investments (do not touch)” as their major financial categories, it optimizes the one you interact with the most. You don’t need to always make the maximum possible to keep a comfortable amount of cash on hand, which gives you more flexibility to take time off between jobs, or tank a layoff, or take that startup job that pays less (but damn if it doesn’t look fun). Personally I recently bought a second apartment adjacent to my first in order to combine them into a 3br. Paying off the first mortgage years ago was the difference between being able to afford the monthly expenses and not.
Obviously you need to consider both net worth and cash flow when making a decision like that, but don’t underrate the difference that improved cash flow makes!
On top of that, the software world has changed dramatically since Bazel was first released. In practice, a git hash and a compile command for a command runner are more than enough for almost everyone.
What has changed in the past ~15 years? Many libraries and plugins have their own compilers nowadays. This increases the difficulty of successfully integrating with Bazel. Even projects that feel like they should be able to properly integrate Bazel (like Kubernetes) have removed it from the project as a nuisance.
Back when it was first designed, even compiling code within the same language could be a struggle; I remember going through many iterations of DLL hell back when I was a C++ programmer. This was the "it works on my machine" era. Bazel was nice because you could just say "Download this version of this thing, and give me a BUILD file path where I can reference it." Sometimes you needed to write some Starlark, but mostly not.
But now, many projects have grown in scale and complexity and they want to have their own automated passes. Just as C++ libraries needed special library wrappers for autotools within Bazel, now you often need to write multiple library compiler/automation wrappers yourself in any context. And then you'll find that Bazel's assumptions don't match the underlying code's. For example, my work's Go codebase compiles just fine with a standard Go compiler, but gazelle pukes because (IIRC) one of our third-party codegen tools outputs files with multiple packages to the same directory. When Etsy moved its Java codebase to Bazel, they needed to do some heavy refactoring because Bazel identified dependency loops and refused to compile the project, even though it worked just fine with javac. You can always push up your monocle and derisively say "you shouldn't have multiple packages per directory! you shouldn't have dependency loops!", but you should also have a compiler that can run your code just like the underlying language without needing to influence it at all.
That's why most engineers just need command runners. All of these languages and libraries are already designed to successfully run in their own contexts. You just need something to kick off the build with repeatable arguments across machines.
I mostly found TLM a disservice to people who reported to TLMs. They didn't have to earn a promotion as both an engineer and a manager at the same time, so many optimized for their own engineering promotion and any managing they did was out of the goodness of their hearts.
As a devil's advocate (I don't work in Google or in a similar role) but if the requirements for engineering promotion are similar for technical managers and engineers, while the first have to manage people then this is just how the system is set up. In this case I think blaming the system more than people is justified, and Google decided to dismantle the role for some reason.
It underperformed banning the word "password" from a Google Form.
So that's what they went with.
reply