Google really doesn't have a leg to stand on here. They scrape the Internet. They replace content against the wishes of users multiple different times, such as with AMP. Their entire business model recently has been to provide you answers they learned from scraping your website and now they want to sue other people who are doing the same.
Data wants to be free. They knew that once.
EDIT: Also to be clear I am not saying they can't win legally. I'm sure they can do legal games and could shop around until they were successful. They are in the wrong conceptually.
As the post says, Google only scrapes the websites that want to be scraped. Sure, it's opt-out (via robots.txt) rather than opt-in, but they do give you a choice. You can even decide between no scraping at all and opting out on a per-scraper basis, and Google will absolutely honor your preferences in that regard.
SERP API just assumes everybody wants to be scraped, and doesn't give you a choice.
(whether websites should have such a choice is a different matter entirely).
requiring me to explicitly opt-out of something is NOT the same thing as getting my consent. So your argument breaks down there.
You know what getting my consent would look like? Google hosting a form where i can tell them PLEASE SCRAPE MY WEBSITE and include it in your search results. That is what consent looks like.
Google has never asked for my consent. Yet they expect others to behave by different rules.
Now where google may have a reasonable case is that google scrapes with the intention of offering the data “for free”. SerpAPI does not.
It's never been the case that if you put something into public, then you get to reserve your right to refuse public access. Either it's public and strangers can look at it. Or it's private and you need to implement a gate.
If this is about protecting third parties from being scraped, why does Google have an interest at all? Surely Google won't have the relevant third-party data itself because, as you say, Google respects robots.txt. So how can that data be scraped from Google?
I don't think this suit is actually about that, though. I think Google's complaint is that
> SerpApi deceptively takes content that Google licenses from others
In other words, this is just a good old-fashioned licence violation.
Unfortunately they do have a couple of points that may prove salient (though I fully agree about them being scrapers also).
You can search Google _for free_ (with all the caveats of that statement), part of their grievance is that serpapi use the scraped data as a paid for service
Lots of Google bot blocking is also circumvented, which they seem to have made a lot of efforts towards in the past year
- robots.txt directives (fwiw)
- You need JS
- If you have no cookie you'll be given a set of JS fingerprints, apparently one set for mobile and one for desktop. You may have to tweak what fingerprints you give back in order to get results custom to user agent etc.
Google was never that bothered about scraping if it was done at a reasonable volume. With pools of millions of IPs and a handle on how to get around their blocking they're at the mercy of how polite the scraping is. They're maybe also worried about people reselling data en masse to competitors i.e. their usual all your data belongs to us and only us.
I thought the ads counted as payment? That seems to be the logic used to take technical measures against adblockers on YouTube while pushing users towards a paid ad-free subscription, at least.
If viewing ads is payment, then Google isn't a free service. If viewing ads isn't payment, then Google should have no problem with people using adblockers.
I don't disagree with the logic and it definitely is/was their business model, scraping/crawling the web and subsidising the service with ads. But clicking on ads are optional.
Eh, and in 20 if SerpApi or whatever the fuck becomes the next google, they’ll have a blog post titled “Why we’re taking legal action against BlemFlamApi data collection”.
The biggest joke was all the “hackers” 25 years ago shouting “Don’t be evil like Oracle, Microsoft, Apple or Adobe and charge for your software, be good like Google and just put like a banner ad or something and give it away for free”
We need a legal precedent that enshrines adversarial interoperability as legal so that we can have a competitive market of BlemFlamApis with no risks of being sued.
It could be because when you leave an SQL server exposed it often turns into much worse things. For example, without additional configuration, PostgreSQL will default into a configuration that can own the entire host machine. There is probably some obscure feature that allows system process management, uploading a shell script or something else that isn't disabled by default.
The end result is "everyone" kind of knows that if you put a PostgreSQL instance up publicly facing without a password or with a weak/default password, it will be popped in minutes and you'll find out about it because the attackers are lazy and just running crypto-mine malware, etc.
IMO the tradeoff that is important here is a few microseconds of time sanitizing the memory saves the millions of dollars of headache when memory unsafe languages fail (which happens regularly)
I agree. I almost feel like this should be like a flag in `free`. Like if you pass in 1 or something as a second argument (or maybe a `free_safe` function or something), it will automatically `memset` whatever it's freeing with 0's, and then do the normal freeing.
Alternatively, just make free do that by default, adding a fast_and_furious_free which doesn't do it, for the few hotspots where that tiny bit of performance is actually needed.
The default case should be the safe correct one, even if it “breaks” backward compatibility. Without it, we will forever be saddled with the design mistakes of the past.
(a) they wanted to make it hard to know what console was the newest or oldest to promote users to just throw up their arms and buy the newest XBox?
(b) they wanted each one to be very unique and different sounding so they would never get confused with one another, but their sequence is difficult to understand
The current benefit for a Framework is that you can swap out the entire inner/guts without being an expert and everything still works together. Most of the laptops I have provide 2 SO-DIMM slots and a slot for either NVME or SATA for storage.
So for me, there is little value in that in most scenarios. There are a few laptop chassis that I am very fond of and have wished I could "use that chassis with that hardware", but even then I haven't seen Framework chassis designs that give me that impression. I'm not saying they're crappy, but I'm thinking of different types of brushed metal, magnesium alloy stuff, etc.
It makes me wonder who their audience is if they are targeting users that will pay a premium for an upgradable system, but are afraid of modifying the guts of the computer.
Is this why back in the day sometimes a Linux distro would have a multi-monitor setup where each monitor was an actual different desktop cube for example. There was a time when each window for an Nvidia graphics card in that type of configuration could not be moved from one screen to another, etc.
A very stupid hack that can work to "fix" this could be to buffer the h264 stream at the data center using a proxy before sending it to the real client, etc.
Yes, but the real issue (IMO) is that something is causing an avalanche of some kind. You would much rather have a consistent 100ms increased latency for this application if it works much better for users with high loss, etc. Also, to be clear, this is basically just a memory cache. I doubt it would add any "real" latency like that.
The idea is that if the fancy system works well on connection A and works poorly on connection B, what are the differences and how can we modify the system so that A and B are the same from it's perspective.
For all the praise he gets here, few seem interested in his methods: writing complete programs, based on robust computer science, with minimal dependencies and tooling.
When I first read the source for his original QuickJS implementation I was amazed to discover he created the entirety of JavaScript in a single xxx thousand line C file (more or less).
That was a sort of defining moment in my personal coding; a lot of my websites and apps are now single file source wherever possible/practical.
Is there any as large as possible single source (or normal with amalgamation version) more or less meaningful project that could be compiled directly with rustc -o executable src.rs? Just to compare build time / memory consumption.
Yes, that's why I've asked about possible rust support of creating such version of normal project. The main issue, I'm unaware of comparably large rust projects without 3rdparty dependencies.
I believe ripgrep has only or mostly dependencies that the main author also controls. It's structured so that ripgrep depends on regex crates by the same author, for example.
I honestly think the single file thing is best reserved for C, given how bad the language support for modularity is.
I've had the inverse experience dealing with a many thousand line "core.php" file way back in the day helping debug an expressionengine site (back in the php 5.2ish days) and it was awful.
Unless you have an editor which can create short links in a hierarchical tree from semantic comments to let you organize your thoughts, digging through thousands of lines of code all in the same scope can be exceptionally painful.
C has no problems splitting programs in N files, to be honest.
The reason FB (and myself, for what it is worth) often write single file large programs (Redis was split after N years of being a single file) is because with enough programming experience you know one very simple thing: complexity is not about how many files you have, but about the internal structure and conceptually separated modules boundaries.
At some point you mainly split for compilation time and to better orient yourself into the file, instead of having to seek a very large mega-file. Pointing the finger to some program that is well written because it's a single file, strlongly correlates to being not a very expert programmer.
The file granularity you chose was at the perfect level for somebody to approach the source code and understand how Redis worked. It was my favorite codebases to peruse and hack. It’s been a decade and my memory palace there is still strong.
It reminded me how important organization is to a project and certainly influenced me, especially applied in areas like Golang package design. Deeply appreciate it all, thank you.
I split to enforce encapsulation by defining interfaces in headers based on incomplete structure types. So it helps me with he conceptually separated module boundaries. Super fast compilation is another benefit.
Reminds of one time when I was pair programming and the other chair said “let’s chop this up, it’s too long” and when I queried the motivation (because I didn’t think it was too long), it was something like, “I’m very visual, seeing the file tree helps me reason about internals”. Fair enough, I thought at the time, whatever makes us more productive together.
On reflection, however, I’m unsure how that goes when working on higher-order abstractions or cross-cutting concerns that haven’t been refactored, and it’s too late to ask.
It may not be immediately obvious how to approach modularity since it isn't directly accomplished by explicit language features. But, once you know what you're doing, it's possible to write very large programs with good encapsulation, that span many files, and which nevertheless compile quite rapidly (more or less instantaneously for an incremental build).
I'm not saying other languages don't have better modularity, but to say that C's is bad misses the mark.
Unironically JavaScript is quite good for single file projects (albeit a package.json usually needed)
You can do a huge website entirely in a single file with NodeJS; you can stick re-usable templates in vars and absue multi-line strings (template literals) for all your various content and markup. If you get crafty you can embed clientside code in your 'server.js' too or take it to the next level and use C++ multi-line string literals to wrap all your JS ie- client.js, server.js and package.json in a single .cpp file
I agree: he loves to "roll your own" a lot. Re: minimal dependencies - the codebase has a software FP implementation including printing and parsing, and some home-rolled math routines for trigonometric and other transcendental functions.
Honestly, it's a reminder that, for the time it takes, it's incredibly fun to build from scratch and understand through-and-through your own system.
Although you have to take detours from, say, writing a bytecode VM, to writing FP printing and parsing routines...
Because he choose the hardest path. Difficult problems, no shortcuts, ambitious, taking time to complete. Our environment in general is the opposite of that.
We spend a lot of time doing busy work that's part of the process but doesn't actually move the needle. We write a lot of code that manages abstractions, but doesn't do a lot. All of this busy work feels like progress, but it's avoiding the hard work of actually writing working code.
We underestimate how inefficient working in teams is compared with individuals. We don't value skill and experience and how someone who understands a problem well can be orders of magnitude more productive.
You are absolutely wrong here. Most of us wish that somebody would get him to sit for an in-depth interview and/or get him to write a book on his thinking, problem-solving approach, advice etc. i.e. "we want to pick his brain".
But he is not interested and seems to live on a different plane :-(
Sure. You’ll notice no libraries, no CI, no issue tracker, written in C, no landing page, no dashboard.
So much of the discussion here is about professional practice around software. You can become an expert in this stuff without actually ever learning to write code. We need to remember that most of these tools are a cost that only benefits for managing collaboration between teams. The smaller the team the less stuff you need.
I also have insights from reading his C style but they may be of less interest.
I think it’s also impressive that he identifies a big and interesting problem to go after that’s usually impactful.
I thought Bellard might be behind even llama.cpp (that would be completely expected for Bellard) but it's actually another great who's done that: Georgi Gerganov: https://github.com/ggerganov
As a maintainer of an ASN.1 compiler, I think his ASN.1 compiler must be quite awesome (it's not open source), and it's brilliant of him to make it proprietary. I bet he makes good money from it.
I remember LZEXE from those olden days. When I discovered the author of FFmpeg and QEMU also created LZEXE, I was so impressed. I've been using his software for my entire computing career.
It's similar to the respect I have for the work of Anders Hejlsberg, who created Turbo Pascal, with which I learned to program; and also C# and TypeScript.
Reading about APE similarly sparkled me - the amount of ingenuity and sheer amazingness (if not perhaps a touch of depravity) that goes into these kinds of endeavors is awe inspiring.
Always interesting when people as talented as Bellard manage to (apparently) never write a "full-on" GUI-fronted application, or more specifically, a program that sits between a user with constantly shifting goals and workflows and a "core" that can get the job done.
I would not want to dismiss or diminish by any amount the incredible work he has done. It's just interesting to me that the problems he appears to pick generally take the form of "user sets up the parameters, the program runs to completion".
Reading some of these comments, it's clear very few in here have ever written a productive customer facing full stack app "javascript is really good for a single file app!!!" ok, maybe if you're rendering static HTML... -> these are not serious people
People only deny the existence of such people based on their own ego, believing that no one could possibly be worth 10x more or produce 10x more than they can. Those who have seen those people know full well these people exist.
It's kind of crazy it ever became some accepted world view, given how every field has a 10xer that is rather famous for it, whether it be someone who dominates in sport, an academic like Paul Erdős or Euler, a programmer like Fabrice or Linus Torvalds, a leader like Napoleon , or any number of famous inventors throughout history.
You can call 1000 averaged programmers and see if they can write MicroQuickJS using the same amount of time, or call one averaged programmer and see if he/she can write MicroQuickJS to the same quality in his/her life time. 10X, 100X or 1000X measures the productivity of us mortals, not someone like Fabrice Bellard.
Fabrice, if you're reading this, please consider replacing Rust instead with your own memory safe language.
The design intent of Rust is a powerful idea, and Rust is the best of its class, but the language itself is under-specified[1] which prevents basic, provably-correct optimizations[0]. At a technical level, Rust could be amended to address these problems, but at a social level, there are now too many people who can block the change, and there's a growing body of backwards compatibility to preserve. This leads reasonable people to give up on Rust and use something else[0], which compounds situations like [2] where projects that need it drop it because it's hard to find people to work on it.
Having written low-level high-performance programs, Fabrice Bellard has the experience to write a memory safe language that allows hardware control. And he has the faculties to assess design changes without tying them up in committee. I covet his attentions in this space.
I think of Rust might trigger a new generation of languages that are developed with the hindsight of rust.
The principle of zero cost abstractions avoids a slow slide of compromising abstraction cost, but I think there could be small cost abstractions that would make for a more pragmatic language. Having Rust to point at to show what performance you could be achieving would aid in avoiding bloating abstractions.
For all the praise he's receiving, I think his web design skills have gone overlooked. bellard.org is fast, responsive and presents information clearly. Actually I think the fancier the website, the shittier the software. Examples: Tarsnap - minimal website, brilliant software. Discord - Whitespacey, animation-heavy abomination of a website. Software: hundreds of MB of JS slop, government wiretap+botnet for degenerates.
Data wants to be free. They knew that once.
EDIT: Also to be clear I am not saying they can't win legally. I'm sure they can do legal games and could shop around until they were successful. They are in the wrong conceptually.
reply