2. apt repositories are cryptographically signed, centrally controlled, and legally accountable.
3. apt search is understood to be approximate, distro-scoped, and slow-moving. Results change slowly and rarely break scripts. PyPI search rankings change frequently by necessity
4. Turning PyPI search into an apt-like experience would require distributing a signed, periodically refreshed global metadata corpus to every client. At PyPI’s scale, that is nontrivial in bandwidth, storage, and governance terms
5. apt search works because the repository is curated, finite, and opinionated
The install side is basically Merkle-friendly (immutable artifacts, append-only metadata, hashes, mirrors).
Search isn’t. Search results are derived, subjective, and frequently rewritten (ranking tweaks, spam/malware takedowns, popularity signals). That’s more like constantly rebasing than appending commits.
You can Merklize “what files exist”; you can’t realistically Merklize “what should rank for this query today” without freezing semantics and turning CLI search into a hard API contract.
Isn't there a library out there for this common set of problems? I know Unicode provides normalization tables, though I don't know how good they are and I don't know if Unicode also provides a library.
once the script is non-trivial, 'install' it using pipx, in editable mode when you work on the script and as normal pipx installed cli utility otherwise.
I had some young family drama which kept me from studying for my first oral university exam. so I talked with the prof about it. he told me to bring a sick leave attestation from Dr such and such - or to come and give it shot. gave it a shot. "you can do much better that's obvious. I'll give you the weakest passing grade or I fail you and you redo the exam. your choice." wow.
Is there some literate programming LSP server around, which under the hood tangles the code chunks for language specific child LSP servers, and proxies those? so you have LSP support in the litprog source?
it would probably also semi-weave the source into a standard, say, markdown or latex or asciidoc and proxy that LSP server on those woven files.
It's not that obscure, even in the US. Anyone who takes French in US high school has probably read it in French (it's very easy to read), and even in English it's one of the most common classic children's books.
I think it's rather a kind-of- schooling-and-education thing.
for schools in a "humanistic" tradition I dare to bet it's canon.
it's a very beautiful read and when you have time, go and grab a sweet illustrated full text paper copy in your language of choice, it has been translated in all languages of the world, and there are wonderful editions of the book. I treasure a large pop up one.
At first glance it looks and feels like a childrens book, but really, is it? Antoine de Saint-Exupéry offers a very unique and poetic look at humankind and a truly timeless masterpiece, touching not so children topic's like different types of vanity, several perspectives on the rat race, addiction, love of course, both "caritas" and "amor" and at an idealistic level also "eros", responsibility for nature, it even touches on assisted suicide, but all of these little essays which are woven into a story arc are told with deep love and tenderness and clarity.
fine dining, if you wish, a gourmet story, really.
> The spirit of the GPL is the freedom of the user, not the code being freely shared.
who do you mean by "user"?
the spirit is that the person who actually uses the software also has the freedom to modify it, and that the users recovering these modifications have the same rights.
is that what you meant?
and while technically that's the spirit of the GPL, the license is not only about users, but about a _relationship_, that of the user and the software and what the user is allowed to do with the software.
it thus makes sense to talk about "software freedom".
last not least, about a single GPL function --- many GPL _libraries_ are licensed less restrictively, LGPL.
The GPL does not restrict what the user does with the software.
It can be USED for anything.
But it does restrict how you redistribute it. You have responsibilities if you redistribute it. You must provide the source code, and pass on the same freedoms you received to the users you redistribute it to.
Thinking on though, if the models are trained on any GPL code then one could consider that they contain that GPL code, and are constantly and continually updating and modifying that code, thus everything the model subsequently outputs and distributes should come under the GPL too. It’s far from sufficient that, say, OpenAI have a page on their website to redistribute the code they consume in their models if such code becomes part of the model’s training data that is resident in memory every time it produces new code for users. In the spirit of the GPL all that derivative code seems to also come under the GPL, and has to be made available for free, even if upon every request the generated code is somehow novel or unique to that user.
If the LLM can reproduce the entire GPL'd code, with licence and attribution intact, then that would satisfy the GPL, correct?
If the LLM can invent new code, inspired by but not copied from the GPL'd code, that new code does not require a GPL licence.
This is essentially the same as we humans do: I read some GPL code and go "huh, neat architecture!" and then a year later solve a similar problem using an architecture inspired by that code. This is not copying, and does not require me to GPL the code I'm producing.
But if I copy-paste a function from the GPL code into my code base, I need to respect the licence conditions and GPL at least part of my code base.
I think the argument that the author is talking about is if the model itself should be GPL'd because it contains copies of GPL'd code that can be reproduced. I don't buy this because that GPL code is not being run as part of the model's functioning. To use an analogy: if I create a code storage system, and then use it to store some GPL code, I don't have to GPL the code storage system itself. As long as it can reproduce the GPL code together with its licence and attribution, then the GPL is not being infringed at any point. The system is not using or running the GPL code itself, it is just storing the GPL code. This is what the LLM is doing.
> Thinking on though, if the models are trained on any GPL code then one could consider that they contain that GPL code, and are constantly and continually updating and modifying that code, thus everything the model subsequently outputs and distributes should come under the GPL too.
If you ask a model to output a task scheduler in C, and the training data contained a GPL-licensed implementation of the Fibonacci function in Haskell, the output isn't likely to bear a lot of resemblance to that input. It might even be unrelated enough that adding that function to the training data doesn't affect what the model outputs for that prompt at all.
The nasty thing in terms using code generated by these things is that if you ask the model to output a task scheduler in C and the training data contained a GPL-licensed implementation of a task scheduler in C, the output plausibly could bear a strong resemblance to that input. Without you knowing that. And then if you go incorporate that into something you're redistributing, what happens?
fundemental architecture of networks, compilers, disk operating systems, databases and more are implemented in GPL family LICENSE code; high value targets to acquire and master.
reply