As a pip maintainer I primarily think about resolution and resolver performance. I got involved when pip introduced it's current resolver around late 2020, and got my first PR landed in mid 2021.
I recently became a packaging maintainer, from working on fixing edge case behavior around specifiers and prerelease versions.
When I did some recent profiling I noticed that A LOT of time was being spent in packaging, largely parsing version strings. I found a few places in pip and packaging that reduced the number of Version objects being created, Henry really ran with the idea of improving performance and made big improvements. I'm excited for this to be vendored in pip 26.0 coming out at the end of January.
If anyone is interested the next big improvement for pip is likely to implement a real CDCL (Conflict-Driven Clause Learning) resolver algorithm, like uv's use of pubgrub-rs. That said, I do this in my spare time so it may be a year or more before I make any real traction on implementing that.
> At one company I worked at, we had a system where each deploy got its own folder, and we'd update a symlink to point to the active one. It worked, but it was all manual, all custom, and all fragile.
The first time I saw this I thought it was one of the most elegant solutions I'd ever seen working in technology. Safe to deploy the files, atomic switch over per machine, and trivial to rollback.
It may have been manual, but I'd worked with a deployment processes that involved manually copying files to dozens of boxes and following 10 to 20 step process of manual commands on each box. Even when I first got to use automated deployment tooling in the company I worked at it was fragile, opaque and a configuration nightmare, built primarily for OS installation of new servers and being forced to work with applications.
I am now feeling old for using Capistrano even today. I think there might be “cooler and newer” ways to deploy, but i never ever felt the need to learn what those ways are since Capistrano gets the job done.
I did this, but I used rsync, and you can tell rsync to use the previous ver as the basis so it wouldn't even need to upload everything all over again. Super duper quick to deploy.
I put that in a little bash script so.. I don't know if you call anything that isn't CI "manual" but I don't think it'd be hard to work into some pipeline either.
Pip has been a flag bearer for Python packaging standards for some time now, so that alternatives can implement standards rather than copy behavior. So first a lock file standard had to be agreed upon which finally happened this year: https://peps.python.org/pep-0751/
Now it's a matter of a maintainer, who are currently all volunteers donating their spare time, to fully implement support. Progress is happening but it is a little slow because of this.
Python has about 40 keywords, I say I would regularly use about 30, and irregularly use about another 5. Hardly seems like a "junkyard".
Further, this lack of first class support for lazy importing has spawned multiple CPython forks that implement their own lazy importing or a modified version of the prior rejected PEP 690. Reducing the real world need for forks seems worth the price of one keyword.
False await else import pass
None break except in raise
True class finally is return
and continue for lambda try
as def from nonlocal while
assert del global not with
async elif if or yield
Soft Keywords:
match case _ type
I think nonlocal/global are the only hard keywords I now barely use, for the soft ones I rarely use pattern matching, so 5 seems like a good estimate
> I had been hoping someone would introduce the non-virtualenv package management solution that every single other language has where there's a dependency list and version requirements (including of the language itself) in a manifest file (go.mod, package.json, etc) and everything happens in the context of that directory alone without shell shenanigans.
Isn't that exactly a pyproject.toml via the the uv add/sync/run interface? What is that missing that you need?
A lot of the tools you list here are composable, mypy checks type hints, black formats code, because of the purpose and design ethos of those two tools they would never think to merge.
So which is it that you want, to just reach for one tool or have tools that have specific design goals and then have to reach for many tools in a full workflow?
FWIW Astral's long term goal appears to be to just reach for one tool, hence why you can now do "uv format".
> There is one part of Python that I consider a design flaw when it comes to packaging: the sys.modules global dictionary means it's not at all easy in Python to install two versions of the same package at the same time. This makes it really tricky if you have dependency A and dependency B both of which themselves require different versions of dependency C.
But it solves the problem that if A and B both depend on C the user can pass an object from A to B that was created by C without worrying about it breaking.
In less abstract terms, let's say numpy one day changed it's internal representation of an array, so if one version of numpy read an array of a different version of numpy it would crash or worse read it but misinterpret it. Now if I have one data science library produces numpy arrays and another visualization library that takes numpy arrays, I can be confident that only one version of numpy is installed and the visualization library isn't going to misinterpret the data from the data because it is using a different version of numpy.
This stability of installed versions have allowed entire ecosystems build around core dependencies in a way that would be tricky without that. I would therefore not consider it a design flaw.
I wouldn't mind a codebase where numpy objects created by dependency B can't be shared directly with dependency A without me first running some kind of conversion function on them - I'd take that over "sorry you want to use dependency A and dependency B in this project, you're just out of luck".
> I wouldn't mind a codebase where numpy objects created by dependency B can't be shared directly with dependency A without me first running some kind of conversion function on them
Given there's no compiler to enforce this check, and Python is dynamic language, I don't see how you implement that without some complicated object provenance feature, making every single object larger and every use of that object (calling with it, calling it, assigning it to an attribute, assigning an attribute to it) impose an expensive runtime check.
You let people make the mistake and have the library throw an exception if they do that, not through type checking but just through something eventually calling a method that doesn't exist.
> You let people make the mistake and have the library throw an exception if they do that, not through type checking but just through something eventually calling a method that doesn't exist.
Exceptions or crashes would be annoying, but yes, are manageable, although try telling that to new users of the language that their code doesn't work because they didn't understand the transitive dependency tree of their install and it automatically vendored different versions of a library for different dependencies, and how did they not know that from some random exception occurring in a dependency.
But as I explain in my example, the real problem is that one version of the library reads the data in a different layout from the other, so instead you end of with subtle data errors. Now your code is working but your getting the wrong output, good luck debugging that.
The advantage is type hints can be fixed without needing to release a new version of Python. The disadvantage is there's a lot of code in the standard library that doesn't really consider how to represent it with type hints, and it can be really tricky and not always possible.
I'm surprised to see so many people moving to pyrefly, ty, and zuban so quickly. I was going to wait until some time in 2026 to see which has matured to the point I find useful, I guess some users really find existing solutions actually unworkable.
Hmm. Presumably mypy and pyrefly use the same ones, but then I don't understand why pyrefly is complaining and mypy isn't:
ERROR Argument `Future[list[BaseException | Any]]` is not assignable to parameter `future` with type `Awaitable[tuple[Any]]` in function `asyncio.events.AbstractEventLoop.run_until_complete` [bad-argument-type]
--> xxx/xxx.py:513:33
|
513 | loop.run_until_complete(tasks.gather(*x, return_exceptions=True))
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(tbh this is rather insignificant compared to the noise from external packages without type annotations, or with incomplete ones… pyrefly's inferences at the existence of attributes and their types are extremely conservative…)
> Hmm. Presumably mypy and pyrefly use the same ones, but then I don't understand why pyrefly is complaining and mypy isn't:
> …where is it even puling "tuple[Any]" from…
Perhaps it's a bug in pyrefly, perhaps mypy or pyrefly is able to infer something about the types that the other isn't. I would strongly suggest checking their issues page, and if not seeing a report already report it yourself.
While there is an ongoing push to more consistently document the typing spec: https://typing.python.org/. It does not actually cover all the things you can infer from type hints, and different type hint checkers have decided to take different design choices compared to mypy and will produce different errors even in the most ideal situation.
This is one of the reasons why I am waiting for these libraries to mature a little more.
> it does not actually cover what rules you can check or infer from type hints
Indeed this is the cause of maybe 30% of the warnings I'm seeing… items being added to lists or dicts in some place (or something else making it infer a container type), and pyrefly then refusing other types getting added elsewhere. The most "egregious" one I saw was:
def something(x: list[str]) -> str:
foo = []
for i in x:
foo.append(i)
return "".join(foo)
Where it complains:
ERROR Argument `str` is not assignable to parameter `object` with type `LiteralString` in function `list.append` [bad-argument-type]
--> test.py:4:20
4 | foo.append(i)
Edit: now that I have posted it, this might actually be a bug in the .join type annotation… or something
Edit: nah, it's the loop (and the LiteralString variant of .join is just the first overload listed in the type hints)… https://github.com/facebook/pyrefly/issues/1107 - this is kinda important, I don't think I can use it before this is improved :/
I assume in your example if you update the foo declaration to the following it solves the complaint:
foo: list[str] = []
If so this a type checking design choice:
* What can I infer from an empty collection declaration?
* What do I allow further down the code to update inferences further up the code?
I don't know Pyrefly's philosophy here, but I assume it's guided by opinionated coding guidelines inside Meta, not what is perhaps the easiest for users to understand.
It is a purely subjective design decision, but I personally prefer the stricter rules that don’t do backwards type inferences like this… type hints shouldn’t follow duck typing semantics. Otherwise you’re not providing nearly as much value IMO. Typescript is really the model organism here. They took the most cursed mainstream programming language, and made it downright good.
Today, the “: list[str]” is 11 wasted characters and it’s not as aesthetically pleasing. Tomorrow, you do some refactor and your inferred list[str] becomes a list[int] without you realizing it… I’m sure that sounds silly in this toy example, but just imagine it’s a much more complex piece of code. The fact of the matter is, you declared foo as a list[any] and you’re passing it to a function that takes an iterable[str] — it ought to complain at you! Type hints are optional in Python, and you can tell linters to suppress warnings in a scope with a comment too.
That being said, perhaps these more permissive rules might be useful in a legacy codebase where no type annotations exist yet.
Really, it’d be extra nice if they made this behavior configurable, but I’m probably asking for too much there. What’s next, a transpiler? Dogs and cats living together?!
> Today, the “: list[str]” is 11 wasted characters and it’s not as aesthetically pleasing. Tomorrow, you do some refactor and your inferred list[str] becomes a list[int] without you realizing it… I’m sure that sounds silly in this toy example, but just imagine it’s a much more complex piece of code.
Hmm. I'm looking at a codebase that is still in a lot of "early flux", where one day I might be looking at a "list[VirtualRouter]" but the next day it's a "list[VirtualRouterSpec]". It's already gone through several refactors and it kinda felt like the type hints were pretty much spot on in terms of effort-benefit. It's not a legacy codebase; it has reasonably good type hint coverage, but it's focused on type hinting interfaces (a few Protocol in there), classes and functions. The type hinting inline in actual code is limited.
I do understand your perspective, but tbh to me it feels like if I went that far I might rather not choose Python to begin with…
I agree, but as a type checker it is a subjective choice, whether to be explicit and not make assumptions or whether to infer from certain certain patterns as probably correct and not be noisy to the user. Very glad to see they plan to fix this.
Are there any reliable decentralized package distribution systems operating at within 2 orders of magnitude of that scale? How do they handle administrative issues such as malicious packages or name squatting? Standards updates? Enforcement of correct metadata? And all the other common things package indexes need to handle.
I'm clearly skeptical, but would be very interested in any real world success stories.
There is, the web. The web distributes code directly to end users at a much larger scale. To distribute the bandwidth costs, the web is federated: to depend on a script you refer to its url, and whoever hosts this url foots the bill.
Deno is a Javascript implementation for the backend that attempts to mimic this pattern (it later introduced a more npm-like centralized repository, but afaik it's optional). Deno is of course less popular than Python, but its url-centered model can really scale imo.
> There is, the web. The web distributes code directly to end users at a much larger scale. To distribute the bandwidth costs, the web is federated: to depend on a script you refer to its url, and whoever hosts this url foots the bill.
But the Web is notorious for the problems I listed, you end up with standards around not following standards. It leaves almost all the responsibility on the client tool (browser or whatever) to do validation to stop malicious sites, name squatting, accepting and "fixing" poorly constructed metadata etc.
> Deno is a Javascript implementation for the backend that attempts to mimic this pattern (it later introduced a more npm-like centralized repository, but afaik it's optional). Deno is of course less popular than Python, but its url-centered model can really scale imo.
I was not familiar with Deno, I've done some shallow reading on this now and it's certainly interesting. I don't know enough about the JavaScript world to make a comment on the pros or cons.
But I don't think can work for Python, as transitive dependencies would immediately conflict as soon as dependencies required a different version of the same transitive dependency. And the guarantee of Python packaging is you only have a single version of a library installed in an environment, while it can cause some dependency solver headache, it also solves a lot of problems as it makes it safe to pass around objects.
I recently became a packaging maintainer, from working on fixing edge case behavior around specifiers and prerelease versions.
When I did some recent profiling I noticed that A LOT of time was being spent in packaging, largely parsing version strings. I found a few places in pip and packaging that reduced the number of Version objects being created, Henry really ran with the idea of improving performance and made big improvements. I'm excited for this to be vendored in pip 26.0 coming out at the end of January.
If anyone is interested the next big improvement for pip is likely to implement a real CDCL (Conflict-Driven Clause Learning) resolver algorithm, like uv's use of pubgrub-rs. That said, I do this in my spare time so it may be a year or more before I make any real traction on implementing that.
reply