More

notatallshaw · 2026-01-09T19:45:15 1767987915

As a pip maintainer I primarily think about resolution and resolver performance. I got involved when pip introduced it's current resolver around late 2020, and got my first PR landed in mid 2021.

I recently became a packaging maintainer, from working on fixing edge case behavior around specifiers and prerelease versions.

When I did some recent profiling I noticed that A LOT of time was being spent in packaging, largely parsing version strings. I found a few places in pip and packaging that reduced the number of Version objects being created, Henry really ran with the idea of improving performance and made big improvements. I'm excited for this to be vendored in pip 26.0 coming out at the end of January.

If anyone is interested the next big improvement for pip is likely to implement a real CDCL (Conflict-Driven Clause Learning) resolver algorithm, like uv's use of pubgrub-rs. That said, I do this in my spare time so it may be a year or more before I make any real traction on implementing that.

notatallshaw · 2025-12-16T21:28:03 1765920483

> At one company I worked at, we had a system where each deploy got its own folder, and we'd update a symlink to point to the active one. It worked, but it was all manual, all custom, and all fragile.

The first time I saw this I thought it was one of the most elegant solutions I'd ever seen working in technology. Safe to deploy the files, atomic switch over per machine, and trivial to rollback.

It may have been manual, but I'd worked with a deployment processes that involved manually copying files to dozens of boxes and following 10 to 20 step process of manual commands on each box. Even when I first got to use automated deployment tooling in the company I worked at it was fragile, opaque and a configuration nightmare, built primarily for OS installation of new servers and being forced to work with applications.

toast0 · 2025-12-16T23:16:36 1765926996

> It may have been manual

It's pretty easy to automate a system that pushes directories and changes symlinks. I've used and built automation around the basic pattern.

shimms · 2025-12-17T00:04:40 1765929880

It’s been a while (a decade?!) but if I recall correctly Capistrano did this for rails deployments too, didn’t it?

thunderbong · 2025-12-17T02:39:26 1765939166

Not just rails. Capistrano is tech stack agnostic. It's possible to deploy a project with nodejs using Capistrano.

And yes, it's truly elegant.

Rollbacks become trivial should you need it.

AznHisoka · 2025-12-17T01:11:28 1765933888

I am now feeling old for using Capistrano even today. I think there might be “cooler and newer” ways to deploy, but i never ever felt the need to learn what those ways are since Capistrano gets the job done.

copperx · 2025-12-17T08:58:22 1765961902

I remember using mina and it was much faster than Capistrano. Sadly, it seems it's now unmaintained.

hdjrudni · 2025-12-18T02:55:46 1766026546

I did this, but I used rsync, and you can tell rsync to use the previous ver as the basis so it wouldn't even need to upload everything all over again. Super duper quick to deploy.

I put that in a little bash script so.. I don't know if you call anything that isn't CI "manual" but I don't think it'd be hard to work into some pipeline either.

I kind of miss it honestly.

notatallshaw · 2025-12-09T15:27:39 1765294059

We're getting there https://pip.pypa.io/en/stable/cli/pip_lock/ !

Pip has been a flag bearer for Python packaging standards for some time now, so that alternatives can implement standards rather than copy behavior. So first a lock file standard had to be agreed upon which finally happened this year: https://peps.python.org/pep-0751/

Now it's a matter of a maintainer, who are currently all volunteers donating their spare time, to fully implement support. Progress is happening but it is a little slow because of this.

notatallshaw · 2025-11-06T15:57:40 1762444660

Unfortunately that was posted 1 month before the Faster CPython project was disbanded by Microsoft, so I imagine things have slowed.

notatallshaw · 2025-11-03T20:54:49 1762203289

Python has about 40 keywords, I say I would regularly use about 30, and irregularly use about another 5. Hardly seems like a "junkyard".

Further, this lack of first class support for lazy importing has spawned multiple CPython forks that implement their own lazy importing or a modified version of the prior rejected PEP 690. Reducing the real world need for forks seems worth the price of one keyword.

lairv · 2025-11-03T22:01:04 1762207264

For those curious here are the actual keywords (from https://docs.python.org/3/reference/lexical_analysis.html?ut... )

Hard Keywords:

False await else import pass None break except in raise True class finally is return and continue for lambda try as def from nonlocal while assert del global not with async elif if or yield

Soft Keywords:

match case _ type

I think nonlocal/global are the only hard keywords I now barely use, for the soft ones I rarely use pattern matching, so 5 seems like a good estimate

silverwind · 2025-11-04T00:20:16 1762215616

I recall when they added "async" and it broken a whole lot of libraries. I hope they never again introduce new "hard" keywords.

GauntletWizard · 2025-11-03T23:30:39 1762212639

Removing "print" in 3.0 helped their case significantly, as well.

notatallshaw · 2025-10-29T19:43:29 1761767009

> I had been hoping someone would introduce the non-virtualenv package management solution that every single other language has where there's a dependency list and version requirements (including of the language itself) in a manifest file (go.mod, package.json, etc) and everything happens in the context of that directory alone without shell shenanigans.

Isn't that exactly a pyproject.toml via the the uv add/sync/run interface? What is that missing that you need?

eatonphil · 2025-10-29T19:51:01 1761767461

> pyproject.toml

Ah ok I was missing this and this does sound like what I was expecting. Thank you!

notatallshaw · 2025-10-14T15:49:01 1760456941

A lot of the tools you list here are composable, mypy checks type hints, black formats code, because of the purpose and design ethos of those two tools they would never think to merge.

So which is it that you want, to just reach for one tool or have tools that have specific design goals and then have to reach for many tools in a full workflow?

FWIW Astral's long term goal appears to be to just reach for one tool, hence why you can now do "uv format".

notatallshaw · 2025-10-14T15:27:38 1760455658

> There is one part of Python that I consider a design flaw when it comes to packaging: the sys.modules global dictionary means it's not at all easy in Python to install two versions of the same package at the same time. This makes it really tricky if you have dependency A and dependency B both of which themselves require different versions of dependency C.

But it solves the problem that if A and B both depend on C the user can pass an object from A to B that was created by C without worrying about it breaking.

In less abstract terms, let's say numpy one day changed it's internal representation of an array, so if one version of numpy read an array of a different version of numpy it would crash or worse read it but misinterpret it. Now if I have one data science library produces numpy arrays and another visualization library that takes numpy arrays, I can be confident that only one version of numpy is installed and the visualization library isn't going to misinterpret the data from the data because it is using a different version of numpy.

This stability of installed versions have allowed entire ecosystems build around core dependencies in a way that would be tricky without that. I would therefore not consider it a design flaw.

simonw · 2025-10-14T17:01:04 1760461264

I wouldn't mind a codebase where numpy objects created by dependency B can't be shared directly with dependency A without me first running some kind of conversion function on them - I'd take that over "sorry you want to use dependency A and dependency B in this project, you're just out of luck".

notatallshaw · 2025-10-14T17:36:51 1760463411

> I wouldn't mind a codebase where numpy objects created by dependency B can't be shared directly with dependency A without me first running some kind of conversion function on them

Given there's no compiler to enforce this check, and Python is dynamic language, I don't see how you implement that without some complicated object provenance feature, making every single object larger and every use of that object (calling with it, calling it, assigning it to an attribute, assigning an attribute to it) impose an expensive runtime check.

But maybe I'm missing something obvious.

simonw · 2025-10-15T03:24:17 1760498657

You let people make the mistake and have the library throw an exception if they do that, not through type checking but just through something eventually calling a method that doesn't exist.

notatallshaw · 2025-10-15T13:50:39 1760536239

> You let people make the mistake and have the library throw an exception if they do that, not through type checking but just through something eventually calling a method that doesn't exist.

Exceptions or crashes would be annoying, but yes, are manageable, although try telling that to new users of the language that their code doesn't work because they didn't understand the transitive dependency tree of their install and it automatically vendored different versions of a library for different dependencies, and how did they not know that from some random exception occurring in a dependency.

But as I explain in my example, the real problem is that one version of the library reads the data in a different layout from the other, so instead you end of with subtle data errors. Now your code is working but your getting the wrong output, good luck debugging that.

notatallshaw · 2025-10-14T15:08:44 1760454524

The standard library does not directly include type hints, they are stored in typeshed: https://github.com/python/typeshed

You can take a look yourself if you think some of them are wrong: https://github.com/python/typeshed/tree/main/stdlib/asyncio

The advantage is type hints can be fixed without needing to release a new version of Python. The disadvantage is there's a lot of code in the standard library that doesn't really consider how to represent it with type hints, and it can be really tricky and not always possible.

I'm surprised to see so many people moving to pyrefly, ty, and zuban so quickly. I was going to wait until some time in 2026 to see which has matured to the point I find useful, I guess some users really find existing solutions actually unworkable.

eqvinox · 2025-10-14T15:26:38 1760455598

> You can take a look yourself if you think some of them are wrong: https://github.com/python/typeshed/tree/main/stdlib/asyncio

Hmm. Presumably mypy and pyrefly use the same ones, but then I don't understand why pyrefly is complaining and mypy isn't:

  ERROR Argument `Future[list[BaseException | Any]]` is not assignable to parameter `future` with type `Awaitable[tuple[Any]]` in function `asyncio.events.AbstractEventLoop.run_until_complete` [bad-argument-type]
     --> xxx/xxx.py:513:33
      |
  513 |         loop.run_until_complete(tasks.gather(*x, return_exceptions=True))
      |                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The definition in typeshed is this:

  def run_until_complete(self, future: _AwaitableLike[_T]) -> _T: ...

…where is it even puling "tuple[Any]" from…

(tbh this is rather insignificant compared to the noise from external packages without type annotations, or with incomplete ones… pyrefly's inferences at the existence of attributes and their types are extremely conservative…)

notatallshaw · 2025-10-14T15:34:18 1760456058

> Hmm. Presumably mypy and pyrefly use the same ones, but then I don't understand why pyrefly is complaining and mypy isn't:

> …where is it even puling "tuple[Any]" from…

Perhaps it's a bug in pyrefly, perhaps mypy or pyrefly is able to infer something about the types that the other isn't. I would strongly suggest checking their issues page, and if not seeing a report already report it yourself.

While there is an ongoing push to more consistently document the typing spec: https://typing.python.org/. It does not actually cover all the things you can infer from type hints, and different type hint checkers have decided to take different design choices compared to mypy and will produce different errors even in the most ideal situation.

This is one of the reasons why I am waiting for these libraries to mature a little more.

eqvinox · 2025-10-14T15:46:06 1760456766

> it does not actually cover what rules you can check or infer from type hints

Indeed this is the cause of maybe 30% of the warnings I'm seeing… items being added to lists or dicts in some place (or something else making it infer a container type), and pyrefly then refusing other types getting added elsewhere. The most "egregious" one I saw was:

  def something(x: list[str]) -> str:
      foo = []
      for i in x:
          foo.append(i)
      return "".join(foo)

Where it complains:

  ERROR Argument `str` is not assignable to parameter `object` with type `LiteralString` in function `list.append` [bad-argument-type]
   --> test.py:4:20
  4 |         foo.append(i)

Edit: now that I have posted it, this might actually be a bug in the .join type annotation… or something

Edit: nah, it's the loop (and the LiteralString variant of .join is just the first overload listed in the type hints)… https://github.com/facebook/pyrefly/issues/1107 - this is kinda important, I don't think I can use it before this is improved :/

notatallshaw · 2025-10-14T15:56:41 1760457401

I assume in your example if you update the foo declaration to the following it solves the complaint:

    foo: list[str] = []

If so this a type checking design choice:

  * What can I infer from an empty collection declaration?
  * What do I allow further down the code to update inferences further up the code?

I don't know Pyrefly's philosophy here, but I assume it's guided by opinionated coding guidelines inside Meta, not what is perhaps the easiest for users to understand.

eqvinox · 2025-10-14T16:37:00 1760459820

Yes, annotating the type explicitly fixes it; but tbh I'd consider that type annotation "unnecessary/distracting code litter".

As far as their philosophy goes, it's an open issue they're working on, so their philosophy seems to agree this particular pattern should work :)

sh34r · 2025-10-15T06:03:21 1760508201

It is a purely subjective design decision, but I personally prefer the stricter rules that don’t do backwards type inferences like this… type hints shouldn’t follow duck typing semantics. Otherwise you’re not providing nearly as much value IMO. Typescript is really the model organism here. They took the most cursed mainstream programming language, and made it downright good.

Today, the “: list[str]” is 11 wasted characters and it’s not as aesthetically pleasing. Tomorrow, you do some refactor and your inferred list[str] becomes a list[int] without you realizing it… I’m sure that sounds silly in this toy example, but just imagine it’s a much more complex piece of code. The fact of the matter is, you declared foo as a list[any] and you’re passing it to a function that takes an iterable[str] — it ought to complain at you! Type hints are optional in Python, and you can tell linters to suppress warnings in a scope with a comment too.

That being said, perhaps these more permissive rules might be useful in a legacy codebase where no type annotations exist yet.

Really, it’d be extra nice if they made this behavior configurable, but I’m probably asking for too much there. What’s next, a transpiler? Dogs and cats living together?!

eqvinox · 2025-10-15T14:28:38 1760538518

> Today, the “: list[str]” is 11 wasted characters and it’s not as aesthetically pleasing. Tomorrow, you do some refactor and your inferred list[str] becomes a list[int] without you realizing it… I’m sure that sounds silly in this toy example, but just imagine it’s a much more complex piece of code.

Hmm. I'm looking at a codebase that is still in a lot of "early flux", where one day I might be looking at a "list[VirtualRouter]" but the next day it's a "list[VirtualRouterSpec]". It's already gone through several refactors and it kinda felt like the type hints were pretty much spot on in terms of effort-benefit. It's not a legacy codebase; it has reasonably good type hint coverage, but it's focused on type hinting interfaces (a few Protocol in there), classes and functions. The type hinting inline in actual code is limited.

I do understand your perspective, but tbh to me it feels like if I went that far I might rather not choose Python to begin with…

notatallshaw · 2025-10-14T17:31:47 1760463107

I agree, but as a type checker it is a subjective choice, whether to be explicit and not make assumptions or whether to infer from certain certain patterns as probably correct and not be noisy to the user. Very glad to see they plan to fix this.

notatallshaw · 2025-09-23T22:29:43 1758666583

Taking PyPI as a central place of packages, it is known that their bandwidth bill would be $1.8+M per month (https://dustingram.com/articles/2021/04/14/powering-the-pyth...) were it not for Fastly giving them a 100% discount.

Are there any reliable decentralized package distribution systems operating at within 2 orders of magnitude of that scale? How do they handle administrative issues such as malicious packages or name squatting? Standards updates? Enforcement of correct metadata? And all the other common things package indexes need to handle.

I'm clearly skeptical, but would be very interested in any real world success stories.

nextaccountic · 2025-09-24T12:01:04 1758715264

There is, the web. The web distributes code directly to end users at a much larger scale. To distribute the bandwidth costs, the web is federated: to depend on a script you refer to its url, and whoever hosts this url foots the bill.

Deno is a Javascript implementation for the backend that attempts to mimic this pattern (it later introduced a more npm-like centralized repository, but afaik it's optional). Deno is of course less popular than Python, but its url-centered model can really scale imo.

notatallshaw · 2025-09-24T16:40:44 1758732044

> There is, the web. The web distributes code directly to end users at a much larger scale. To distribute the bandwidth costs, the web is federated: to depend on a script you refer to its url, and whoever hosts this url foots the bill.

But the Web is notorious for the problems I listed, you end up with standards around not following standards. It leaves almost all the responsibility on the client tool (browser or whatever) to do validation to stop malicious sites, name squatting, accepting and "fixing" poorly constructed metadata etc.

> Deno is a Javascript implementation for the backend that attempts to mimic this pattern (it later introduced a more npm-like centralized repository, but afaik it's optional). Deno is of course less popular than Python, but its url-centered model can really scale imo.

I was not familiar with Deno, I've done some shallow reading on this now and it's certainly interesting. I don't know enough about the JavaScript world to make a comment on the pros or cons.

But I don't think can work for Python, as transitive dependencies would immediately conflict as soon as dependencies required a different version of the same transitive dependency. And the guarantee of Python packaging is you only have a single version of a library installed in an environment, while it can cause some dependency solver headache, it also solves a lot of problems as it makes it safe to pass around objects.

hellcow · 2025-09-23T22:48:03 1758667683

Go does this, and I’d say it’s been highly successful.