Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

PyPI servers would have to be constantly rebuilding a central index and making it available for download. Seems inefficient




Debian is somehow able to manage it for apt.

1. Debian is local first via client side cache

2. apt repositories are cryptographically signed, centrally controlled, and legally accountable.

3. apt search is understood to be approximate, distro-scoped, and slow-moving. Results change slowly and rarely break scripts. PyPI search rankings change frequently by necessity

4. Turning PyPI search into an apt-like experience would require distributing a signed, periodically refreshed global metadata corpus to every client. At PyPI’s scale, that is nontrivial in bandwidth, storage, and governance terms

5. apt search works because the repository is curated, finite, and opinionated


isn't this an incrementally updatable tree that is managed with a Merkle tree? git-like, essentially?

The install side is basically Merkle-friendly (immutable artifacts, append-only metadata, hashes, mirrors). Search isn’t. Search results are derived, subjective, and frequently rewritten (ranking tweaks, spam/malware takedowns, popularity signals). That’s more like constantly rebasing than appending commits.

You can Merklize “what files exist”; you can’t realistically Merklize “what should rank for this query today” without freezing semantics and turning CLI search into a hard API contract.


are you saying PyPi search is spammed o-O ?

Yes, it was subject to abuse so they had to shutdown the XML-RPC API

that depends on how it can be downloaded incrementally.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: