The Accidental Library: How The Civilization Library Got Built
Three months ago I was working on StarCache and hit a wall I couldn't think my way around.
StarCache is an orbital compute architecture I've been developing since early 2026. The basic idea is to hold entire AI models in static memory using a novel multi-layer 3D-stacked chip, deployed in low-earth orbit. The project spans semiconductor physics, thermal management in vacuum, radiation hardening, satellite engineering, launch economics, and a LOT of other areas. I had no formal background in any of these fields when I started.
Early on, the project moved fast. Frontier AI models filled the gaps. I'd ask Claude or Gemini about chip fabrication, radiation effects on silicon, and Stefan-Boltzmann radiative cooling. I'd get answers that sounded right. I'd build them into the project and keep moving.
Then I tried to design the chip itself.
Specifically, I needed to understand how I'd actually print the layers using ASML's TWINSCAN EUV lithography system. Not how the system works at the marketing level. How it works at the level where I could specify the optical parameters, the exposure dose, the mask design, the resist chemistry, and the multi-patterning sequence. I needed to know what was physically possible and what wasn't. I needed to know what tolerances I could push and where the hard physical limits sat.
The frontier models could not give me this. They produced confident answers that turned out, on careful checking, to be wrong in important ways. Equation derivations had subtle errors. Cited specifications were slightly off from reality. Whole categories of constraint went unmentioned because they didn't appear in the public web data the models were trained on. The closer I got to the real engineering, the worse the models performed.
I realized I needed the textbooks themselves. Not summaries of them and not forum threads from a decade ago, but the actual references the engineers at ASML, Intel, and TSMC learn this material from. Bakshi on EUV lithography. Hecht on optics. Doering and Nishi on semiconductor manufacturing. Smith and Chetwynd on ultraprecision mechanism design. Real references with real equations and real specifications, downloadable as PDFs and queryable for the answers I actually needed.
So I started collecting them.
What I didn't realize at the time was that the collection wouldn't stop at semiconductor engineering. Once I had the optics and lithography references, I needed thermal management. Then radiation effects. Then satellite engineering. Then launch trajectories. Each textbook opened onto the next field, and each field had its own foundational references I hadn't read. I’d skim through a textbook and memorize the table of contents and introductory chapters.
A few weeks in, I took a break to write something I'd been meaning to write for a while: an essay arguing that the 21st century is the first era in history where pursuing polymathy is structurally viable again, because AI tools collapse the access asymmetries that made polymathy impossible in the mid-20th century. I called it The 21st Century Polymath.
While writing it, I noticed I was describing my own current behavior. The collection of textbooks I was building wasn't a side activity to StarCache. It was the actual project. The Polymath article was the thesis. The textbook collection was the infrastructure. And the thing I was using to query the collection — a local Gemma model wired to a RAG database of every textbook I'd assembled — needed a name. I called it POLY.
The collection kept growing. I expanded beyond semiconductor and physics references into mathematics, biology, medicine, engineering, law, philosophy, history, and trades. Once I'd seen the structure of one field, I wanted the structure of all of them. I checked whether anyone had already built a modern version of this. The closest things are archival. The Long Now's Manual for Civilization is a crowd-sourced shelf of roughly 3,500 books at The Interval in San Francisco, but it's weighted toward cultural canon, philosophy, and science fiction, and it's built for preservation rather than active technical learning. The Survivor Library leans heavily on 19th and early-20th-century public-domain material. Neither is a modern, professional-grade reference set you could write down as a list and learn a field from today.
So I built it. Three months of curation. 3,000 professional-grade textbooks across every major domain of modern human knowledge. Millions of pages and roughly 1.5 billion words of curated, expert-authored, expert-validated reference material. The Civilization Library.
Building the list took two months of going back and forth with every frontier model, pushing each candidate against four tests: foundational, comprehensive, modern, and actually used by experts today. Another month went to edge cases, the gaps no model thought to mention, which I closed by following my own curiosity until the structure felt complete. I cleaned and processed every file by hand. I could have automated that, but the manual pass was the point. I skimmed each book, absorbed its table of contents and introductory chapters, and indexed it into my own memory as I placed it. The library was being built in two places at once, on disk and in my head. Every book earns its spot.
The original goal was to design a chip. The actual project turned out to be assembling the substrate that lets a single mind navigate the totality of human knowledge.
I'm still going to design the chip. I just have a better library now.