# Background

To simply use TeX, you don’t need to know anything about how the TeX program itself is implemented.

If you do want to read the source code of the TeX program, you could just do it, as it has both been published in book form:

• TeX: The Program, by Donald E. Knuth. ISBN 0-201-13437-3 (Reading, Massachusetts: Addison-Wesley, 1986), xviii+600pp. Volume B of Computers And Typesetting.

and is available as a PDF:

• Just invoke texdoc tex if you have a TeX distribution installed, or else find it online here (3.4 MiB).

However, neither is perfect:

• (Book ≫ PDF) The book has a useful introduction and appendices (including a diagram) that are not present in the PDF, and more importantly every page-spread (pair of facing pages) has a mini-index for all the identifiers that occur on those pair of pages,

• (PDF ≫ book) the PDF has (some) click-able cross-references, which in the book require manually turning pages and so on.

This webpage will attempt to provide the best of both worlds, and eventually to explain the program and make it easily understandable. But it is still under construction, and will take some time.

# Book pages

In the meantime, here are some pages from the book that are not in the PDF (scans for now; hope to transcribe and/or create equivalents):

• Preface (page v)

• Supplementary Bibliography (publications of the TeX project at Stanford University) (pages vi, vii)

• How to Read a WEB (pages viii-xiii – temporarily taken down till I can double-check permission issues)

• Chart of TeX’s code components (dark) and memory regions (light+dark) (page 594), also available here (via this in TUGboat – but note that the middle section is scaled by amount of code as Knuth says in the videos, and not by memory structure as the caption in TUGboat says: scaling by memory is only for the outer sections).

Additionally, to become familiar with the WEB style of writing programs, you may also want to read:

## Smaller programs

Then, you may want to work your way up to TeX from smaller programs written by Knuth in the same style: I’ve prepared a List of WEB files. Specifically, a possible reading order, from smallest to largest, is:

Program Pages Sections Fraction of TeX
POOLTYPE 7 + 4 20 + 2 ≈ 1.4% to 2%
GLUE 8 + 3 26 + 1 ≈ 1.6% to 2%
DVITYPE 47 + 7 111 + 2 ≈ 8% to 10%
TANGLE 66 + 9 187 + 2 ≈ 13.5% to 14%
WEAVE 98 + 12 263 + 2 ≈ 19% to 20.5%
TEX 478 + 57 1378 + 2 100%

Of these,

• POOLTYPE and DVITYPE share a lot of their code with TeX (specifically, TeX’s string handling and DVI output sections respectively),

• GLUE shows alternatives to some code in TeX (and is probably best read after understanding the corresponding parts of TeX, even though it was published first),

• TANGLE and WEAVE are the implementation of WEB, and perhaps worth reading as programs of smaller/intermediate size.

I have separate pages for each of these programs on this site:

Totally unrelated to TeX, but you could look at other “literate programs” entirely: Knuth’s CWEB programs, or the (Academy Award winning!) Physically Based Rendering book (see random chapter).

## Pascal

• You can read Jensen and Wirth (“Pascal user manual and report”), the original tutorial and definition of Pascal.

• An interesting document is Kernighan (the K in K&R) on “Why Pascal is Not My Favorite Programming Language”. Kernighan faced many difficulties with Pascal, and WEB in many ways is a solution to those same difficulties (I wrote more on that at the beginning and end of the annotated version of POOLTYPE).

• You can try writing a few of your own small programs in Pascal (with and without WEB), as with my 7-page “Hello, world!” program here.

• You can try reading some Pascal programs written by others, so that at least the idiosyncrasies that are common to all Pascal programs can be got over. My suggestions are:

• Let’s Build a Compiler by Jack Crenshaw. The original is here, but there’s a very convenient reformatted version here.

• The P4 Compiler and Interpreter by Steven Pemberton and Martin Daniels.

Note that these two programs are both compilers. TeX itself is written like a compiler, so you’ll find many similarities.

• Another personal favourite is the (in-progress after a long hiatus) Let’s Build A Simple Interpreter series by Ruslan Spivak. (No relation, I think, to Michael Spivak the author of AmsTeX and The Joy of TeX.) This is Python code, but in trying to write a Pascal compiler you may learn something about the language. If not, you’ll at least learn about compilers in a fun way.

## Other versions of TeX

• Instead of reading the Pascal source as written by Knuth, there are many other versions you could read. I’ve collected a bunch here.

• Many (not all!) are only of historical interest.

• The most relevant may be LuaTeX, which is under active development, and started with a manual translation of the Pascal code to C. For (a randomly picked?) example, compare section 426 of the TeX program with this part of LuaTeX — it is in more familiar C style, but there are also more cases. (Compare with pdfTeX and XeTeX.)

• A good version to read may be Richard Sandberg’s rsTeX. For example, here’s that same section. See comments at the top of the file.

• web2w is a conversion of TeX from literate programming in Pascal to literate programming in C. See Martin Ruckert’s TUGboat article and website.

## Videos

Finally, after having read these smaller programs and having gained a bit of familiarity with Pascal, before reading the full TeX program I strongly recommend the series of 12 lectures that Knuth gave in 1982 called The Internal Details of TeX82. They are available on YouTube, but I’ve embedded them on this website, with some comments, here.

Also, as you read the program / watch the videos, you can also try solving these exercises from a course DEK gave about TeX:

## History

It is unclear how much this would help, but often the earlier versions of a program are less complex, or at least illuminate how the program got into its current state.

• TeX’s early “design documents” (TEXDR.AFT and TEX.ONE) are available on https://www.saildart.org/ (also published in Digital Typography).

• The site also contains early versions of the TeX programs, in the SAIL language.

• In principle these could be cross-referenced against the TeX “errorlog”, to produce something like a mini-version of Diomidis Spinellis’s Unix history repo.

# The program

If you’re done with all the prerequisites and are ready for reading TeX itself, click here for a raw dump of TeX.

# Random observations

A debugger may help. I started writing something (has a really bad interface and is implemented in a stupid way currently).

• An example is here, showing some of what goes on in TeX’s internal state with the two line file:
\expandafter\expandafter\expandafter\meaning\expandafter\uppercase\expandafter{a}
\end

After looking at the program.

