Reading awc
Arther Whitney, the creator of the k language(s), writes C code in a unique style. Some see it cryptic while others find it elegant. I’ve known his implementation of J, and the word “Incunabulum”, from Dyalog’22. But never had I tried to reason about his works.
About several months ago, I started to do code-golf using ngn/k, an opensource implementation of K6, as to gain some experiences with the k language. I had tried to read ngn/k’s source code but was scared away by it’s size (translated: due to my laziness). In this January, Arther gave an referential implementation of a small k interpretor. It consists of only around 60 lines of code, in which 33 lines are in the header. So I decided to take a look.
It took me about a hour to understand these 29 lines of C code. Wow, awc is dense. Here are some interesting points I got from the code.
There are only three explicit loops in the whole program. Two
while-loops for formatting integers and for the main loop of the
interpreter. And a single for-loop in the macro i
that’s
responsible for every iteration and vector-traversal in the program.
Yep, no stinky loops!
Common patterns are encapsulated into macros but they leave enough
freedom for customization. For example, n(n,e)
generates a
new vector with each element created by expression e
, and
it is used in everything from indexing a vector by another vector to
being embedded in G(f,o)
for doing inner-product between
arrays.
It uses fixed names a
and x
for input
variables in functions, just like how ⍺
and ⍵
works in APL dfns. As an extension of the previous point, common
operations like referencing the i-th element and increasing reference
count are made into macros for each input variable. This not only frees
the program from boilerplates, but also makes function definitions more
straight forward. Though it still requires cares like in the function
Cat
, x
must be copied before a
since it uses latter’s length, and a
might be deallocated
by _a
if they were copied the other way around.
At first, I had to jump between a.h and a.c constantly for references on what those macros do. After getting familiar with the macros, it reads like a C-style array language, and it’s easy to see how primitives are built from each others. Almost nothing is redundant so nothing to glance over when one reads the code. I think it is an interesting way to formulate programs and it tries to capture their essence. It really makes its reader think about the program and how it is built, rather than just inferencing what it does from its shape in a glance and wonder if there were some off-by-one error caused by silly typos.
Here is the annotated a.c.