Interview Question

Posted by Elf Sternberg as Uncategorized

I won’t reveal where or when I got this question, but it always amused me.  At the time, I answered it using Underscore and Coffeescript, which the interviewers allowed I was going to have access to… but here’s a pure ES6 solution.

The problem, simply stated, was “write a function that sums two polynomial equations and prints the results.”  They defined the format for the input this way:

// 5x^4 + 4x^2 + 7 
// 3x^2 + 9x - 7
var e1 = [{x: 4, c: 5}, {x: 2, c: 4}, {x: 0, c: 7}];
var e2 = [{x: 2, c: 3}, {x: 1, c: 9}, {x: 0, c: -7}];

They were kind enough to let me code on my keyboard.  My answer is rather dramatic.

// Reduce any list of equations into an array of maps of exponent:coefficient
var eqns = [e1, e2].map((a) => a.reduce((m, t) => { m[t.x] = t.c; return m; }, new Object(null)));

// Find the largest exponent among all the equations
var maxe = Math.max.apply(null, eqns.map((a) => Math.max.apply(null, Object.keys(a))));

// For the range (maxe ... 0), for all equations, sum all the coefficients of that exponent, 
// filter out the zeros, sort highest to lowest, create string representations, and print.
        Array.from(new Array(maxe + 1), (x,i) => i)
        .map((exp) => [exp, eqns.reduce(((memo, eqn) => memo + (eqn.hasOwnProperty(exp) ? eqn[exp] : 0)), 0)])
        .filter((e) => e[1] != 0)
        .sort((e) => e[0])
        .map((e) => e[1] + (e[0] > 1 ? 'x^' + e[0] : (e[0] == 1 ? 'x' : '')))
        .join(' + '));

The interviewer just stared at it, and stared at it, and said, “I’ve never seen anyone solve that in three lines.  Or that fast.”

I shrugged.  “It’s a straightforward map/reduce of the relationship between exponents and coefficients, removing any factors that had a coefficient of zero.  This seemed the least buggy way to do it.  The riskiest part of this equation is the mapping back to string representation.  The nice feature of this function is that if we generalize the first line over an arguments array, it works for any number of equations, not just two.”

He agreed.  They ultimately didn’t hire me.  I had a friend there, and he said, “They really liked you, but it was pretty clear you were already bored where you were and moving from one infrastructure job to another wasn’t going to change that.”  Sad but true.



Lisp In Small Pieces, Chapter 5: The Storage Story

Posted by Elf Sternberg as Lisp

One other thing about Lisp in Small Pieces chapter 5 jumps out at me: the storage story.

In the interpreter written for Chapter 5, some things are cons lists (most notably, the expression object you pass into the interpreter), and some things are lists, but they’re not built with car/cdr/cons.

In chapter 3, we built an interpreter that used full-blown objects, in which each object had a field named “other” that pointed to the next object; when looking up a variable or an unwind point, the search was an explict call: starting with the latest object, a search would begin down the chain for a match and, when found, would trigger either a memory retrieval or a continuation, at which point the interpreter would resume with the relevant memory or continuation. Each object had a “failure” root class that would throw an exception.

In chapter 5, it gets even more functional. Chapter 5 tried to define everything in the Lambda Calculus, which allows for closures, but doesn’t by default support objects. But Quiennec really wanted to teach about allocation issues, especially the boxing and unboxing of values, so to make that point, he created two structures: one represents variable names that points to indexes, and one represents an indexed collection of boxes. Lookup represents the Greek equation σ (ρ ν), which is basically that the environment knows the names of thing, and the store knows the location of things.

But in order to be explicitly literal, Quiennec goes full-on. Both environment and store are represented the same. He creates a base environment that looks like this:

ρ.init = (ν) -> "Variable name not found."

and then when we add a new variable name to the stack, we write:

(ρ, ν, σ, κ) -> (ν2) -> if (ν2 == ν) then κ(σ) else ρ(ν2)

. In this case, we call a function that creates a function that, in turn, says “If the name requested matches the name at creation time, return the stored store point (actually, continue with it), else call the next (deeper) environment, all the way down the stack until you find the thing or hit ρ.init”.

It’s a really cheesy ways of emphasizing that you can do Lisp in a full-on Lambda Calculus way, but you probably shouldn’t. It’s also completely dependent upon the parent environment to reap memory when you’ve examined the tip of an expression and have retreated back toward the base of the expression tree to proceed down the next expression.

Lessons here are about the Lambda Calculus, and about memory management. In the latter case, how hard it’s going to be if you want to do it the way the big boys do.  Garbage collection is hard.


Lisp In Small Pieces, Chapter 5.

Posted by Elf Sternberg as Uncategorized

I lied when I said I’d completed Chapter 5 of Lisp In Small Pieces. I went back and re-read it, and realized that I was going to have to do the exercises in the chapter if I wanted to understand what was going on well enough to get through chapter 6.

Chapter 5 wasn’t nearly as hard as I’d thought it was going to be. It also wasn’t much fun. No new interpreter came out of it. Instead, what I got was the same interpreter, with yet another layer of indirection.

Quinnec is making a point in chapter 5. “If you’re going to present at OOPSLA, you’ll need to know how the big boys write.” And the big boys write in Greek. The chapter is about “the meaning of a program,” and turns the core switch statement into a quasi-readable “the meaning of this expresssion is the meaning of this expression when…” with pattern matching for conditionals, abstractions, and sequences. The meanings themselves become repetitious definitions of expression, environment, memory, and continuation.

Oh, yeah, did I mention that absolutely everything in this is in continuation-passing style? Madness. It made sense, when I did it, but it was painful to work through all those changes and make it all work.

There’s a giggle where Quinnec explains that Greek is used because each letter becomes a shorthand for something: an addressing scheme, a naming scheme, a continuation, an expression, so it’s possible to fit the entire language definition on one page, leaving you nine pages to explain whatever it is that’s unique about your language. “Greek letters are used because most programming languages are limited to the ASCII character set.”

Obviously, this book needs to be brought into the 21st century. We have unicode now. I was able to copy the Greek verbatim into my comments, e.g. f[y → z] = λx . if y = x then z else f(x) endif.  Note the lambda and the other unicode characters supposedly out of my reach. Not only are they accessible to me, I’ve permanently encoded them into my keyboard. (Had to use that Windows key for something, after all.)

Translating the Greek into Scheme, and then into Coffeescript, my target language, was fun. No, really. When it was finally working, it was kinda nifty to see that it did in fact work. You end up building these enormous function stacks of “meaning,” at the bottom of which is the print command, which translates to “print what that all meant.” At that moment, all the functions built up trigger in the right order, the AST order, and the result is delivered to your screen. It’s big and ugly and weird, but it works.

Chapter six is about making that all go faster. But I needed a chapter 5 interpreter before I could begin.

It’s been two weeks since I finished chapter 5. The start of the school year for my daughter, and other life matters, have intervened. I’ve also been head deep in learning something esoteric for work, so I’ll have go back and review that.


Using SplunkJS with SimpleXML Panels

Posted by Elf Sternberg as Uncategorized


Splunk’s SimpleXML is an XML file format to describe a custom dashboard with searches, inputs and panels. There are a number of fantastic resources for building them, but I recently encountered an interesting problem. That link also discusses SplunkJS, a Javascript library that allows users to customize searches and visualizations far beyond what SimpleXML allows.

SplunkJS is usually used with raw HTML and CSS, but can be pulled into a SimpleXML file by using the script attribute in the SimpleXML opening <dashboard> or <form> tag. It’s easy to make a SplunkJS search and attach it to a SimpleXML visualization; it’s not so easy to make a SimpleXML search and attach it to a SplunkJS visualization. This document shows you how, and shows you how to fix a peculiarity that arises from creating a well-organized ecosystem of panels and dashboards.

In later versions of Splunk, SimpleXML has a new attribute for <panel>, ref, which allows you to define a panel in a single file and drop it into a number of different dashboards without having to cut-and-paste the panel code. In the process, SimpleXML mangles the names of searches and visualizations, and so finding and manipulating those searches has become difficult.

This example uses the Splunk Linux TA (Technology Add-on), so you should download and install that. What data you use isn’t really important. For our example, though, we’re going to do is create a single dashboard with a single independent panel that shows the list of processes running on a host, find that panel, find its search, find its title, and modify the title with the name of the longest-running process.

After installing Splunk (here’s the free version of the Enterprise Edition, limited to a half-GB of data per day) and getting it up and running, click on the App icon (the gear symbol) on the left sidebar. On the Applications list, click on “Create a New App”, and provide it with a name, a directory slug, and a version number.

Now it’s time to fire up your editor. We need to create three things. A dashboard, a panel, and a javascript file to perform the magic.

Literate Program

A note: this article was written with the Literate Programming toolkit Noweb. Where you see something that looks like <this>, it’s a placeholder for code described elsewhere in the document. Placeholders with an equal sign at the end of them indicate the place where that code is defined. The link (U->) indicates that the code you’re seeing is used later in the document, and (<-U) indicates it was used earlier but is being defined here.

The Dashboard Files

The Dashboard file is simple. We just want to pull in a panel. This goes into APP_HOME/default/data/ui/views/index.xml. Here, APP_HOME is the path to the directory slug where your app is stored. I install Splunk in /opt and I named my example “searchhandle,” thus the path is /opt/splunk/etc/apps/searchhandle/default/data/ui/views/.


<dashboard script="title.js">
  <label>A Dashboard with a portable Panel and a Managed Title</label>
  <description>A simple demonstration integrating SimpleXML and SplunkJS</description>
    <panel ref="cputime" />

The panel file is also simple. It’s going to define a search and a table. It goes in APP_HOME/default/data/ui/panels/cputime.xml. Note that the filename must match the ref attribute. I’ve limited the search to the last hour, just to keep from beating my poor little laptop to death.

The indenting of the query is a little odd; this is the best compromise I have found between making the search readable in the XML, and making it readable if you examine it with Splunk’s search tool.


    <title>Long-running processes</title>
    <search id="cputimesearch">
      <query>index=os source=ps | stats latest(cpu_time) by process 
| sort -latest(cpu_time)
| rename process as "Process", 
  latest(cpu_time) as "CPU Time"

The <dashboard> tag in our dashboard file has a script attribute. This is where we’ll put our logic for manipulating the title of our panel. It’s annoying that we have to put our script reference in the dashboard and not the panel. It’s possible to have a file named “dashboard.js” which will be loaded for every XML file in your app, and then have it selectively act on panels when they appear, but that seems like a half-hearted solution to the problem.

The Javascript

Javascript files go in the APP_HOME/appserver/static/ directory. I’ve named ours title.js.

Splunk uses the require facility to import files. In the prelude to any SplunkJS interface, you must start with the ready! import, which doesn’t allow the contents of this file to run until the Splunk MVC (Model View Controller) base library is loaded. We’re also loading the searchmanager and two utility libraries: underscore and jquery, both of which come with the SplunkJS UI.

The one thing we’re most concerned with is the registry, which is a central repository where all components of the current Splunk job’s client-side operations are indexed and managed.

The file’s outline looks like the below. Understanding is best served by reading the references from bottom to top: wait for the search, find the search, listen to the search, do something when the search triggers.


], function(mvc, searchManager, _, $) {

    var registry = mvc.Components;

    <update the title with the data>

    <listen to the search for the data>

    <find the search>

    <wait for the search to be available>

In the outline, we took one of the items passed in, mvc.Components, and gave it a name, the registry. Waiting for the search to be available is as simple as listening to the registry:

<wait for the search to be available>= (<-U)

var handle = registry.on('change', findPanel);

Finding the search and attaching a listener to it is actually one of the two hardest parts of this code. First, because we have to find it, and the new panels layout makes that difficult, and secondly, because the change event mentioned above can happen multiple times, but we want to make sure we only set up our listener only once.

Below, the function findPanel lists through all the Splunk managed objects on the page, and finds our search. It does this by looking for a registry name that matches the ID of our search. The panel layout mangles the name, attaching the prefix “panelXX_” where XX is some arbitrary index number. (In practice, the index number is probably deterministic, but that’s not useful or important if you’re going to be using this panel on multiple dashboards.) Underscore’s filter is perfect for finding out if our search is available. If it is, we disable the registry listener and proceed to the next step, sending it the search name.

<find the search>= (<-U)

var findPanel = function() {
    var panel = _.filter(registry.getInstanceNames(),
                         function(name) { return name.match(/panel\d+_cputimesearch/); });
    if (panel.length === 1) {
        registry.off('change', findPanel);

This is the most straightforward part of the code. Having found the search name, we then get the search manager, get its results manager, and then set up a listener to it that will update the title with the data.

Splunk searches manage the task of searching, but not the actual data. That happens in a Result, which updates regularly with the growing cache of data from the server to the browser.

This code skips a ton of details, mostly about listening to the search for failure messages. That’s okay. This is just an example, and it works 99% of the time anyway. Since we’re going to change the title to include the longest-running process, and our search is pre-sorted, we just need a count of one. This Result uses the same dataset as the actual visualization and puts no additional strain on the Splunk server or bandwidth between the server and the browser.

<listen to the search for the data>= (<-U)

var setUpSearchListener = function(searchname) {
    var searchmanager = registry.getInstance(searchname);
    var resultmanager = searchmanager.data("preview", {
        output_mode: "json",
        count: 1,
        offset: 0
    resultmanager.on("data", updateTitle);

The last thing we do is update the title. (Remember, that’s our goal). The panel’s title is found in the .panel-head h3 child DOM object. Finding the panel is trickier, but Splunk gives us an attribute with the name of the panel’s filename, so jQuery can find it for us. There’s a guard condition to ensure that we actually have some data to work with.

The names of the fields correspond to the final names in the search. I’ve always found Splunk’s naming conventions to be a little fragile, but it works most of the time.

<update the title with the data>= (<-U)

    var updateTitle = function(manager, data) {
        if ( !data || !data.results || !data.results.length) {

        var topprocess = data.results[0];
        $("[data-panel-ref=cputime] .panel-head h3")
            .text("Longest Running Process: " + topprocess["Process"] +
                  " (" + topprocess["CPU Time"] + ")");


One last detail: You want to be able to get to this page.

To do that, open the file at: APP_HOME/default/data/ui/nav/default.xml and replace the line for “search” with this:

<update navigation>=

<view name="index" default='true' />
<view name="search" />

Now restart Splunk

And that’s it. Put it all together, and you’ve got yourself a working application in which SplunkJS can tap into SimpleXML searches and exploit their data, even if that search is defined in an independent panel.

This code is available at my github at Splunk with SimpleXML and Javascript.

Table of Contents

Dear Gods, I’m not even sure why I should even bother, but the C++ experiments I’ve conducted recently were so much fun I’ve decided to put TOXIC (Terabytes of XML, Indexed and Compressed) and Twilight (A basic GraphDB with local indexing and aggressive caching, using RDF triples as index keys) back into my projects list.  I’m not even sure why.  It doesn’t seem like a very smart thing to distract me with yet more shiny.  But these would be fun, fun shiny.

Recently, while I was at Beer && Coding, one of the others came in with a problem that they’d been given by a potential employer. They’d hoped that we’d be able to help finish it. Nobody did in the time allotted, but I got pretty far with my Scheme version. However, Scheme wasn’t in the list of legal target languages.

The problem stated was:

Given a number (assume base 10) less than 10,000, write a program in
C++ that will reverse the digits of that number, calculate the
original number to the power of the new number, and print it out.
You may not use Boost, GMP, or any library other than that provided
by the C++ Standard Library.

I don’t know C++. I haven’t ever written C++ profesionally, and I haven’t actually looked at C++ since 1999 or so. As a professional, I’m aware of what’s going on in the zeitgeist, and at my job at Spiral Genetics I interacted with two very talented C++ developers a lot, so I was aware of things like the emerging C++ Standard Library and RAII and so forth. I didn’t know what they meant, but I had heard of them. I’ve also been aware of the emerging standards in C++11 and C++14, mostly thanks to Slashdot, Hacker News, and their ilk (don’t read the comments, don’t ever read the comments), so I’d heard about auto_ptr and C++11 lambdas and the like.

It took about an hour of googling to get up to speed on things like namespaces, containers, for_each, lambdas, and the like. I really like the new unique_ptr construction. That’s very nice.

My basic solution degrades to 4th-grade mathematics: Break the multiplicand up into a list of single digits, multiply each digit with the multiplier, then redistribute the values up the tens, hundreds, etc., etc. This solution is not particularly fast or space-efficient, but it has the virtue of being comprehensible by any ten-year-old.

As usual, I’ve provided a test suite, as well as a pair of utility functions for converting the list to a string, or an unsigned long. The latter only works with very small results. The executable, “cheapgmp”, works as specified in the problem statement.

The source code is, of course, available on Github.


Pattern Matching And Lisp

Posted by Elf Sternberg as programming

No, I don’t mean regular expressions.  I mean that thing that you see in “modern” programming languages where you’ll have two functions of the same name that take different arguments do different things  We use it all the time.  The plus symbol, “+“, often has two separate meanings, and is often processed by two different operations:

add(number, number) -> mathematical addition
add(string, string) -> textual concatenantion

Now, most programming languages provide this differentiation by convention.  C++ is fairly unusual in that it permits it explicitly; you can use the same name for a function with different inputs (the arguments), and the C++ compiler will connect any calls to that function name correctly depending upon the arguments you provide at compile time.

But the modern languages like Haskell do this explicitly and without any special needs.  You can re-use function names all day, and as long as every function has unique input and output types, Haskell will associate the right functions in the right way.  (I think Haskell is annoying in that it doesn’t even matter what order you make these declarations in, but some people seem to like it.)

It turns out this behavior is, for programming, really old.  Like, 1960’s-era old.  The λ-calculus that McCarthy turned into Lisp back in 1960 described it.  McCarthy didn’t know how to turn pattern matching into a general-purpose feature, so we ended up with “if” statements and conditionals and all the rest to decide these things by hand, but the original math seemed to require automatic pattern matching.  Take the “sequence” operation: This is when a function has a list of things to do, several statements in a row, and returns the value of the last expression.  The math for this reads:

Ε[[π]]ρκσ = Ε[[π]]ρκσ

Ε[[π π+]]ρκσ = (Ε[[π]]ρ λεσ1.(Ε+[[π+]]ρκσ1) σ

The left side of the equation denotes “the meaning of…”  In this case, line one is “The meaning of a single statement is the meaning of a single statement after we’ve taken into account the heap and the stack.”  And line two is “The meaning of a single statement with more statements following is the meaning of the first statement, which we then discard to derive the meaning of the rest of the statements, all the while taking into account the heap and the stack.”  There’s actually more to that, as the environment and memory can change from statement to statement, but that’s already handled by our understanding of what ρ and σ mean.

In a common language like python or javascript, we would have an ‘if’ statement to say “If this is a single statement we’re done and return, else process the first and iterate.”   But in Haskell you could just write those explicitly, and it would work; the compiler knows exactly what you mean, and does the right thing, because there’s no ambiguity there at all.

I just wish I could write them in English, which is my native tongue, and not have to translate from Greek in my head.

Chapter 4 of Lisp In Small Pieces introduces side effects. It also introduces a (excruciatingly simplified) taste of what managing memory feels like, with a monotonically increasing integer representing memory we’ve requested, and a corresponding identity in the environment.

More interesting is the abandonment of the Object-Oriented approach in Chapter 3 for a completely functional approach in Chapter 4. Everything is a function with enclosure. When you want a number, you have to send the message (number sValue) to extract it. You can get typing information with (object sType). The “sValue” and “sNumber” are, in my code, simple Javascript objects of type Symbol, around which I’m trying to formalize symbols and quotations, so that going forward it’s easy to distinguish a Symbol as a unique type– doing so may remove an entire class of difficulty in the read() function.

Chapter 4 was supposed to also teach about memory management and variable “boxing,” but in fact it only contains a very cheap pair of stacks: one for the environment, the key of which is a name and the value of which is a number, and one for memory, the key of which is the number, and the value of which is the object: a symbol, a number, a boolean function or an arbitrary function. It makes a strong correspondence between this Lisp and the λ-calculus, and even shows how, with a few small extensions, the λ-calculus can accommodate assignments and similar side effects, although not necessarily externalities, like I/O. Extending this to a real memory management system is, I hope, not entirely left up to the reader as an exercise.

I’m still trying to figure out how to write this monoidically, such that I can “lift” (see: The Elevated World) all of my functions out of the “standard Lisp lists and things” world to the “lisp lists and things encapsulated with metadata about compilation,” which would enable me to add debugging, source code, and sourcemap information to the resultant product. Maybe I need to do a few more monoid/monad tutorials, the more hands-on ones, to wrap my head around lifting.

The source code for Chapter 4 of Lisp In Small Pieces, Coffeescript Edition is, as always, available on Github.

Chapter 5 basically reiterates Chapter 4, only it does so entirely in Greek. I’m only half-kidding; it’s mostly an attempt to formally described the interpreter covered in Chapter 4 using the formal language of an extended λ-calculus. It doesn’t have an interpreter of its own, except in a purely theoretical, head-in-the-clouds description of a λ-calculus-derived language written mostly in Greek symbols.  So there’s not going to be any source code associated with it.

In Chapter 3 of Lisp In Small Pieces, you are encouraged to write a lisp interpreter using continuations. Here’s what I learned as I implemented the chapter 3 examples in Javascript:

A continuation is a representation of the incomplete state of the program. For any evaluation being done, the continuation at that moment holds everything else that must be accomplished in order for the program to finish.

It sounds, odd, but it’s actually an easily understandable feature of any development environment with first-class functions and working closure– both of which Javascript has. Imagine the instruction print 4 * 2. At the very top, that breaks into “Evaluate four times two then print the result.” The part after ‘then’ is the final continuation.

So it breaks down to:

  • Wrap the “print something” as a FinalContinuation. Pass that continuation to the evaluate function with a pointer to the rest of the AST (Abstract Syntax Tree) to evaluate.
  • Evaluate what the ‘*’ means. This retrieves a native function that expects two numbers and returns a product. Wrap the FinalContinuation with a new EvFunc continuation.
  • Evaluate what ‘4’ means. Continue on to…
  • Evaluate what ‘2’ means. Continue on to…
  • Apply the function. This continuation has been built by the previous three. Inside the source code, you’ll see EvFunc, which calls EvArgs and passes in Apply, a continuation to run after the arguments have been evaluated. Each instance of “EvArgs” creates as “GatherArgs”, which appends the discovered arg into a list, and then passes the ‘Apply’ continuation forward. When the EvArgs discover no more arguments, it finally calls the ‘Apply’ continuation, which performs the primitive operation ‘4 * 2′, and then the Apply calls its continuation, which is the FinalContinuation, which prints the value passed by Apply.
  • As that was the FinalContinuation, the program terminates.

Rather than stepping through the program line by line, we step through continuation by continuation. Each line of a program becomes its own continuation, encapsulating the current state of the system, as well as all possible future states– at least abstractly. For a while, this was really hard for me to understand, but now that I’ve done the exercise, I see that the notion “the rest of the program” doesn’t mean the rest of the syntax tree all laid out and ready to trigger, but instead means “What we’ve done before,” (the current continuation), “What we want to achieve and what we haven’t processed yet that needs doing to acheive it,” (the continuation we’re about to build), and “How we bridge the two after processing this node of the AST” (the current evaluation).

Each continuation becomes a handler of its own momentary responsibility in the interpretation process and a wrapper around all the future responsibilities that have thus far been discovered during interpretation.

That seemed like a lot of work, and I wasn’t entirely sure of the benefits of it. On the other hand, continuations do allow for some features of programming that are much harder without them: blocks of code with early return, catch/throw, and finally.

Blocks with early return were easy. When you enter a block, an interpreter creats a new continuation: What do to after the block. Then it creates a continuation: What to do inside the block. It then resumes the inside one. If the block ends, or if return is used, the results are delivered as the value to be processed by the after continuation. Blocks with labels look up the stack where the prior continuations were stored, and when it finds a match, resumes the program using that continuation.

This is important: the stack’s purpose, which up to this point was strictly to maintain the identity and value of variables in a given scope, has a new responsibility: maintaining the integrity of the interpretation process itself. There is a new thing on the stack that is not a Symbol/Value pair! Or rather, it is a Symbol/Value pair, but not one you can dereference and manipulate on your own. Eventually, we learn that the classic call/cc (call with current continuation) does allow you to access it and run it whenever you want.

Catch/throw is more complicated. Throw statements can appear anywhere in the source code and abstract syntax tree (AST), and aren’t necessarily containerized within a catch block. When they trigger, a corresponding catch continuation has to be looked up and triggered. A catchLookup occurs up the collection of continuations (not the environment, note) until it gets a match or the program dies with an uncaught throw.

A protect is even more important. In Javascript and Python, protect is usually called ‘finally’, and it means the collection of statements that must be executed within the current continuation-managing context. So when the body of the protected statement ends in any way, we look up the collection of current continuations (remember, they’re wrapping each other up to the BottomCont, creating a sort of stack-that’s-not-a-stack) until we find the continuation in which the protect was created, then continue with evaluating the final body, then continue on with execution. Because any continuation can be wrapped this way, the base continuation has to be modified, and both the ‘block’ and ‘catch’ handlers have to also be modified to scan for protected evaluations.

This is fun. And kinda deep. It goes way more into detail than your usual “let’s write an interpreter!” that you find on the Internet, even if one of my side-excursions was to write the λisperator interpreter.

Even this code is twonky. Symbols are not handled well. There’s an arbitrary distinction between initial symbols and created symbols. It’s not as bad as the λisperator interpreter where the continuation designations were kinda arbitrary.

The oddest thing about this code, and it’s endemic in all of these “write an interpreter in Scheme” textbooks, is that it rides hard on top of the scheme garbage collection. When the environment stack pops down, or when a continuation is executed and unwraps, the wrapper is simply discarded, and the assumption is that garbage collection just works at the interpreter’s interpreter level. You start to notice it a lot as you go through the exercise.

We all have superpowers. One of my friends once said that my superpower was the ability to look at any piece of software and, within minutes, be able to describe with frightening accuracy the underlying data structures and fundamental internal APIs used to manipulate it. The other day I ran headlong into Airtable, and the underlying data structure was pretty obvious. The performance is amazing, but the underlying data is, well, simple.

A spreadsheet is a Cartesian grid of cells. But the interaction between the cells is a graph, in which some cells contain values, and other cells contain expressions that are dependent upon those values. A hypersheet extends this notion to include the idea that some cells contain, or are driven by, data stored elsewhere (and usually stored using an RDF triple).

So: An Airtable is nothing more than a graph database, backed by Postgres, in which triggers written in a convenient subset of a popular language (let’s call it Javascript) can be written by the community to facilitate the filtering and processing of hypertext content on a cell-by-cell basis.

The trick of a spreadsheet is in extending it conveniently. But there’s already a book that teaches you how to do cutting, copying, and pasting of content. It’s not a fantastic book; it’s too centered on its own interests to abstract spreadsheet notions well, and it’s too concerned with C# (although the F# asides are fascinating). It does cover the issue of relative-and-absolute ranges and how to cut and paste them without losing track of what they refer to, so that’s good enough.

Seriously: That’s about it. I’ve linked to every paper and book you need, every idea, to re-implement Airtable. All it takes is some engineering. And let’s be honest here: if you get the infrastructure for Airtable right, you get how to write Wunderlist, Trello, and Evernote for free.

Now, be fair: I have no idea if Airtable is implemented this way.  For all I know, this facile analysis might give the engineers there one heck of a giggle.  But if I was going to implement something like Airtable, these are the resources with which I would begin: A document store of databasess, a graph database of relationships, an abstracted RDBMS (probably Postgres) of content, a decent event manager using push as its paradigm for updating cells (RxJS, using React, or possibly just straight up Om), and a lot of interesting knowledge gleaned from papers on hypersheets and functional spreadsheets, with each cell decorated with an about field that identifies the uniform resource name of the object that cell represents.  Processing could be done client-side or server-side, depending upon the nature of the cell and the frequency with which in needs updating (RSS and Semantic Web objects would be updated less frequently), and it would be technically challenging but not impossible to discern which was which with a proper constraints-based system.

Oh, by the way, if you get the typeof/about/property triple correct for cell representation, you’re basically a few engineering puzzles away from replicating The Grid, as well.


December 2015
« Sep    

Recent Comments