Will there ever be a Flying Shuttle for Software?

Posted by Elf Sternberg as chat

Fred Brooks once famously wrote that there was no silver bullet for software development. Once a project and its tooling were agreed upon, software development took the time that it took. Brooks’s insight was that software developers spend their days more or less at the limits of their intellectual capacity, and no amount of tooling or management could expand that limit. Since a developer is usually fully engaged with any given task, adding a second developer to that task will actually slow production down: the two developers now have the overhead of communicating with each other the nature of the task and each may be blocked by the other to accomplish a signifcant subtask.

The term for this “No Silver Bullet,” and the book by that title has gone on to be a perennial best-seller, usually purchased by harried software developers and left anonymously on the desks of clueless project managers.

Treadle-based handlooms for weaving cloth have existed for almost two thousand years, and their current form is unchanged from the one that emerged in Germanic and Italian forms a thousand years ago. A loom has a frame of vertical threads in two alternating sets; each press of the foot peddle pulls one set apart from the other, and the user strings a spool of thread called a “shuttle” across this horizontally, then cycles the foot peddle to close the lifted threads down, then lift the other set up, giving the weaver that alternating over-under sequence that holds cloth together. Skilled users were known to “throw” the shuttle across the loom field, but cloth could only be woven as wide as one person could reasonably reach.

In 1733, a man named John Kay built a narrow wooden track, put the shuttle into the track, and with a piece of string and a handle, “jerked” the shuttle back and forth across the loom field. A few improvements later, he had made a loom three times as wide as existing looms, and one that could be operated twice as fast. Kay’s “flying shuttle” replaced the thrown shuttle almost instantly, and the industrial revolution had its first major component.

The more I write software, the more I have this sensation that something is very wrong with the way we write software. I listen to developers describe their projects and my overwhelming thought is, “Didn’t someone do that already?” I can’t begin to count the number of times I have implemented the same small algorithms over and over. Consider Stack Overflow, in which questions “How do I find a substring?” has several different code snippets that dozens of people will now simply cut and paste into their own code. Programmers get paid like princes these days mostly to know how to know these things, and glue them together in the right order for our corporate masters, and make them work profitably.

If there’s No Silver Bullet, could there be a Flying Shuttle waiting for us? I don’t have an answer, but I fear that there is: there’s an answer that glues everything from the hardware to the UI together into a meaningful whole, that answers the questions people want answered, and does the things people want done. I suspect it’s already here; we just haven’t identified it for what it is yet.

So, Spectrum IEEE has a “The sky is falling! The sky is falling!” article claiming that in 2016 tech layoffs have been nasty and that in 2017 it’s going to get even nastier. This is one of many articles on this theme, but it’s a little disheartening to see it in Spectrum. Worse, none of the articles I’ve read on this theme list the skills are going to be out-of-date. Which skills? What disciplines?

In 2008, I was laid off after 8 years at a large company, and I’d been using the same tools for those 8 years. As a front-end developer for dev-ops shops, my skills were woefully out-of-date: We’d been using Sencha (JS) and Webware (PY), with some Python 2 Python-to-C libraries. I knew nothing about what the cool kids were doing. I sat down and in a few days taught myself Django and jQuery; I rebooted by SQL knowledge from my 90s-era experience with Oracle and taught myself the ins and outs of Postgresql.

And then, in the bottom of the recession, I took shit contracts that paid very little (in one mistake, nothing) but promised to teach me something. I worked for a Netflix clone startup; I traded my knowledge of video transcoding for the promise of learning AWS. I worked for a genetic engineering startup, trading my knowledge of C++ for the promise of learning Node, Backbone, SMS messaging, and credit card processing; a textbook startup, trading my knowledge of LaTeX for the promise of learning Java; an advertising startup trading my basic Django skills to learn modern unit testing; a security training startup, trading my knowledge of assembly language in order to learn Websockets.

The market improved. I never stopped learning. I gave speeches at Javascript and Python meet-ups. Recruiters sought me out. I’ve been at another big company for four years now.

Will things go to hell in March? I don’t care. I have the one skill that matters.


PacMan doesn’t need AI

Posted by Elf Sternberg as Design, programming

The other day, I was reading through the course syllabus for a second-year AI class, as one does, when I noticed that the assignment for the sixth week was to turn in a working version of PacMan. Which is kind of weird, because the actual algorithm for PacMan involves more or less zero AI. It involves something else, and one of my favorite words: stigmergy.

Alright, so, here’s the algorithm in a nutshell: PacMan is played on a 29-by-26 square grid of cells. Everything else is special effects. There is a clock cycle: every cycle, the characters move from one square of the grid to another. If PacMan and a ghost share the same cell in a cycle, PacMan loses a life. There’s an animation engine running to make it look smoother than it is, but that’s the basic game.

The grid is actually three different grids layered together: One grid constrains movement by providing the walls. One grid tracks the dots that have been eaten. (The actual end-of-round tracking is done with a counter.)

The last grid is the stigmergy grid: every clock cycle, PacMan moves forward in a direction. The grid he just left is given a number: 255. Every clock cycle, the stigmergy grid is scanned for these numbers, and they’re reduced according to some formula until they reach zero. A ghost wandering the maze has a few rules: when it reaches a cell that has more than one neigbor, it chooses a direction based on a formula, and part of that formula includes adding in the stigmergy number of the neighboring cells. Blue ghosts use a reverse strategy; “dead” ghosts use a simple vector-weight strategy to go back to the center room.

In short, the ghosts are following PacMan’s scent, in much the same way ants follow a trail laid down by other ants.

There’s also a clock-cycle counter that causes the ghosts to reverse themselves from time to time, but that’s the basic gist of it. Unfortunately, the random number generator is seeded with the same number every level, so it became possible to master the game and play infinitely long. As smooth as the game looks, you actually have half a second of leeway time between moves, which is well within the average video gamer’s skill to master. Ms. PacMan fixed the seeding issue, and the game is significantly harder to play for a long time.

That’s it. You could implement PacMan in a few hundred lines of Javascript and HTML. Some animate CSS using the FLIP trick would be awesome. There’s no magic, and certainly no AI about it.

My latest contribution to the world is Git Lint, a plug-in for git that allows you to pre-configure your linters and syntax checkers, and then run them all at once on only the things you’ve changed. About half of us still live in the command line, and I like being able to set-and-forget tools that make me a better developer.

Here are a few things I’ve learned along the way about Python projects.

1. Use A Project Template

Project templates provide a means to magically produce a lot of the boilerplate you’re going to be producing anyway. I’m fond of Cookiecutter. While Git Lint started life as a single Bash script (later, a Hy script, and now a Python module), at some point I needed much more than just that: I needed documentation and testing. Up-to-date templates provide you with up-to-date tools: Tox, Travis, Sphinx, PyTest, Flake8, Twine, and a Makefile come pre-packaged with Cookiecutter’s base template, and that’s more than enough to launch most projects.

2. Setup.py is a beast

Getting setup.py to conform to my needs was a serious pain in the neck. It still doesn’t work correctly when installing to Mac OSX because the Python libraries and the manual (man pages) are in two different locations. If I’m building a command line tool, I always try to provide man pages. It’s usually the first place I look.

The manual pages also didn’t show up reliably in the build process; I had to force it by adding it explicitly to the manifest, even though it included the docs tree by default.

Setup.py and man pages are NOT friends.

Getting the build to include man pages, which I require for any command-line utility, was truly a pain in the neck, and now every upload to pypi has a manual step where I figure out if I have a man page to deliver or not. It’s truly painful.

3. Sphinx is a pain in the neck — but it’s worth it for Github Pages

Sphinx, the documentation tool for Python, uses RST (reStructured Text), which has just about the worst imaginable syntax for external links I’ve ever wrestled with. Inconsistencies about mixing links and styles drove me out of my mind.

On the other hand, I now have what I consider to be a solid idiom for generating Github Pages (gh-pages) from Sphinx documentation. A branch named “gh-pages” that contains your documentation will automatically be converted into a documentation tree on Github Pages (github.io), and you can see the results for Git Lint. This tree looks completely different from your development trees, so don’t get them confused, merge, or rebase them!

Simple generation of gh-pages

If you check out the Makefile, you’ll see the idiom clearly: it checks out a complete copy of itself into a sub directory, builds the documentation, copies it back to the parent directory, and then fixes all the links (because Github Pages really doesn’t like underscores in file and directory names). There’s irony in that it uses Perl to do the fixing– it’s just what I knew, it was fast, and I always have both Python and Perl installed.

This, by the way, points to another issue: always use the same virtual environment wherever you work. My Macbook and my Linux box had different versions of Sphinx on them, and the resulting generated pages were different on both boxes, making git report “everything’s changed!” when I went to fix a single typo in a link somewhere (I told you I hated those links).

It might be worth it to bag sphinx as a docker feature, or ensure that the version is locked down in your virtual environment.

4. Tox is amazing

Working with tox allowed me to be reassured that my code ran correctly every time, the first time. It did not catch other critical issues with installation, like the man page issue mentioned above, and that was painful to manage, but it did everything else.

5. Git Porcelain Zero is ridiculous

If you’re not familiar with git --porcelain, it’s an argument that many of the status-oriented git commands have that changes the output to a stable, machine-readable form meant to be consumed by other tools. Git Lint uses it a lot.

But the git --porcelain command doesn’t have any other guarantees: it doesn’t guarantee filename sanity, or unicode compatibility. For that, there’s git --porcelain -z, which produces a report in which everything is null-terminated so weird filenames can be consumed. This would be fine if the output were columnar, but it’s not always. The most egregious example I found was git status --porcelain -z, which is usually three columns, but if there’s an ‘R’ in the first column, then it’s four columns– ‘R’ means the operation is ‘rename,’ and the fourth column is the original name.

Since the -z argument makes both the cell and the line terminators null, you have to parse positionally. And if you’re parsing positionally and the number of positions can change, well, that’s context-sensitive parsing. And it’s ridiculous to have to put a context-sensitive parser into a small project like this. There was only one exceptional case here, so it’s a small issue, but inconsistencies like this really bother me.

6. Git lint is amazing

Now that I’ve actually used my little beastie, I can’t tell you how happy I am with it. As a full-stack developer with Python, C++, XML, HTML, CSS, Javascript, and some in-house stuff I can’t discuss, being able to check the entire toolchain without caring about what I’m checking, just set and forget, makes me extremely happy.

All in all, this was one of those projects where I learned a lot about everything: git, python, unit testing, documentation, github, jekyll, reStructuredText, Cookiecutter, PyPi. All this knowledge poured into one small project.

There’s been a bit of chatter on the topic of being an old geek. As most people here know, code quality in the small is one of my favorite topics, and I realized after reading an article this morning that the two topics are actually significantly linked.

Kent Dodd’s Why Users Care About How You Write Code hammered home something that took me a long time to learn. A lifetime, so to speak. There’s a mantra that I’ve heard inside every enterprise I’ve ever been in: “Customer’s don’t care what language you use or your company’s code style guides or your build system. They care about the experience.” But Dodd’s observation is this: if your system has poor abstractions, or is abstracted in the wrong way, then certain requests for future adaptations of your code are going to be difficult. Dodd’s case is that “the experience of the user” is more than just what happens when they sit down with your software: it’s about how fast you can innovate without risking the stability, security, or reliablity of the product. It’s about having a relationship with the user that spans months or years, all the while your software is growing, adapting to new missions, and improving.

Knowing that these pitfalls exist is something that only comes with experience. Recognizing technical debt before it grows into a monster that eats up your development cycles is something that comes from doing this job for a long time.

Every startup that says greybeard geeks aren’t a “cultural fit” is buying into the idea that it can outrun technical debt. That it doesn’t need to mind its long-term, multi-release relationship with its users. Or that technical debt is something to be managed, something middle-management is going to deal with, and you can hire young developers and burn them out, and it’ll all be fine.

It won’t be fine.  Experience matters.  And more to the point, experience matters most to your customers, because without it, the experience your customers will have with your software will be as inconsistent and callow as the developers you hire.


The limits of my abilities…

Posted by Elf Sternberg as chat

In the cellar was a tunnel scarcely ten yards long, that had taken him a week to dig. I could have dug that much in a day, and I suddenly had my first inkling of the gulf between his dreams and his powers. — H.G. Wells, The War of the Worlds

The past week, work has been slow so I’ve had a little time to work on updating my git pre-commit hook. For a while, I actually had a “hack” wrapped around it to run it from the command line, so I could see what was failing before I tried to commit it. I realized as I was working that I was basically writing a lint hook, and have since changed the project’s name to reflect that: git-lint.

The problem is that working on projects like this a polyloader have taught me that the gulf between my dreams and my powers is enormous. It took me a week to refactor pre-commit into something with actual command line arguments, an external configuration file, and policies to implement, as well as adding the ‘dry run’ and ‘sort order’ capabilities– things the pre-commit version doesn’t really need. Obviously, I’d like to know and do a lot more with my non-professional development life. But finding the time is hard, and frankly, when I’m done working on code at work I really don’t have the brains left to write, draw, or code at home.

I remain committed to a few basic ideas: that there’s too much code in the world; that 99% of what we do is translating from formats that are human-comfortable to those that are machine-ready and back; that we can and should make as much of that work declarative; and that even interpreted languages should invest heavily in pre-processors to remove new scopes where none are needed, inline where possible, and exploit the CPU to the best of any human ability.

I know, I’m not helping by writing more of it.

I need to get rich enough to stay home and hack all day. That’s the answer. And I do; just ask my long-suffering wife, who bemoans my willingness to spend all Saturday in front of the computer, geeking out.

University computer science libraries have fallen into a sad and tragic state. I went by the University of Washington Engineering Library. When I was at Isilon, that library was a kind of miracle; if we needed to know anything interesting going on in the world of data management, you could go to the library and find a raftload of interesting papers, digest them over the weekend, and be ready with some new trick for Distributed Dynamic RAID by Monday. It was a thrilling place to enter.

And you could make photocopies of the really interesting stuff.

I went in there recently and discovered that the stacks haven’t changed. I mean that literally. Most of the interesting CS journals have moved entirely on-line, and there wasn’t a single collected journal available dated after 2007. There was a wall labeled “Computer Manuals” that had books covering the initial industrial release of SQL. There was a general damp, fusty smell to everything.

There was one lone machine against the wall where you could survey and, if you had the time, read all the papers the world had waiting, tens of thousands of articles, conference submissions, books, precis, even patents. But you couldn’t print anything out and, since I’m no longer a student, I couldn’t mail copies to myself.

There’s more interesting stuff happening in the world now than there was a decade ago, but the academic CS journals are working ever harder to lock it up and “protect” it from the prying eyes of industry. And that’s a damned shame.

I can’t remember where I found it, but there was a brilliant explanation of how functional code maps value. Remember, in a functional program, the basic notation is x → y, that is, for every function, it maps value x to another value y. Things like map() map an array to another array, while reduce() maps a single thing (an array) to another single thing (a value). How does functional programming encode other things?

Well, there’s

x → y
x is mapped to y
x → y∪E
x is mapped to y or Error (Maybe)
x → P(y)
x is mapped to all possible values of y (Random Number Generators)
x → (S -> y ⨯ S)
x is mapped to a function that takes a state and returns a value and a new state (State)
x → Σy
x is mapped to the set of all real-world consequences (IO)

The other day I realized that there’s one missing from this list:

x → ♢y
x is mapped to y eventually (Promises)

I’m not sure what to do with this knowledge, but it’s fun to realize I actually knew one more thing than my teacher.  Note that the first case, x → y, really does cover all sum (union) and product (struct) types, which tells me that the ML-style languages’ internal type discrimination features are orthogonal to their encapsulation of non-linear mappings.

The really weird thing is to realize that the last four are all order-dependent.  They’re all about making sure things happen in the correct sorted order (and temporal order, if that matters).  That leads me to think more about compiler design…


Menders, Makers, Mentors

Posted by Elf Sternberg as chat

Andrea Goulet is giving me an existential crisis. The CEO of a software development consultation shop, she recently wrote an article called Menders vs. Makers, and something happened this week that makes me think, maybe I’m in the wrong line of work. I’m starting to suspect I’m a mender in a business that only values makers.

This week, I was working on a code base that provided a hierarchical tag editor for an inventory system. I had recently added a new feature that made it possible to see individual elements of the tag system on the Collection page; you not longer had to go visit a single object to see if it had, for example, a location tag; you could just say on the Collection page, “Show me all the objects that have a location tag, and add a new column, location.”

Now that we were able to see the tags, a new problem was found: it wasn’t possible to delete tags. Odd nobody had noticed that before. Since I was the last person in that code base, it was my duty to fix it. Down into the legacy code I went.

The tagging code was, well, intermingled. Validating the tags, determining the changes between the version on the client and the version on the server, writing those changes back, were all in a single gigantic Backbone sync method involving empty arrays, for loops, and concat methods. I spent about four hours, during which I:

  • Replaced all for loops with map / reduce / filter
  • Separated the model validation into its own method
  • Used underscores’s intersection / union / difference functions to create instruction sets for deleting and adding to the tag system
  • Used Backbone’s set([_], (void 0), {unset: true}) method to delete the tags, rather than hammer the event bus with a series of change events in a each loop.

I struggled a lot to make sure I was using names that explained what each thing did.

In short, I did with my code what I did with my writing: try to make every line a pleasure to read, something that told a story about what was happening and what was going to happen next. I hope when someone sees overlappingTags = _.intersection(newTags, restrictedTagNames), it’s obvious what’s happening, and it should create anticipation that soon there will be a line that checks to see if overlappingTags has anything in it and, if it does, reports an error with the offending tags.

I’ve always had fun doing stuff like that, turning unreadable mash into clarity. Even my recent bragging project, Polyloader, is actually a fix for the “All Python on the filesystem ends in .py” bug that sorta firewalls Python syntax from the rest of the language universe.

I’ve found this industry doesn’t really like menders. Code editors, people who go in after the fact and apply measures both aesthetic and qualitative to the code they see, are often seen as nothing but agency overhead by managers.

On the other hand, I’ve yet to meet another developer who resented menders. They like menders; they want to learn from menders how make code better. Menders tend to be older, tend to know more, tend to be broadly learned and strongly opinionated. Nothing “just gets thrown there.” It has to be fixed, it has to work, it has to be right. And I’ve yet to meet a software developer who didn’t want to get it right. Often, they just don’t know how, or nobody’s ever told them how.

Let’s show them how.


Programmers need a class in aesthetics.

Posted by Elf Sternberg as chat

Sometimes it’s a little hilarious to read the back-and-forth of academics. My favorite is this exchange from Roman R. Redziejowski and Brian Ford over packrat parsing. Redziejowski writes

PEG is not good as a language specification tool. The most basic property of a specification is that one can clearly see what it specifies. And this is, unfortunately, not true for PEG.

To which Ford responds,

Such permissiveness can create unexpected syntactic subtleties, of course, and caution and good taste are in order: a powerful syntax description paradigm also means more rope for the careless language designer to hang himself with.

No points for complaining that Ford ends his sentence with a preposition.

This exchange highlights an issue in the programming language community that stands out for me. There’s a debate raging between two camps, with Google Go at one pole and Haskell at the other. Google Go is fundamentally an ugly language, one the designers admit up front is meant to make mediocre programmers productive, to constrain them from hurting themselves while making them capable of producing working code. And while it’s fine for that, consider the Microsoft “wizards” of the mid-1990s that pumped out huge blocks of C++ that nobody, not even the template designers, could understand; when it comes to Go, that’s where we’re headed. On the other hand, Haskell is fundamentally a beautiful language that’s really, really hard to understand; you have to immerse yourself in decisions where you, yourself describe the constraints with precision, with care, with taste.

Ira Glass has a speech, On Storytelling, in which he says, about being creative,

We get into it because we have good taste, but there’s like a gap.

The first couple years that you’re making stuff, what you’re making isn’t so good, It’s trying to be good, it has ambition to be good, but it’s not quite that good.

But your taste, the thing that got you into the game, your taste is still killer. And your taste is so good that you can tell that what you’re making is kind of a disappointment to you, you know what I mean?

The thing is, this is true of storytelling, of drawing, of any creative endeavor. A lot of programmers don’t get into programming because they view it as a creative endeavor. They view it as puzzle solving. They view it as engineering. They view it as a way to make money fast.

They have no taste.

Often, they don’t want to have taste. They want to get the job done and get paid. “Taste” slows them down and gets in the way. Aesthetic decisions about code layout and arrangement, they believe, are irrelevant to getting the job done.

This isn’t true, of course; Tasteless Go is still as unmaintainable as tasteless C++. It’s possible to write aesthetically horrifying Haskell. Let’s not even talk about Perl.

I believe this is the fundamental dividing line betnween Go, C, and C++ on the one side, and Rust, Clojure, and Haskell on the other. The whole point of Go is make programmers with no interest in taste or aesthetics write programs that work. Maintainability is secondary.

Which goes back to my tweet above. Java and Go programmers want to write the first kind. Haskell and Lisp programmers and their descendents love to write the second type. But my experience with reading and writing in a variety of lanugages convinces me we frequenty end up at the third with no help for it.

The solution is to teach aesthetics. To teach people that readability and maintainability matter more than just getting the job done.  That if it doesn’t make you feel good the day after you wrote it, re-write it.

After all, sometimes your code will live much longer than you expect.

Subscribe to Feed



April 2017
« Jan