The Death’s Head Semantic In Python: Is It Really Necessary?

Posted by Elf Sternberg as programming, python, Rust

I’ve been thinking a lot about where the death’s head symbol, ☠, appears in the Python semantic analysis. The Python language, underneath all the churn and symbols, is only about 40 semantics in size (see: Python, The Full Monty), and most of those are fairly well-defined.

The problem lies in this simple example:

def f(y):
    if y > 9:
        x = True
    return x

This is where the ☠ symbol appears in the semantic specification. For the given environment Γ, the value of x is bool | ☠, meaning that it’s possible this function terminates the program. If you pass a value less than 10, this function throws a fatal exception.

This particular case is easy to identify. But Python’s class slots are stringly defined, and in some cases indistinguishable from hash tables. It’s entirely possible to pass an object to a function, have the function manipulate the object’s fields, and then return to the caller an object with unexpectedly missing fields, which trigger the ☠ semantic. This becomes even more problematic when we consider that those manipulations may come from tainted external data.

The problem is that literally all values in Python come with an implicit v|☠ as their bound value when a scope is entered. It is impossible for a Python to be provably correct. The external source of names is a significant problem, as unlike Perl, Python lacks a language-based mechanism for tracking tainted data. This has lead to some fantastic magic, such as Django’s ORM, but it also leads to fragile coding practices.

Python manages this with a robust try/catch mechanism that developers learn to just throw around broken functions. That seems to be “good enough,” and certainly Python’s popularity is a sign that Python got something right. I just can’t help but feel that there has to be a way to get one step further: to secure Python against its ☠ deathwish and retain most of Python’s flexibility. I wonder how many of Python’s 100 most popular libraries are dependent upon the v|☠ behavior, and how much of that can be identified and automatically fixed by static analysis.

I’m still deep into learning Rust, but I’m going to come out of it sometime soon. It’s been really slow, and I’ve discovered that I can’t let it go for more than a day or two or what I’ve learned really starts to fade. And it’s not something I get paid to work on, so it’s really hard to find the time, even a little, to refresh the memory cells and keep it from fading.

My objectives at this point are pretty far out. Once I get a couple of Rust projects, minor ones, under my belt, I’m going to, as the expression goes, Go For It: I’m going to write my goddamn Lisp. I’m going to follow Thorsten Ball’s Writing an Interpreter In Go, only I’m going to write it in Rust. I’m going to use Matt Might’s Parsing with Derivatives for the front end, Hy’s keywords and symbols as my starting point for syntax, The Full Monty as my core semantics, and maybe a real-time garbage collector just for fun. Then I’m going to start taking Python semantics away and see just how much of the Python standard library works without requiring the death’s head, and how much performance I can tweak out of Python when I don’t have to live with it.

And then, just to make sure that at least one tickmark on the cruel [Language Design Checklist] isn’t ticked, I’ve got a few things I intend to write in it. Just to see how far I can get. Yeah, it’s gonna be a PLOT (Programming Language for Old Timers), but so what? It has to be more fun than hacking Go.

Comment Form

Subscribe to Feed



February 2018
« Jan