The little-endian web?

Apr 24th, 2012

Here’s the deal: typed arrays are not fully portable. On most browsers, this code will print 1:

1
2
3
var a1 = new Uint32Array([1]);
var a2 = new Uint8Array(a1.buffer);
console.log(a2[0])

But the typed arrays spec doesn’t specify a byte order. So a browser on a big-endian system (say, a PowerPC console like Xbox or PS3) is allowed to print 0. In short: casting an ArrayBuffer to different types is unportable by default. It’s up to web developers to canonicalize bytes for different architectures.

Now, we could just require typed arrays to be little-endian, once and for all. After all, almost all platforms are little-endian these days. The few big-endian platforms could just automatically reorder bytes for all typed array accesses. But this would have to be made to work with WebGL, which works by sending application-generated buffers to the GPU. In order to make this work on a big-endian architecture, little-endian-encoded ArrayBuffer data would need to be translated when sending back and forth to the GPU. Technically, this might be possible, but there’s really no evidence that it would have acceptable performance.

On the other hand, can we really trust that web applications will write portable code? Imagine a hashing algorithm that builds an internal ArrayBuffer and casts it to different types. If the code isn’t written portably, it’ll break on a browser implementing big-endian typed arrays.

This leaves big-endian browsers with a nasty decision: try to emulate little-endian typed arrays to protect against unportable application logic, and suffer the complexity and performance costs of translating data back and forth to the GPU, or just hope that not too many web pages break. Or perhaps surface an annoying decision to users: do you want to run this application in fast mode or correct mode?

For now, we should let browser vendors on big-endian systems make that decision, and not force the decision through the spec. If they end up all choosing to emulate little-endian, I’ll be happy to codify that in the standards. As I understand it, TenFourFox can’t support WebGL, so there the best decision is probably to emulate little-endianness. On an Xbox, I would guess WebGL performance would be a higher priority than web sites using internal ArrayBuffers. But I’m not sure. I’d say this is a decision for big-endian browsers to make, but I would greatly welcome their input.

In the meantime, we should do everything we can to make portability more attractive and convenient. For working with I/O, where you need explicit control over endianness, applications can use DataView. For heterogeneous data, there’ll be ES6 structs. Finally, I’d like to add an option for ArrayBuffers and typed arrays to be given an optional explicit endianness:

1
2
3
4
5
var buffer = new ArrayBuffer(1024, "little"); // a little-endian buffer
var a1 = new Uint32Array(buffer);
a1[0] = 1;
var a2 = new Uint8Array(buffer);
a2[0]; // must be 1, regardless of system architecture

With the endianness specified explicitly, you can still easily write portable logic even when casting — without having to canonicalize bytes yourself. Emscripten and Mandreel could benefit from this increased portability, for example, and I think crypto algorithms would as well. I’ll propose this extension to Khronos and TC39, and discuss it with JavaScript engine implementors.

Homoiconicity isn’t the point

Apr 17th, 2012

I’ve never really understood what “homoiconic” is supposed to mean. People often say something like “the syntax uses one of the language’s basic data structures.” That’s a category error: syntax is not a data structure, it’s just a representation of data as text. Or you hear “the syntax of the language is the same as the syntax of its data structures.” But S-expressions don’t “belong” to Lisp; there’s no reason why Perl or Haskell or JavaScript couldn’t have S-expression libraries. And every parser generates a data structure, so if you have a Python parser in Python, then is Python homoiconic? Is JavaScript?

Maybe there’s a more precise way to define homoiconicity, but frankly I think it misses the point. What makes Lisp’s syntax powerful is not the fact that it can be represented as a data structure, it’s that it’s possible to read it without </em>parsing</em>.

Wait, what?

It’s hard to explain these concepts with traditional terminology, because the distinction between reading and parsing simply doesn’t exist for languages without macros.

Parsing vs reading: the compiler’s view

In almost every non-Lispy language ever, the front end of every interpreter and compiler looks pretty much the same:

traditional parsing pipeline

traditional parsing pipeline

Take the text, run it through a parser, and you get out an AST. But that’s not how it works when you have macros. You simply can’t produce an AST without expanding macros first. So the front-end of a Lispy language usually looks more like:

macro pipeline

macro pipeline

What’s this intermediate syntax tree? It’s an almost entirely superficial understanding of your program: it basically does paren-matching to create a tree representing the surface nesting structure of the text. This is nowhere near an AST, but it’s just enough for the macro expansion system to do its job.

Parsing vs reading: the macro expander’s view

If you see this statement in the middle of a JavaScript program:

1
2
3
for (let key in obj) {
    print(key);
}

you know for sure that it’s a ForInStatement, as defined by the spec (I’m using let because… ES6, that’s why). If you know the grammar of JavaScript, you know the entire structure of the statement. But in Scheme, we could implement for as a macro. When the macro expander encounters:

1
2
(for (key obj)
  (print key))

it knows nothing about the contents of the expression. All it knows is the macro definition of for. But that’s all it needs to know! The expander just takes the two subtrees, (key obj) and (print key), and passes them as arguments to the for macro.

Parsing vs reading: the macro’s view

Here’s a simple for macro, written in Racket:

1
2
(define-syntax-rule (for (x e1) e2)
  (for-each (λ (x) e2) e1))

This macro works by pattern matching: it expects two sub-trees, the first of which can itself be broken down into two identifier nodes x and e1, and it expands into the for-each expression. So when the expander calls the macro with the above example, the result of expansion is:

1
(for-each (λ (key) (print key)) obj)

The power of the parenthesis

If you’ve ever wondered why Lisp weirdos are so inexplicably attached to their parentheses, this is what it’s all about. Parentheses make it unambiguous for the expander to understand what the arguments to a macro are, because it’s always clear where the arguments begin and end. It knows this without needing to understand anything about what the macro definition is going to do. Imagine trying to define a macro expander for a language with syntax like JavaScript’s. What should the expander do when it sees:

1
quux (mumble, flarg) [1, 2, 3] { foo: 3 } grunch /wibble/i

How many arguments does quux take? Is the curly-braced argument a block statement or an object literal? Is the thing at the end an arithmetic expression or a regular expression literal? These are all questions that can’t be answered in JavaScript without knowing your parsing context — and macros obscure the parsing context.

None of this is to say that it’s impossible to design a macro system for languages with non-Lispy syntax. My point is just that the power of Lisp’s (Scheme’s, Racket’s, Clojure’s, …) macros comes not from being somehow tied to a central data structure of the language, but rather to the expander’s ability to break up a macro call into its separate arguments and then let the macro do all the work of parsing those arguments. In other words, homoiconicity isn’t the point, read is.

Synchronous module loading in ES6

Mar 29th, 2012

One of the great features of ES6 modules is the direct style module loading syntax:

1
2
import map from "underscore.js";
... map(a, f) ...

This makes it as frictionless as possible to grow or refactor your code into multiple modules, and to pull third-party modules into an existing codebase. It also makes a common module format that can be shared between the browser and JS servers like Node.

But this direct style requires loading its dependencies before it can execute. That is, it’s a synchronous module load. Put in the context of a script tag, this would make it all too easy to block page rendering on I/O:

1
2
3
4
<script>
import $ from "jquery.js";
$('myelement').style({ 'background-color': 'yellow' })
</script>

Throwing this syntax into the browser like this would be an invitation to jank. Thanks to insight from Luke Hoban, I think we have the right approach to this for ES6, which is in fact similar to our approach to avoiding turning eval into a synchronous I/O operation.

In previous versions of ECMAScript, there’s only one syntactic category of program that you can evaluate, called Program in the grammar. In ES6, we’ll define a restricted version of the syntax to be used in synchronous settings, which makes it illegal to do synchronous loads. Within a blocking script, the only access to modules is via the dynamic loading API:

1
2
3
4
5
<script>
System.load("jquery.js", function($) {
    $('myelement').style({ 'background-color': 'yellow' })
});
</script>

This eliminates the footgun, and all of your modules can themselves use the synchronous loading syntax. For example, if jquery.js wants to use a module — say, a data structure library — it can go ahead and load it synchronously:

1
2
3
// jquery.js
import Stack from "utils.js";
... new Stack() ...

But still, this restriction on the top-level loses the convenience of directly importing modules from scripts. Thing is, in an asynchronous context, there’s nothing wrong with doing a synchronous load. So just like the asynchronously loaded jquery.js can use the synchronous syntax, we can also allow it in a defer or async script:

1
2
3
4
<script async>
import $ from "jquery.js";
$('myelement').style({ 'background-color': 'yellow' })
</script>

This allows the full flexibility and expressiveness of ES6 embedded in HTML, without any hazard of blocking page rendering for I/O.

The eval function for ES6 will work the same way, disallowing synchronous loading syntax in the grammar it recognizes, to prevent turning it into a synchronous API. We’ll also add an asynchronous version of eval that, like script async, recognizes the full grammar.

Two years at MoCo

Jan 10th, 2012

If I remember right, today is my two year anniversary working full time at Mozilla. And it works out to about six years of working with Mozilla and TC39. I could stop and get sentimental, but there’s work to do.

Why coroutines won’t work on the web

Dec 14th, 2011

The topic of coroutines (or fibers, or continuations) for JavaScript comes up from time to time, so I figured I’d write down my thoughts on the matter. I admit to having a soft spot for crazy control-flow features like continuations, but they’re unlikely ever to make it into ECMAScript. With good reason.

The big justification for coroutines in JavaScript is non-blocking I/O. As we all know, asynchronous I/O leads to callback API’s, which lead to nested lambdas, which lead to… the pyramid of doom:

1
2
3
4
5
6
7
range.on("preheat", function() {
    pot.on("boil", function() {
        rice.on("cooked", function() {
            dinner.serve(rice);
        });
    });
});

Whereas, if you look at the README for node-fibers, you’ll see this pleasant-looking example:

1
2
3
4
5
Fiber.run(function() {
    console.log('wait...' + new Date);
    sleep(1000);
    console.log('ok...' + new Date);
});

That looks pretty sweet. It’s a synchronous version of setTimeout that doesn’t block the main thread. This seems like a nice combination of the sequential style of synchronous code but with the responsiveness of non-blocking I/O. Why wouldn’t we want something like this in ECMAScript?

Coroutines are almost as pre-emptive as threads

Part of the beauty of JavaScript’s event loop is that there’s a very clear synchronization point for reaching a stable state in your programs: the end of the current turn. You can go ahead and leave things in a funky intermediate state for as long as you like, and as long as you stitch everything back up in time for the next spin of the event loop, no other code can run in the meantime. That means you can be sure that while your object is lying in pieces on the floor, nobody else can poke at it before you put it back together again.

Once you add coroutines, you never know when someone might call yield. Any function you call has the right to pause and resume you whenever they want, even after any number of spins of the event loop. Now any time you find yourself modifying state, you start worrying that calling a function might interrupt some code you intended to be transactional. Take something as simple as swapping a couple fields of an object:

1
2
3
var tmp = obj.foo;
obj.foo = obj.bar;
obj.bar = munge(tmp);

What happens if munge does a yield and only resumes your code after a few other events fire? Those events could interact with obj, and they’d see it in this intermediate state where both obj.foo and obj.bar are the same value, because obj.bar hasn’t yet been updated.

We’ve seen this movie before. This is just like Java’s threads, where any time you’re working with state, you have to worry about who might try to touch your data before it reaches a stable point. To be fair, life is actually far worse in Java, where almost every single basic operation of the language can be pre-empted. But still, with coroutines, every function call becomes a potential pre-emption point.

Host frames make coroutines unportable

And then there’s the implementation problem. Unless your JavaScript engine doesn’t use a stack (and they all do), coroutines would have to be able to save a stack on the heap and restore it back on the stack later. But what if JavaScript code calls into code implemented in the host language (usually C++)? Some engines implement functions like Array.prototype.forEach in C++. How would they handle code like this?

1
2
3
4
5
6
7
Fiber.run(function() {
    array.forEach(function(x) {
        console.log('wait: ' + x);
        sleep(1000);
        console.log('ok: ' + x);
    });
});

Other languages with coroutines take different approaches. Lua allows implementations to throw an error if user code tries to suspend host activations. This would simply be unportable, since different engines would implement different standard libraries in C++.

The Scheme community tends to demand a lot from their continuations, so they expect functions like for-each and map to be suspended. This could mean either forcing all the standard libraries to be self-hosted, or using more complicated implementation strategies than traditional stacks.

Simply put: browser vendors are not going to do this. Modern JS engines are extraordinary feats of engineering, and rearchitecting their entire stack mechanism is just not realistic. Then when you consider that these changes could hurt performance of ordinary function calls, well… end of discussion.

Shallow coroutines to the rescue

OK, back to the pyramid of doom. It really does kind of suck. I mean, you could name and lift out your functions, but then you break up the sequential flow even worse, and you get a combinatorial explosion of function arguments for all those upvars.

This is why I’m excited about generators. Generators are a lot like coroutines, with one important difference: they only suspend their own function activation. In ES6, yield isn’t a function that anyone can use, it’s a built-in operator that only a generator function can use. With generators, calling a JS function is as benign as it ever was. You never have to worry that a function call might yield and stop you from doing what you were trying to do.

But it’s still possible to build an API similar to node-fibers. This is the idea of task.js. The fibers example looks pretty similar in task.js:

1
2
3
4
5
Task(function() {
    console.log('wait... ' + new Date);
    yield sleep(1000);
    console.log('ok... ' + new Date);
}).run();

The big difference is that the sleep function doesn’t implicitly yield; instead, it returns a promise. The task then explicitly yields the promise back to the task.js scheduler. When the promise is fulfilled, the scheduler wakes the task back up to continue. Hardly any wordier than node-fibers, but with the benefit that you can always tell when and what you’re suspending.

Coroutines no, generators yes

Coroutines are not going to happen in JavaScript. They would break one of the best features of JavaScript: the simplicity of the event loop execution model. And the demands they would place on current engines for portability are simply unrealistic. But generator functions are easy to add to existing engines, they have none of the portability issues of coroutines, and they give you just enough power to write non-blocking I/O in a synchronous style without being “threads lite.”