HomeBlogwhats-missing-from-node

What's missing from NodeJS

Every now and then I keep thinking of this. What got me started to write it all down was this thread on the mailing list. Some guy who enjoys pain takes the trouble to write an E4X extension to Node (it really should be an extension to V8, but let alone that). Then another guy replies:

“We aren't interested in language features, we're building a platform and not concerning ourselves with the language and vm allows us to build and iterate on this platform significantly faster than if we were building a language.”  (Mikeal Rogers)

“We aren't interested in language features”! I'm not personally a fan of E4X and don't care much about having that in the language, but I do think that JavaScript needs to evolve. When I say needs I mean right now, it's not good enough.

How could you not be interested in language features, since, after all, you write in that language and it's the single most important thing once you get past accidental complexity.

First, the good news

As I re-read this I realize it might sound like I'm bashing Node. That's far from my intention. Node is a great thing and I'm just trying to analyze, objectively, what are its current problems. So before going to the bad news, let's summarize the good ones:

  • It's incredibly fast. The V8 team did a hell of a job. I was baffled when I noticed that my UglifyJS (which I also ported to Common Lisp) runs faster in Node than in SBCL. I think it's a good benchmark (I mean, I hate “hello world”-type statistics). I'm pretty sure the CL version could be improved considerably, but you know, the JS version is faster with no additional effort. V8 is a race car.

  • It has a small and clean core API. No bells and whistles, just what you need. Writing a socket server or client is a snap. You have access to most POSIX functions, and most of them have both asynchronous and synchronous versions. There's a “crypto” module which covers all your cryptographic needs. There's even a module system (CommonJS) which I'm not entirely happy with, but does the job. And decent documentation that fits on one page.

  • It has great momentum. New recruits are jumping in every day and this is great, especially if they come from PHP or Java, because they are introduced to important features from functional languages. I still remember the day when I understood closures—I suddenly felt a hundred times more powerful! Like the hacker in me was dormant for years and just woke up ready to deal with problems I couldn't think of before.

    It's important that programmers get exposed to these concepts. Many don't even try advanced functional languages simply because they aren't mainstream. JavaScript can help spreading some essential knowledge, and NodeJS+V8 make a great vehicle.

So no, I'm not bashing Node/JavaScript. Au contraire, I have some cool stuff in the works for it. But this being said, now let's turn to look at what's bad about it.  Note that most of the following relates to the language, and not to Node itself, and that's especially why the reply I quoted above seems idiotic.

Accidental complexity

I took a look at the official list of NodeJS modules and it's not surprising to find a gazillion Web frameworks and tons of libs to ease “asynchronous programming”. Also there are like 28.5 Redis clients (the 0.5 is mine, fortunately not listed there) and lots of “Redis clones” which provide no more value than a plain hash table.

A part of those projects were written for fun, and others for learning. But many, probably, were written out of necessity. Now when you see a dozen “control flow” libraries that deal with “asynchronous programming” you begin to wonder if this is a serious problem. And it is. If you want to fetch 3 values from a database and print their sum, you need to do something like this:

get_val("foo", function(foo){
    get_val("bar", function(bar){
        get_val("baz", function(baz){
            print(foo + bar + baz);
        });
    });
});

The above code looks wrong. It works just fine and it's even fast, but it's not how you want to write it. If your sanity is like mine, you'd prefer to write it like this:

print(get_val("foo") + get_val("bar") + get_val("baz"));

or even this:

function add(a, b) { return a + b };

function add_values(names) {
    return names.map(get_val).reduce(add, 0);
};

print( add_values([ "foo", "bar", "baz" ]) );

This version is a bit lengthier, but more generic. Let me try to write the generic version in asynchronous style too, so that we make an idea about how evil this stuff is:

// we're still gonna use that
function add(a, b) { return a + b };

function add_values(names, next) {
    var count = names.length
    var results = [];
    var i = 0;
    while (i < count) get_val(names[i], (function(i){
        return function(ret) {
            results[i] = ret;
            if (--count == 0)
                next(results.reduce(add, 0));
        };
    })(i++));
};

add_values([ "foo", "bar", "baz" ], print);

Would anyone like to write code like this? It's definitely not impossible (more, it might even give you a feeling of “gosh, I'm so smart!”) but hold on a sec: after all, you only had to add a bunch of values.

This is one serious type of “accidental complexity” that Noders have to deal with. It's “accidental” because it's not related to the problem at hand; it's just required to make the tool happy. They might need to write a simple blog engine, or a complex accounting system, but either way they have to deal with this issue all the way through, because you see, in Node, everything has to be asynchronous. So once you get past the initial hype you understand that, even if it might result in a more scalable system (which I doubt, if we're not talking about “hello world” benchmarks), the cost of implementing your stuff in this style will be quite painful. Then you start writing yet another control flow abstraction, only to realize later that you can't really abstract this out.

No Macros

IF JavaScript would have macros, at least, you could work around this accidental complexity in a more elegant way. Provided we had macros we could for example write the first version like this:

get_val(foo,
  get_val(bar,
    get_val(baz,
      print(foo + bar + baz))));

[ Incidentally, do the parens remind you of something? ;-) ]

get_val would be a macro that receives as first argument the name of the variable to fetch from DB (it can be a symbol, no need to quote it) and a statement to execute after fetching the value, taking care to define it then as a local variable. (that statement could be a block containing multiple statements, if this is needed).

That would become bearable. Macros write code for you. They are a very good way to hide various types of accidental complexity with no loss in performance and with a clear gain in readability and maintainability. Having macros, you could add support for E4X without even touching the core.

I've spent 2-3 weekends trying to write a macro system on top of the JS parser that I have in UglifyJS, but finally I gave up. The syntax of JavaScript is too complicated for this, and my time is quite limited.

No Threads

Multi-threading programming is more rewarding than people seem to think. Even if you reshape your brain so that continuation-passing everywhere starts making sense, every now and then you might need to solve a “hard” problem. Like, for example, rendering a template to HTML. That's not quite a “hard” problem, but it keeps the CPU busy. So while the CPU is busy running a single template, no other clients are served, and this no matter what kind of quasi-core CPU you might have, because stuff in V8 happens in a single thread.

Practically you have a single option here: execute multiple Node processes so that when one is busy, another one can step in to take an incoming connection. If you can hack in C++, you could extend Node to use threads and launch multiple VM-s (this seems to be allowed in V8; what is not allowed is to have multiple threads doing stuff in the same VM).

Separate VM-s are quite like separate processes—they can't share anything, not even global variables, so there isn't much benefit. I suppose you could write code to pass things around between VM-s, since you can control everything in C++, but you'd have to be very adventurous. Basically as long as this thing is not in the core, even if someone writes it, nobody will use it because there is a high risk that an upgrade will break it off.

Why I don't use Node

This time last year I was using Perl for my server-side needs. I still do (I mean, even this very website is served by Perl) but I don't anymore program in Perl (it's just unfortunate that this old blog code is Good Enough that I don't have any incentives to make another one yet). Maybe, had I tried Node back then, maybe I would have love it. JavaScript was always high on my "love" languages list.

Many times I've been thinking about replacing Perl. I looked seriously at Ruby and Python a few times. They seem to be more pleasant to work with than Perl, but after some study I decided that the benefits don't justify the cost.

But then I tried Lisp.

Lisp has macros and threads (well, threads aren't, technically, related to Lisp, but most serious implementations have them) so the two issues I mentioned above don't exist. On the topic of “accidental complexity” Lisp seems to do better than any other language out there. I could almost say that there is no such thing. The language works for me, rather than against me. I don't need to fight it. When there are problems, I manipulate the language to make it more appropriate for dealing with them; everything fits together so nicely that in the end, it seems like Lisp has been designed for my problem, rather than my problem reshaped to fit Lisp. (Yeah, Graham said this first).

I feel kinda small when I think about the sheer amount of features that these guys stuffed into a specification almost 20 years ago. Today, most other languages don't even come close. This page exemplifies some of the key Lisp features.

People moved from assembly language to C for a good reason, and then they moved from C to Perl or Python for the same good reason: it's way cheaper and easier to add horsepower than to add manpower. What many don't realize is that these are small, incremental steps closer to Lisp. But the Lisp was here all the time. Those who tried it and were serious enough to get past the initial “this stuff is weird” feeling, usually never look back.

I guess I've hit the point of no return. I don't want to think of working in a weaker language now. I do a lot of JavaScript, of course, but I wish it had at least two features that I have in Lisp, and those are macros and threads.

And this guy says they're not interested in language features.

Comments

  • By: Nick FitzgeraldNov 23 (22:09) 2010Parenscript? §

    Just making sure: you are aware of Parenscript, right? It might not make sense to use Parenscript to use Node with macros (I did, though, because Hunchentoot is a bitch to setup behind apache), but assuming you are doing web dev, you will probably need to write some JS, and it seems like Parenscript would be a good fit if you are very familiar with both Lisp and JS.

    Anyways, I found that while I was working on TryParenscript.com, I ended up almost adding in thread-like macros based on top of setTimeout:

    Check out 'once-when-> and 'always-when-> in this file: https://github.com/fitzgen/tryparenscript.com/blob/master/ps-helpers.lisp

    I ended up "porting" them to pure JS with functions instead of blocks, because I thought it might be useful for the rest of the JS world (and gave them better names): http://fitzgeraldnick.com/weblog/35/

    Anyways, I'm sure this type of thing is nothing new for you, but I thought it might add to the conversation in some small way :)

    _Nick_

    • By: mishooNov 28 (17:23) 2010RE: Parenscript? §

      Hey Nick,

      Yeah sure I know about Parenscript.  It's really good stuff, but somehow I didn't got to use it for serious work.  The biggest reason is that I already have tons of libs written in JS, which I need to use, and mixing the two can be messy and hard to debug.

      So far I'm using it when I need to generate small snippets of JS from server-side—it's more convenient than using a general-purpose template engine or (concatenate 'string). :-)

  • By: Jacob RothsteinNov 28 (11:23) 2010Sibilant §

    Hi Mihai, just wanted to mention that I've been hacking intently at sibilant, which is a lisp that compiles to js, somewhat similar to parenscript.  Sibilant was written in js initially and has since been rewritten in itself.  The language is defined largely as macros, which can be added at any point.  Check out the in-browser compiler at http://sibilantjs.info or the code at http://github.com/jbr/sibilant.  Macros are pretty easy in JS if you replace all of the syntax with parens :.  I'd love to hear your feedback.

    • By: mishooNov 28 (17:21) 2010RE: Sibilant §

      Hey, that's pretty cool!

      Just wondering how you do macros, as I noticed that there's no backtick in the samples.  I know that macros are supposed to always return an AST, but how do you differentiate stuff that needs to be evaluated at compile time?

  • By: Robert SchultzDec 16 (17:07) 2010RE: What's missing from NodeJS §

     
    I'm a big fan of Node.js.

    I totally agree that the complexity introduced with the asynchronous nature of Node.js is a significant challenge.

    However I disagree that this should be solved by adding macro support.

    I like having clean, readable code. Once I started doing serious Node.js, this was a difficult thing to achieve.

    I looked into many Node.js modules designed to help solve this complexity and fell in absolute love with 'Step'
    https://github.com/creationix/step

    It would turn the code above into 'something' like this:

    function()
    {
        var self = this;

           Step(
            function getFoo()
              {
                get_val("foo", this);
              },
              function getBar(err, foo)
              {
                self.foo = foo;
                 get_val("bar", this);
              },
              function getBaz(err, bar)
              {
                self.bar = bar;
                 get_val("baz", this);
              },
              function printResults(err, baz)
              {
                 print(self.foo + self.bar + baz);
              });
    }();

    Basically it calls each function in sequence. You pass 'this' as the callback to any async calls you make and it passes the result to your next function when it has the result.

    It also supports doing things in parallel:

    function()
    {
        Step(
            function getFooBarBaz()
            {
                get_val("foo", this.parallel());
                get_val("bar", this.parallel());
                get_val("baz", this.parallel());
            },
            function printResults(err, foo, bar, baz)
            {
                print(foo + bar + baz);
            });
                
    }();

    Here it would run all three get_vals at the same time and then only call your next function after all three have returned.

    Now tell me that isn't just all kinds of sexy!

    Tell me that doesn't get you excited :)

    • By: mishooDec 16 (22:20) 2010RE[2]: What's missing from NodeJS §

          print(get_val("foo") + get_val("bar") + get_val("baz"));

      ^^ that's sexy.  The naked truth.  If you look at the line above, you *know* what it does.  If you look at your last sample, you have to mentally execute it (knowing what Step is in between) to figure it out.

      A smart man said that perfection in design is achieved not when there isn't anything more to add, but when there is nothing you can take away.  The same applies to writing programs.

      About parallelization, I prefer to leave it on the hardware.  True, the line above would then become something like:

          print(p_add("foo", "bar", "baz"));

      assuming you had something as simple and ubiquitous as "threads" (which you don't) and the program would still be trivial to understand.

  • By: AlexanderDec 30 (09:32) 2010RE[3]: What's missing from NodeJS §

    Hello Mihai

    I'm looking for a freelancer to develop certain JS modifications for CKEditor. If you are interested in this project, please contact me to get further details.

    Hope to hear from you,

    Alexander

  • By: AkshayApr 10 (17:49) 2011RE: What's missing from NodeJS §

    You should check out coffeescript.
    I fell in love with it after startin server side javascript.
    You can find it at http://jashkenas.github.com/coffee-script/

  • By: Isaac Z. SchlueterApr 22 (19:06) 2011RE: What's missing from NodeJS §

    Of course, many of us in the Node.js community are very interested in the JavaScript language, its features and evolution.  It's just that JavaScript *the language* is not developed as part of the Node.js project.

    Node is not about adding language features to JavaScript, it's about a vision for a very specific IO paradigm, and uses v8 as the language engine.  We certainly aren't *removing* language features that v8 implements.  It's more a separation of concerns than a lack of concern.

    You should participate on the es-discuss mailing list, where the future of EcmaScript is debated ad nauseam: https://mail.mozilla.org/listinfo/es-discuss

    By the way, UglifyJS is gorgeous.

  • By: Doeke ZanstraApr 23 (14:11) 2011RE: What's missing from NodeJS §

    You say you need threads, but what you describe you need can also be accomplished with the web worker api (http://www.whatwg.org/specs/web-workers/curr… OK, it's not implemented in NodeJS yet, but it's comming...

Page info
Created:
2010/11/23 00:07
Modified:
2010/11/23 21:04
Author:
Mihai Bazon
Comments:
11
Tags:
javascript, lexical closures, lisp, node.js, programming, v8
See also