Development diary…

I’ve decided to start keeping a development diary here. Mostly this was motivated by a desire to have a forum for rubber duck debugging, as working remote it’s otherwise hard to get this (my long suffering non-programmer fiancée can only take so much!) But it’d also be nice to have a better sense of my own velocity and where things have taken me. That this also gives the rest of the community a bit more insight into what’s getting attention is an extra bonus.

Today I…

  • Got @othiym23 to look at my npm-package-arg changes that integrate hosted-git-info, which in turn will allow the multi-stage install to both fetch package data from hosted repos via a shortcut (direct fetch of the package.json via HTTP, rather than having to clone via git), and also to not treat github as special, adding support for gitlab and bitbucket.
  • … and that led to a request for new npm-package-arg documentation generally, and, well, perhaps the curse of knowledge has made hard for me to see, as it feels well documented to me already. =D That said, I did manage to make some improvements, digging into my own realize-package-specifier for inspiration.
  • Made an issue to track adding website support for multiple hosted git repos.
  • Read @izs‘s excellent talk on corporate oss
  • Got the first issue on require-inject. Prepared a fix for it. This ended up being a little more involved then I was expecting– it turns out that the require object is unique per module, but that require.cache is shared. This means assigning something to require.cache (eg require.cache = {}) does nothing outside the current module. Where as mutating it DOES. Anyway, we’ve got new tests now and a new release, so all’s good.
  • TIL my somewhat eccentric habit of writing /[.]/ or /[/]/ instead of /\./ or /\// dates back to Damien Conway‘s Perl Best Practices.
  • Finally started write code to execute the multi-stage install‘s action list, which at this point means decomposing things like “install” into their component parts, eg, “fetch”, “extract”, “gyp-build”, “lifecycle scripts”, “move to final resting place”. (I say finally, because LAST week I was keen on getting to this and I didn’t and then THIS week I was keen on getting to it too. Alas, unforeseen prerequisites…)

Multi-stage installs and a better npm

(Originally posted to the npm blog, the source is on Github.)

Hi everyone! I’m the new programmer at npm working on the CLI. I’m really excited that my first major project is going to be a substantial refactor to how npm handles dependency trees. We’ve been calling this thing multi-stage install but really it covers more than just installs.

Multi-stage installs will touch and improve all of the actions npm takes relating to dependencies and mutating your node_modules directory. This affects install, uninstall, dedupe, shrinkwrap and, obviously, dependencies (including optionalDependencies, peerDependencies, bundledDependencies and devDependencies).

The idea is simple enough: Build an in-memory model of how we want the node_modules directories to look. Compare that model to what’s on disk, producing a list of steps to change the disk version into the memory model. Finally, we execute the steps in the list.

The refactor gives several needed improvements: It gives us knowledge of the dependency tree and what we need to do prior to touching your node_modules directory. This means we can give simple errors, earlier, much improving the experience of this failure case. Further, deduping and recursive dependency resolution are then easy to include. And by breaking down the actual act of installing new modules into functional pieces, we eliminate the opportunity for many of the race conditions that have plagued us recently.

Breaking changes: The refactor will likely result in a new major version as we will almost certainly be tweaking lifecycle script behavior. At the very least, we’ll be running each lifecycle step as its own stage in the multi-stage install.

But wait, there’s more! The refactor will make implementing a number of oft-requested features a lot easier– some of the issues we intend to address are:

  • Progress bars! #1257, #5340
  • Automatic/intrinsic dedupe, across all module source types #4761, #5827
  • Errors if we can’t find compatible versions MUCH earlier, before any changes to your node_modules directory have happened #5107
  • Better diagnostics when peerDependencies produce impossible to resolve scenarios.
  • Better use of bundledDependencies
  • Recursively resolving missing dependencies #1341
  • Better shrinkwrap #2649
  • Fixes some icky edge cases [#3124], #5698, #5655, #5400
  • Better shrinkwrap support, including updating of shrinkwrap file when you use –save on your installs and uninstalls #5448, #5779
  • Closer to transactional installs #5984

So when will you get to see this? I don’t have a timeline yet– I’m still in the part of the project where everything I look at fractally expands into yet more work. You can follow along with progress on what will be its pull request

If you’re interested in that level of detail, you may also be interested in reading @izs‘s and @othiym23‘s thoughts.

Abraxas– A node.js Gearman client/worker/admin library

Abraxas is an end-to-end streaming Gearman client and worker library for Node.js. (Server implementation coming soon.)

Standout features:

  • Support for workers handling multiple jobs at the same time over a single connection. This is super useful if your jobs tend to be bound by external resources (eg databases).
  • Built streaming end-to-end from the start, due to being built on gearman-packet.
  • Most all APIs support natural callback, stream and promise style usage.
  • Support for the gearman admin commands to query server status.
  • Delayed background job execution built in, with recent versions of the C++ gearmand.

Things I learned on this project:

  • Nothing in the protocol stops clients and workers from sharing the same connection. This was imposed by arbitrary library restrictions.
  • In fact, the plain text admin protocol can be included cleanly on the same connection as the binary protocol.
  • Nothing stops workers from handling multiple jobs at the same time, except, again, arbitrary library restrictions.
  • The protocol documentation on is out of date when compared to the C++ gearmand implementation– notably, SUBMIT_JOB_EPOCH has been implemented. I’ve begun updating the protocol documentation here:

Because everything is a stream, you can do things like this:

Or as a promise:

Or as a callback:

Or mix and match:

MySQL ROUND considered harmful

MySQL’s ROUND has different behavior for DECIMALs than it does for FLOATs and DOUBLEs.

This *is* documented. The reason for this is not discussed but it’s important. ROUND operates by altering the type of the expression to have the number of decimal places that it was passed. And this matters because the type information associated with a DOUBLE will bleed… it taints the rest of the expression:

We’re going to start with some simple SQL:

Here 2.5 is a DECIMAL(2,1) and 605e-2 a DOUBLE, and the result is a DOUBLE. That’s all well and good…

But let’s try rounding 605e-2.

So… what’s going on here? The round part of the expression shouldn’t have changed its value. And in fact, it hasn’t, calling ROUND(605e-2,2) returns 6.05 as expected. The problem here is that the type of ROUND(605e-2,2) is DOUBLE(19,2) and when that’s multiplied by 2.5 the resulting expression is still DOUBLE(19,2). But the number of decimals on a float is for display purposes only– internally MySQL keeps full precision… we can prove that this way:

So yeah… MySQL let’s you increase precision with ROUND– Postgres is looking mighty fine right now.

Survey of node.js Gearman modules

Here’s a brief survey of node.js Gearman modules. I’ll have some analysis based on this later.

Module Github Author Last
Tests Docs Client Worker Multi
Streams Errors Timeouts
gearman gofullstack/gearman-node smith, gearmanhq 2011-05-02 4
gearman-stream Clever/gearman-stream azylman, templaedhel 2014-03-21 0
Previously named gearman_stream, uses gearman-coffee
gearnode andris9/gearnode andris 2013-02-25 1
gearmanode veny/GearmaNode veny 2014-03-20 4
nodegears enmand/nodegears enmand 2013-12-07 1
que vdemedes/que vdemedes 2012-07-02 0
Uses node-gearman
gearman-js mreinstein/gearman-js mreinstein 2013-11-03 4
gearman2 sazze/gearman-node ksmithson 2013-09-17 0
Fork of gearman with no changes except name
node-gearman andris9/node-gearman andris 2013-08-13 2
node-gearman-ms nachooya/node-gearman-ms nachooya 2013-11-18 0
Fork of node-gearman
gearman-coffee Clever/gearman-coffee rgarcia, azylman, jonahkagan 2013-03-19 2
magictoolbox/node-gearman oleksiyk 2012-12-03 0

Using cron with “every”

The “every” command is that I wrote, inspired by the unix “at” command.  It adds command to your crontab for you, using an easier to remember syntax.  You can find it on github, here:

I was reminded because of this article on cron for perl programmers who are unix novices:

Here’s how you’d write their examples using “every”:

What’s more, there’s no need to specify the path to Perl, because unlike using crontab straight up, it will maintain your path.  Even better, you can use relative paths to refer to your script, eg:

This works because every ensures that it executes from the place you set it up.  Just like “at” it uses all of the same context as your normal shell.

Education clearly doesn’t help reporters

“Education clearly pays. Despite recent questioning of the value of university degrees, more than two thirds of the top one per cent had a university degree, compared to 20.9 per cent of the total population.”

No, that’s not what that says at all. It says that the wealthy value degrees, not that degrees make one wealthy.  It says that degrees are something that wealthy people do, but if you just get a degree thinking it will make you wealthy, you’re as confused as the cargo-cults that would build bamboo airports in hopes of attracting a supply plane.

Easy ad-hoc publishing on a machine with Apache


This is just a little hack of mine to make it trivial for me to reflect any directory on my server as a website, either with a name I specify or a hash. Handy for all sorts of things, I initially created it to give myself an easy way to view remote coverage reports that generated to HTML. It’s also a nice way to view HTML docs bundled with a package, or any other random HTML you come across.

How it works

As part of setup, we create a file based apache rewrite map that rewrites slugs off of our domain based on rules from a text file. These text files are super simple, just the slug followed by a space and then what to rewrite to.

With the setup out of the way, we have a very simple shell script that uses Perl to figure out the absolute path from your relative one and uses openssl to generate a hash from that. It uses the hash as the slug if you don’t specify one.  Once it’s appended these to the rewritemap file it tells you what your new URL is.

The example in the repo obviously isn’t generic, it refers to a host I control, but that’s easily editable.  This is less software package and more stupid sysadmin hack.

Android Ports for Corporate Firewalls

Beyond the standard 80 and 443 to handle web traffic, Android also needs 5222 (Jabber) and 5228 (allegedly Google Marketplace, but needed for a phone to fully connect to the network and have functioning Google Talk).

Mail is also likely needed too of course, with SMTP 25 and 995, POP 110 and 993, and IMAP 143 and 465. For some setups you may also need LDAP, 389 and 636. Exchange needs 135 and in some esoteric configurations, NNTP with 119 and 563.

AnyEvent::Capture – Synchronous calls of async APIS

I’ve been a busy little bee lately, and have published a handful of new CPAN modules— I’ll be posting about all of them, but to start things off, I bring you: AnyEvent::Capture

It adds a little command to make calling async APIs in a synchronous, but non-blocking manner easy. Let’s start with an example of how you might do this without my shiny new module:

The above is not an uncommon pattern when using AnyEvent, especially in libraries, where your code should block, but you don’t want to block other event listeners. AnyEvent::Capture makes this pattern a lot cleaner:

The AnyEvent::DBus documentation provides another excellent example of just how awkward this can be:

With AnyEvent::Capture this would be:

We can also find similar examples in the Coro documentation, where rouse_cb/rouse_wait replace condvars:

Even still, for the common case, AnyEvent::Capture provides a much cleaner interface, especially as it will manage the guard object for you.

Whatever fills my mind…