10 Tips for Coding with Node.js #2: How to Avail & Beware of the Ecosystem

By: David Mark Clements

Welcome to part two of our ten part series of recommended practices and tips to help you on your journey with Node.js. To refresh your memory here are the ten tips under consideration:

  1. Develop debugging techniques
  2. Avail and beware of the ecosystem
  3. Know when (not) to throw
  4. Reproduce core callback signatures
  5. Use streams
  6. Break out blockers
  7. Deprioritize synchronous code optimizations
  8. Use and create small single-purpose modules
  9. Prepare for scale with microservices
  10. Expect to fail, recover quickly

In this post, we’ll be talking about making use of the extensive npm ecosystem, how to evaluate modules and how to avoid or protect against some common pitfalls.

node.js ecosystemThe Ecosystem

$ npm install what-you-need

According to Module Counts, the Node ecosystem is the largest and one of the fastest growing of it’s peers. There are over 125,000 modules on npm, growing on average by 228 new modules per day. Compare this to Maven Central, Java’s package repository, which hosts just short of 98,000, growing at a rate of 60 packages per day. Similarily, Rubygems.org has around 96,500 packages, growing at a rate of 57 packages per day. These statistics should be qualified: the npm repository is not only for Node modules, there are client side JavaScript (and indeed a tiny proportion of bash/sh script) modules on npm. However, it is safe to say that the vast majority of modules on npm are intended for use with Node.

The Small Core Strategy

Since inception, the Node.js project has insisted on the smallest core possible, providing only the essential primitives required to build apps with file system, networking and command line abilities. Need WebSockets? Use a third party module that attaches to a core HTTP server. Need to support different routes? Maybe use an HTTP framework. Want to use a template language? Take your pick. I believe the small core strategy has contributed massively to the success of Node.js and to the growth of the ecosystem. It allows the core team to focus on quality, security and performance. In fact this very approach resulted in the heartbleed bug being removed from Node.js a year before heartbleed went public. They didn’t know it was a bug, it was just decided that this feature of OpenSSL was superfluous.

Availing of the Ecosystem

When starting a new project I break up the logic into the smallest pieces I can. Each of these pieces is a problem to solve. A problem solving attempt should begin with a search of npm. I always like to make sure before I write any code that a perfectly good existing solution isn’t already available. A module should be focused in purpose, it’s API should be obvious and it’s source code should be small enough to fit in my head. If such a module solves one of the problems of a project then the research time has paid for itself. If no module exists, there could be modules that solve part of the problem. This might mean the problem needs to be broken down further. There might also be a partial, unfinished module and perhaps that can be the basis for a solution, or at least give some insight.

availing the node.js ecosystem

Caveat Utilitor

The great thing about npm is that anyone can publish to npm, all they have to do is register an account and npm publish. This also helps to explain the massive growth of the ecosystem. However, the scary thing about npm is that anyone can publish to npm. The laissez faire approach to ecosystem management has been fundamental to rapid growth. The trade off is the increased burden of discovery and evaluation on the module user. For rapid prototyping, there’s no doubt 125,000 modules at our fingertips is an amazing thing. But somewhere before going live, someone has to check that these modules we’re using are production worthy.

Module Evaluation Tools

There have been some community initiatives to amortize these efforts, for instance the Node Security Project issues security advisories and has an accompanying command line tool called nsp. The retire module performs a similar role by making a project aware of out-of-date modules. There’s also Node Zoo, which pulls in a variety of metrics to provide confidence rankings for a module. Ultimately though, it’s down to us to ensure the packages we use are safe and fit for purpose. For an example of manual module evaluation, one of our nearForm architects, Guy Ellis, wrote a piece on his approach to selecting a package.

Dependents

This is one of the most powerful metrics – it carries a similar weight to word of mouth. The module pages of the npm website detail the dependents at the bottom of the page, however the npm command line tool doesn’t provide a way to retrieve these dependents. So I have put together a small command line tool for doing just that:

$ npm -g install npm-dependents

We can see how many published modules are depending on express by running npm-dependents express, which will output something like this:

npm-dependents

Default behavior of npm-dependents

Or, we can list out all modules depending on seneca with npm-dependents seneca --list.

--flag npm-dependents

Using the – -list flag with npm-dependents

When checking how many dependents a module has, there are a few things to bear in mind. It’s more powerful than download stats, because it means the module is beyond being played with, to being part of another tool. However, for command line tools this metric should be discounted – because unless a CLI tool is a build tool it’s unlikely to be used as a dependency of other packages.

Don’t Be Blinded by Popularity

To balance the previous section, just because something is popular it doesn’t mean its right for every case. Sometimes a module is less popular because it’s extremely niche, but it may just be the very thing that’s needed. Additionally, just because a module or framework is popular, doesn’t mean we should make assumptions about any sane behaviour that a framework or module should have. For example, the wildly popular express framework does not set secure defaults, it favours rapid development over production security, leaving that as an exercise for the user. If you’re interested in more information on server hardening, see the helmet package, the Kraken framework or get in touch with us.

Shiny Websites

Another smoke screen when evaluating modules can be a super-awesome-shiny website. If a module under evaluation is actually a framework maintained either by a large company or active community, then a polished site is neither a red flag nor a green light. However, for small independent modules curated by one to three developers, a Readme.md file on Github is sufficient, in fact it’s reassurring. If a small team has produced an amazing website to accompany a recently released module, it should generate a code quality concern. Shiny websites don’t always carry a strong correlation to good code quality, sometimes the opposite.

Who Wrote It?

Whilst it’s important to explore the ecosystem, we naturally begin to recognize prominent module authors. This is a healthy thing, learn who to trust and use their modules when you can.

Review the Source

Going through the source code of a third-party module is often a great educational exercise. Reading all the source code of all the modules and their sub-dependencies can be a daunting challenge. But quickly scanning the source for red lights can be a good way to catch potential issues. One thing to look out for is the use of eval, whether it is through calling eval or using new Function. Using eval with user input, on the server side, is really very dangerous. It also has performance implications. Understanding of context is vital. For instance, some template engines use eval ( for example dust, jade, Angular…). If we’re using a template engine we have to be okay that we’re trusting the engine to thoroughly clean user input, and understand the flow of data into and out of the eval. Other things to look out for (also context-dependent) could be the dependency making proper use of streams, or is it expecting to buffer all data then process. In these cases, the question must be asked: What’s the largest possible amount of data that could be passed through this module. Buffering then synchronously processing data is a recipe for disaster with Node.js.

Shrinkwrapping

$ npm shrinkwrap

When installing a module with npm install --save, the version number is added to the package.json as ^1.0.0 (assuming the current version is 1.0.0). The caret (^) is an instruction to npm, telling it to install the latest minor version. It is the equivalent of setting the version number to 1.x.x. This means when the dependencies for a package are installed, they may be different versions to those originally installed and tested during development. During development this is a desired behaviour. We want bug fixes (that is, increases to the final part of the version number), and backwards compatible API improvements (which is the minor (middle) version number). This is not something we want in most production scenarios. We want our dependencies to stay static, and we’ll upgrade dependencies manually. The npm shrinkwrap command will do a deep crawl of all dependencies, generating a shrinkwrap.json file. For instance, here is the first dependency in the shrinkwrap.json file generated for the npm-dependents module: {    “name”: “npm-dependents”,   “version”: “1.0.1″,   “dependencies”: {    “JSONStream”: {     “version”: “0.10.0″,      “from”: “JSONStream@*”,      “resolved”: “https://registry.npmjs.org/JSONStream/-/JSONStream-0.10.0.tgz”,      “dependencies”: {        “jsonparse”: {          “version”: “0.0.5″,          “from”: “jsonparse@0.0.5″,          “resolved”: “https://registry.npmjs.org/jsonparse/-/jsonparse-0.0.5.tgz”       },       “through”: {         “version”: “2.3.6″,         “from”: “through@>=2.2.7 <3.0.0″,         “resolved”: “https://registry.npmjs.org/through/-/through-2.3.6.tgz”        }      }    }, …snip… A gist of the full shrinkwrap.json file can be found here. When a module comes with a shrinkwrap.json file, npm will ignore the package.json file, using shrinkwrap.json to install specific versions for all dependencies and sub-dependencies.

Keep a Filtered Cache Repository

Once a list of vetted modules is compiled, isolating these modules from the public npm repository can be a useful way to share validation work across a team, or just make it easier for a single developer to work on several projects with pre-validated modules. This in itself is a huge topic and outside the scope of this post, however sinopia can be a good place to start. Sinopia acts as a private repository which also fetches and caches modules from npm. Essentially the idea is to npm install all validated modules, then disable proxying to the public repo.

Production Ready Modules

To finish, here is a list of modules we at nearForm believe are production ready. Whilst we have confidence in these modules and their authors, bugs can still easily slip in between version releases, and this list shouldn’t be a replacement for the due diligence required when using any module in production.

 

require author description
async caolan Async patterns
bl rvagg Binary parsing
browserify substack Browser distribution
bunyan trentm Logging
chai jakeluer Assertions
debug tjholowaychuk Debug printer
dockerode apocas Docker management
duplexify mafintosh Stream utilities
event-stream dominictarr Stream utilities
express tjholowaychuk Server framework
glob isaacs Glob matching
grunt cowboy Build system
gulp contra Build system
hapi hueniverse Server framework
hyperquest substack Lighter HTTP client
istanbul gotwarlost Test Coverage Analysis
JSONStream dominictarr Stream utilities
levelup rvagg LevelDB
lodash jdalton Faster functional patterns
minimatch isaacs Glob matching
minimist substack Command line options
mocha tjholowaychuk Unit testing
moment timrwood Date manipulation
mongodb christkv MongoDB
mysql felixge MySQL
nconf indexzero Configuration
needle tomas Lighter HTTP client
nodemailer andris9 Email Client
passport jaredhanson Login and authentication
pg brianc Postgres
pump mafintosh Stream utilities
redis mjr Redis
request mikeal HTTP client
restify mcavage REST API builder
socket.io rauchg Realtime
split2 matteo.collina Stream utilities
tape substack Unit testing
through2 rvagg Stream utilities
underscore jashkenas Functional patterns
ws einaros Websockets
xml2js leonidas XML to JavaScript

 

 Conclusion

In conclusion, time researching modules is not wasted time – it may save future headache and heartache. When evaluating a module, look for key indicators, but always be aware of the larger context. Know that all that glitters is not gold – a shiny website, or lots of Github stars doesn’t necessarily indicate code quality. The best way to evaluate a module is to read the code – if it takes longer than an hour to understand the module’s probably too big (unless you’re looking for a framework or utility library). That’s it for this post. I hope you found it helpful. If you know of any other modules that you’ve found to be useful feel free to talk about them in the comments. The next tip will be ‘Know when (not) to throw’. In the meantime, post any comments or questions in the comment section below, subscribe to this blog to be notified as soon as follow-on posts published.   Want to work for nearForm? We’re hiring.


Email hello@nearform.com

Twitter @nearform

Phone +353-1-514 3545

Check out nearForm at www.nearform.com.


By: David Mark Clements

David Mark Clements is a JavaScript and Node.js specialist based in Northern Ireland. He is nearForm's lead trainer and training content curator. From a very early age he was fascinated with programming. He first learned BASIC on one of the many Ataris he had accumulated by the age of nine. David learned JavaScript at age twelve, moving into Linux administration and PHP as a teenager. Node has become a prominent member of his toolkit due to its versatility, vast ecosystem, and the cognitive ease that comes with full-stack JavaScript. David is author of Node Cookbook (Packt), now in its second edition.
  • Jason

    NPM has some built-in module eval tools, one of which I like better than dependent counts:

    npm v express users | wc -l

    Will tell you number of users with packages depending on express.

    And for download stats there’s always:

    curl -s https://api.npmjs.org/downloads/point/last-week/express

    Which you can make nicer with the now ubiquitous json bash script by Trentm

  • Sven Slootweg

    I’m not sure why request is in the recommended module list – it has some severe code quality issues, and they lead to unpredictable bugs with some regularity.

  • floatdrop

    Hyperquest have issues (with https options for example), consider to use https://github.com/sindresorhus/got instead.

  • David Mark Clements

    @disqus_fIlVuIOAbB:disqus that’s awesome, love learning ways to quickly check the stats

    @svenslootweg:disqus – the thing that’s really working in requests favour is the active developer community around it, it did lay dormant for a while but Mikael has done a great job engaging with the community so I expect it to be progressively maturing. Having said that, the hyperquest module is included as a lightweight alternative – particular for use in say, a server environment. I would expect request to be used more in a tooling context at present.

    @floatdrop:disqus at first glance sindresorhus’ module looks viable, I’ll take a closer look, thanks!

  • GeGe

    Strange you point out the “Shiny Websites” topic. I contribute to OSS primarily by writing these “shiny” websites, because my husband believes in keeping his git repo README as condensed and focused on the API of the module as possible. However, I’ve convinced him that sometimes more examples and tutorials need to come into play, in order to reach a broader audience…and that is my job. We work in parallel, but my “shiny website” with tutorials that I provide bares no affect on the quality of the project. I’m not sure why you are drawing any line at all between them really. Bad code is bad, no matter the reason, and judging a book by its cover for any reason is nonsensical. I’ve never heard a developer say, DAMMIT I WAS FOOLED BY THE SHINY WEBSITE AGAIN!!!! If that is ever the case, then I hope it was my website because well, that means I’m just that awesome.

  • David Mark Clements

    @GeGe thanks for your input – much appreciated

    I love the way you and your husband have complimentary skill sets, that’s a scenario I didn’t cover.

    The point being made is based on the observation that visual stimulation can have an effect on perception, for instance the halo effect (https://en.wikipedia.org/wiki/Halo_effect).

    The proverb you mention “don’t judge a book by its cover” exists because there *is* an association between the contents of the book and a propensity to judge it by it’s cover.

    I think developers are less prone to this than most, however I can hold my hand up and say I have spent time working with a library that I found to be suboptimal, and I had chosen it against other libraries because it had a more professional looking site. Nevertheless a good site doesn’t necessitate poor code, you simply have to understand the context – and the context for your husband is he gets to have his cake, and it eat too. But I’m not jelly