Pretty Diff - JavaScript Style Guide

Cheney's Law: "Technicians willing to increase productivity by way of convention contrarily become sufficiently productive increasing conventions."

Godwin's Law of Programming: "Provided a discussion on software development, particularly JavaScript, grows longer the probability of comparing any development effort to writing directly in Assembly or machine code grows increasingly larger."

Warning: This style guide is strict. If you disagree with it then do not use it, but certainly do not complain about it.

Table of Contents

  1. White Space
  2. Comments
  3. Declarations
  4. Data
  5. Loops
  6. Classes
  7. Prototypes
  8. Structure
  9. Things to Avoid
  10. Architecture
  11. Libraries
  12. Named Patterns

White Space

Indentation is composed of 4 consecutive spaces and there are no expections. For everything else I follow the rules provided by jspretty.js.

Comments

Comments are no more than 72 characters wide from the left most margin so that there is less risk of alteration if the code is printed. I typically use the /* star type comments at the extreme top of a code file to provide opening information about the library, license, authors and so forth. After the first line of code starts I never use star comments again. The problem with star comments is that they cannot be nested and I may need to use them in the code when debugging. Comments should be used to describe and annotate the structure of your code to describe the intentions and objectives of the coder as well as why the logic operates in a certain way. The what and how are irrelevant and should not be mentioned in the comments, because it is painfully clear by reading the code.

Declarations

The very first thing that should occur at only the top most scope is the "use strict" pragma. After than one and only exception variables are always declared at the top of their current scope. If more than one variable must be declared then those variables will be a comma separated list declared using only a single var keyword. All variables are declared with a value in attempt to type each variable before its used to reduce risk of unintentional type recasting later. All functions are declared as variables so as to defeat hoisting. Where performance matters all functions should be provided a function name so as to allow profiling.

It is important to always name artifacts of structure a valid and easily understandable variable name. This includes most functions, most data collections, and certain primitives used in closure between clearly separated areas. Everything else gets a one letter variable name, because for everything else the name is largely irrelevant and likely misleading.

Data

For data collections in execution I only use arrays. I find that object literals are less flexible to navigate in a loop and are slower to operate upon dynamically. For data in transit I use JSON. For data at rest in localStorage I use a custom formatting convention and custom parsing means as dictated by a given project because the native JSON object is not available in IE8. I plan to migrate to JSON for data in storage once IE8 stops being supported.

Loops

I only use for loops or do while loops. I never use for in loops as they tend to be dramatically slower and less flexible. Do while loops are faster than for loops but they are also less flexible and are less safe as they can become an endless loop if some expected condition is never achieved. I tend to se do while loops when some incremental change must be performed based upon an expected result.

Classes

I avoid constructor type formations at all cost. If I am forced to use a constructor to solve a problem then something has gone terribly wrong and I have failed to understand the problem I am attempting to solve. Avoidance of constructors is easily accomplished if the key words new and this are limited or eliminated from the code. I only use the new keyword to create a variable from a global object, such as RegExp or Date. I only use this to provide a back reference from a DOM node to eliminate unnecessary DOM queries.

Prototypes

I do not use prototypes in my code as I cannot enforce separation of concerns upon code that receives the benefit of cascading inheritance. Furthermore, I needed to access some sort of functionality then I would access it directly with an explicit reference rather than needing some method to be supplied to me via an underlying inheretance chain. Prototypes also require use of the this keyword at their instantiation and I really deplore that keyword.

Structure

In writing applications perhaps the most important design criteria is a separation of concerns. This separation should apply to data collections as well as the logic. When done correctly a proper separation of concerns will feature very many functions as generaally separated design concerns each contain progressively more specific design considerations that must be separated from each other.

Things to Avoid

Large dynamically populated strings should be built by pushing new parts into an array with the push method. Accomplishing this objective with string concatenation is substantially slower.

Do not pass functions as arguments of other functions. First of all this is hard to read and unnecessarily complicated to debug due to unnecessary coupling. Worse is that it is certainly insecure if the function passed in contains arguments of user specified input.

Do not be afraid of the DOM. It is a stable API. A few years ago it was painfully slow to access in rapid succession, but those problems are largely gone. Using large libraries to solve problems with DOM access does not solve your problem and instead buries your problem under an avalance of abstracted nonsence making the original problem harder to find.

Write to innerHTML at your own risk. A one time write to the DOM using innerHTML is substantially faster to execute for large input than building out each DOM node manually. There is great safety in using the DOM methods as where that safety is completely absent when using the innerHTML method. I use innerHTML in my own code to substantially increase performance, but it comes at a cost of error checking and error correction to avoid polluting the document with broken code and flawed data.

Some parsers and IDEs can become confused as to when a complex regular expression actually ends. I recommend encapsulating all regular expression literals in parenthesis if they are used outside of methods or if they are not used with a method.

Architecture

The two most wonderful qualities of JavaScript are that functions are first-class objects and that functions have lexical scope. It is because of these qualities that much of the prior discussed rubbish can be avoided outright. Hidden within these qualities is an implied architecture. In more traditional object-oriented programming (OOP) code architectures always have some common qualities: explicit formulation, atomicity, and containment. While each of these qualities benefits code reuse they also produce unnecessary burdens.

OOP code is very explicit. This means the programmer must state how the various particles of an application are linked. The primary benefit is absolute control of the code, which provides the programmer the opportunity to decide on which things reference each other in various contexts. Unfortunately, absolute control is counter intuitive to automation. The entire essence of automation is to release control of a process into a means that operates a series of steps without required intervention at each step and without regard to losses or gains in potential systems efficiency.

In JavaScript this manner of absolute control is costly thereby doing more harm than good. The reason for the substantial loss in processing efficiency is due to an increased number of references. JavaScript is a very high level language, which means it is very far away from the hardware in the technology stack. Modern JIT engines close this gap substantially and yet even in a compiled state IO operations to memory are still costly at execution time. Please note that I am speaking to frequency of memory access and not size of memory consumption. Therefore improvements to execution speed in JavaScript primarily come down to two factors: reducing the number of references and properly using operators upon those references.

Atomicity is the form in which OOP code is primarily written, although such expression is not required. Atomicity refers to the granular separation of code units to small uniquely stated particles. Atomicity is wonderfully beneficial to code reuse in that the more granular code parts become and the more available those parts are to reference from a common location the more freedom the developer has in linking various pieces. Atomicity then is both an enabler and enabled by explicit control over the code.

One way to think of atomicity is to imagine that a complex framework is devised of various minor pieces, such as little functions or a collection of objects each with a collection of openly available properties. These various pieces can be put together in various different ways with little cause for disassembly. The JQuery code is a strong example of atomicity which allows their code to be quickly examined with clarity. A more tangible example would be building a sophisticated building using only Lego toys. Atomicity is therefore fully counter-intuitive to structure where a structure is devised only at execution time from the explicit links of references.

OOP code, at least outside of JavaScript, has block scope. Block scope is very handy for enabling a more expressive freedom within atomic code units, particularly when those code units are objects. JavaScript prior to ECMAScript 6 does not have block scope, so that entire expressive nature is lost and substantially diminishes the value of OOP code models with regard to both programmer expression and execution efficiency. JavaScript has only function scope. This means variables are either global or declared within a function without exception. With regard to an OOP perspective this is horribly limiting.

Function scope in JavaScript is lexical, which means each function is an atomic particle taking ownership of its internal pieces and simultaneously is provided access to all particles in the scope where the function is declared. This is perhaps the most powerful and expressive quality of JavaScript in that the lexical nature of JavaScript provided with the nature of function scope allows opportunities for implied architectures; that is architectures not formed from explicitly stated references.

Implied architectures are challenging items to discuss because even though many programmers program for the sake of automation many find difficultly in automating the very expression of programs. Implied architectures are a somewhat automated means of expression in that I do not have to explicitly state how various particles are linked via reference if the link between the various particles is automatically created. Since lexical function scope means that code inside a function can access code in scopes containing that function a structure is already silently present to supply a link between the various code particles. This means that as a programmer I must be willing to sacrifice some degree of atomicity for structure and since the structure is already silently present in the inner workings of the language I need fewer explicit references to accomplish identical tasks.

The benefits of an implied architecture are that the total size of code is dramatically reduced since many explicit references are no longer necessary and the code is structured. Fully atomic code is easy to quickly gloss over in a cursory glance but is challenging and time consuming to actually analyze because everything is an explicitly named reference that defines how the particles of code come together to form the program. This means the structure of a fully atomic code base is not visually obvious. An implied architecture reliant upon some underlying organizational means is inherently structured, to some degree, and therefore far easier to analyze as a wholly structured unit.

The greatest performance hit in JavaScript with regards to lexical scope is scope depth. References are most quickly resolved if declared locally. References declared one scope higher are a bit slower to resolve and so forth as the scope chain is crawled. This is partially why I do not use prototypes. Prototypes are not resolved until the entire scope chain is first crawled which makes prototypes more expensive to access than any variable. The benefit of prototypes is that they are a memory conserving feature much like classes in C++ or Java. Memory conservation is rarely beneficial in JavaScript since JavaScript applications are almost never persistent and garbage collection is forced upon the interpreter. This performance hit has the benefit of reinforcing the nature of a well organized structure in that functions should reside high enough in the scope chain to allow reuse where needed, but as low as possible to prevent unnecessary scope traversal.

An additional benefit in using a purely lexical based structure for application architecture is that this is how HTML and XML work. These markup languages have absolutely no atomicity. They have a single root tag and everything within fits into some kind of defined and well regulated structure, which means the user does not have explicit control either. More important still is that XML, and HTML to a lesser extent, are purely lexical. This is certainly provable when declaring a namespace upon a tag and noting the structural access provided to descendent elements of this tag. It is convenient to use a programming approach in application code that closely mirrors the meta data relationships provide to the data instance, such as context. JavaScript is most frequently written to access markup via some API, such as a framework or the DOM. The convenience herein is that the structure of the markup provides insight into access and exploitation of the data from within the application thereby potentially simplifying application access. Likewise the application can serve as an enabler to understanding how better to extend the markup dynamically by providing a different perspective on structure refinement under similar conventions and formulations.

Libraries

Libraries are best viewed as an atomic component of the code that just happens to reside in some separate location for reuse independent of a particular project or application instance. As a result of this consideration libraries should be applied to a given project exactly the same as any function or object by assignment to a variable at the appropriate location in the application structure. Where and how to include libraries into an application is the greatest single challenge to writing structured code in JavaScript because JavaScript does not have a native IO or file system component. Fortunately, there are some tools to bridge this gap: RequireJS, CommonJS.

Named Patterns

Named patterns are a raging trend in programming and particularly in JavaScript. I do not bother to learn the names of named patterns, because if I did I fear I may be forced to content communicating upon such foolishness. I believe resources and constraints upon a particular programming task make each task relatively unique and that programming is about efficiently solving problems. If I wanted to become really good at optimizing generic cookie-cutter solutions I would abandon programming and learn manufacturing engineering. Perhaps this madness is better illustrated by context of a tool-building factory factory factory pattern.