Pretty Diff - Guide, Using jsscope to understand scope, inheritance, and scope chains in JavaScript

Introduction

Scope is perhaps one of the most challenging concepts to understand for experienced programmers who are new to JavaScript. This is primarily because JavaScript is an object oriented language that offers poly-instantiation similar to how these concepts are offered in Java or C++, but unlike these other languages scope follows a separate and unrelated model.

Since functional programming has always been natively available in JavaScript I never find myself needing to use object oriented programming techniques in this language. I am able to program faster by never needing to use constructors, bind, call, apply, or Object.create. I only use this in callbacks and new to access global objects that require it. It is possible to program large decomposable applications in JavaScript without OOP, and doing so will increase your programming speed, reduce the complexity of your code, and shrink the lines of code in your applications.

No matter what language you work in, programming in a functional style provides benefits. You should do it whenever it is convenient, and you should think hard about the decision when it isn't convenient. — John Carmack

Key Points

Functions provide scope.
Closure is crossing scope boundaries.
Variables are resolved through a succession of scope, scope chain.
Where in the scope chain variables are declared structures a functional application.
The Pretty Diff application is an example of functional programming and decomposition.
Leaky functions allow for manually controlled side effects.
ECMAScript 6 will introduce block scope with the let keyword.

JSScope

Each of the following code samples is prepared using the jsscope feature of Pretty Diff's jspretty.js library. This feature beautifies the supplied JavaScript and then outputs the code as formatted HTML. It also colors the background of functions and variables to indicate scope depth, which is explained in further detail below.

The jsscope feature can be accessed in code by providing a value of true to the Pretty Diff option jsscope or by selecting the option Scope analysis output from the webtool. It also possible to link directly a jsscope operation by providing an address parameter of jsscope and a parameter of s that links to a code sample. Example:

http://prettydiff.com/?m=beautify&s=http://prettydiff.com/lib/jspretty.js&jsscope

Function Scope

Many languages offer block scope. Prior to ECMAScript 6 JavaScript only offers global scope and function scope. I will speak to block scope with ECMAScript 6 much later.

Functions are powerful. They can be treated like dumb object literals and assigned properties. Functions can store instructions, return values, and provide a limited scope to variables. Variables are instantiated with the var keyword.

A simple code sample showing a single function containing a single variable and a global variable.

var globalNumber = 10,
outer = function () {
var a = 20;
};

In the above code sample we can see a function assigned to the reference outer in the global scope and containing a single variable named a. We can also see a second global variable named globalNumber. Variable a resides in the scope provided by its containing function, outer.

Functions are private spaces, somewhat like a top secret military base. You can see out to the rest of the world from inside, but the outside cannot peer inside. In the case of our code example the global scope cannot see inside the function's scope, which means variable a does not exist in the global scope where globalNumber is declared. Since variable a does not exist in the global space it cannot used in this area for any operation. It simply doesn't exist and cannot be found.

The scope of a function can see outside. We know that variable a cannot be accessed in the lower function scope from global, so instead let's think about this in the opposite direction. Because the scope of a function can see outside it can access the global scope. The variables a and globalNumber can be accessed together to perform an addition operation that provides a value of 30, but only from inside the function.

A simpled code example of closure.

var globalNumber = 10,
outer = function () {
var a = 20;
return a + globalNumber;
};
return outer(); //returns 30

Closure

Most simply speaking closure is the process of crossing a scope boundary to access a resource. In the second code sample we had to leave the local function scope to find and access the variable globalNumber. In computer science terms this idea of scope compartmentalization is called lexical scope. To understand why this concept is so frustrating to experienced programmers we need to example a more complex code sample.

A third code sample showing a function nested inside the outer function.

var globalNumber = 10,
outer = function () {
var a = 20,
inner = function () {
var b = 30;
return a + b + globalNumber;
};
return inner();
};
return outer(); //returns 60

In the third code sample we can see three scopes. All the discussed logic about private and public access still applies. The scope of a function can see out, but things cannot see inside.

When writing object oriented programming in a language like Java public and private states of objects must be manually declared by the code author. In the case of this sample code there are private and public areas, but these states are automatically applied by the nature of functions and their position relative to each other.

Object oriented programming is powerful because it allows a code author to define an object and conditionally extend that object at a later time only when needed. This means there must be some high level generic object that gradually becomes more specific as children are attached. This is a top down model of organization where vague definitions start in a high level and expect to be consumed by lower level code. This also means the consuming code must specify the object it wishes to extend, which provides some manual effort on behalf of the code author and allows tighter control of code behavior.

The functional approach is often challenging for programmers experienced with object oriented programming. So much of the manual linking, referencing, and extending are instantly irrelevant as they are automatically known as a component of the code architecture. While OOP is a top down model of programming the functional approach is very much a bottom up model. This bottom up model is called the scope chain.

The functional programming paradigm is trending up because it requires less effort on the part of the programmer and in some cases executes much faster. Since functional programming is architecturally based instead of reference based it reduces risk and maintenance in the code substantially. Its not completely wonderful though, applications written with the functional programming model tend to consume far greater memory.

Scope Chain

The scope chain is the process a language uses to resolve non-local references in the lexical inheritance model of programming. Everytime JavaScript encounters a variable reference it attempts to resolve this reference in the local scope. If the reference is declared in the local function then it is easy to find and the program and knows which scope the reference is bound. Looking at the third code sample variable globalNumber is used in the scope of function inner, but is not declared there. To resolve this reference the application must step into the containing scope, function outer, to find the reference. It still cannot be found, so the application continues looking into the next higher scope, global. Now the reference can be resolved.

This process of gradually stepping up the higher scopes is JavaScript's primary scope chain. A second hidden scope chain exists for prototypes. In earlier versions of the language prototypes were always faster to access, because even though the prototype resolution chain is not accessed until the primary scope chain is exhausted their existence was always cached in memory, much like classes in Java and C++. In modern execution of the language all references are cached so the only observable benefit to continued use of prototypes is to allow object oriented programming styles instead of purely functional programming styles.

Structure and Code Reuse

To reduce clutter in your application references should always be defined as locally, which means in the lowest scope available, as possible. If a reference is defined locally in a function, but needs to be accessed from a function not available to the current scope then move the reference declaration into a higher scope to enable access. By balancing the needs of access versus the need to localize references the placement of variables in your application will be natural without extensive manual effort.

I commonly use arrays as closures. This is particularly helpful when writing language parsers. An array stores all the data about a code sample into fragments I call tokens. A variety of analysis needs to be performed against this data. By declaring the array high in a parsing library it can be accessed by a variety of functions each containing a different type of analysis. This is helpful because I can access the array regardless of scope depth no differently than a local variable and the changes made to the array are made in the scope where the array is declared. There are no objects to extend and no instructions to control reuse or access.

Using Pretty Diff as an Example

Even after the concepts of functional programming are taught they tend to remain challenging to experienced programmers from an architectural perspective. If programmers are formally educated to learn programming in the context of object oriented code then application architecture is known primarily in that fashion. This frustration is often evident when a programmer attempts to learn this model of programming in the context of existing models that are more emotionally comforting, which results in working harder than necessary and understanding less than expected. Functional programming is actually massively easier to understand with regards to code architecture because it is rather primitive and the code architecture handles all the challenging concepts automatically, but as a programmer you must be willing to concede that it is okay to give up absolute control of where references are consumed.

The Pretty Diff application is written in a purely functional form. I will use this as an example of functional programming architecture. I recommend opening the prettydiff.js file in a code editor that allows code folding on blocks and functions, because the application file is rather large. After the file contents are in your code editor fold absolutely everything so that you can see code that looks like less than 200 lines.

The Pretty Diff application is a single function named prettydiff. When I write code in JavaScript I always declare my variables at the very top of their function. This is not required, but it makes the code much easier to understand from a quick glance. So, lets examine the variables at the top of the prettydiff function.

Code sample exemplifying the variables declared the outer most scope of the Pretty Diff application.

1
2
- 3
4
5
6
7
8
9
- 10
11
12
- 13
14
15
- 16
17
18
- 19
20
21
- 22
23
24
- 25
26
27
- 28
29
30
- 31
32
33
34

var prettydiff = function prettydiff(api) {
"use strict";
var startTime = (function startTime() {
var date = new Date(),
time = date.getTime();
return time;
}()),
jsxstatus = false,
summary = "",
charDecoder = function init_charDecoder() {
return;
},
csspretty = function init_csspretty() {
return;
},
csvbeauty = function init_csvbeauty() {
return;
},
csvmin = function init_csvmin() {
return;
},
diffview = function init_diffview() {
return;
},
jspretty = function init_jspretty() {
return;
},
markupmin = function init_markupmin() {
return;
},
markup_beauty = function init_markup_beauty() {
return;
},
core = function core(api) {

We can see a couple of variables defined as literals at the top, but absolutely everything else appears to be empty functions. I do this to conform to rules on JSLint, which expects a reference to be declared before it is used and several of my massively monolithic libraries reference each other. To summarize I have a single function named prettydiff that contains only a few variables of which most are massively self-contained libraries.

I have written the code in a way that the libraries can be completely independent of the larger Pretty Diff application. The csspretty library is being used in the atom-beautify project, for instance. This ability to instantly pull a function out of a larger project and execute it independently exemplifies the decomposable nature of functional programming. There is no ceremony or boiler plate that must go with code when a function is subdivided into an independent application, but there must be some internalized planning in the function for it to receive input independent of its containing application.

Each of the libraries are divided into a few large functions that act as primary sub-components. These sub-components then contain functions of their own, and so on.

Leaky Functions

Through a clever use of closure it is possible to easily cheat the public/private nature of functional programming. This is what I call leaky functions. Look back at the code sample from the Pretty Diff application and notice the variable summary. This variable declared outside of the libraries. When used inside the library functions it has access to all of the logic and references that is otherwise hidden in those functions. I use this variable to fully access the library code, perform some deep analysis of how code is parsed, and then leak a report of my analysis. Because of the leaky function concept I can cheat the bottom-up model of functional programming to access the logic than I need and only when I need it. This demonstrates the simple and expressive power of closures.

ECMAScript 6 and let

The next version of JavaScript called ECMAScript 6 will feature a new way to declare references with the let keyword. The var keyword is limited to function scope, but the let keyword will enable block scope. A block of code is generally anything wrapped in curly braces: loops, conditions, and of course functions. References declared with the let keyword will still follow the bottom-up lexical model that the var keyword follows.

The benefit is that references of extremely limited use can be declared where they are used instead of needing a function. This benefit does not come at conflict to anything previously discussed about functional programming, which also means the let keyword can effectively replace all uses of the var keyword. ECMAScript 6 is not final at the time of this writing and I await to see how its use always my perceptions of functional programming in JavaScript.

Faulty Criticisms

These are actual criticisms to this approach I have heard. I wish I were making this up.

This style of coding will produce a big messy file. This style of coding may result in a large code file if the given application is large, but large and messy are entirely unrelated just as with literature and novels. This approach to writing code will likely help to make less messy, because the declarations are bound to the architecture that defines the depth of scope for the application. Most JavaScript applications tend to always become large messy files as the result of build systems from application pre-processors, like: Grunt, Gulp, or Browserify. The functional approach commonly results in substantially less code since there are fewer declarations and fewer logical conventions in the code.
Everything can access the same variables, which will be a conflict of too many things accessing the variables at once. JavaScript execution is so far always synchronous and single threaded. Without events and other externalizing APIs it is impossible that collisions will occur from different things attempting to access a variable at the same time. Even with events and externalizing APIs this is incredibly unlikely to occur and even more unlikely that it will cause a problem.
This is not the computer science definition of inheritance. Inheritance is an English language word that has existed since Middle English when the language was trying to merge higher French vocabulary with the common Germanic vocabulary regularly in use. All uses of this word in programming subscribe to the proper dictionary definition. Perhaps this statement was meant to describe this use of inheritance as foreign to formalized concepts of object oriented programming, which assumes one believes this is the limit of computer science education.

If you believe anything in this article is in error or need of improvement please open a bug. I have thick skin and appreciate a healthy dose of criticism.