Pretty Diff - Documentation

Explore some samples. For any questions, comments, requests, or feedback please join the Pretty Diff mailing list or chat on Gitter. Find Pretty Diff on GitHub.

About This Application

Introduction

This tool was originally created to compare minified code by attaching a beautifier and minifier to a file comparison tool. Over the years it has grown into custom language parsers capable of performing a variety of language analysis. This application is 100% vanilla JavaScript and is API independent.

License

@source http://prettydiff.com/prettydiff.js

@documentation - English http://prettydiff.com/documentation.php

@licstart The following is the entire license notice for Pretty Diff.

This code may not be used or redistributed unless the following conditions are met:

If each and all these conditions are met use, extension, alteration, and redistribution of Pretty Diff and its required assets is unlimited and free without author permission.

@licend The above is the entire license notice for Pretty Diff.

Informational Guides

  1. Using jsscope to understand scope, inheritance, and scope chains in JavaScript
  2. Ignoring specified tags from markup beautification
  3. So its kind of like recursive command line diff, but in JavaScript
  4. Processing the JSX format from Facebook's React
  5. Saving colorful JavaScript code samples
  6. Conforming to popular style guides with the styleguide option
  7. What Pretty Diff can do to auto-correct some sloppiness in JavaScript.
  8. Brief overview of the prettydiff.js code architecture

Unrelated Guides

  1. The DOM Explained, Quick and Simple
  2. A/B Testing for the Web
  3. Explaining Closure To A Child

Known Issues

  1. Webkit browsers, such as Google Chrome and Apple Safari, do not support beautification of white space characters or layout in textarea elements. As a result the web interface offers a slightly degraded experience for these browsers in order to prevent corruption of output. Please see these bugs for more experience: 51168 and 90739.
  2. Markup beautification will output flawed data if less than characters, "<", are included into sections of content and not escaped or wrapped in script/style tags. This error occurs regardless if less than characters embedded within content are quoted or not.

Execution

Web Tool URI Parameters

  1. c - This parameter receives the name of a supported color scheme.
  2. d - This parameter receives a URI as a value that points a difference code source. If the value of this URI contains ampersand characters, &, or question mark characters, ?, please escape the ampersands characters to "%26" and the question mark characters to %3F.
  3. jscorrect - This presence of this parameter sets the jscorrect parameter to boolean "true". No value is required.
  4. jsscope - This parameter forcefully applies the jsscope feature of the JSPretty library and does not require a value.
  5. l - This parameter receives a value of markup, html, auto, javascript, js, css, csv, or text. This bypasses all other language settings and determinations thereby forcefully applying the language against the supplied value to this parameter. The value of "html" is identical to the value "markup" except that it forces the option "Presume SGML type HTML" for all modes while "markup" unsets this option. The values "javascript" and "js" are treated equally.
  6. m - This parameter receives a value of beautify, minify, or diff. This parameter sets the mode of the tool.
  7. s - This parameter receives a URI as a value that points a code source. If the value of this URI contains ampersand characters, &, or question mark characters, ?, please escape the ampersands characters to "%26" and the question mark characters to %3F. The tool executes this code automatically on page load for beautify and minify modes, but only for diff mode if a source is provided with the d parameter.
  8. codemirror - If this parameter is present with a value of false the CodeMirror application library will be discarded in favor of HTML textarea elements.

The parameters are optional and are provided soley for portability. The parameters may occur in any order. Examples:

  1. http://prettydiff.com/?l=html&s=http://google.com/&m=beautify
  2. http://prettydiff.com/?s=http://www.amazon.com/Definitive-XML-Schema-Priscilla-Walmsley/dp/0130655678/ref=sr_1_1%3Fie=UTF8%26qid=1312890971%26sr=8-1&html&m=beautify

Pretty Diff Function

Overview

Pretty Diff is an application written entirely in JavaScript and expressed as a single function named 'prettydiff()'. This application was originally written as a means to algorithmically difference between two similar pieces of code regardless of minification and other white space differences. The result is a fast difference engine offering many options that allows access to the world's most advanced markup beautification algorithm.

While the Pretty Diff application is expressed as a single function it contains a few libraries. These libraries are a diff engine and a few language parsers for providing beautification and minification. The libraries can be used independently of Pretty Diff. Browse the libraries in the local lib directory.

The Pretty Diff application is completely environment agnostic. It can run on the command line, web browser, or any other environment. All that is required is an appropriate API. Feel free to write your own or use the ones provided in the local API directory.

There is one exception in the Pretty Diff application to complete environmental isolation. At this time the charDecoder.js code is entirely reliant upon DOM access. This library transforms character entity references into literal characters and vise versa. Pretty Diff does not contain a Unicode character map, so this library is reliant upon execution in a web browser, but it does degrade gracefully in other environments.

The Pretty Diff function receives input from a single argument. This argument is a really big object literal that specifies the code to process and options on how to process it. Read about the various options in the Pretty Diff API section of this document.

Pretty Diff API (Options)

  1. api

    • Description
      Used internally to decide if some small JavaScript functions need to be included with report output based upon the operating environment.
  2. braceline

    • Description
      In JavaScript a new line is inserted after opening curly braces and before closing curly braces.
    • Type
      boolean
    • Default
      false
    • As labeled in the HTML tool
      Brace Lines (JavaScript Only)
  3. bracepadding

    • Description
      Inserts a space after the start of a contain and before the end of the container in JavaScript if the contents of that container are not indented; such as: conditions, function arguments, and escaped sequences of template strings.
    • Type
      boolean
    • Default
      false
    • As labeled in the HTML tool
      Brace Padding (JavaScript Only)
  4. braces

    • Description
      Sets the style of indentation during JavaScript beautification. The default value "knr" sets a JSLint compliant beautification scheme and the other value "allman" puts opening curly braces on their own line.
    • Type
      string
    • Accepts
      knr, allman
    • Default
      knr
    • As labeled in the HTML tool
      Style of Indent (JavaScript Only)
  5. color (node-local.js only)

    • Description
      Specifies which color scheme to apply to the output file.
    • Type
      string
    • Default
      white
    • Accepted values
      canvas, default, shadow, white
  6. comments

    • Description
      Determines whether comments should be indented. This property is only used in beautification mode.
    • Type
      string
    • Accepted values
      indent, noindent
    • Default
      indent
    • As labeled in the HTML tool
      Indent Comments
  7. conditional

    • Description
      Retain Internet Explorer conditional HTML comments during minification of HTML.
    • Type
      boolean
    • Default
      false
    • As labeled in the HTML tool
      IE Comments (HTML Only)
  8. content

    • Description
      Determines if string literals in JavaScript and content in markup should be normalized to a literal value of text prior to a diff operation. This property is only used if mode is set to diff.
    • Type
      boolean
    • Default
      false
    • As labeled in the HTML tool
      Ignore Content (Markup / JavaScript)
  9. context

    • Description
      In the diff mode a numeric value sets the number of matching (equivalent) lines to precede and follow each line containing a difference to provide code context. An empty or non-numeric value returns a diff report with all lines of code.
    • Type
      string or number
    • Default
      no value (empty string)
    • As labeled in the HTML tool
      Context size (optional)
  10. correct

    • Description
      A limited attempt to automatically correct certain stylistic code problems that JSLint complains about. This option will insert missing semicolons, insert missing curly braces, convert some instances of "--" and "++" operators to "-=" and "+=" respectively, and convert "new Object()" and "new Array()" into "{}" and "[]" respectively.
    • Type
      boolean
    • Default
      false
    • Informational Guide
      jscorrect option
    • As labeled in the HTML tool
      Fix Sloppy Code (JavaScript Only)
  11. csvchar

    • Description
      Stores the string value used as a data separator for the "csv" language. Any string is accepted, but if value of lang property is not set to "csv" this property is ignored.
    • Type
      string
    • Default
      , (comma)
    • As labeled in the HTML tool
      Character separator
  12. diff

    • Description
      Code sample to compare the source code sample against. This property is required when mode is set to diff, but is otherwise ignored.
    • Type
      string
    • Default
      none
    • As labeled in the HTML tool
      New Text
  13. diffcli

    • Description
      An option only available from the node-local.js API file for Node.js. This option will output a list of differences in color to the console. If more than one file is compared it will indicate which files are deleted or new. If the output option is not specified the node-local.js will convert the diffcli option to a value of true. If the value of diffcli is true and the context option is omitted the context option will be provided a value of 2. For additional information please read the diffcli guide.
    • Type
      boolean
    • Default
      false
    • Informational Guide
      Using diffcli
    • As labeled in the HTML tool
      not in the HTML tool
  14. diffcomments

    • Description
      Retain code comments so that code and comments can be compared by the diff process.
    • Type
      boolean
    • Default
      false
    • As labeled in the HTML tool
      Code Comments
  15. difflabel

    • Description
      Sets a label describing the value of diff code sample.
    • Type
      string
    • Default
      new
    • As labeled in the HTML tool
      New label (optional)
  16. diffview

    • Description
      Determines if the diff report should be expressed is a side-by-side comparison or a single column inline view.
    • Type
      string
    • Accepted values
      sidebyside, inline
    • Default
      sidebyside
    • As labeled in the HTML tool
      Diff View Type
  17. elseline

    • Description
      If the "else" keyword should be pushed onto a new line in JavaScript beautification.
    • Type
      boolean
    • Default
      false
    • As labeled in the HTML tool
      Else on New Line (JavaScript Only)
  18. force_indent

    • Description
      Allows every piece of code and content in a markup language to be indented without regard for the creation of white space tokens or code semantics.
    • Type
      boolean
    • Default
      false
    • As labeled in the HTML tool
      Force Indentation (Markup Only)
  19. help (node-local.js only)

    • Description
      Displays documenation to the command line or console.
  20. html

    • Description
      Forces markup code to be interpreted as HTML. HTML mode identifies certain tags as singletons by tag name even if they are no closed with "/>" syntax. It also ignores beautification on "<pre>" elements and tolerates "<li>" elements that do not a closing "</li>" tag. This option will be automatically assigned a value of true if the lang option is provided a value of "auto" and the code can be identified as HTML.
    • Type
      boolean
    • Default
      false
    • As labeled in the HTML tool
      Presume HTML (Markup Only)
  21. inchar

    • Description
      Stores the character literal used for an indentation. A single indentation is the result of this value repeated the number of times specified in the insize option.
    • Type
      string
    • Default
      (a single space)
    • As labeled in the HTML tool
      Indentation character
  22. inlevel

    • Pads JavaScript and markup beautification with additional indentation. Useful in the case of submitting code to a markupdown format that identifies code by a padding of 4 spaces for each code line.
    • Type
      number
    • Default
      0
    • As labeled in the HTML tool
      Code padding (Markup / JavaScript)
  23. insize

    • Description
      Stores the number of times the inchar value must repeat to comprise a single indentation.
    • Type
      number
    • Default
      4
    • As labeled in the HTML tool
      Indentation size
  24. jsscope

    • Description
      Produce HTML output for JavaScript beautification that colors variables based upon their scope of declaration. Using colors this feature highlights inheritance, scope depth, and closure. The value "none" turns this feature off. The value "html" creates a formatted HTML code sample for displaying HTML code samples on web pages. The value "report" creates the code for a complete HTML file. For additional information please read the jsscope guide and the jshtml guide.
    • Type
      string
    • Accepted values
      none, html, report
    • Default
      none
    • Informational Guide
      jsscope features and saving colorful code as HTML
    • As labeled in the HTML tool
      Scope Analysis (JavaScript Only)
  25. lang

    • Description
      Tells the diff program which language it is receiving. The value "auto" allows the application to determine between CSS, JavaScript, and Markup without human effort. If the auto value cannot determine the language it will default to a value of text if the mode is diff or it will default to a value of JavaScript for other modes.
    • Type
      string
    • Accepted values
      auto, css, csv, javascript, markup, text
    • Default
      auto
    • As labeled in the HTML tool
      Code type
  26. langdefault

    • Description
      If the lang option is set to a value of "auto" the value of langdefault determines what the default language should be in case where a language cannot be detected from the code sample.
    • Type
      string
    • Accepted values
      css, csv, javascript, markup, text
    • Default
      javascript (dom.js - HTML tool), text (node-local.js and prettydiff.wsf)
    • As labeled in the HTML tool
      Auto detect default
  27. mode

    • Description
      The operation to be performed.
    • Type
      string
    • Accepted values
      beautify, diff, minify, parse
    • Default
      diff
    • As labeled in the HTML tool
      Function
  28. obfuscate

    • Description
      A converts reference names into smaller names during JavaScript minification.
    • Type
      boolean
    • Default
      false
    • As labeled in the HTML tool
      Obfuscation (JavaScript)
  29. objsort

    • Description
      Sorts properties of objects in JavaScript and/or CSS. The accepted values determine which language this option should be applied.
    • Type
      string
    • Accepted values
      all, css, js, none
    • Default
      js
    • As labeled in the HTML tool
      Property Sorting (CSS / JavaScript)
  30. output (node-local.js only)

    • Description
      Determines the location of where files should be saved.
    • Type
      string
  31. preserve

    • Description
      Retain empty lines in either JavaScript or CSS like languages. Consecutive empty lines will be converted to a single empty line.
    • Type
      string
    • Accepted values
      all, css, js, none
    • Default
      js
    • As labeled in the HTML tool
      Empty Lines (CSS / JavaScript)
  32. quote

    • Description
      Diff only language independent option to normalize single quote characters to double quote characters.
    • Type
      boolean
    • Default
      false
    • As labeled in the HTML tool
      Diff Quotes
  33. quoteconvert

    • Description
      Convert the quote characters delimiting strings from either double or single quotes to the other. Applies to JavaScript and CSS and to attributes in markup.
    • Type
      string
    • Accepted values
      double, single, none
    • Default
      none
    • As labeled in the HTML tool
      Quotes (Markup / JavaScript)
  34. readmethod (node-local.js only)

    • Description
      Determines how input should be received. The value auto changes to directory, file, or screen depending on the source type. The value directory will read all files from a single directory. The value file reads the contents of a single file. The value filescreen reads the contents of a file but outputs the result to the console. The value screen will look to the console for code input and outputs to the console. The value subdirectory will recursively read files in subdirectories.
    • Type
      string
    • Default
      screen
    • Accepted values
      auto, directory, file, filescreen, screen, subdirectory
  35. report (node-local.js only)

    • Description
      Determines if a meta data report should be generated.
    • Type
      boolean
    • Default
      true
  36. semicolon

    • Descriptiong
      If semicolon characters at the end of a line should be removed prior to a diff operation.
    • Type
      boolean
    • Default
      false
    • As labeled in the HTML tool
      Trailing Semicolons
  37. source

    • Description
      A code sample to operate upon.
    • Type
      string
    • Default
      none
    • As labeled in the HTML tool
      Base Text (diff), Beautification input (beauty), Minification input (minify)
  38. sourcelabel

    • Descriptiong
      Sets a label describing the value of source property in the diff report.
    • Type
      string
    • Default
      Base
    • As labeled in the HTML tool
      Base label (optional)
  39. space

    • Description
      Inserts a space following a function keyword for anonymous functions in JavaScript beautification.
    • Type
      boolean
    • Default
      true
    • As labeled in the HTML tool
      Function Space (JavaScript Only)
  40. style

    • Description
      Whether CSS and JavaScript code should be indented according to the surrounding markup or if they should be indented starting from 0. This property is only applied to markup code containing CSS and JavaScript.
    • Type
      boolean
    • Default
      true
    • As labeled in the HTML tool
      Indent Style/Script (Markup Only)
  41. styleguide

    • Description
      Provides a packaged set of option configurations for JavaScript interpretation to more closely conform to popular style guides. In the face of a conflict between something configured to the value of a styleguide option setting and a separately specified option the styleguide overrides. For additional information please see the styleguide guide.
    • Type
      string
    • Accepted values
      airbnb, crockford, google, grunt, jquery, mediawiki, yandex, none
    • Default
      none
    • Informational Guide
      styleguide definitions
    • As labeled in the HTML tool
      Style Guide (JavaScript Only)
  42. titanium

    • Description
      If the JavaScript parser should parse Titanium Style Sheets instead of JavaScript. This option may be set explicitly, but is primarily used internally from language detection or if the value of the lang option is "tss".
    • Type
      boolean
    • Default
      false
    • As labeled in the HTML tool
      (absent)
  43. topcoms

    • Description
      If minification should include into the output all comments at the top of JavaScript or CSS input before any code.
    • Type
      boolean
    • Default
      false
    • As labeled in the HTML tool
      Top Comments (CSS / JavaScript)
  44. varword

    • Description
      If a single var should be used to declare a list of variables or if a var keyword should be used per variable. A value of each will convert comma separated variable lists into separate statements each starting with a var keyword. The value list will convert consecutive variable statements to a comma separated list. The value none omits this option.
    • Type
      string
    • Accepted values
      each, list, none
    • Default
      none
    • As labeled in the HTML tool
      Variable Lists (JavaScript)
  45. vertical

    • Description
      If lists of assignments or properties should be vertically aligned for faster and easier reading. The accepted values determine to which language this option should be applied.
    • Type
      boolean
    • Accepted values
      all, css, js, none
    • Default
      js
    • As labeled in the HTML tool
      Vertically Align (CSS / JavaScript)
  46. wrap

    • Description
      In a markup document this option sets how many columns wide text content may be before wrapping onto a new line. In JavaScript this option determines the maximum length of a string literal and line comment before being broken in + separated fragments. The value 0 disables text wrapping. A negative value combines + separated string literals into a single string.
    • Type
      number
    • Default
      80
    • As labeled in the HTML tool
      Wrap text (JavaScript, Markup)

Practices of Pretty Diff

Pretty Diff hopes to encourage conventions of efficiency, but not at cost to recursion, regression testing, or altered functionality. For example consider the situation of minifying markup. In a typical scenario the practice of code minifcation is to remove all code comments and all white space characters not absolutely necessary for syntax interpretation. If the most impactful form of minification is exercised upon markup the functionality of the code is certainly changed. Consider these two examples:

  1. <p>This is a paragraph with a text field. <input type="text"/></p>
  2. <p>This is a paragraph with a text field.<input type="text"/></p>

The difference between the two examples above is the difference of a single space character between the period and the input tag. In markup white space characters are tokenized when the code is parsed to output by default, and rarely is this default challenged. Tokenized white space means sequential white space characters are converted to a single space character and then sequential space characters are converted to a single space character. This means the presence of some white space characters are completely trivial, while others are not. A single space character separating words of content is not trivial if it is an isolated space. In the above sample the difference of a space separating the input tag from the content is also not trivial since it alters how tokenized content is interpreted.

If markup code were fully minified then all white space characters outside of syntax containers, such as tags, would be removed, thereby making the content illegible. Markup can still be correctly minified, but only when rendering of tokenized white space is fully considered. The opposite of this problem is accidental addition of white space characters from flawed beautification schemes. Consider the following two examples:

  1. <p>This is a statement with a <a href="#">hyperlink</a>.</p>
  2. <p>     This is a statement with a         <a href="#">             hyperlink         </a>     . </p>

The differences between the two prior examples is that the second example introduces white space tokens where they do not exist in the first example. In the first example there are no characters between the opening <p> tag and the text or the opening <a> and the text while this is not true of the second example. Therefore these two statements are not similar enough for a logical comparison. A well crafted beautifier, or pretty printer, will take these differences into account so far as to alter the entirety of a code base for easier reading but not at the cost of manipulating how the code is parsed by a given interpreter.

Code must never be minified if it cannot be automatically recovered into an easily readable form and must never be beautified if such beautification changes how the code is parsed. This is the importance of regression. The most extreme form of minification is referred to as obfuscation. Obfuscation removes all code comments and all white space characters not absolutely required for syntax compliance, but goes one step further and changes all variable and command names to the fewest available character length. Pretty Diff considers the practice of obfuscation to be harmful as its practice eliminates the possibility of regression. Without the possibility for regression recursive practices are improbable.

An instantiation of a pattern where the pattern's presence is available in the given instance without regard for multiplication is said to be idempotent. A recursive practice is the ability replicate an action where the replication does not harm the potential of further replication, or the idempotent nature of a pattern, upon or resulting from that action. In the case of Pretty Diff code that begins unminified should be capable of being minified, beautified, minified again, and so on without harm or difference to the functional integrity of the supplied code. Any process that prevents such recursive practices, such as obfuscation, are harmful and must be avoided.

Comments are the regression exception. There is no way to efficiently reduce code while retaining comments and documentation. Pretty Diff strongly recommends that documentation be separated from production code into either a redundant development version or into a separated documentation archive so that it can be preserved apart from the production code.

There is one extremely limited exception to functional interference observed by Pretty Diff. The Cascading Style Sheets language provides a syntax and vocabulary that are limited and fully known. Therefore functional changes can, and are, supplied to CSS code during minification because in this one narrow instance there is no harm to regression. Superior minification can be performed by supplying minor functional changes to the code which be easily and intelligently reversed without error or prior knowledge of the code sample.

Option Comment

The Pretty Diff option comment is similar in convention to the JSLint option comment. In the case where multiple Pretty Diff option comments are present in a document only the first will be processed. If in diff mode and an option comment is present in the diff code but not the source code then this option comment will be processed. In order for the option comment to be recognized it must start with /*prettydiff.com and end with */. The options are listed in this comment separated by commas as a colon separated name value pair. The options match the exact value definition for the Pretty Diff application properties above and options that allow abstract values must have their values enclosed in either single or double quotes. The options can be listed in any order. The option comment should be separated from other comments to prevent any possibility of corrupted interpretation. These are examples of appropriate option strings:

Input and Output

The function outputs an array of two indexes. The first array index is always the processed data and the second array index contains some metadata. In the case of the "beautify" and "minify" operations the first index of the output array is the processed source code as text and the second array index is the code report, as seen generated on the client side tool, formatted as HTML. The output from the diff operation returns an HTML table of the actual diff output as the first array index and a some minor metadata about the number of errors in the second array index with both indexes formatted as HTML and neither comprising a complete HTML document. To form the diff output of the prettydiff function into a single HTML document I supply the following extra code from the various files in the api directory:

  1. output = prettydiff(...);
  2. heading = '<?xml version="1.0" encoding="UTF-8" ?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"><html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"><head><title>Pretty Diff</title><link rel="canonical" href="http://prettydiff.com/" type="application/xhtml+xml"/><meta http-equiv="Content-Type" content="application/xhtml+xml;charset=UTF-8"/><meta name="robots" content="index, follow"/><meta name="DC.title" content="Pretty Diff - The difference tool"/><link rel="icon" type="image/x-icon" href="http://prettydiff.com/images/favicon.ico"/><link rel="meta" href="http://prettydiff.com/labels.rdf" type="application/rdf+xml" title="ICRA labels"/><meta http-equiv="pics-Label" content='(pics-1.1 "http://www.icra.org/pics/vocabularyv03/" l gen true for "http://prettydiff.com" r (n 0 s 0 v 0 l 0 oa 0 ob 0 oc 0 od 0 oe 0 of 0 og 0 oh 0 c 1) gen true for "http://www.prettydiff.com" r (n 0 s 0 v 0 l 0 oa 0 ob 0 oc 0 od 0 oe 0 of 0 og 0 oh 0 c 1))'/><meta name="author" content="Austin Cheney"/><meta name="description" content="Pretty Diff tool can minify, beautify, or diff between minified and beautified code.This tool can even beautify and minify HTML."/><meta name="distribution" content="Global"/><meta http-equiv="Page-Enter" content="blendTrans(Duration=0)"/><meta http-equiv="Page-Exit" content="blendTrans(Duration=0)"/><meta http-equiv="content-style-type" content="text/css"/><meta http-equiv="content-script-type" content="text/javascript"/><meta name="google-site-verification" content="qL8AV9yjL2-ZFGV9ey6wU3t7pTZdpD4lIetUSiNen7E"/><link rel="stylesheet" type="text/css" href="http://prettydiff.com/diffview.css" media="all"/></head><body><h1>Pretty Diff - The difference tool</h1><span class="clear"></span><div id="diffoutput">';
  3. return heading + output[1] + "</div>" + output[0] + "</body></html>";

Executing with Node.js

Node.js is a command line run time. Node.js execution could be the result of a one-time execution or part of a scripted automation process. To execute the prettydiff function with Node.js the following code needs to be added after the prettydiff function. Nothing else needs to occur for compatibility or integration with Node.js.

if(typeof exports!=="string"){exports.api=function(x){"use strict";return prettydiff(x);};}

Some Node.js API files are supplied as part of the Pretty Diff project. An actively maintained API for use on the local file system is supplied as api/node-local.js.

Executing with Windows Script Host (WSH)

Windows Script Host allows for a JavaScript run time in Windows environments from a command line with output directly returned to the command line or in a debugger window. To execute JavaScript with WSH a file is needed to supply the JavaScript function call, pass arguments from command line into a JavaScript compatible format, and to request dependencies.

An actively maintained API is provided at api/prettydiff.wsf. A WSF file follows basic XML syntax and may allow multiple operations of different languages to execute in tandem so long as each operation is confined to a job tag. The named elements in the example file are used to intercept arguments supplied via command line.

The example file would be operated for HTML compatibility using the following command: cscript prettydiff.wsf /source:"my_source_file.js" /html:true /mode:"beautify"

Writing the output of a WSH task into a file would require an additional ActiveX instruction in the wsh.wsf file or would require the automation of script execution in the context of the PowerShell language.

Beautification

JavaScript

JavaScript beautification uses the jspretty library. JavaScript is beautified in a manner that conforms to the rules of JSHint and JSLint. jspretty counts unnecessary use of the new keyword in its report summary, which is any use except immediately preceeding one of these global objects:

"ActiveXObject", "ArrayBuffer", "AudioContext", "Canvas", "CustomAnimation", "DOMParser", "DataView", "Date", "Error", "EvalError", "FadeAnimation", "FileReader", "Flash", "Float32Array", "Float64Array", "FormField", "Frame", "Generator", "HotKey", "Image", "Iterator", "Intl", "Int16Array", "Int32Array", "Int8Array", "InternalError", "Loader", "Map", "MenuItem", "MoveAnimation", "Notification", "ParallelArray", "Point", "Promise", "Proxy", "RangeError", "Rectangle", "ReferenceError", "Reflect", "RegExp", "ResizeAnimation", "RotateAnimation", "Set", "SQLite", "ScrollBar", "Set", "Shadow", "StopIteration", "Symbol", "SyntaxError", "Text", "TextArea", "Timer", "TypeError", "URL", "Uint16Array", "Uint32Array", "Uint8Array", "Uint8ClampedArray", "URIError", "WeakMap", "WeakSet", "Web", "Window", "XMLHttpRequest"

Summary

An unassigned anonymous function creates the summary report and assigns its output to a variable named summary. Variable summary is not declared within jspretty. It is used as a closure from a higher scope so that it may be externally available and yet still access all the internals of jspretty.

CSS

CSS beautification uses csspretty library. A minimalist parser that is capable of quick extension and deep analysis. csspretty is written to supply extended support for the conventions of SCSS and LESS grammars.

CSV

CSV typically stands for comma separated values, but in this tool it stands for character separated values. The csvbeauty library takes a sequence of characters and splits the input upon that supplied sequence onto new lines. Prior existing line breaks, if they were quoted, are converted to a space contained by braces: { }. Unquote line breaks are converted into two simultaneous line breaks. If the final character(s) match the user supplied character sequence, after charDecoder processing, then those characters are converted into {|} so that csvmin will know a character sequence must exist at the extreme end of input. Escaped double quote characters, escaped using the formal CSV method by immediately preceeding the characters with an extra double quote character, are converted in a single double quote character to improve ledgibility.

CSV beautification uses charDecoder to decode Unicode character entities. The charDecode library accepts any combination of HTML decimal Unicode entities and Unicode hexidecimal entities. HTML decimal entities must begin with an ampersand and pound character '&#', be immediately followed with between one and six decimals, and be immediately terminated by a semicolon ';'. Examples of accepted HTML entities are:

The Unicode hexidecimal entities must begin with a lowercase u and plus character 'u+', be immediately followed by a four or five digit hexidecimal value, and be immediately terminated by a plus character. Hexidecimal values smaller than four digits must be padded with 0 characters necessary to achieve four digits. Examples of accepted Unicode entities are:

Please be aware that charDecode is reliant upon the interpreting application's HTML character rendering engine to map entity values to character maps, which means if the browser does not support the entity supplied the browser will return a generic character marker instead of the intended character. The content will then be separated in accordance to the rendered sequence value, which means a generic character marker will be used in the separation instead of the character referrenced by the supplied entity. In summary, if your browser has limited support for Unicode characters you must expect equally limited results when using entity references.

Markup

Markup beautification uses the markup_beauty library, which operates upon a pattern based logic of referential integrity. This means decisions are made through exposure to the pattern as established so far. Unfortunately, this requires defined logic to consider all possible combinations of patterns.

The markup beautification is based upon syntax conventions only and absolutely not upon vocabulary. The two exceptions are that the contents of a script tag are presumed to be JavaScript if the tag does not contain a type attribute or the type attribute contains one of these values: text/javascript, text/ecmascript, application/javascript, application/x-javascript, application/ecmascript. When the contents of a script tag are presumed as JavaScript they are beautified accordingly. The contents of a style tag are presumed to be CSS if the tag contains a type attribute with a value of text/css or if the type attribute is not present, and those contents are beautified as CSS. The presumed CSS and JavaScript do not inherit indentation from the markup. Since the beautification is not based upon vocabulary any language that uses angle brackets for delimiters should work assuming the conditions of the next paragraph are met. The supplied markup does not have to be valid or well formed by any means.

Content in the markup is represented by whether or not it begins or ends with any whitespace. If content does begin and/or end with whitespace then new line characters are added and the content is indented. This means tags that butt up directly to content are then treated as an extension of that content and are not indented. Singleton tags are expected to be terminated as XML singleton tags, which means a forward slash character prior to its closing angle bracket. If a singleton tag is not properly closed the beautifier believes the tag to be a start tag, which expects an end tag. Singleton tags may represent an indication of content in the form of media or form controls, and so they are indented in the same manner as content.

PHP tags are expected to open with "<?php" and XML parsing declarations are expected to open with "<?xml". Tags that begin with only "<?" are not supported, and so they are believed to be start tags missing a closing tag. This unsupported convention is no longer supported by PHP, even if tolerated, and will generate errors to an XML parser. I don't support this and neither should you.

Start tags expect to receive an end tag. End tags will be indented exactly like their starting pair unless they are directly next to content and the same is true for start tags. The beautification logic is smart enough to compensate and correct itself in adjustment for start tags or end tags that are not indented due to content.

The markup_beauty function also supports nested tags. Some server side processing languages use an XML base tag syntax for application processing, such as JSTL, and allow the direct embedding of HTML and XML tags directly. This following tag is example of something that can be beautified: <c:out value="<strong>variable text output</strong>"/>. The only limitation is that the nested tags must be quoted in either double or single quotes.

Markup Summary

The markup_beauty function contains, at its end, an unassigned anonymous function that creates the summary report and assigns its output to a variable named summary. Variable summary is not provided a scope by markup_beauty, because it is meant to be supplied as a closure to markup_beauty. This summary variable must be provided a scope by the consuming application or it will become an implied global, or an undeclared variable error in strict mode.

The markup_summary creates a report of the number of parts comprising the markup, the weight of each of those parts, and a score using a math formula to compute a performance rating that reenforces reliance upon structure and elaboration of content. This function also displays each HTML element making a HTTP request.

Minification

JavaScript

The custom built csspretty library is used to analyze and minify JavaScript code. It is also capable of providing some minor auto-correction, such as inserting missing semicolons and curly braces.

CSS

The custom built csspretty library is used to analyze and minify CSS code.

CSV

CSV typically stands for comma separated values, but in this tool it stands for character separated values. The csvmin library reverts all changes inflicted by the csvbeauty library.

CSV beautification uses charDecoder to decode Unicode character entities. The charDecode library accepts any combination of HTML decimal Unicode entities and Unicode hexidecimal entities. HTML decimal entities must begin with an ampersand and pound character '&#', be immediately followed with between one and six decimals, and be immediately terminated by a semicolon ';'. Examples of accepted HTML entities are:

The Unicode hexidecimal entities must begin with a lowercase u and plus character 'u+', be immediately followed by a four or five digit hexidecimal value, and be immediately terminated by a plus character. Hexidecimal values smaller than four digits must be padded with 0 characters necessary to achieve four digits. Examples of accepted Unicode entities are:

Please be aware that charDecoder is reliant upon the interpreting application's HTML character rendering engine to map entity values to character maps, which means if the browser does not support the entity supplied the browser will return a generic character marker instead of the intended character. The content will then be separated in accordance to the rendered sequence value, which means a generic character marker will be used in the separation instead of the character referrenced by the supplied entity. In summary, if your browser has limited support for Unicode characters you must expect equally limited results when using entity references. csvmin does not revert any changes supplied by the charDecoder library.

Markup

Markup is minified using markupmin. This library does little more than tokenize a run of whitespace characters into a single space character and scrubbing of comments. It does, however, preserve whitespace inside ASP and PHP tags and preserve SSI tags. It will also assume the contents of a script tag are JavaScript and minify them according, and also assumes the contents of style tags are CSS and minifies them as such.

Pretty Diff

Diff Code

The diff engine uses diffview. Originally by Snowtide Informatics Systems. diffview is almost entirely rewritten from scratch so that JavaScript arrays are used to store the dynamic output instead of DOM objects. This change has result in a faster and more extensible application. charcomp is the function used to highlight per character differences.

Diff Process

JavaScript code is beautified with the jspretty library and then compared. CSV is first minified with csvmin and then beautified with csvbeauty. CSS is beautified with the csspretty library and then compared. Markup is beautified using the markup-beautify library. Plain text is compared without any minification or beautification. If code that needs to be compared is not compatible with the other processes then use the plain text mode.

Parsing

Overview of Parsing

Most parsers generate an Abstract Syntax Tree, instead the libraries used in Pretty Diff generate parallel arrays. This means multiple separate arrays are generated with a one to one relation of values at any given indexes between the arrays.

The parse data is exposed if the mode option has a value of parse. The language CSV is not currently supported in parse mode.

The parse data is always returned as object with two or three properties. This is handy if Pretty Diff is embedded directly in some application and the parse data needs to be immediately available. In other instances the generated object will need to be converted to a string using the JSON.stringify method.

JavaScript Parsing

For JavaScript and similar languages, such as React JSX, the jspretty library returns an object containing two arrays associated with the property names: token and types. The token array contains the actual parsed code into atomic fragments called tokens. The types array contains a categorical label that identifies each token into a grouped term.

The jspretty creates some tokens that are not present in the code source. In the case of missing semicolons and missing curly braces the jspretty library will create these tokens in the proper location. The created tokens are pseudo tokens in this form: "x;", "x{", and "x}". If the correct option is used the pseudo tokens are converted to actual code tokens. When the correct option is not used the pseudo tokens will be present in the parsed output, but are removed from the output when the mode option is not "parse".

Here are the supported types:

  • comment — Describes JavaScript code comments. This describes all block comments (/*) and most inline comments (//).
  • comment-inline — Describes JavaScript inline comments that follow code tokens on the same line of code.
  • end — Described all closing square braces, closing curly braces, and closing paranthesis.
  • literal — Describes string and number types. Also used for the literal portion of ES6 template strings, Unix shebang, embedded ASP, PHP, and SSI code.
  • markup — Describes blocks of XML code for the React JSX langauge.
  • method — Describes an open paranthesis that is immediately following a word token.
  • operator — Describes all arithmetic characters, comparison characters, and syntax characters not described elsewhere. A subtraction character immediately preceeding a number and not following a word, end, or literal type is joined with the number in type literal and not described as an operator.
  • regex — Describes a regular expression instance.
  • separator — Describes commas, semicolons, and periods that are not present to describe a decimal point in a number.
  • start — Describes all opening square braces, all opening curly braces, and open paranthesis characters not described by type method.
  • word — Describes all references and keywords. In the jspretty library a word token may contain any character that is not described by another type and is not a white space character, which is more flexibility than the JavaScript language allows.

CSS Parsing

The csspretty library parses CSS and similar languages, such as: SCSS (Sass) and Less. In the parse mode an object containing two parallel arrays are generated ans assigned to properties: token and types.

The types generated by csspretty are:

  • colon — Describes a colon that separates properties from values and variables.
  • comment — Describes block comments (/*), which are the only valid comment type in CSS, and inline comments that are used in SCSS and Less.
  • comment-inline — Describes an inline comment (//) that follows code on the same line of code.
  • end — Describes an end curly brace.
  • property — Describes a CSS prentation property.
  • propvar — Describes a SCSS or Less variable used where a value type is expected.
  • pseudo — Describes a method following a colon type. This is primarily used for the SCSS "extend" method.
  • selector — Describes a reference name which preceeds a start type. The csspretty library preserves selectors as a single token and does not break them down into individual items of syntax or reference.
  • semi — Describes a semicolon that separates properties.
  • start — Describes an opening curly brace.
  • value — Describes something assigned to a property.

Markup Parsing

All markup type languages are parsed with the markup_beauty library and returns an object containing three parallel arrays. The three arrays are assigned to property names: token, typea, typeb. Two categories of types are present to account for context dependents type classifications.

The HTML5 specification allows for list item elements to contain text content without an end tag. A start tag without a matching end tag would disrupt beautification, so Pretty Diff attempts to identify these instances by use of a pseudo tag: </prettydiffli>. If the option correct the pseudo tags are instantly supplied as </li> tags otherwise the Pretty Diff pseudo tag is removed immediately prior to generating output. The pseudo token logic is only evaluated if the input is HTML.

Types available in typea array:

  • T_tag_start — start tags, tags delimited by angle braces only
  • T_tag_end — end tags, tags that start with "</"
  • T_singleton — singleton tags, tags that end with "/>"
  • T_asp — any tag that begins and ends with the these delimiters: "<%" and "%>", "[%" and "%]", "{@" and "@}", "{{{" and "}}}", "{{" and "}}"
  • T_php — any tag that begins with "<?" and ends with "?>"
  • T_ssi — any tag that begins with "<--!" and ends with "-->"
  • T_xml — any tag that begins with "<?xml" and ends with "?>"
  • T_sgml — any tag that begins with "<!" where the following character is not "-"
  • T_comment — any tag that begins with "<--" and ends with "-->"
  • T_ignore — any tag containing the attribute "data-prettydiff-ignore" or tags named "pre" if in HTML mode
  • T_script — tags named "script"
  • T_style — tags named "style"
  • T_content — text node types

Types available in typeb array:

  • start — T_start or T_asp types identified as template start blocks
  • end — T_end or T_asp types identified as template end blocks
  • singleton — T_singleton tags
  • parse — T_xml and T_sgml types
  • comment — T_comment type tags
  • external — T_content types immediately following either T_script or T_style tags if the script or style tags have no type attribute or a type attribute with a particular value
  • mixed_both — content that begins and ends with white space
  • mixed_start — content that begins and does not end with white space
  • mixed_end — content that does not begin with white space but ends with white space
  • content — content that does not begin and does not end with white space

Components Files

A list of code components, author information, and dates of revision.
ComponentAuthor(s)SummaryRevised
charDecoder.jsAustin CheneyThe function that decodes Unicode character entities for csvbeauty.js and csvmin.js.
codemirror.cssCodeMirror teamCSS for the CodeMirror editor.
codemirror.jsCodeMirror teamJavaScript for the CodeMirror editor.
csspretty.jsAustin CheneyCSS parser.
csvbeauty.jsAustin CheneyThe function that beautifies character sequence values.
csvmin.jsAustin CheneyThe function that minifies character sequence values.
diffview.cssAustin CheneyThe CSS that powers everything to do with the form, diff output, and this documentation.
diffview.js Chas Emerick - Original Austin Cheney - Major Revision Builds the HTML diff output.
documentation.xhtmlAustin CheneyMaria Ramos - EspañolThis documentation page.
dom.jsAustin CheneyA supplemental JavaScript file providing DOM access and interaction with the web tool.
jspretty.jsAustin CheneyHarry Whitfield - QA supportBeautifies JavaScript code.
markup_beauty.jsAustin CheneyBeautifies markup code.
markupmin.jsAustin CheneyMinifies markup code.
node-local.jsAustin CheneyAn API for processing the local command line JavaScript with Node.js
prettydiff.com.xhtmlAustin CheneyActual Pretty Diff tool HTML file.
prettydiff.jsAustin CheneyActual Pretty Diff application code.
prettydiff.wsfAustin CheneyPretty Diff API for Windows Script Host.

Please send comments, feedback, and requests to [email protected].