Saturday, December 08, 2018

Understanding JavaScript Modules As A TypeScript User

This post, is the first in a series of post that I will be writing about Modules in TypeScript. It is a continuation of the process of jotting down the various things I learnt while building ip-num. The previous post like this one, was Declaration Files in TypeScript: An Introduction. In which I shared how I came to wrap my head around declaration files in TypeScript.

Since the topic of Modules is quite an expansive one, instead of jamming everything I need to write into one blogpost, I will be spreading it out, across various posts, that would be published in the coming weeks/months.

This post is to get the ball rolling on that, and It would focus mainly on introducing the concepts of modules as we have them within the JavaScript ecosystem.

Since TypeScript is but a superset of JavaScript and JavaScript is still its only transpiler target, one really cannot get away from having to understand how things work in the JavaScript land in other to be able to deeply understand how various aspects of TypeScript fits together.

Modules are just one example of this.

This post is divided into the following section:

Modules in JavaScript before ES6 - this introduces CommonJS and AMD.
Modules in JavaScript with ES6 - this introduces the native module support in JavaScript.
How does everything fits together with TypeScript - this introduces how TypeScript compiles to the various module systems.
Summary aka tl:dr - this wraps up the key points in a couple of bullet points.

Modules In JavaScript Before ES6

Until recently, there was no native support for modules in JavaScript. Web applications were built across multiple JavaScript files that are then included into the web page using the <script/> tag.

Anyone with the tiniest knowledge of JavaScript would know the <script/> tag as it was the mechanism that allows the inclusion of JavaScript code inside an HTML page.

Given two JavaScript files:

// in message.js

let msg = "hello world"

// in greeter.js

let greeter = (themsg) = {
  console.log(themsg);
}

greeter(msg)

You can now include these two JavaScript files in an html file via the script tag:

// in index.html

<script type="text/javascript" src="message.js"/>
<script type="text/javascript" src="greeter.js"/>

And when you load the page, you will see "hello world" printed in the console.

But there are a couple of problems with this setup. First problem is that there is no explicit indication of the dependency between message.js and greeter.js. That is, no explicit way for greeter.js to indicate that it depends on msg coming from message.js

The second problem is the use of the global scope to share data between message.js and greeter.js. The message.js had to declare msg in the global scope in other for it to be accessible to greeter.js.

This leads to a very fragile setup. For one, the message.js has to be included in the index.html file before greeter.js. Not having the JavaScript files included in this order would break things. The other issue is the use of the global scope; which is never a good idea. It increases the chances of name collision as more variables, and files get added to a project.

Obviously, this was not a desirable situation to be in. In other to be able to confidently build complex applications, a more robust and less fragile solution had to be invented.  A solution that does not pollute the global namespace, a solution that makes declaring of dependencies between JavaScript files explicit and a solution that does not require having to manually stitch script tags together in html pages.

And this was exactly what happen, and what the JavaScript community did. The interesting thing though, was that, instead of coming up with one solution, two solutions to the lack of modules in JavaScript came to be. One targeting server side use cases, and the other targeted usage within browser environments.

The solution that emerged for usage in server side environment came to be known as CommonJS, while the one targeting the browser environment became known as AMD, which stands for Asynchronous Module Definition.

These two module systems allow for essentially the same thing: ability to structure our JavaScript software into separate, self contained pieces (referred to as modules), and be able to explicit describe dependencies between these self contained modules without having to pollute the global namespace.

How do they look like in practice?

I will quickly highlight how these two module system looks like when used in practice, nothing in depth will be presented in this post, For an in depth exploration, check the links listed in the reference sections.

How to CommonJS

The two main key concept to keep in mind with CommonJS is module.exports/exports and the require function.

module.exports
The module.exports, (and the exports variable, which is a reference to module.exports) is used to declare the pieces of functionality in a file, (read module) that should be available for use in another JavaScript file (read another module).

This pieces of functionality to be made available for use by others could be objects, variable declaration, functions etc.

require function
The require function is used to bring in the pieces of functionality that has been exported in one file, into another.

We can use these two construct to convert the examples above into one that uses CommonJS.

Basically what we want to do is this:
  1. Define a messages variable in message.js and export it
  2. Define the greeter function in greeter.js and export it
  3. In index.js bring in both message and greeter function as a dependency and use them
This would look like:



CommonJS is a specification which is implemented by Node.js. Although this might be considered the most popular implementation, other implementation do exist. See Are there implementations of CommonJS besides Node.js?

CommonJS References 
Writing Modular JavaScript With AMD, CommonJS & ES Harmony
The difference between module.exports and exports
Understanding module.exports and exports in Node.js
Node.js module.exports vs. exports

How to AMD

The main concepts to keep in mind with AMD are the define function and the require function. The define function is used to define a module while specifying its required dependencies, if any.

The require function is used to write top level code that depends on other predefined AMD modules.

We can use these two constructs to convert the examples above into one that uses AMD. Basically what we want to do is this:
  1. Define a messages variable in message.js using AMD's define
  2. Define a greeter function in greeter.js using AMD's define
  3. In script.js use AMD's require to bring in message and greeter, and use them.
  4. Include script.js in the index.html
This would look like this:



Note, just like CommonJs, AMD is also a specification. But unlike CommonJS which can be used right away in NodeJs, to use AMD in the browser, you also have to include a runtime implementation of the AMD specification. This is needed to be able to process the module definitions.

RequireJS is such an implementation and runtime provider of AMD.

We use require in the embedded snippet above in index.html by including https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js

Note, that just like we have various implementation of CommonJS, there exists other implementation of AMD apart from RequireJS, although I think RequireJS is probably the oldest implementation of the spec.

AMD References 
Writing Modular JavaScript With AMD, CommonJS & ES Harmony
How to Get Started With RequireJS


Modules In JavaScript With ES6

With ES6, JavaScript now has native support for modules. This means that modules can be defined using just JavaScript without the need for CommonJS or AMD.

The native module definition is done using a syntax that is different from what we have in CommonJS and AMD, and since it is a native feature, it does not require any external library to provide a runtime. What is needed is just a browser that supports JavaScript module, a requirement which should no longer be hard to come by since most modern browser now support JavaScript modules.

When writing TypeScript, the syntax you would be using to define modules would be the JavaScript native module syntax. The native module syntax could appear confusing. This is because there are slightly different variant to it. Because of this, I would be dedicating another post to this subject. But below is an example of how using ES6 module looks like.




The take away from this section is that, when writing TypeScript, you would not be using CommonJs or AMD syntax to define your modules. You would be using the native JavaScript module syntax.


How does everything fits together with TypeScript

At this point, we have a brief overview of the landscape of module systems in JavaScript. How the lack of a native module system led to the creation of CommonJS and AMD.

It has also been noted that JavaScript now has native support for modules, and that while writing TypeScript, the syntax used to define modules is essentially the syntax specified by the native module system, and not AMD or CommonJS.

How does this then tie in with TypeScript?

The part TypeScript plays is helping us convert the TypeScript code we write to JavaScript using the module system of our choosing.  So, we write our code, using the native module syntax, when we then transpile to JavaScript, we have the option to specify which module system the generated JavaScript code should be using.

This configuring of the TypeScript compiler can be done either by passing a configuration flag to the TypeScript compiler or by using the tsconfig.json file which houses the configuration we want applied to the complier. The supported module options are "None", "CommonJS", "AMD", "System", "UMD", "ES6", "ES2015" or "ESNext"

I would probably do another post in the future where I disambiguate these various options, but for now we would only focus on the three options we are familiar with, from this posts: which is CommonJS, AMD and ES6.

The content of a tsconfig.json could be

{
    "compilerOptions": {
        "module": "CommonJS", // module configuration
        "noImplicitAny": true,
        "removeComments": true,
        "preserveConstEnums": true,
        "sourceMap": true
    },
    "files": [
        "main.ts"
    ]
}

For the full list of available configuration options available in tsconfig.json, check the documentation.

To have an idea of the effects the module option has on the generated JavaScripts, I have listed below the various generated JavaScript files when the module options is set to "CommonJs", "AMD", and "ES6".

This is done given the following files:

// greeter.ts
export function greet() {
    return `Hello, world`;
}

And

// main.ts

import { greet } from "./greet";
let messsage = greet()
console.log(messsage);

With a tsconfig.json

{
    "compilerOptions": {
        "module": "CommonJS",
        "noImplicitAny": true,
        "removeComments": true,
        "preserveConstEnums": true,
        "sourceMap": true
    },
    "files": [
        "main.ts"
    ]
}

Module set to CommonJS

When the module property in tsconfig.json is set to "CommonJS" and the tsc command is executed to compile the files, the output would be:

// generated greet.js
"use strict";
exports.__esModule = true;
function greet() {
    return "Hello, world";
}
exports.greet = greet;

// generated main.js
"use strict";
exports.__esModule = true;
var greet_1 = require("./greet");
var messsage = greet_1.greet();
console.log(messsage);

You will notice the CommonJS specific exports, and require function in the generated JavaScript.

Module set to AMD

When the module property in tsconfig.json is set to "AMD" and the tsc command is executed to compile, the output would be:

// generated greet.js

define(["require", "exports"], function (require, exports) {
    "use strict";
    exports.__esModule = true;
    function greet() {
        return "Hello, world";
    }
    exports.greet = greet;
});


// generated main.js

define(["require", "exports", "./greet"], function (require, exports, greet_1) {
    "use strict";
    exports.__esModule = true;
    var messsage = greet_1.greet();
    console.log(messsage);
});


Module set to ES6

When the module property in tsconfig.json is set to "ES6" and the tsc command is executed to compile, the output would be:

// generated greet.js

export function greet() {
    return "Hello, world";
}


// generated main.js

import { greet } from "./greet";
var messsage = greet();
console.log(messsage);


You will notice that when the module property is set to ES6, the generated JavaScript is almost the same as the TypeScript that got transpiled. Essentially showing that the module system used in TypeScript is the same one as defined in native JavaScript.

Note that in real life scenario, the generated JavaScript files would probably be processed further, via the usage of module loaders or module bundlers in other to be able to run them. For example if the target module is AMD, you might need to use RequireJS to be able to run the code in the browser, or if the target module was CommonJS you might have to use something like Webpack to bundle the code into a format that can be run in the browser.

Understanding the module loading/bundling landscape is also another important aspect of being able to use TypeScript effectively, so I would probably do a separate post on it in the coming weeks.

Summary aka tl:dr

  • Modules in programming languages allows the building of software as separate, and distinct self contained pieces of functionalities. These pieces of functionalities can then depend on each other, and by doing so, can be composed to build out the full functionality of a software program. Modules are not unique to JavaScript or TypeScript.
  • Historically JavaScript lacks such a module system. Pieces of functionalities are spread across script files, which are then brought into an html page using the <script/> tag.
  • The lack of a module system and the use of the script tag to assemble together the various portion of a web application has various issues:  One is that the different files uses the global namespace to communicate. Also putting together the various script tags together is a fragile process, as the order of inclusion can break the dependency chain.
  • The JavaScript community had to come up with ways to work around this limitation of the absence of module system in JavaScript. Two solutions emerged. CommonJS which was geared towards usage of JavaScript in the backend via Node.js. AMD which was geared towards usage of JavaScript within the browser environment.
  • CommonJS defines module.exports (or exports which references module.exports) as the means to signify functionality that should be accessible to the outside world. It defines the require function, to bring in functionalities made accessible from other modules.
  • AMD makes use of the define function to define functionalities within a module to be accessible by the outside world. While it uses the require function to bring in other modules.
  • JavaScript now has native module since ECMAScript 6 (ES6).
  • When writing Typescript, you basically define your modules using a syntax that is more or less an implementation of native modules introduced in ES6.
  • You can then configure the TypeScript compiler to use a different module system in the JavaScript it generates. The options includes None, AMD, CommonJS, UMD, SystemJS. You choose these module system, depending on the environment you want to run the JavaScript code emitted by TypeScript. For example if the JavaScript would be executed in server side in NodeJS, you probably would want to use CommonJS.
  • You can also use Module Loaders and Module Bundlers to further process the output of the TypeScript compiler. Exploring module loaders/bundlers would be done in a follow up post


I am writing a book: TypeScript Beyond The Basics. Sign up here to be notified when it is ready.

No comments: