Learning JavaScript (2016)
Chapter 20. Node
Up until 2009, JavaScript was almost exclusively a browser scripting language.1 In 2009, frustrated by the state of server-side options, a Joyent developer named Ryan Dahl created Node. Node’s adoption was meteoric, and it even achieved success in the notoriously slow-to-adopt enterprise markets.
For those who liked JavaScript as a language, Node made it possible to use the language for tasks traditionally relegated to other languages. For web developers, the appeal is stronger than just the choice of language. Being able to write JavaScript on the server means a consistent language choice—no mental context-switching, a reduced reliance on specialists, and (perhaps most importantly) the ability to run the same code on the server and the client.
While Node was introduced to enable web application development, its jump to the server inadvertently enabled other nontraditional uses, such as desktop application development and system scripting. In a sense, Node allowed JavaScript to grow up and join the party.
Node Fundamentals
If you can write JavaScript, you can write Node applications. That’s not to say that you can simply take any browser-based JavaScript program and run it on Node: browser-based JavaScript uses APIs that are specific to the browser. In particular, in Node, there is no DOM (which makes sense: there’s no HTML). Likewise, there are APIs that are specific to Node that don’t exist in the browser. Some, like operating system and filesystem support, are not available in the browser for security reasons (can you imagine the damage hackers could do if they could delete your files from the browser?). Others, such as the ability to create a web server, simply aren’t very useful in a browser.
It’s important to understand what’s JavaScript, and what’s part of an API. A programmer who has always written browser-based code might reasonably assume that window and document are simply part of JavaScript. However, those are APIs provided in the browser environment (which were covered in Chapter 18). In this chapter, we’ll cover the APIs provided in Node.
If you haven’t already, make sure Node and npm are installed (see Chapter 2).
Modules
Modules are a mechanism for packaging and namespacing code. Namespacing is a way to prevent name collisions. For example, if Amanda and Tyler both write a function called calculate, and you simply cut and paste their functions into your program, the second one will replace the first. Namespacing allows you to somehow refer to “Amanda’s calculate” and “Tyler’s calculate.” Let’s see how Node modules solve this problem. Create a file called amanda.js:
function calculate(a, x, n) {
if(x === 1) return a*n;
return a*(1 - Math.pow(x, n))/(1 - x);
}
module.exports = calculate;
And a file called tyler.js:
function calculate(r) {
return 4/3*Math.PI*Math.pow(r, 3);
}
module.exports = calculate;
We could legitimately make the argument that Amanda and Tyler were both lazy in naming their functions something so nondescript, but we’ll let it slide for the sake of this example. The important line in both of these files is modules.export = calculate. module is a special object that Node makes available to implement modules. Whatever you assign to its exports property will be what is exported from the module. Now that we’ve written a couple of modules, let’s see how we use them in a third program. Create a file called app.js, and we’ll import these modules:
const amanda_calculate = require('./amanda.js');
const tyler_calculate = require('./tyler.js');
console.log(amanda_calculate(1, 2, 5)); // logs 31
console.log(tyler_calculate(2)); // logs 33.510321638291124
Note that the names we chose (amanda_calculate and tyler_calculate) are totally arbitrary; they are just variables. The value they’re receiving is the result of Node processing the require function.
The mathematically inclined reader may have already recognized these two calculations: Amanda is providing the sum of the geometric series , and Tyler is providing the volume of a sphere of radius r. Now that we know this, we can shake our heads at Amanda and Tyler’s poor naming practices, and choose appropriate names in app.js:
const geometricSum = require('./amanda.js');
const sphereVolume = require('./tyler.js');
console.log(geometricSum(1, 2, 5)); // logs 31
console.log(sphereVolume(2)); // logs 33.510321638291124
Modules can export a value of any type (even a primitive, though there’s little reason for that). Very commonly, you want your module to contain not just one function, but many, in which case you could export an object with function properties. Imagine that Amanda is an algebraist who is providing us many useful algebraic functions in addition to a geometric sum:
module.exports = {
geometricSum(a, x, n) {
if(x === 1) return a*n;
return a*(1 - Math.pow(x, n))/(1 - x);
},
arithmeticSum(n) {
return (n + 1)*n/2;
},
quadraticFormula(a, b, c) {
const D = Math.sqrt(b*b - 4*a*c);
return [(-b + D)/(2*a), (-b - D)/(2*a)];
},
};
This results in a more traditional approach to namespacing—we name what’s returned, but what’s returned (an object) contains its own names:
const amanda = require('./amanda.js');
console.log(amanda.geometricSum(1, 2, 5)); // logs 31
console.log(amanda.quadraticFormula(1, 2, -15)); // logs [ 3, -5 ]
There’s no magic here: the module is simply exporting an ordinary object with function properties (don’t let the abbreviated ES6 syntax confuse you; they’re just functions). This paradigm is so common that there’s a shorthand syntax for it, using a special variable simply called exports. We can rewrite Amanda’s exports in a more compact (but equivalent) way:
exports.geometricSum = function(a, x, n) {
if(x === 1) return a*n;
return a*(1 - Math.pow(x, n))/(1 - x);
};
exports.arithmeticSum = function(n) {
return (n + 1)*n/2;
};
exports.quadraticFormula = function(a, b, c) {
const D = Math.sqrt(b*b - 4*a*c);
return [(-b + D)/(2*a), (-b - D)/(2*a)];
};
NOTE
The exports shorthand only works for exporting objects; if you want to export a function or some other value, you must use module.exports. Furthermore, you can’t meaningfully mix the two: use one or the other.
Core Modules, File Modules, and npm Modules
Modules fall into three categories, core modules, file modules, and npm modules. Core modules are reserved module names that are provided by Node itself, such as fs and os (which we’ll discuss later in this chapter). File modules we’ve already seen: we create a file that assigns tomodule.exports, and then require that file. npm modules are just file modules that are located in a special directory called node_modules. When you use the require function, Node determines the type of module (listed in Table 20-1) from the string you pass in.
Type |
String passed to require |
Examples |
Core |
Doesn’t start with /, ./, or ../ |
require('fs') require('os') require('http') require('child_process') |
File |
Starts with /, ./, or ../ |
require('./debug.js') require('/full/path/to/module.js') require('../a.js') require('../../a.js') |
npm |
Not a core module and doesn’t start with /, ./, or ../ |
require('debug') require('express') require('chalk') require('koa') require('q') |
Table 20-1. Module types |
Some core modules, such as process and buffer, are global, are always available, and do not require an explicit require statement. The core modules are listed in Table 20-2.
Module |
Global |
Description |
assert |
No |
Used for testing purposes. |
buffer |
Yes |
For input/output (I/O) operations (primarily file and network). |
child_process |
No |
Functions for running external programs (Node and otherwise). |
cluster |
No |
Allows you to take advantage of multiple processes for performance. |
crypto |
No |
Built-in cryptography libraries. |
dns |
No |
Domain name system (DNS) functions for network name resolution. |
domain |
No |
Allows grouping of I/O and other asynchronous operations to isolate errors. |
events |
No |
Utilities to support asynchronous events. |
fs |
No |
Filesystem operations. |
http |
No |
HTTP server and related utilities. |
https |
No |
HTTPS server and related utilities. |
net |
No |
Asynchronous socket-based network API. |
os |
No |
Operating system utilities. |
path |
No |
Filesystem pathname utilities. |
punycode |
No |
Encoding of Unicode using a limited ASCII subset. |
querystring |
No |
Utilities for parsing and constructing URL querystrings. |
readline |
No |
Interactive I/O utilities; primarily used for command-line programs. |
smalloc |
No |
Allows for explicit allocation of memory for buffers. |
stream |
Yes |
Stream-based data transfer. |
string_decoder |
No |
Converts buffers to strings. |
tls |
No |
Transport Layer Security (TLS) communication utilities. |
tty |
No |
Low-level TeleTYpewriter (TTY) functions. |
dgram |
No |
User Datagram Protocol (UDP) networking utilities. |
url |
Yes |
URL prasing utilities. |
util |
No |
Internal Node utilities. |
vm |
No |
Virtual (JavaScript) Machine: allows for metaprogramming and context creation. |
zlib |
No |
Compression utilities. |
Table 20-2. Core modules |
It is beyond the scope of this book to cover all of these modules (we will discuss the most important ones in this chapter), but this list gives you a starting point to look for more information. Detailed documentation for these modules is available in the Node API documentation.
Finally, there are npm modules. npm modules are file modules with a specific naming convention. If you require some module x (where x is not a core module), Node will look in the current directory for a subdirectory called node_modules. If it finds it, it will look for x in that directory. If it doesn’t find it, it will go up to the parent directory, look for a module called node_modules there, and repeat the process until it finds the module or reaches the root. For example, if your project is located in /home/jdoe/test_project, and in your application file, you call require('x'), Node will look for the module x in the following locations (in this order):
§ /home/jdoe/test_project/node_modules/x
§ /home/jdoe/node_modules/x
§ /home/node_modules/x
§ /node_modules/x
For most projects, you’ll have a single node_modules directory in the application root. Furthermore, you shouldn’t add or remove things from that directory manually; you’ll let npm do all the heavy lifting. Still, it’s useful to know how Node resolves module imports, especially when it comes time to debug problems in third-party modules.
For modules that you write yourself, do not put them in node_modules. It will work, but the point of node_modules is that it’s a directory that can be deleted at any time and re-created by npm from the dependencies listed in package.json (see Chapter 2). You can, of course, publish your own npm module, and manage that module with npm, but you should avoid editing things directly in node_modules!
Customizing Modules with Function Modules
Modules most commonly export objects, and sometimes a single function. There’s another very common pattern: a module that exports a function that’s intended to be invoked immediately. It’s the return value of that function (which can be a function itself) that’s intended to be used (in other words, you don’t use the function that’s returned; you invoke that function and use whatever it returns). This pattern is used when the module needs to be customized somehow or receive information about the enclosing context. Let’s consider the real-world npm package debug. When you import debug, it takes a string that will be used as a log prefix so logging for different parts of your program can be distinguished. It’s used like this:
const debug = require('debug')('main'); // note that we immediately call the
// function that the module returns
debug("starting"); // will log "main starting +0ms"
// if debugging is enabled
TIP
To enable debugging with the debug library, set an environment variable called DEBUG. For our example, we would set DEBUG=main. You can also set DEBUG=* to enable all debug messages.
It’s clear from this example that the debug module returns a function (because we immediately call it as a function)…and that function itself returns a function that “remembers” the string from the first function. In essence, we have “baked in” a value to that module. Let’s see how we might implement our own debug module:
let lastMessage;
module.exports = function(prefix) {
return function(message) {
const now = Date.now();
const sinceLastMessage = now - (lastMessage || now);
console.log(`${prefix} ${message} +${sinceLastMessage}ms`);
lastMessage = now;
}
}
This module is exporting a function that is designed to be called right away so that the value for prefix can be baked into the module. Note we also have another value, lastMessage, which is the timestamp of the last message that was logged; we use that to calculate the time between messages.
This brings us to an important point: what happens when you import a module multiple times? For example, consider what happens if we import our home-grown debug module twice:
const debug1 = require('./debug')('one');
const debug2 = require('./debug')('two');
debug1('started first debugger!')
debug2('started second debugger!')
setTimeout(function() {
debug1('after some time...');
debug2('what happens?');
}, 200);
You might expect to see something like this:
one started first debugger! +0ms
two started second debugger! +0ms
one after some time... +200ms
two what happens? +200ms
But what you will actually see is this (plus or minus a few milliseconds):
one started first debugger! +0ms
two started second debugger! +0ms
one after some time... +200ms
two what happens? +0ms
As it turns out, Node only ever imports any given module once (every time a Node app is run). So even though we import our debug module twice, Node has “remembered” that we imported it before, and used the same instance. Thus, even though debug1 and debug2 are separate functions, they both share a reference to lastMessage.
This behavior is safe and desirable. For reasons of performance, memory usage, and maintainability, it’s better for modules to only ever be included once.
TIP
The way we’ve written our home-grown debug module is similar to the way its npm namesake works. However, if we did want multiple debug logs that had independent timing, we could always move the lastMessage timestamp into the body of the function that the module returns; then it will receive a new, independent value every time a logger is created.
Filesystem Access
Many introductory programming books cover filesystem access because it’s considered a critical part of “normal” programming. Poor JavaScript: up until Node, it wasn’t in the filesystem club.
The examples in this chapter assume your project root is /home/<jdoe>/fs, which is a typical path on a Unix system (replace <jdoe> with your username). The same principles apply on a Windows system (where your project root might be C:\Users\<John Doe>\Documents\fs).
To create a file, use fs.writeFile. Create a file in your project root called write.js:
const fs = require('fs');
fs.writeFile('hello.txt', 'hello from Node!', function(err) {
if(err) return console.log('Error writing to file.');
});
This will create a file in the directory you’re in when you run write.js (assuming you have sufficient privileges in that directory, and there isn’t a directory or read-only file called hello.txt already). Whenever you invoke a Node application, it inherits its current working directory from where you run it from (which may be different than where the file lives). For example:
$ cd /home/jdoe/fs
$ node write.js # current working dir is /home/jdoe/fs
# creates /home/jdoe/fs/hello.txt
$ cd .. # current working dir is now /home/jdoe
$ node fs/write.js # creates /home/jdoe/hello.txt
Node provides a special variable, __dirname, which is always set to the directory in which the source file resides. For example, we can change our example to:
const fs = require('fs');
fs.writeFile(__dirname + '/hello.txt',
'hello from Node!', function(err) {
if(err) return console.error('Error writing to file.');
});
Now write.js will always create hello.txt in /home/<jdoe>/fs (where write.js is located). Using string concatenation to join __dirname and our filename isn’t very platform-agnostic; this could cause problems on a Windows machine, for example. Node provides platform-independent pathname utilities in the module path, so we can rewrite this module to be more friendly on all platforms:
const fs = require('fs');
const path = require('path');
fs.writeFile(path.join(__dirname, 'hello.txt'),
'hello from Node!', function(err) {
if(err) return console.error('Error writing to file.');
});
path.join will join directory elements using whatever directory separator is appropriate for the operating system, and is generally a good practice.
What if we want to read the contents of that file back in? We use fs.readFile. Create read.js:
const fs = require('fs');
const path = require('path');
fs.readFile(path.join(__dirname, 'hello.txt'), function(err, data) {
if(err) return console.error('Error reading file.');
console.log('Read file contents:');
console.log(data);
});
If you run this example, you may be unpleasantly surprised at the result:
Read file contents:
<Buffer 68 65 6c 6c 6f 20 66 72 6f 6d 20 4e 6f 64 65 21>
If you convert those hex codes to their ASCII/Unicode equivalents, you’ll find it is indeed hello from Node!, but the program as it stands is not very friendly. If you don’t tell fs.readFile what encoding was used, it will return a buffer, which contains raw binary data. Although we didn’t explicitly specify an encoding in write.js, the default string encoding is UTF-8 (a Unicode encoding). We can modify read.txt to specify UTF-8 and get the result we expect:
const fs = require('fs');
const path = require('path');
fs.readFile(path.join(__dirname, 'hello.txt'),
{ encoding: 'utf8' }, function(err, data) {
if(err) return console.error('Error reading file.');
console.log('File contents:');
console.log(data);
});
All of the functions in fs have synchronous equivalents (that end in “Sync”). In write.js, we can use the synchronous equivalent instead:
fs.writeFileSync(path.join(__dirname, 'hello.txt'), 'hello from Node!');
And in read.js:
const data = fs.readFileSync(path.join(__dirname, 'hello.txt'),
{ encoding: 'utf8' });
With the synchronous versions, error handling is accomplished with exceptions, so to make our examples robust, we would wrap them in try/catch blocks. For example:
try {
fs.writeFileSync(path.join(__dirname, 'hello.txt'), 'hello from Node!');
} catch(err) {
console.error('Error writing file.');
}
WARNING
The synchronous filesystem functions are temptingly easy to use. However, if you are writing a webserver or networked application, remember that Node’s performance derives from asynchronous execution; you should always use the asynchronous versions in those cases. If you are writing a command-line utility, it is usually not an issue to use the synchronous versions.
You can list the files in a directory with fs.readdir. Create a file called ls.js:
const fs = require('fs');
fs.readdir(__dirname, function(err, files) {
if(err) return console.error('Unable to read directory contents');
console.log(`Contents of ${__dirname}:`);
console.log(files.map(f => '\t' + f).join('\n'));
});
The fs module contains many more filesystem functions; you can delete files (fs.unlink), move or rename files (fs.rename), get information about files and directories (fs.stat), and much more. Consult the Node API documentation for more information.
Process
Every running Node program has access to a variable called process that allows it to get information about—and control—its own execution. For example, if your application encounters an error so severe that it’s inadvisable or senseless to continue executing (often called a fatal error), you can immediately stop execution by calling process.exit. You can also provide a numeric exit code, which is used by scripts to determine whether or not your program exited successfully. Conventionally, an exit code of 0 has indicated “no error,” and a nonzero exit code indicates an error. Consider a script that processes .txt files in a subdirectory data: if there are no files to process, there’s nothing to do, so the program exits immediately, but it’s not an error. On the other hand, if the subdirectory data doesn’t exist, we will consider this a more serious problem, and the program should exit with an error. Here’s how that program might look:
const fs = require('fs');
fs.readdir('data', function(err, files) {
if(err) {
console.error("Fatal error: couldn't read data directory.");
process.exit(1);
}
const txtFiles = files.filter(f => /\.txt$/i.test(f));
if(txtFiles.length === 0) {
console.log("No .txt files to process.");
process.exit(0);
}
// process .txt files...
});
The process object also gives you access to an array containing the command-line arguments passed to the program. When you execute a Node application, you can provide optional command-line arguments. For example, we could write a program that takes multiple filenames as command-line arguments, and print out the number of lines of text in each file. We might invoke the program like this:
$ node linecount.js file1.txt file2.txt file3.txt
The command-line arguments are contained in the process.argv array.2 Before we count the lines in our files, let’s print out process.argv so we know what we’re getting:
console.log(process.argv);
Along with file1.txt, file2.txt, and file3.txt, you’ll see a couple of extra elements at the beginning of the array:
[ 'node',
'/home/jdoe/linecount.js',
'file1.txt',
'file2.txt',
'file3.txt' ]
The first element is the interpreter, or program that interpreted the source file (node, in our case). The second element is the full path to the script being executed, and the rest of the elements are any arguments passed to the program. Because we don’t need this extra information, we’ll just use Array.slice to get rid of it before counting the lines in our files:
const fs = require('fs');
const filenames = process.argv.slice(2);
let counts = filenames.map(f => {
try {
const data = fs.readFileSync(f, { encoding: 'utf8' });
return `${f}: ${data.split('\n').length}`;
} catch(err) {
return `${f}: couldn't read file`;
}
});
console.log(counts.join('\n'));
process also gives you access to environment variables through the object process.env. Environment variables are named system variables that are primarily used for command-line programs. On most Unix systems, you can set an environment variable simply by typing export VAR_NAME=some value (environment variables are traditionally all caps). On Windows, you use set VAR_NAME=some value. Environment variables are often used to configure the behavior of some aspect of your program (without your having to provide the values on the command line every time you execute the program).
For example, we might want to use an environment variable to control whether or not our program logs debugging information or “runs silently.” We’ll control our debug behavior with an environment variable DEBUG, which we’ll set to 1 if we want to debug (any other value will turn debugging off):
const debug = process.env.DEBUG === "1" ?
console.log :
function() {};
debug("Visible only if environment variable DEBUG is set!");
In this example, we create a function, debug, that is simply an alias for console.log if the environment variable DEBUG is set, and a null function—a function that does nothing—otherwise (if we left debug undefined, we would generate errors when we tried to use it!).
In the previous section, we talked about the current working directory, which defaults to the directory you execute the program from (not the directory where the program exists). process.cwd tells you what the current working directory is, and process.chdir allows you to change it. For example, if you wanted to print out the directory in which the program was started, then switch the current working directory to the directory where the program itself is located, you could do this:
console.log(`Current directory: ${process.cwd()}`);
process.chdir(__dirname);
console.log(`New current directory: ${process.cwd()}`);
Operating System
The os module provides some platform-specific information about the computer on which the app is running. Here is an example that shows the most useful information that os exposes (and their values on my cloud-based dev machine):
const os = require('os');
console.log("Hostname: " + os.hostname()); // prometheus
console.log("OS type: " + os.type()); // Linux
console.log("OS platform: " + os.platform()); // linux
console.log("OS release: " + os.release()); // 3.13.0-52-generic
console.log("OS uptime: " +
(os.uptime()/60/60/24).toFixed(1) + " days"); // 80.3 days
console.log("CPU architecture: " + os.arch()); // x64
console.log("Number of CPUs: " + os.cpus().length); // 1
console.log("Total memory: " +
(os.totalmem()/1e6).toFixed(1) + " MB"); // 1042.3 MB
console.log("Free memory: " +
(os.freemem()/1e6).toFixed(1) + " MB"); // 195.8 MB
Child Processes
The child_process module allows your app to run other programs, whether it be another Node program, an executable, or a script in another language. It’s beyond the scope of this book to cover all of the details of managing child processes, but we will consider a simple example.
The child_process module exposes three primary functions: exec, execFile, and fork. As with fs, there are synchronous versions of these functions (execSync, execFileSync, and forkSync). exec and execFile can run any executable supported by your operating system.exec invokes a shell (which is what underlies your operating system’s command line; if you can run it from the command line, you can run it from exec). execFile allows you to execute an executable directly, which offers slightly improved memory and resource use, but generally requires greater care. Lastly, fork allows you to execute other Node scripts (which can also be done with exec).
NOTE
fork does invoke a separate Node engine, so you’re paying the same resource cost you would with exec; however, fork gives you access to some interprocess communication options. See the official documentation for more information.
Because exec is the most general, and the most forgiving, we’ll use it in this chapter.
For demonstration purposes, we’ll execute the command dir, which displays a directory listing (while Unix users are more familiar with ls, dir is aliased to ls on most Unix systems):
const exec = require('child_process').exec;
exec('dir', function(err, stdout, stderr) {
if(err) return console.error('Error executing "dir"');
stdout = stdout.toString(); // convert Buffer to string
console.log(stdout);
stderr = stderr.toString();
if(stderr !== '') {
console.error('error:');
console.error(stderr);
}
});
Because exec spawns a shell, we don’t need to provide the path to where the dir executable lives. If we were invoking a specific program that’s not generally available from your system’s shell, you would need to provide a full path to the executable.
The callback that gets invoked receives two Buffer objects for stdout (the normal output of a program) and stderr (error output, if any). In this example, since we don’t expect any output on stderr, we check first to see if there was any error output before printing it out.
exec takes an optional options object, which allows us to specify things like the working directory, environment variables, and more. See the official documentation for more information.
NOTE
Note the way we import exec. Instead of importing child_process with const child_process = require('child_process'), and then calling exec as child_process.exec, we simply alias exec right away. We could do it either way, but the way we’ve done it is quite common.
Streams
The concept of a stream is an important one in Node. A stream is an object that deals with data—as the name implies—in a stream (the word stream should make you think of flow, and because flow is something that happens over time, it makes sense that it would be asynchronous).
Streams can be read streams, write streams, or both (called duplex streams). Streams make sense whenever the flow of data happens over time. Examples might be a user typing at a keyboard, or a web service that has back-and-forth communication with a client. File access, too, often uses streams (even though we can also read and write files without streams). We’ll use file streams to demonstrate how to create read and write streams, and how to pipe streams to one another.
We’ll start by creating a write stream and writing to it:
const ws = fs.createWriteStream('stream.txt', { encoding: 'utf8' });
ws.write('line 1\n');
ws.write('line 2\n');
ws.end();
TIP
The end method optionally takes a data argument that is equivalent to calling write. Thus, if you’re sending data only once, you can simply call end with the data you want to send.
Our write stream (ws) can be written to with the write method until we call end, at which point the stream will be closed, and further calls to write will produce an error. Because you can call write as many times as you need before calling end, a write stream is ideal for writing data over a period of time.
Similarly, we can create a read stream to read data as it arrives:
const rs = fs.createReadStream('stream.txt', { encoding: 'utf8' });
rs.on('data', function(data) {
console.log('>> data: ' + data.replace('\n', '\\n'));
});
rs.on('end', function(data) {
console.log('>> end');
});
In this example, we’re simply logging the file contents to the console (replacing newlines for neatness). You can put both of these examples in the same file: you can have a write stream writing to a file and a read stream reading it.
Duplex streams are not as common, and are beyond the scope of this book. As you might expect, you can call write to write data to a duplex stream, as well as listen for data and end events.
Because data “flows” through streams, it stands to reason that you could take the data coming out of a read stream and immediately write it to a write stream. This process is called piping. For example, we could pipe a read stream to a write stream to copy the contents of one file to another:
const rs = fs.createReadStream('stream.txt');
const ws = fs.createWriteStream('stream_copy.txt');
rs.pipe(ws);
Note that in this example, we don’t have to specify encoding: rs is simply piping bytes from stream.txt to ws (which is writing them to stream_copy.txt); encoding only matters if we’re trying to interpret the data.
Piping is a common technique for moving data. For example, you could pipe the contents of a file to a webserver’s response. Or you could pipe compressed data into a decompression engine, which would in turn pipe data out to a file writer.
Web Servers
While Node is being used in many applications now, its original purpose was to provide a web server, so we would be remiss not to cover this usage.
Those of you who have configured Apache—or IIS, or any other web server—may be startled at how easy it is to create a functioning web server. The http module (and its secure counterpart, the https module) exposes a createServer method that creates a basic web server. All you have to do is provide a callback function that will handle incoming requests. To start the server, you simply call its listen method and give it a port:
const http = require('http');
const server = http.createServer(function(req, res) {
console.log(`${req.method} ${req.url}`);
res.end('Hello world!');
});
const port = 8080;
server.listen(port, function() {
// you can pass a callback to listen that lets you know
// the server has started
console.log(`server startd on port ${port}`);
});
NOTE
Most operating systems prevent you from listening on the default HTTP port (80) without elevated privileges, for security reasons. As a matter of fact, you need elevated privileges to listen on any port below 1024. Of course this is easy to do: if you have sudo access, you can just run your server with sudo to gain elevated privileges, and listen on port 80 (as long as nothing else is). For development and testing purposes, it’s common to listen on ports above 1024. Numbers like 3000, 8000, 3030, and 8080 are commonly picked because they are memorable.
If you run this program and visit http://localhost:8080 in a browser, you will see Hello world!. On the console, we’re logging all requests, which consist of a method (sometimes called a verb) and a URL path. You might be surprised to see two requests each time you go to that URL in the browser:
GET /
GET /favicon.ico
Most browsers will request an icon that they can display in the URL bar or tab; the browser will do this implicitly, which is why we see it logged on the console.
At the heart of Node’s web server is the callback function that you provide, that will respond to all incoming requests. It takes two arguments, an IncomingMessage object (often abbreviated req) and a ServerRequest object (often abbreviated res). The IncomingMessage object contains all information about the HTTP request: what URL was requested, any headers that were sent, any data sent in the body, and more. The ServerResponse object contains properties and methods to control the response that will be sent back to the client (usually a browser). If you saw that we called req.end and wondered if req is a write stream, go to the head of the class. The ServerResponse object implements the writable stream interface, which is how you write data to the client. Because the ServerResponse object is a write stream, it makes it easy to send a file…we can just create a file read stream and pipe it to the HTTP response. For example, if you have a favicon.ico file to make your website look nicer, you could detect this request and send this file directly:
const server = http.createServer(function(req, res) {
if(req.method === 'GET' && req.url === '/favicon.ico') {
const fs = require('fs');
fs.createReadStream('favicon.ico');
fs.pipe(res); // this replaces the call to 'end'
} else {
console.log(`${req.method} ${req.url}`);
res.end('Hello world!');
}
});
This a minimal web server, though not a very interesting one. With the information contained in IncomingRequest, you can expand this model to create any kind of website you wish.
If you’re using Node to serve websites, you’ll probably want to look into using a framework such as Express or Koa that will take some of the drudgery out of building a web server from scratch.
TIP
Koa is something of a successor to the very popular Express, and it’s no coincidence: both are the work of TJ Holowaychuk. If you’re already familiar with Express, you will feel right at home with Koa—except that you get to enjoy a more ES6-centric approach to web development.
Conclusion
We’ve scratched the surface of the most important Node APIs here. We’ve focused on the ones you’ll probably see in almost any application (such as fs, Buffer, process, and stream), but there are many more APIs to learn about. The official documentation is comprehensive, but can be daunting for the beginner. Shelley Powers’ Learning Node is a good place to start if you’re interested in Node development.
1There were attempts at server-side JavaScript before Node; notably, the Netscape Enterprise Server supported server-side JavaScript as early as 1995. However, server-side JavaScript didn’t start to gain traction until the 2009 introduction of Node.
2The name argv is a nod to the C language. The v is for vector, which is similar to an array.