Web Workers - HTML5 APIs - HTML5, JavaScript and jQuery (Programmer to Programmer) - 2015

HTML5, JavaScript and jQuery (Programmer to Programmer) - 2015

Part IV HTML5 APIs

Lesson 33 Web Workers

JavaScript has always been a single threaded programming language. Essentially this means that the language is only capable of performing a single operation at a time, and therefore only capable of utilizing a single CPU on the underlying hardware.

This was not a problem when JavaScript was first created because the vast majority of devices that ran browsers supported only a single CPU. Over the last 10 years there has been a major change in hardware, however, and the vast majority of devices now support multiple CPUs. Even smart phones typically support up to four CPUs.

Note

The terms “processor” and “CPU” are essentially interchangeable and refer to the hardware responsible for executing the instructions provided by software. Many CPUs support multiple cores: A multi-core processor essentially contains multiple independent units capable of executing instructions on the same CPU. The terms “core” and “CPU” will therefore be used interchangeably in this lesson.

A programming language is considered multi-threaded if it is capable of specifying multiple sets of instructions that can be run in parallel. Each set of instructions is encapsulated inside a thread.

The key reason that multi-threaded programming languages are important is performance. If you envisage a device with four CPUs, it is capable of executing four sets of instructions in parallel. If a programming language is not multi-threaded, however, it is only capable of using 25 percent of the overall processing power at any point in time. The same software written in a multi-threaded programming language may therefore execute up to four times faster.

Note

Multi-threaded software can still execute on a device with a single CPU. In this case the operating system is responsible for providing each thread a share of the processor.

There are good reasons why JavaScript does not support multiple threads. Programming with multiple threads can cause issues because two different threads may perform operations simultaneously that impact the same underlying data. This is particularly true with the DOM: If two threads were to simultaneously update the DOM it may be very difficult for the browser to determine what the outcome should be.

JavaScript Event Model

In order to adapt to the changing face of hardware, HTML5 has introduced an important API called web workers that can be used to create multi-threaded JavaScript programs.

Before looking at web workers, it is worth investigating how best to write responsive web pages within the single-threaded model. These techniques are useful because, as you will see, there are some limitations on web workers; therefore, they will not always be a viable option.

Imagine that the user presses a button onscreen, and this causes the browser to execute a set of instructions that will take 10 seconds to complete. For instance, the button may call the following function, which attempts to find the highest random number from one billion possibilities:

function findLargest() {

var max = 0;

for (var i = 0; i <= 1000000000; i++) {

max = Math.max(max, Math.random());

}

console.log(max);

}

While this processing is occurring (which takes approximately 10 seconds on my computer) the web browser will be completely unresponsive. If the user clicks buttons, nothing appears to happen: The button will not even change appearance. This typically causes the user to click and click and click. If that was not bad enough, once the processing finishes, all those clicks will suddenly fire, which can cause havoc.

The code that is executed when the user clicks a button is wrapped in an event and placed on a queue. The JavaScript engine is then responsible for processing the code in each event in the order the events were created, not processing the next event until the previous one has completed.

In order to observe this, create the following web page or download it from the book's website (it is called longandshort.html):

<!DOCTYPE html>

<html lang="en">

<body>

<a href="#" onclick="findLargest()">Long operation</a>

<a href="#" onclick="getDate()">Short operation</a>

</body>

<script>

function findLargest() {

var max = 0;

for (var i = 0; i <= 1000000000; i++) {

max = Math.max(max, Math.random());

}

console.log(max);

}

function getDate() {

console.log("The time is "+new Date());

}

</script>

</html>

This page contains two buttons—one generates an event that runs for approximately 10 seconds, while the other generates an event that takes milliseconds. Open this page with the console open and perform the following:

1. Click the Long operation link.

2. Immediately click the Short operation link 12 times.

You should see the output shown in Figure 33.1.

image

Figure 33.1

Even if you finish clicking the Short operation link before the Long operation finishes, none of the clicks are processed until the Long operation completes.

This can create a terrible user experience. Any delay above approximately 200 milliseconds will be noticeable to users and will affect their impression of the web page.

What should you do if you need to perform processing that will take more than 200 milliseconds, and you cannot use web workers?

The ideal approach in this case is to take advantage of the setTimeout function. This function can be used to create an event that will execute at a defined time in the future. For instance, execute the following in the console:

setTimeout(function() {

console.log('Testing');

}, 2000);

The will print “Testing” in the console 2 seconds after the code is executed. The first parameter is the code to execute, while the second parameter is the delay in milliseconds.

If you think about this in the context of the JavaScript event model, the function passed to setTimeout becomes an event and is added to the queue of events when the specified time is reached. Just like any other event, it will not actually execute until it reaches the front of the event queue.

You can therefore split the algorithm up into separate portions and pass each to setTimeout in turn:

function findLargest() {

var max = 0;

var iterations = 0;

function findLargestSub() {

while(true) {

iterations++;

if (iterations === 1000000000) {

console.log(max);

break;

} else if (iterations % 10000000 == 0) {

setTimeout(findLargestSub, 10);

break;

} else {

max = Math.max(max, Math.random());

}

}

}

findLargestSub();

}

The findLargest function now contains a sub-function called findLargestSub. The sub-function is essentially the same as the original function, except it processes a maximum of 10 million numbers.

If the processing has not completed after these 10 million numbers are processed, the sub-function halts and requests that it be invoked again with a 10-millisecond delay. Not only is there a delay, however, but the next portion of the algorithm will be placed at the end of the event queue, allowing any other events that have occurred a chance to complete.

If you make these changes and run the same operations again, you should notice that pressing the Short operation button produces an almost immediate response, even while the Long operation is processing in the background.

It can be difficult to write algorithms in this manner, however, which is one reason web workers are an attractive option.

Web Workers

The web worker specification is part of HTML5 and is widely supported by the major browsers. The web worker API allows you to create a JavaScript file that will execute on an entirely separate thread from the JavaScript event thread. This code can, however, be passed messages from the JavaScript event thread and provide results back when it completes. Figure 33.2 shows the basic pattern used by web workers.

image

Figure 33.2

The two outer boxes represent two operating system threads. The browser thread is responsible for executing all the JavaScript included in the page or imported scripts, while the web worker thread is responsible for executing code in a web worker file. The two threads can only communicate via messages routed by the web worker API.

Note

You might be wondering how web workers handle the potential issues mentioned earlier in this lesson if a web worker updates the DOM at the same time as the code in the main JavaScript thread. The web worker API has a convenient answer for that problem; it is not possible to access the document object from a web worker. If you need to update the DOM as a result of web worker processing, the web worker needs to pass the result back to the main JavaScript thread, which can then update the DOM.

Web workers actually have a number of limitations. In addition to the document object, they are unable to access localStorage or sessionStorage.

In order to see the benefit of web workers, you need a piece of code that runs for an extended period of time. In order to simulate this, you will create an array of 100,000 random numbers, sort them, and display the lowest number:

var result = [];

for (var i = 0; i < 1000000; i++) {

result.push(Math.random());

}

result.sort();

console.log(result[0]);

Depending on the speed of your computer, you may want to increase or decrease the quantity of numbers.

You will now create a web worker that accepts a parameter representing the number of random numbers to create and returns the smallest number. The code of a web worker needs to be created in a separate file so create a file called random.js (in the same folder ascontacts.js so that it is available from the web server), and add the following contents:

self.addEventListener('message', function(msg) {

var data = msg.data;

var result = [];

for (var i = 0; i < data; i++) {

result.push(Math.random());

}

result.sort();

self.postMessage(result[0]);

}, false);

When this web worker is loaded, it starts by adding an event listener that allows it to be notified when a message is available for it. As you will see, it does this by invoking addEventListener on an object called self.

The self object represents the global namespace, and is therefore the equivalent of the window object for conventional JavaScript code. In fact, if you type self at the console, it will return the window object. As mentioned, a web worker cannot access the window object so when it invokes self it returns its own global namespace object, which happens to be called WorkerGlobalScope.

As you can see, when the web worker receives a message, it can extract the information from the data variable on the message object—for the purposes of this example data will be a number.

The web worker then performs any processing necessary. It can use any features of the JavaScript language it needs, including built-in libraries such as Math.

Once the web worker has a result, it can return it to the main browser thread using the postMessage function.

You will now create a simple web page that allows a number to be entered into a form. When the form is submitted, the number will be passed to the web worker, and the result added to a table.

Create a web page called findnumbers.html in the same folder as contacts.html, with the following content (this is available on the book's website):

<!DOCTYPE html>

<html lang="en">

<head>

<meta charset="utf-8">

<title>Lowest number</title>

<link rel="stylesheet" media="all" type="text/css" href="contacts.css">

<script src="jquery-2.1.1.js"></script>

</head>

<body>

<header>Find the lowest number</header>

<form method="post" style="margin:30px">

<div class="formRow">

<label for="contactName">Enter a number</label>

<input required name="theNumber" type="number"

class="validated" id="theNumber"/>

</div>

<div class="formRow">

<input style="width:70px" type="submit"

title="Find" value="Find"/>

</div>

</form>

<section id="numberList" style="margin:30px">

<table>

<thead>

<th>Number entered</th><th>Result</th>

</thead>

<tbody></tbody>

</table>

</section>

</body>

<script>

$('form input[type="submit"]').click(

function(evt) {

evt.preventDefault();

var number = $('#theNumber').val();

var row = $('<tr>').append('<td>'+number+'</td>' ).append('<td>'+0+'</td>');

$('#numberList table tbody').append(row);

});

</script>

</html>

This has not been integrated with the web worker yet, but if you open it (from the web server) and submit some numbers, it should display as you see in Figure 33.3.

image

Figure 33.3

You will now add the code to construct a web worker, pass it your numbers, and listen for the result. Start by changing the JavaScript code as follows; I will then step through each new line:

$('form input[type="submit"]').click(

function(evt) {

evt.preventDefault();

var number = $('#theNumber').val();

var worker = new Worker('random.js');

worker.addEventListener('message', function(evt) {

var result = evt.data;

var row = $('<tr>').append('<td>'+number+'</td>').append('<td>' +result+'</td>');

$('#numberList table tbody').append(row);

}, false);

worker.postMessage(parseInt(number));

});

The code starts by constructing a web worker using the following line of code:

var worker = new Worker('random.js');

Notice that this passes a reference to the script you created earlier using a relative URL. Notice also that you did not import the random.js script at the top of the web page: It is the reference to it here that causes the script to be downloaded from the server.

Note

If you need the web worker to function while offline, you can also add web worker scripts to application cache manifest files.

Once the web worker has been constructed, you register an event listener with it so that you can hear when it posts a message back to the main browser thread. In this example, this will occur when the lowest random number is identified; therefore, you also have code to process this result.

The final line of code is where you actually post a message to the web worker, causing it to begin processing. In this case, you pass the number that was extracted from the form.

You should now be able to submit numbers in the form and see them being passed to the web worker. As you post progressively larger numbers, you will notice that there is a delay before the table is updated, but despite this, it is still possible to submit another number in the form.

You can try submitting several numbers at a time and then opening Task Manager on Windows or Activity Monitor on OS X. For instance, I submitted four numbers on a machine with four cores, and saw the results shown in Figure 33.4.

image

Figure 33.4

The four boxes on the right side show the utilization of the four CPUs over time. As you can see, as soon as I submitted four numbers, all four CPUs used 100 percent of their available resources. This ultimately meant that the task was completed four times quicker than it would have been without web workers.

You should also notice that if you submit the numbers in quick succession, they all complete in quick succession of one another, proving they processed in parallel.

It is also important to realize that even though my machine has only four cores, I could still submit 10 or 20 numbers simultaneously. In this case, each number would be processed on a separate thread, and it would be up to the operating system to provide a share of the CPUs to each thread.

Try It

In this Try It, you will convert the longandshort.html web page from earlier in this lesson to use web workers. This will ensure that it is always possible to push “Short operation” and receive immediate feedback in the console.

Lesson Requirements

To complete this lesson, you will need a text editor for writing code and Chrome for running the completed web page. You should also download longandshort.html from the book's website.

This example needs to execute inside a web server so the resources mentioned need to be added to the same folder as contacts.html, and the Mongoose web server needs to be running.

Step-by-Step

1. Create a web worker in a separate file and add code for it to listen for messages being posted to it. I have called mine find_number.js. The web worker will not read any data from the message, but the message will be used to signal to the web worker that it should begin processing.

2. Move the code from findLargest to the web worker, and when it finishes processing, post the result (the maximum number) back to the main browser thread using postMessage.

3. Change the logic of findLargest so that it constructs a web worker and posts a message to it. When the processing completes, it should receive the response from the web worker and print this to the console.

4. Ensure that the processing of the Long operation does not block the Short operation from executing immediately.

Reference

Please go to the book's website at www.wrox.com/go/html5jsjquery24hr to view the video for Lesson 33, as well as download the code and resources for this lesson.