APIs: Web Workers - JUMP START HTML5 (2014)

JUMP START HTML5 (2014)

Chapter 27 APIs: Web Workers

Every HTML5 app is written in JavaScript, but the single and the most crucial limitation of HTML5 apps is that the browsers’ JavaScript runtimes are single-threaded in nature. Some of you might say that you have run tasks asynchronously in the past using functions like setTimeout,setInterval, and our all-time favorite, XMLHttpRequest. But, in reality, those functions are just asynchronous, not concurrent. Actually, all JavaScript tasks run one after the other and are queued accordingly. Web Workers offer us a multi-threaded environment where multiple threads can execute in parallel, offering true concurrency.

In this chapter, we’ll first discuss the purpose and how to use Web Workers, before having a look at some its limitations and security considerations.

Introduction and Usage

The Web Worker API allows us to write applications where a computationally expensive script can run in the background without blocking the main UI thread. As a result, unresponsive script dialogs—as shown in Figure 27.1, which is due to the blocking of main thread—can be a thing of the past.

An unresponsive script message

Figure 27.1. An unresponsive script message

There are two kinds of Web Workers: dedicated workers and shared workers. The main difference is the visibility. A dedicated worker is accessible from the parent script that created it, but a shared worker can be accessed from any script of the same origin. Shared workers have limited browser support: Chrome 4.0+, Safari 5.0+ and Opera 10.6+. Neither Firefox nor IE has support for shared workers.

We’ll be discussing dedicated workers in this book, as that’s what you’ll most probably use.

Note: Detecting Support

Before going any further, I just want to make that sure you detect the browser's support for Web Workers. In the previous chapter, I showed an example where we detected the support for Web Workers using native JavaScript and Modernizr, so we’ll skip repeating it here.

To use Web Workers, you just need to call the Worker constructor and pass the URI of the Worker script:

var worker=new Worker('myworker.js');

Note: Worker Script Path

Please note that the Worker script path must have the same origin as the parent script, and be relative to the parent script’s location.

Here, myworker.js is the Worker script that needs to be executed in the background. To communicate with a Worker, you just need to call postMessage on the worker object, passing a message, if any:

// start a worker without any message

worker.postMessage();

// pass a message and start worker

worker.postMessage('Hey, are you in the mood to start work?');

But communication doesn’t have to be unidirectional. Our Worker can also reply! To receive a message sent by our Worker, we attach an event listener to the worker object, like so:

//register a callback

worker.addEventListener('message',function(e){

alert('Got message from worker, '+e.data);

},false);

The handler function is also passed a message event object. This object has a property called data that contains the actual message sent. So in the above code, we accessed the e.data property to retrieve the data that was sent by our Worker.

And what happens to the message passed as an argument to postMessage() in our main script? Well, that is passed to our Worker script, which can be retrieved at the Worker side by registering the same event listener. The following snippet shows how to do that:

myworker.js

//sent from worker

self.addEventListener('message',function(e){

self.postMessage('Hey, I am doing what you told me to do!');

},false);

Warning: Workers Are Sandboxed

Workers run in a sandboxed environment. This means that they’re unable to access everything a normal script can. For example, in the previous code snippet you can’t access the global object window inside myworker.js. So bad things will happen if you try to writewindow.addEventListener instead of self.addEventListener. Workers also have no access to the DOM. Why? More on that later.

In case of any error occurring, the onerror handler is called. The following callback should be registered in the parent script:

//register an onerror callback

worker.addEventListener('error',function(e) {

console.log(

'Error occurred at line: '+e.lineno+' in file '+e.filename

);

},false);

Passing JSON data

Web Workers can only be passed a single parameter; however, that parameter can be a complex object containing any number of items. Let’s modify our code to pass JSON data:

parentScript.js

var worker=new Worker('myworker.js');

worker.addEventListener('message',function(e){

alert('Got answer: '+e.data.answer+' from: '+e.data.answerer);

},false);

worker.postMessage({'question':'how are you?','askedBy':'Parent'});

myworker.js

self.addEventListener('message', function(e) {

console.log(

'Question: ' + e.data.question +

' asked by: '+e.data.askedBy

);

self.postMessage(

{

'answer': 'Doing pretty good!',

'answerer':'Worker'

}

);

},false);

Note: Worker Data Is Copied

The data you pass to the Worker is copied, not shared. The receiver will always receive a copy of data that is sent. It means that just before being sent, the data is serialized and becomes de-serialized on the receiving side. You may wonder why it is implemented this way. Simple! To avoid threading issues.

Web Worker Features

As noted previously, Web Workers run in a sandboxed environment. Their features include:

· read-only access to navigator and location objects

· functions such as setTimeout/setInterval and XMLHttpRequest object, just like the main thread

· creating and starting subworkers

· importing other scripts through importScripts() function

· ability to take advantage of AppCache

They have no access to:

· the window, parent, and document objects (use self or this in Workers for global scope)

· the DOM

Note: Why Workers Are Unable to Access the DOM

Browsers (and the DOM) operate on a single thread. That's because your code can prevent other actions, such as a link being clicked. Multiple threads could break that. Also the DOM is not thread-safe. For these reasons, the Worker threads have no access to the DOM—but that won't stop your Workers from modifying main page content. You can always pass a result back to the parent script and let the UI thread update the DOM content. I will show you how at the end of the chapter.

There is just one final issue before we move onto more advanced features: how to close a thread. To terminate a thread, just call worker.terminate() from main script or self.close() from the worker itself.

More Advanced Workers

Inline Workers

Everybody loves to do things on the fly. Since Web Workers run in a separate context, the Worker constructor expects a URI that specifies an external script file to run. But if you want to be really quick, you can create Inline Workers on the fly through blobs. Have a look at the following code:

var blob = new Blob(["onmessage = function(e) {

↵self.postMessage(e.data); };"]);

var worker = new Worker(window.URL.createObjectURL(blob));

worker.addEventListener('message', function(e){

alert('Got same Message: '+e.data+' from worker');

},false);

worker.postMessage('Good Morning Worker!!');

In this snippet, we wrote the content of our Worker in a Blob. window.URL.createObjectURL essentially creates a URI (for example, blob:null/027b645d-be05-4f14-8866-e52604777608) that references the content of the Worker, and that URI is passed to the Worker constructor. Then we proceed as usual.

Since it’s inconvenient to put all the Worker content into a Blob constructor, we can alternatively put all the Worker code in a separate script tag inside the parent HTML. Then, at runtime, we pass that content to the Blob constructor.

In the following example, we write the Worker in the parent HTML page itself. Once the Worker starts, it will execute a function every second and return the current time to the main thread. The main thread will then update the div with the time (remember we talked about updating the DOM?):

parentPage.html

<!DOCTYPE html>

<html>

<head>

<!--

The following script won't be parsed by the JavaScript engine

because of its type

-->

<script type="text/javascript-worker" id="jsworker">

setInterval(function(){

postMessage(getTime());

}, 1000);

function getTime(){

var d = new Date();

return d.getHours()+":"+d.getMinutes()+":"+d.getSeconds();

}

</script>

<script>

var blobURI = new Blob(

[document.querySelector("#jsworker").textContent]

);

var worker=new Worker(window.URL.createObjectURL(blobURI));

worker.addEventListener('message',function(e){

document.getElementById('currTime').textContent=e.data;

},false);

worker.postMessage();

</script>

</head>

<body>

<div id="currTime"></div>

</body>

</html>

This code is fairly self-explanatory. Once the Worker starts, we register a function that is executed every second and returns the current time. The same data is retrieved and the DOM is updated by the main thread. Furthermore, #currTime can be cached for better performance.

If you’re creating many blob URLs, it’s good practice to release them once you’re done (I’d recommend that you avoid creating too many blob URLs):

window.URL.revokeObjectURL(blobURI); // release the resource

Creating Subworkers Inside Workers

In your Worker files you can further create subworkers and use them. The process is the same. The main benefit is that you can divide your task between many threads. The URIs of the subworkers are resolved relative to their parent script’s location, rather than the main HTML document. This is done so that each Worker can manage its dependencies clearly.

Using External Scripts within Workers

Your Workers have access to the importScripts() function, which lets them import external scripts easily. The following snippet shows how to do it:

// import a single script

importScripts('external.js');

//import 2 script files

importScripts('external1.js','external2.js');

Note: URIs are Relative to the Worker

When you import an external script from the Worker, the URI of the script is resolved relative to the Worker file location instead of the main HTML document.

Security Considerations

The Web Worker API follows the same origin principle. It means the argument to Worker() must be of the same origin as that of the calling page.

For example, if my calling page is at http://xyz.com/callingPage.html, the Worker cannot be on http://somethingelse.com/worker.js. The Worker is allowed as long as its location starts with http://xyz.com. Similarly, an http page cannot spawn a Worker whose location starts withhttps://.

Figure 27.2 shows the error thrown by Chrome when trying to go out of the origin:

Origin error in Chrome

Figure 27.2. Origin error in Chrome

Some browsers may throw security exceptions if you try to access the files locally (via the file:// protocol). If you are getting any such exceptions, just put your file in the local server and access it with this: http://localhost/project/somepage.html.

Polyfills for Older Browsers

What if the browser does not support the Web Workers API? There are several polyfills available to support older browsers by simulating the behavior of Web Workers. The Modernizr page on Github has a list of such polyfills, but long-running code may fail with these implementations. In those cases, it may be necessary to offload some processing to the server via Ajax.

Conclusion

Web Workers give you a big performance boost because of their multi-threaded nature. All modern browsers, including IE10 and above, offer support for Web Workers.

Here are a few use cases that you can try to implement:

· long polling in the background and notifying the user about new updates

· pre-fetching and caching content

· performing computationally expensive tasks and long-running loops in the background

· a spell-checker that runs continuously in the background