Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: topic blocking vs non-blocking #5326

Closed
wants to merge 4 commits into from
Closed
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
143 changes: 143 additions & 0 deletions doc/topics/blocking-vs-non-blocking.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
# Overview of Blocking vs Non-Blocking

This overview covers the difference between **blocking** and **non-blocking**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should clarify around blocking being synonymous with synchronous and non-blocking with asynchronous as all four terms a used throughout this document.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I'll make the relationship explicit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomgco I put this note at the top of the Comparing Code section prior to showing the code samples

calls in Node.js. This overview will refer to the event loop and libuv but no
prior knowledge of those topics is required. Readers are assumed to have a
basic understanding of the JavaScript language and Node.js callback pattern.

> "I/O" refers primarily to interaction with the system's disk and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quoting, right? Source is missing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a quote, just a note

network supported by [libuv](http://libuv.org/).


## Blocking

**Blocking** is when the execution of additional JavaScript in the Node.js
process must wait until a non-JavaScript operation completes. This happens
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-JavaScript operation

I think this needs to be clarified. For example,

while(1);

will block and it is a JavaScript operation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a clarification to this a couple sentences later

In Node.js, JavaScript that exhibits poor performance due to being CPU intensive rather than waiting on a non-JavaScript operation, such as I/O, isn't typically referred to as blocking.

because the event loop is unable to continue running JavaScript while a
**blocking** operation is executing.

In Node.js, JavaScript that exhibits poor performance due to being CPU intensive
rather than waiting on a non-JavaScript operation, such as I/O, isn't typically
referred to as **blocking**. Synchronous methods in the Node.js standard library
that use libuv are the most commonly used **blocking** operations. Native
modules may also have **blocking** methods.

All of the I/O methods in the Node.js standard libraries provide asynchronous
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

standard library, right? Did you mean core modules?

versions, which are **non-blocking**, and accept callback functions. Some
methods also have **blocking** counterparts, which usually have names that end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that they will always be named Sync if a method is synchronous.

with `Sync`.


## Comparing Code

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we highlight: put this in Bold this is a synchronous file read -> this is a synchronous file read
then highlight: equivalent asynchronous example to equivalent asynchronous example
I think it's easier to notice.


Using the File System module as an example, this is a **synchronous** file read:

```js
const fs = require('fs');
const data = fs.readFileSync('/file.md'); // blocks here until file is read
```

And here is an equivalent **asynchronous** example:

```js
const fs = require('fs');
fs.readFile('/file.md', (err, data) => {
if (err) throw err;
});
```

The first example appears simpler but it has the disadvantage of the second line
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simpler

comparative degree used. Shouldn't this be

The first example appears simpler than the second

or simply

The first example appears simple

**blocking** the execution of any additional JavaScript until the entire file is
read. Note that in the synchronous version if an error is thrown it will need to
be caught or the process will crash. In the asynchronous version, it is up to
the author to decide whether an error should throw as shown.

Let's expand our example a little bit:

```js
const fs = require('fs');
const data = fs.readFileSync('/file.md'); // blocks here until file is read
console.log(data);
// moreWork(); will run after console.log
```

And here is a similar, but not equivalent asynchronous example:

```js
const fs = require('fs');
fs.readFile('/file.md', (err, data) => {
if (err) throw err;
console.log(data);
});
// moreWork(); will run before console.log
```

In the first example above, `console.log` will be called before `moreWork()`. In
the second example `fs.readFile()` is **non-blocking** so JavaScript execution
can continue and `moreWork()` will be called first. The ability to run
`moreWork()` without waiting for the file read to complete is a key design
choice that allows for higher throughout.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

throughput

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍



## Concurrency and Throughput

JavaScript execution in Node.js is single threaded, so concurrency refers to the
event loop's capacity to execute JavaScript callback functions after completing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

capacity or capability?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

capacity IMO

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we mention internal thread pool here?
IMO concurrency in node is more about ability to schedule work to be done and move on, no matter if it is offloaded to internal thread pool or just registered via iocp or linux counterpart, but callbacks are still performed sequentially one-by-one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@YuriSolovyov might be better to leave out internal thread pool here unless you think its exclusion makes the statement here false. I think explaining it in another topic where it can receive more depth would be very valuable.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just wanted to somehow stress that callbacks are still called sequentially and not concurrently.
Also, 👍 to more deep and detailed topic about thread pool somewhere.

other work. Any code that is expected to process requests in a concurrent manner
depends on the ability of the event loop to continue running as non-JavaScript
operations like I/O are happening.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence has to be rephrased.


As an example, let's consider a case where each request to a web server takes
50ms to complete and 45ms of that 50ms is database I/O that can be done
asychronously. Choosing **non-blocking** asynchronous operations frees up that
45ms per request to handle other requests. This is an effective 90% difference
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we specifically mention the percentage gains?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like the percentage supports the explanation here and makes the impact clearer. Do you feel like it is misleading or have another objection?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jrit The percentage would vary from machine to machine and it would not be consistent. I am not sure mentioning the performance gain in percentage here would be a good idea

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 ... if we cannot guarantee that percentage then we should not assert it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some possible resolutions are

  • make a vague statement like "a significant difference", which still implies some percentage change well above zero, but avoids the explicit claim
  • completely remove the last sentence of the paragraph to avoid implying any percentage improvement

To the extent this is a hypothetical example in an introductory topic, I favor the first option as providing some guidance toward the more common case.

in capacity just by choosing to use **non-blocking** methods instead of
**blocking** methods.

The event loop is different than models in many other languages where additional
threads may be created to handle concurrent work. For an introduction to the
event loop see [Overview of the Event Loop, Timers, and
`process.nextTick()`](https://github.com/nodejs/node/pull/4936)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, PRs are WIP items. So quoting them is a not a good idea I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be merged with this like this, my issue is that there should be a reference to the other topic on nodejs.org but the PR hasn't even been merged into master yet. This was mentioned above when the PR was opened.



## Dangers of Mixing Blocking and Non-Blocking Code

There are some patterns that should be avoided when dealing with I/O. Let's look
at an example:

```js
const fs = require('fs');
fs.readFile('/file.md', (err, data) => {
if (err) throw err;
console.log(data);
});
fs.unlinkSync('/file.md');
```

In the above example, `fs.unlinkSync()` is likely to be run before
`fs.readFile()`, which would delete `file.md` before it is actually read. A
better way to write this that is completely **non-blocking** and guaranteed to
execute in the correct order is:


```js
const fs = require('fs');
fs.readFile('/file.md', (err, data) => {
if (err) throw err;
console.log(data);
fs.unlink('/file.md', err => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use parens consistently with arrow functions.

if (err) throw err;
});
});
```

The above places a **non-blocking** call to `fs.unlink()` within the callback of
`fs.readFile()` which guarantees the correct order of operations.


## Additional Resources

- [libuv](http://libuv.org/)
- [Overview of the Event Loop, Timers, and
`process.nextTick()`](https://github.com/nodejs/node/pull/4936)
- [About Node.js](https://nodejs.org/en/about/)