Fixes#26179.
The original error reported in that issue is fixed on canary, but in
local testing on my windows machine, `next build` would just hang
forever.
After some digging, what happens is that at some point in next build,
readFile promises (from `fs/promises` ) just never resolve, and so next
hangs.
It turns out the issue is saturating tokio's blocking task thread pool.
We previously limited the number of blocking threads to 32, and at some
point those threads are all in use and there's no thread available for
the file reads.
What's taking up all of those threads? The answer turns out to be
`tokio::process`. On windows, child process stdio uses the blocking
threadpool: https://github.com/tokio-rs/tokio/pull/4824. When you poll
the child's stdio on windows, it spawns a blocking task per poll, and
calls `std::io::Read::read` in the blocking context. That call can block
until data is available.
Putting it all together, what happens is that Next.js spawns `2 * the
number of CPU cores` deno child subprocesses to do work. We implement
`child_process` with `tokio::process`. When the child processes' stdio
get polled, blocking tasks get spawned, and those blocking tasks might
block until data is available. So if you have 16 cores (as I do), there
are going to be potentially >32 blocking task threadpool threads taken
just by the child processes. That leaves no room for other tasks to make
progress
---
To fix this, for now, increase the size of the blocking threadpool on
windows. 4 * the number of CPU cores should be enough to leave room for
other tasks to make progress.
Longer term, this can be fixed more properly when we handroll our own
subprocess code (needed for detached processes and additional pipes on
windows).
This is the release commit being forwarded back to main for 2.0.2
Co-authored-by: bartlomieju <bartlomieju@users.noreply.github.com>
Co-authored-by: Bartek Iwańczuk <biwanczuk@gmail.com>
when defining a custom runtime, it might be useful to define a custom
prompter - for instance when you are not relying on the terminal and
want a GUI prompter instead
This is the release commit being forwarded back to main for 2.0.1
Co-authored-by: bartlomieju <bartlomieju@users.noreply.github.com>
Co-authored-by: Bartek Iwańczuk <biwanczuk@gmail.com>
This came up on Discord as a question so I thought it's worth adding a
hint for this as it might be a common pitfall.
---------
Signed-off-by: Bartek Iwańczuk <biwanczuk@gmail.com>
Co-authored-by: David Sherret <dsherret@users.noreply.github.com>
Fixes#22995. Fixes#23000.
There were a handful of bugs here causing the hang (each with a
corresponding minimized test):
- We were canceling recv futures when `receiveMessageOnPort` was called,
but this caused the "receive loop" in the message port to exit. This was
due to the fact that `CancelHandle`s are never reset (i.e., once you
`cancel` a `CancelHandle`, it remains cancelled). That meant that after
`receieveMessageOnPort` was called, the subsequent calls to
`op_message_port_recv_message` would throw `Interrupted` exceptions, and
we would exit the loop.
The cancellation, however, isn't actually necessary.
`op_message_port_recv_message` only borrows the underlying port for long
enough to poll the receiver, so the borrow there could never overlap
with `op_message_port_recv_message_sync`.
- Calling `MessagePort.unref()` caused the "receive loop" in the message
port to exit. This was because we were setting
`messageEventListenerCount` to 0 on unref. Not only does that break the
counter when multiple `MessagePort`s are present in the same thread, but
we also exited the "receive loop" whenever the listener count was 0. I
assume this was to prevent the recv promise from keeping the event loop
open.
Instead of this, I chose to just unref the recv promise as needed to
control the event loop.
- The last bug causing the hang (which was a doozy to debug) ended up
being an unfortunate interaction between how we implement our
messageport "receive loop" and a pattern found in `npm:piscina` (which
angular uses). The gist of it is that piscina uses an atomic wait loop
along with `receiveMessageOnPort` in its worker threads, and as the
worker is getting started, the following incredibly convoluted series of
events occurs:
1. Parent sends a MessagePort `p` to worker
2. Parent sends a message `m` to the port `p`
3. Parent notifies the worker with `Atomics.notify` that a new message
is available
4. Worker receives message, adds "message" listener to port `p`
5. Adding the listener triggers `MessagePort.start()` on `p`
6. Receive loop in MessagePort.start receives the message `m`, but then
hits an await point and yields (before dispatching the "message" event)
7. Worker continues execution, starts the atomic wait loop, and
immediately receives the existing notification from the parent that a
message is available
8. Worker attempts to receive the new message `m` with
`receiveMessageOnPort`, but this returns `undefined` because the receive
loop already took the message in 6
9. Atomic wait loop continues to next iteration, waiting for the next
message with `Atomic.wait`
10. `Atomic.wait` blocks the worker thread, which prevents the receive
loop from continuing and dispatching the "message" event for the
received message
11. The parent waits for the worker to respond to the first message, and
waits
12. The thread can't make any more progress, and the whole process hangs
The fix I've chosen here (which I don't particularly love, but it works)
is to just delay the `MessagePort.start` call until the end of the event
loop turn, so that the atomic wait loop receives the message first. This
prevents the hang.
---
Those were the main issues causing the hang. There ended up being a few
other small bugs as well, namely `exit` being emitted multiple times,
and not patching up the message port when it's received by
`receiveMessageOnPort`.
Testing once again if the crates are being properly released.
---------
Co-authored-by: bartlomieju <bartlomieju@users.noreply.github.com>
Co-authored-by: Bartek Iwańczuk <biwanczuk@gmail.com>
Test run before Deno 2.0 release to make sure that the publishing
process passes correctly.
---------
Co-authored-by: bartlomieju <bartlomieju@users.noreply.github.com>
Co-authored-by: Bartek Iwańczuk <biwanczuk@gmail.com>
This commit adds a suggestion with information and hint how
to resolve situation when user tries to run an npm package
with Node-API addons using global cache (which is currently not
supported).
Closes https://github.com/denoland/deno/issues/25974