Core: UDS Socket Handling Rework #2482

ikolomi · 2024-10-21T09:42:34Z

1.Introduced a user-land mechanism for ensuring singleton behavior of the socket, rather than relying on OS-specific semantics. This addresses the issue where macOS and Linux report different errors when the socket path already exists.

2.Simplified the implementation by removing unnecessary abstractions, including redundant connection retry logic.

Issue link

This Pull Request is linked to issue (URL): [https://github.com//issues/2433]

Checklist

Before submitting the PR make sure the following are checked:

This Pull Request is related to one issue.
Commit message has a detailed description of what changed and why.
Tests are added or updated.
CHANGELOG.md and documentation files are updated.
Destination branch is correct - main or release
Commits will be squashed upon merging.

glide-core/src/socket_listener.rs

eifrah-aws · 2024-10-21T12:54:56Z

glide-core/src/socket_listener.rs

+
+                    // signal initialization success
+                    init_callback(Ok(socket_path_cloned.clone()));
+                    let _ = tx.send(true);


The only reason a tx.send will fail means that the receiving end has dropped - we should handle this

what do you mean by handle ? we could log maybe, but its so unimportant I would not bother
Otherwise, it is safer to process to the accept loop since it might serve other clients

logging an error is one of way of handling it. Ignoring the return value of send(..) seems a bit weird to me

the log will be redundant because it cannot happen - the rx will not be closed

eifrah-aws · 2024-10-21T12:58:23Z

glide-core/src/socket_listener.rs

@@ -924,23 +806,109 @@ pub fn start_socket_listener_internal<InitCallback>(
    init_callback: InitCallback,
    socket_path: Option<String>,
 ) where
-    InitCallback: FnOnce(Result<String, String>) + Send + 'static,


This function should return a Result. This will reduce the code below by 1/2

I dont want to change the design at this point

eifrah-aws · 2024-10-21T13:02:00Z

glide-core/src/socket_listener.rs

+                        match listener_socket.accept().await {
+                            Ok((stream, _addr)) => {
+                                local_set_pool
+                                    .spawn_pinned(move || listen_on_client_stream(stream));
+                            }
+                            Err(err) => {
+                                log_error(
+                                    "listen_on_socket",
+                                    format!("Error accepting connection: {err}"),
+                                );
+                                break;
+                            }
+                        }
+                    }


You should consider making this method returning a Result.

This highlighted lines above can then become:

let stream = listener_socket.accept().await?.0; local_set_pool.spawn_pinned(move || listen_on_client_stream(stream));

isnt having an await? will make it impossible to log "error accepting connection"

I assume that you meant here that we will lose the context on which the error occurred (e.g. "accept error"). We can use map_err for this OR we can use a more elegant way by creating a our own set of errors:

For example, lets say we have defined a new result: MyResult

use thiserror::Error; #[derive(Error, Debug)] pub enum MyError { #[error("Accept error. {0}")] AcceptError(String), } // we can now use this listener_socket.accept() .await .map_err(|e| MyError::AcceptError(e.to_string()))? .0;

really all this for what?

barshaul · 2024-10-21T13:17:01Z

glide-core/src/socket_listener.rs

+                    drop(listener_socket);
+                    let _ = std::fs::remove_file(socket_path_cloned.clone());
+
+                    // no more listening on socket - update the sockets db
+                    let mut sockets_write_guard = INITIALIZED_SOCKETS
+                        .write()
+                        .expect("Failed to acquire sockets db write guard");
+                    sockets_write_guard.remove(&socket_path_cloned);
+                    Ok(())


If a new client will be created between the time you drop the listener_socket/when we break from the accept loop to the time we remove it from the sockets_write_guard, the new client would get that there's an existing socket listener and return its path although it isn't available anymore. how do we handle that? it would probably make the wrapper to fail trying to connect to a bad socket path

I think it does not differ from the previous behavior, which I find odd, but I do not want to change the design at this time.
The weak point of the design is that the init callback is called upon the socket creation. Thus, if the loop terminates by failing to accept the connection - the callback might be called with a success but the connection will not be accepted.

With the previous implementation what you are describing might happen like this:

client A is created, started thread_a, binding and accepting on the socket

client B creation started, creating thread_b, which detects an existing socket which he could connect to

thread_a failed to accept the connection and terminated

client B init callback called with a success, thread_b does not continue (since it detected a connectable socket) but the creation eventually fails since the connection cannot be established

To summarize - the new implementation does not degrade, but even improves the situation by having the explicit remove_file and user-land socket db update upon the accept loop termination (in the orig implementation the socket file will remain causing the following creations to fail)

Yury-Fridlyand · 2024-10-21T19:02:31Z

Add a changelog please

1.Introduced a user-land mechanism for ensuring singleton behavior of the socket, rather than relying on OS-specific semantics. This addresses the issue where macOS and Linux report different errors when the socket path already exists. 2.Simplified the implementation by removing unnecessary abstractions, including redundant connection retry logic. Signed-off-by: ikolomi <[email protected]>

ikolomi added bug Something isn't working Core changes Used to label a PR as PR with significant changes that should trigger a full matrix tests. labels Oct 21, 2024

ikolomi added this to the 1.2 milestone Oct 21, 2024

ikolomi requested a review from eifrah-aws October 21, 2024 09:42

ikolomi requested a review from a team as a code owner October 21, 2024 09:42

ikolomi had a problem deploying to AWS_ACTIONS October 21, 2024 09:42 — with GitHub Actions Error

ikolomi force-pushed the issue_2433 branch from 99f91f7 to feb8ab5 Compare October 21, 2024 09:47

ikolomi had a problem deploying to AWS_ACTIONS October 21, 2024 09:48 — with GitHub Actions Error

ikolomi force-pushed the issue_2433 branch from feb8ab5 to 58e5dd0 Compare October 21, 2024 09:49

ikolomi had a problem deploying to AWS_ACTIONS October 21, 2024 09:50 — with GitHub Actions Error

ikolomi force-pushed the issue_2433 branch from 58e5dd0 to 4c501c6 Compare October 21, 2024 09:56

ikolomi had a problem deploying to AWS_ACTIONS October 21, 2024 09:57 — with GitHub Actions Error

ikolomi force-pushed the issue_2433 branch from 4c501c6 to cd4748b Compare October 21, 2024 10:09

ikolomi had a problem deploying to AWS_ACTIONS October 21, 2024 10:09 — with GitHub Actions Error

eifrah-aws reviewed Oct 21, 2024

View reviewed changes

barshaul reviewed Oct 21, 2024

View reviewed changes

Yury-Fridlyand requested a review from jonathanl-bq October 21, 2024 15:09

ikolomi force-pushed the issue_2433 branch from cd4748b to 9e0318e Compare October 22, 2024 09:20

ikolomi had a problem deploying to AWS_ACTIONS October 22, 2024 09:21 — with GitHub Actions Error

ikolomi changed the title ~~Glide-core UDS Socket Handling Rework~~ Core: UDS Socket Handling Rework Oct 22, 2024

ikolomi force-pushed the issue_2433 branch from 9e0318e to e97177a Compare October 22, 2024 09:48

ikolomi had a problem deploying to AWS_ACTIONS October 22, 2024 09:48 — with GitHub Actions Error

ikolomi had a problem deploying to AWS_ACTIONS October 22, 2024 09:48 — with GitHub Actions Failure

eifrah-aws self-requested a review October 22, 2024 10:32

eifrah-aws approved these changes Oct 22, 2024

View reviewed changes

ikolomi force-pushed the issue_2433 branch from e97177a to 615c05d Compare October 22, 2024 13:07

ikolomi had a problem deploying to AWS_ACTIONS October 22, 2024 13:07 — with GitHub Actions Failure

ikolomi temporarily deployed to AWS_ACTIONS October 22, 2024 13:07 — with GitHub Actions Inactive

ikolomi had a problem deploying to AWS_ACTIONS October 22, 2024 13:37 — with GitHub Actions Failure

ikolomi temporarily deployed to AWS_ACTIONS October 22, 2024 13:42 — with GitHub Actions Inactive

ikolomi merged commit 3642038 into release-1.2 Oct 22, 2024
42 checks passed

ikolomi deleted the issue_2433 branch October 22, 2024 13:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Core: UDS Socket Handling Rework #2482

Core: UDS Socket Handling Rework #2482

ikolomi commented Oct 21, 2024

eifrah-aws Oct 21, 2024

ikolomi Oct 21, 2024

eifrah-aws Oct 22, 2024

ikolomi Oct 22, 2024

eifrah-aws Oct 21, 2024

ikolomi Oct 22, 2024

eifrah-aws Oct 21, 2024

ikolomi Oct 21, 2024

eifrah-aws Oct 22, 2024 •

edited

Loading

ikolomi Oct 22, 2024

barshaul Oct 21, 2024

ikolomi Oct 21, 2024 •

edited

Loading

Yury-Fridlyand commented Oct 21, 2024

Core: UDS Socket Handling Rework #2482

Core: UDS Socket Handling Rework #2482

Conversation

ikolomi commented Oct 21, 2024

Issue link

Checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eifrah-aws Oct 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ikolomi Oct 21, 2024 • edited Loading

Choose a reason for hiding this comment

Yury-Fridlyand commented Oct 21, 2024

eifrah-aws Oct 22, 2024 •

edited

Loading

ikolomi Oct 21, 2024 •

edited

Loading