Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow to allocate cache manager with custom refill socket #378

Closed
wants to merge 1 commit into from

Conversation

ievenbach
Copy link
Contributor

@ievenbach ievenbach commented Apr 24, 2024

Cache managers use two sockets: one for cache refill operation,
and another one for notifications.

In order to simulate NETLINK events by reading data from files,
we need to be able to overwrite callbacks for both sockets.

This new function allows us to set up refill socket any way we want.
It does have requirement that the refill socket be blocking.

Copy link
Owner

@thom311 thom311 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to simulate NETLINK events by reading data from files,
we need to be able to overwrite both sockets.

Passing a nl_cb argument for the internal socket doesn't really overwrite the socket. Wouldn't it be better to have instead a struct nl_sock *sync_sk argument?

extern int nl_cache_mngr_alloc_cb(struct nl_sock *,
int, int,
struct nl_cache_mngr **,
struct nl_cb *sync_cb);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new ABI must be listed in libnl-3.sym file.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new entry MUST be added to the bottom of the file. See the comment at the top of the file.

@ievenbach
Copy link
Contributor Author

In order to simulate NETLINK events by reading data from files,
we need to be able to overwrite both sockets.

Passing a nl_cb argument for the internal socket doesn't really overwrite the socket. Wouldn't it be better to have instead a struct nl_sock *sync_sk argument?

The intent is not to overwrite the socket. I want as much of the functionality as possible intact. Instead, I want to provide a callback that overwrites the send/recv.

Why we need a separate socket at all is a different question.

@ievenbach ievenbach force-pushed the aurora/cache-mgr-cb branch 2 times, most recently from 08241df to 1b1ee7d Compare April 24, 2024 17:29
@ievenbach
Copy link
Contributor Author

In order to simulate NETLINK events by reading data from files,
we need to be able to overwrite both sockets.

Passing a nl_cb argument for the internal socket doesn't really overwrite the socket. Wouldn't it be better to have instead a struct nl_sock *sync_sk argument?

The intent is not to overwrite the socket. I want as much of the functionality as possible intact. Instead, I want to provide a callback that overwrites the send/recv.

Why we need a separate socket at all is a different question.

Bah. I should read my own commit messages better. Fixed the description.

@thom311
Copy link
Owner

thom311 commented Apr 24, 2024

Why we need a separate socket at all is a different question.

That is the real question. Having two sockets (one for async events and the other for sync fetches) unavoidably means that it's not possibly to bring the messages in a consistent state. This can only correctly work, by using one socket for everything, where all events (sync and async) are in order.

Anyway. That is out of scope for this PR and would be a large work. I don't think anybody is going to fix that.

@thom311
Copy link
Owner

thom311 commented Apr 24, 2024

The intent is not to overwrite the socket. I want as much of the functionality as possible intact. Instead, I want to provide a callback that overwrites the send/recv.

I see.

Since this is quite a special method, I still think it would be better to just accept a full struct nl_sock *sync_sock argument. For this more elaborate use-case, it provides the most flexibility, as the user can prepare the socket in any way the want. It's only mildly more cumbersome for the caller. Since struct nl_sock is not ref-counted, the function necessarily will take ownership of what you pass in.

Also, note that the result argument is an out-argument. Let's keep that the last parameter of the function.

@ievenbach ievenbach force-pushed the aurora/cache-mgr-cb branch from 1b1ee7d to 505da87 Compare April 24, 2024 21:47
@ievenbach ievenbach requested a review from thom311 April 25, 2024 11:28
@ievenbach ievenbach changed the title Allow to allocate cache manager with custom callback on refill socket Allow to allocate cache manager with custom refill socket Apr 25, 2024
goto errout;
}
if ((err = nl_connect(mngr->cm_sync_sock, protocol)) < 0)
goto errout_free_sync_sock;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not how for mngr->cm_sock, we call connect and non-blocking above. That means, for the sk argument, the caller is not expected to do that. That may be a questionable choice, but it is probably better to handle his consistently. Meaning: leave the nl_connect(cm_sync_sock) call here, also for the externally provided sk_blocking.

lib/cache_mngr.c Outdated
* Same as \f nl_cache_mngr_alloc, but sets custom refill socket
* Note: ownership of the sk_blocking passes to the cache manager
*/
int nl_cache_mngr_alloc_ex(struct nl_sock *sk, struct nl_sock *sk_blocking, int protocol, int flags,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sk_blocking is internally assigned to cm_sync_sock. While it's unclear which is the best name, they probably should be consistent. I think the parameter should be called sync_sk.

lib/cache_mngr.c Outdated
if (!mngr)
if (!mngr) {
// No manager yet, to free it in case of error
nl_socket_free(sk_blocking);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is inconsistent too what happens with sk. On failure, the sockets are left alone. The caller would need to handle them in that case. That is also missing in nl_cache_mngr_alloc() (but that code will go away).

lib/cache_mngr.c Outdated
nl_socket_free(sk_blocking);
return err;
}
return nl_cache_mngr_alloc_ex(sk, sk_blocking, protocol, flags, result);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as said, the logic of nl_connect() should not be on the caller.

Hence, most of this code is still in nl_cache_mngr_alloc_ex().

Allow NULL as sk_blocking (sync_sk), and then just do

int nl_cache_mngr_alloc(struct nl_sock *sk, int protocol, int flags,
			struct nl_cache_mngr **result)
{
    return nl_cache_mngr_alloc_ex(sk, NULL, protocol, flags, result);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was how I made it originally. However I noticed that presents a problem: we allow passing in two sockets, but cannot differentiate between errors caused by either. Which means we would need extra steps to be taken, to disown them internally in case of error, so that caller could release these.
I just found it easier to be inconsistent with socket assumptions.
I can fix the assumption about the second socket as well to be consistent.


NL_DBG(1, "Allocated cache manager %p, protocol %d, %d caches\n",
mngr, protocol, mngr->cm_nassocs);

*result = mngr;
return 0;

errout_free_sync_sock:
nl_socket_free(mngr->cm_sync_sock);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like sk, the new parameter sync_sk/sk_blocking should be optional/nullable. If the user did not specify a blocking socket, in case of error the internally allocated one will still be needed to released.

You need to right logic, so that all internally allocated data is correctly freed on every error path. Use _nl_auto_nl_socket instead of explicit free for that.

@thom311
Copy link
Owner

thom311 commented Apr 26, 2024

Note also NL_ALLOCATED_SOCK, where ownership is only assumed for internally allocated socket. You would need something similar for sync-sock.

@ievenbach ievenbach force-pushed the aurora/cache-mgr-cb branch from 505da87 to d80a836 Compare April 29, 2024 12:15
@ievenbach
Copy link
Contributor Author

Note also NL_ALLOCATED_SOCK, where ownership is only assumed for internally allocated socket. You would need something similar for sync-sock.

I rewrote the patch to follow this pattern.

@ievenbach ievenbach requested a review from thom311 April 29, 2024 12:18
Cache managers use two sockets: one for cache refill operation,
and another one for notifications.

In order to simulate NETLINK events by reading data from files,
we need to be able to overwrite callbacks for both sockets.

This new function allows us to set up refill socket any way we want.
It does have requirement that the refill socket be blocking.
@ievenbach ievenbach force-pushed the aurora/cache-mgr-cb branch from d80a836 to 5425c00 Compare April 29, 2024 12:39
thom311 added a commit that referenced this pull request May 6, 2024
@thom311
Copy link
Owner

thom311 commented May 6, 2024

merged, with two follow-up changes.

Thank you!

@thom311 thom311 closed this May 6, 2024
@ievenbach
Copy link
Contributor Author

merged, with two follow-up changes.

Thank you!

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants