Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vault sharing notification is not sent #824

Open
CMCDragonkai opened this issue Oct 14, 2024 · 22 comments
Open

Vault sharing notification is not sent #824

CMCDragonkai opened this issue Oct 14, 2024 · 22 comments
Assignees
Labels
bug Something isn't working

Comments

@CMCDragonkai
Copy link
Member

Describe the bug

Sharing a vault to a node ID causes this log in the agent logs:

WARN:polykey.PolykeyAgent.NotificationsManager:Could not send VaultShare notification to vqsq2ca7vjbsiv3paletajdduecm0v36j00jbnbg5eg084bm46g6g: ErrorPolykeyRemote
WARN:polykey.PolykeyAgent.task v0pocjo3thlo01bik4q0di1273s:Failed - Reason: ErrorPolykeyRemote

There's no notifications sent.

To Reproduce

  1. Try sharing a vault and observe the agent logs.

Expected behavior

I'm able to connect and share the vault, and the remote target can clone the vault. But notifications is not sent, so this sounds like a trivial bug. It should just sent without any other problems.

Screenshots

Platform (please complete the following information)

version          	1.14.0-1-1
sourceVersion    	1.14.0
stateVersion     	1
networkVersion   	1
versionMetadata  	

Notify maintainers

@tegefaulkes

@CMCDragonkai CMCDragonkai added the bug Something isn't working label Oct 14, 2024
Copy link

linear bot commented Oct 14, 2024

@CMCDragonkai
Copy link
Member Author

@CDeltakai do systemctl status ... to get the logs specific to receiving notifications.

@CMCDragonkai
Copy link
Member Author

We need a better regression testing here.

@CMCDragonkai
Copy link
Member Author

There's no error on the remote side that shows up.

@tegefaulkes
Copy link
Contributor

tegefaulkes commented Oct 14, 2024

I did a quick test on my side. and got this when trying to share the vault.

ErrorPolykeyRemote
    at toError (/home/brian/workspace/Polykey-CLI/node_modules/polykey/src/network/utils.ts:581:29)
    at Object.transform (/home/brian/workspace/Polykey-CLI/node_modules/@matrixai/rpc/src/utils.ts:452:19)
    at ensureIsPromise (node:internal/webstreams/util:185:19)
    at transformStreamDefaultControllerPerformTransform (node:internal/webstreams/transformstream:515:18)
    at transformStreamDefaultSinkWriteAlgorithm (node:internal/webstreams/transformstream:565:10)
    at Object.write (node:internal/webstreams/transformstream:360:14)
    at ensureIsPromise (node:internal/webstreams/util:185:19)
    at writableStreamDefaultControllerProcessWrite (node:internal/webstreams/writablestream:1112:5)
    at writableStreamDefaultControllerAdvanceQueueIfNeeded (node:internal/webstreams/writablestream:1227:5)
    at writableStreamDefaultControllerWrite (node:internal/webstreams/writablestream:1101:3) {
  data: {},
  cause: ErrorNotificationsPermissionsNotFound
      at f.receiveNotification (/nix/store/gfwlbhw0b2fkh40kbb1sakwmpd53ln66-polykey-cli-0.10.0/lib/node_modules/polykey-cli/dist/polykey.js:2246:59829)
      at async withF (/nix/store/gfwlbhw0b2fkh40kbb1sakwmpd53ln66-polykey-cli-0.10.0/lib/node_modules/polykey-cli/node_modules/@matrixai/resources/dist/utils.js:17:16)
      at async handle (/nix/store/gfwlbhw0b2fkh40kbb1sakwmpd53ln66-polykey-cli-0.10.0/lib/node_modules/polykey-cli/dist/polykey.js:2131:215244)
      at async wrapperDuplex (/nix/store/gfwlbhw0b2fkh40kbb1sakwmpd53ln66-polykey-cli-0.10.0/lib/node_modules/polykey-cli/dist/polykey.js:2124:127100)
      at async outputGen (/nix/store/gfwlbhw0b2fkh40kbb1sakwmpd53ln66-polykey-cli-0.10.0/lib/node_modules/polykey-cli/dist/polykey.js:2124:126196)
      at async Object.pull (/nix/store/gfwlbhw0b2fkh40kbb1sakwmpd53ln66-polykey-cli-0.10.0/lib/node_modules/polykey-cli/dist/polykey.js:2124:126394) {
    data: {},
    cause: undefined,
    timestamp: 2024-10-14T04:55:06.381Z,
    exitCode: 67
  },
  timestamp: 2024-10-14T04:55:06.390Z,
  exitCode: 67,
  metadata: {
    localHost: '::',
    localPort: 36611,
    remoteHost: '::ffff:1.40.225.41',
    remotePort: 34140,
    command: 'notificationsSend'
  }
}
WARN:polykey.PolykeyAgent.NotificationsManager:Could not send VaultShare notification to vqsq2ca7vjbsiv3paletajdduecm0v36j00jbnbg5eg084bm46g6g: ErrorPolykeyRemote
WARN:polykey.PolykeyAgent.task v0pockgl5eho01823hj3jvapvrs:Failed - Reason: ErrorPolykeyRemote

After getting @CDeltakai to trust my node I was able to share a new vault with him without error.

This is designed behavior where you just outright reject notifications from vaults where you don't trust them first. However the feedback on the problem here is really bad. We need to include all the errors in the cause chain when reporting errors like this.

Along with sending notifications specifically, if we reject a notification due to lack of trust, There should be a logger message for it as well.

@tegefaulkes tegefaulkes self-assigned this Oct 15, 2024
Copy link
Contributor

I'm going to address this in two parts.

  1. I'm going to improve the details provided when logging out these errors. In the case where we have nested errors with causes I want to print out the whole cause chain description.
  2. Specifically for this, we need to take the ErrorPolykeyRemote we're getting and wrap that in a more descriptive error stating that the notification was rejected by the receiving node.

Copy link
Contributor

Ok, so I've made the error logging better here and improved the related tests in the CLI.

Besides the fact that why the notification failed wasn't clear. This was intended behaviour. The vaults share command itself doesn't care if the notification fails. The only reason we got any feedback that the notification failed is because background tasks, regardless if the failure is allowed or not, is logged as a warning.

Copy link
Contributor

With the new changes release I'm going to consider this one done. Unless we want to consider changing the intended behaviour of vaults share and the failing notification.

@CMCDragonkai
Copy link
Member Author

Ok, so I've made the error logging better here and improved the related tests in the CLI.

Besides the fact that why the notification failed wasn't clear. This was intended behaviour. The vaults share command itself doesn't care if the notification fails. The only reason we got any feedback that the notification failed is because background tasks, regardless if the failure is allowed or not, is logged as a warning.

I thought it was due to no trust/permission to share.

@CMCDragonkai
Copy link
Member Author

Example of the new log?

@CMCDragonkai
Copy link
Member Author

I still want to make sure we're structured in that though.

@tegefaulkes
Copy link
Contributor

All PolykeyErrors will include the full error chain when stringified now. So thw warning looks like this.

WARN:NotificationsManager Test:Could not send VaultShare notification to vmi821ubhbelid409ocdlf19hphrf1ltv8fhvmr12pu4g4l5afsdg: ErrorPolykeyRemote()>ErrorNotificationsPermissionsNotFound()
WARN:task v0pog95uas5o01d8t2l68bqblk0:Failed - Reason: ErrorPolykeyRemote()>ErrorNotificationsPermissionsNotFound()

The stringified error part is ErrorPolykeyRemote()>ErrorNotificationsPermissionsNotFound() If it includes a message then it would look like SomeError("some message")>CauseError() The formatting could use some feedback.

It was due to a lack of trust to accept notifications. Sharing was unaffected.

@CMCDragonkai
Copy link
Member Author

Hmmm, I would prefer to use json for structured messages, and a better serialisation of error messages. The () seems weird. Basically js-errors should be updated instead.

@tegefaulkes
Copy link
Contributor

tegefaulkes commented Oct 16, 2024

We could use a JSON format. I structured it this way to keep it succinct and avoid bloating out the log size. Let me play around with some formats.

Once we've settled on that I'll move the formatting change into js-errors. I'll create a new issue to track that.

Copy link
Contributor

There are 3 things left to do for this.

  1. I made a branch or PR in the js-errors repo for changing the toString formatting of errors. Just rip that out, close it or what ever. its no longer needed.
  2. The change in Polykey needs to remove the toString method from the ErrorPolykey that's creating the new format.
  3. Where ever we do a e.toString() or String(e) on errors when logging out a warning or error in the logger. We need to use a custom utility that will convert errors into formatted strings for the logger message.

    There may be existing issues relating to this that need to be updated or closed.

Copy link
Contributor

I will be taking over this issue now.

@CMCDragonkai
Copy link
Member Author

There could be a default toString in the js-errors that is intended to be overridden by applications.

@tegefaulkes
Copy link
Contributor

So we want to keep that now? That's what I intended to do but after discussion we went with having whatever was serialising the error to a string would decide how to format it.

Copy link
Contributor

aryanjassal commented Nov 21, 2024

There could be a default toString in the js-errors that is intended to be overridden by applications.

So we want to keep that now? That's what I intended to do but after discussion we went with having whatever was serialising the error to a string would decide how to format it.

I need solid specifications on error rendering. Brian mentioned that the error message should contain the traceback to an extent, but be understandable by the user.

Q. Should the error message have only the last error, or a stack trace?
Q. How far back should the traceback go?
Q. How do we want to format the error messages?
Q. How do we want to implement formatting the messages?

To get more details in the traceback, the user might be able to optionally enable verbose mode. Otherwise the errors could be formatted in a simpler manner.

$ polykey secrets ls novault
ErrorVaultsVaultUndefined: Vault 'novault' does not exist

$ polykey secrets ls novault -v
ErrorPolykeyRemote: Remote error from RPC call
  localHost ::1
  localPort 53346
  remoteHost ::1
  remotePort 43507
  command vaultsSecretsList
  timestamp Thu Nov 21 2024 12:50:24 GMT+1100 (Australian Eastern Daylight Time)
  cause: ErrorVaultsVaultUndefined: Vault does not exist

Moreover, the user does not need to see the ErrorPolykeyRemote wrapping the error, as that is specific to the client and shouldn't affect the user in any way.

Currently, this is how the error rendering is handled in the secrets commands to better comply with Unix standards.

$ npm run polykey -- secrets mkdir vault:nodir/nested vault:nodir2/nested vault:nodir3/nested

> [email protected] polykey
> ts-node src/polykey.ts secrets mkdir vault:nodir/nested vault:nodir2/nested vault:nodir3/nested

mkdir: cannot create directory nodir/nested: No such file or directory
mkdir: cannot create directory nodir2/nested: No such file or directory
mkdir: cannot create directory nodir3/nested: No such file or directory
ErrorPolykeyCLIMakeDirectory: Failed to create one or more directories - Failed to create one or more directories

One change I would suggest is to either show the error description or error message, as showing both sometimes leads to weird results like the one in mkdir example above.

Another example would be ErrorVaultsVaultUndefined, where it always throws a static Vault does not exist message, but which vault failed isn't properly conveyed. If we add a message, then the error would render weirdly, making it hard to read for users.

ErrorVaultsVaultUndefined: Vault does not exist - Vault 'vaultname' does not exist

So, either we render the description if the message doesn't exist, or render the message if it does. It would make the errors much more readable and concise.

ErrorVaultsVaultUndefined: Vault 'vaultname' does not exist


Basically, we need to finalise the format of errors and how they should be rendered to close this issue. Currently, there seems to be a lot of confusion surrounding it, and it needs to be cleared up for progress to be made.

@CMCDragonkai @tegefaulkes

@CMCDragonkai
Copy link
Member Author

There could be a default toString in the js-errors that is intended to be overridden by applications.

So we want to keep that now? That's what I intended to do but after discussion we went with having whatever was serialising the error to a string would decide how to format it.

I need solid specifications on error rendering. Brian mentioned that the error message should contain the traceback to an extent, but be understandable by the user.

Q. Should the error message have only the last error, or a stack trace?
Q. How far back should the traceback go?
Q. How do we want to format the error messages?
Q. How do we want to implement formatting the messages?

To get more details in the traceback, the user might be able to optionally enable verbose mode. Otherwise the errors could be formatted in a simpler manner.

$ polykey secrets ls novault
ErrorVaultsVaultUndefined: Vault 'novault' does not exist

$ polykey secrets ls novault -v
ErrorPolykeyRemote: Remote error from RPC call
  localHost ::1
  localPort 53346
  remoteHost ::1
  remotePort 43507
  command vaultsSecretsList
  timestamp Thu Nov 21 2024 12:50:24 GMT+1100 (Australian Eastern Daylight Time)
  cause: ErrorVaultsVaultUndefined: Vault does not exist

Moreover, the user does not need to see the ErrorPolykeyRemote wrapping the error, as that is specific to the client and shouldn't affect the user in any way.

Currently, this is how the error rendering is handled in the secrets commands to better comply with Unix standards.

$ npm run polykey -- secrets mkdir vault:nodir/nested vault:nodir2/nested vault:nodir3/nested

> [email protected] polykey
> ts-node src/polykey.ts secrets mkdir vault:nodir/nested vault:nodir2/nested vault:nodir3/nested

mkdir: cannot create directory nodir/nested: No such file or directory
mkdir: cannot create directory nodir2/nested: No such file or directory
mkdir: cannot create directory nodir3/nested: No such file or directory
ErrorPolykeyCLIMakeDirectory: Failed to create one or more directories - Failed to create one or more directories

One change I would suggest is to either show the error description or error message, as showing both sometimes leads to weird results like the one in mkdir example above.

Another example would be ErrorVaultsVaultUndefined, where it always throws a static Vault does not exist message, but which vault failed isn't properly conveyed. If we add a message, then the error would render weirdly, making it hard to read for users.

ErrorVaultsVaultUndefined: Vault does not exist - Vault 'vaultname' does not exist

So, either we render the description if the message doesn't exist, or render the message if it does. It would make the errors much more readable and concise.

ErrorVaultsVaultUndefined: Vault 'vaultname' does not exist


Basically, we need to finalise the format of errors and how they should be rendered to close this issue. Currently, there seems to be a lot of confusion surrounding it, and it needs to be cleared up for progress to be made.

@CMCDragonkai @tegefaulkes

I would have kept the ExceptionName: Description - Message as the default way of reporting this. To make it easier for logging systems to deal with this - this all occurs in a single line.

Then for additional data that's what the data is for. And for example mkdir I would join together all directories that failed to be made into the message and add it to the data pojo too.

In JSON error formatting, this data is shown as json is a single line showing all the data when serialised.

This issue has nothing to do with error reporting in the original spec. So I'm not sure if this should be even addressed here.

@CMCDragonkai
Copy link
Member Author

No we don't want that much magic in our error reporting. We want to do something where we detect if it's an interactive tty - do something, if it is a regular pipe, do a different thing.

I would say default human reporting can do a smart switch between single line reporting for regular stdout/stderr pipe and more fancier reporting with indented multiline message for tty. But then provide specific formatting option overrides. Like tty, pipe, json.

Copy link
Contributor

Then for additional data that's what the data is for. And for example mkdir I would join together all directories that failed to be made into the message and add it to the data pojo too.

Unix operates on each path, and returns the status after operating on each path. If one path fails, mkdir will write that to stderr and continue with other paths. I did this to replicate that behaviour. I can still add that to the error data, but for the user, it would start looking very cluttered.

No we don't want that much magic in our error reporting. We want to do something where we detect if it's an interactive tty - do something, if it is a regular pipe, do a different thing.

I would say default human reporting can do a smart switch between single line reporting for regular stdout/stderr pipe and more fancier reporting with indented multiline message for tty. But then provide specific formatting option overrides. Like tty, pipe, json.

Yes, this looks like a good approach. The tty formatting can be concise, as that is meant to be read by the user. This is where we can show the error message or error description. (I am still of the opinion that the end user should see as little clutter or irrelevant information as possible). Then, pipe or json can include the more detailed stack trace as that would be for logging.

There could be a default toString in the js-errors that is intended to be overridden by applications.

So we want to keep that now? That's what I intended to do but after discussion we went with having whatever was serialising the error to a string would decide how to format it.

I still need an answer to this. Brian mentioned that previously, we decided to let each application define their own toString, but now we are deciding on providing a default, overridable toString. How should this be handled? Does js-errors have default ways to show errors on tty, pipe, and json, and we can override if needed? Or does this need to be implemented by whatever is rendering the error instead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

No branches or pull requests

3 participants