-
Notifications
You must be signed in to change notification settings - Fork 693
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What is the meaning of KEY FEATURE[ Anonymization ] in README? #541
Comments
hey @jatinmehrotra, In a few analysers like Pod, we feed to the AI backend the event messages which are not known beforehand thus we are not masking them for the time being. Further research has to be made to understand the patterns and be able to mask the sensitive parts of an event like pod name, namespace. The majority of the analysers are producing customer errors that we have created and we are able to mask. By masking I mean, swapping sensitive strings( e.g namespace and pod names ) of the error messages with random hashes which then is shared with the backend AI of your choice then in the analysis report we swap them back again and present the initial pod and namespace string to the user; hope that makes more sense. |
Thank you @arbreezy for the explanation. It is really helpful and definitely makes sense 💯 Based on the above explanation I want to confirm a few things as I am planning to introduce k8gpt in one of the projects.
Unrelated to the above explanation
|
Hi, thanks for your interest and thanks for @arbreezy for answering some of the questions. I am one the founder of the project, I am thrilled to see involvement and discussion here.
Masking
We typically wil not mask the below because we don't send any identifying information, just that one of these things has been detected to be incorrect No Masking
Fields:
It's for V2, which will be later this year Q4
Please see https://docs.k8sgpt.ai/reference/guidelines/privacy/ I don't have an exact list of strings being sent, maybe I misunderstand
The bottom line is that in critical production environments (like one of the banks I used to work at) I would recommend an entirely different backend -> use a local model. Then you can rest easily that its inside your DMZ and nothing is leaking. If would like an example of how to use LocalAI ( one of our providers ) that lets you use your own models, we would be happy to share docs, blogs, posts. |
@AlexsJones Thank you so much for the explanation and your conclusion to use local AI.
If my understanding is correct out of the unmasked field |
I always err on the side of caution - so yes, it is quite possible the payload of the event might have something like "super-secret-project-pod-X crashed" which we don't currently redact. As an example - if you use
|
Thank you so much @AlexsJones for your explanation. Really helpful. I would like to send a PR to update the README for Anonymization based on our discussion as I am sure there might be others who might be wondering the same. By the end of the day I will push a PR |
Sounds great, I will close this issue for now but please feel free to reference/re-open if needed |
Checklist
Affected Components
K8sGPT Version
No response
Kubernetes Version
No response
Host OS and its Version
No response
Steps to reproduce
Searched the entire REPO for the meaning of events where anonymization would not apply.
Expected behaviour
I am trying to understand this particular line which is mentioned in README -> Key Features -> Anonymization
What kind of events are we talking about? and what are the details will be shared with AI backend in case of such events?
Actual behaviour
No response
Additional Information
No response
The text was updated successfully, but these errors were encountered: