-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Little Understand #1
Comments
Hi there, thanks for your interest in our work. We will be uploading the full paper to arXiv.com soon, and here's a preview of the paper if you might be interested: http://www.yi-zeng.com/wp-content/uploads/2022/04/Narcissus_Backdoor.pdf To answer your question: So, we are actually trying to train a model addicted to the Narcissus trigger related to the target class. You can think that the Narcissus trigger is an optimized feature that is not originally from the target class but instead `misrepresents' the target class so well that the model trained over the dataset preserves it as a robust indicator for the target class. Thus, other classes will also be sucked into the black hole. I hope the above explanation resolves your question. Follow-ups are always welcome. |
Hi, I wonder how your Narcissus attack performs when the poisoning rate is 0. Because if the ASR retains high when there is no explicit poisoning operation, your Narcissus trigger is basically equivalent to targeted universal adversarial perturbations. Thanks! |
Hi there, Thanks for your interest in our work. Your point is quite interesting, and if you only look at the inference time, where both attacks adopt a universal pattern (UAP or backdoor trigger), the procedures are indeed quite similar. However, sadly, the Narcissus would have no effect when no poison is executed (i.e., when the poison ratio == 0). Additionally, we would like to highlight some more differences between our work to existing targeted UAP attacks:
Yi |
Hi, Yi! Thanks for your explanation. Actually, what interest me most are the minimal poisoning rate and clean-label setting of Narcissus. It does refresh my understanding of the success of backdoor attacks. I have another question: The test-stage trigger magnification seems to be a critical (but controversial in terms of stealthiness) design for the success of Narcissus. I wonder how the ASR changes when varying the magnification factor. (Albeit the trigger stealthiness is quite ambiguous in this field...) |
Thanks for noticing that interesting detail;) The thing about trigger stealthiness we normally talk about is more focused on the training set. The reason is that, in the training stage, it is more reasonable for an user to evaluate or fully cleanse what they will be used for model training. The scenario in the test phase (where we used a trigger magnification) is different. Usually, it is required to reply fast for the models when implementing them. For example, a human face recognition model needs to give the response (like whether the person who wants to enter the room has authorization or not) without any delay. Such a requirement allows an attacker to reasonably assume no detailed inspection would be enforced on the input data. Thus, it should be easy for us to magnify the trigger during the test phase (triggers are also magnified on existing clean-label backdoor attacks, say label-consistent attack, sleeperagent, or hidden trigger backdoor). As for ASR without magnification, our experiment with Narcissu would drop about 60% on PubFig (it is still higher than the others, though). |
Hi, The ASR of Narcissus on tiny-imagenet reaches 85+, which is amazing. Is there a visualization of the poisoned samples on tiny-imagenet? As far as I know, an attack that can achieve such effects with the same poisoning ratio generally requires label-flipping and very obvious triggers. For I couldn't get similar results following the description, could you please provide the related code on tiny-imagenet or more specific experimental setup? Thanks! |
Thank you for your kindly reply! From README.md, the notebook file (Narcissus.ipynb) is only for CIFAR-10. I also appreciate the new code release. |
Yeah, my bad. Will be uploading it in the following week. Stay tuned! |
Thank you! |
@YiZeng623 Thank you for your work. Could you elaborate more on Figure 1 in the paper, please? From what I understand, the red color is for target-class examples, and the yellow one is for non-target-class examples. I still don't get the idea of a clean target model, poisoned target model, and surrogate model in this image. Thank you so much. |
你好 请问您能否提供 其他两个数据集上的相关代码? 谢谢! |
After read through the example, can I simply think that you are trying to train a model to addicted to one target label, so that when predicting non-target samples but added with this noise, the poisoned model will output the target label to achieve backdoor attacks?
The text was updated successfully, but these errors were encountered: