-
Notifications
You must be signed in to change notification settings - Fork 2
/
cards.yaml
303 lines (301 loc) · 13.5 KB
/
cards.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
# This is inspied by Fraser Scott's Alexa implementation of EoP https://github.com/zeroXten/alexa_threat_model_game/blob/master/cards.yaml
#
sources:
BIML_78:
title: Architectural Risk Analysis of Machine Learning Systems
identifier_format: <component label>:<risk number>:<descriptor>
url: https://berryvilleiml.com/results/ara.pdf
BIML_LLM_24:
title: Architectural Risk Analysis of Large Language Models
identifier_format: LLM:<component label>:<risk number>:<descriptor>
url: https://berryvilleiml.com/results/BIML-LLM24.pdf
OWASP_LLM:
title: OWASP Top 10 Risks for Large Language Model Applications
identifier_format: OWASP LLM:risk number
url: https://owasp.org/www-project-top-10-for-large-language-model-applications/
suit_order:
- Model Risks
- Output Risks
- Input Risks
- Dataset Risks
rank_order: [2,3,4,5,6,7,8,9,10,'J','Q','K','A']
ranks:
2: two
3: three
4: four
5: five
6: six
7: seven
8: eight
9: nine
10: ten
J: Jack
Q: Queen
K: King
A: Ace
suits:
Model risk:
2:
title: Catastrophic forgetting
source: BIML_78
identifier: eval:5:catastrophic forgetting
description: When a model is filled with too much overlapping information, collisions in the representation space may lead to the model “forgetting” information.
3:
title: Oscillation
source: BIML_78
identifier: alg:8:oscillation
description: An ML system may end up oscillating and not properly converging if using gradient descent in a space with a misleading gradient.
4:
title: Randomness
source: BIML_78
identifier: alg:4:randomness
description: Setting weights and thresholds with a bad RNG can damage system behavior and lead to subtle security issues.
5:
title: Online system manipulation
source: BIML_78
identifier: alg:1:online
description: When an ML system system online keeps learning during operations, clever attackers can nudge the model so that it drifts from its intended operational profile.
6:
title: Overfitting
source: BIML_78
identifier: eval:1:overfitting
description: The model learns its training dataset so well that it’s no longer able to generalize outside of the training set and will perform poorly.
7:
title: Hyperparameters
source: BIML_78
identifier: inference:3:hyperparameters
description: An attacker that can control the hyperparameters can manipulate the future training of the machine learning model
8:
title: Hosting
source: BIML_78
identifier: inference:4:hosting
description: The server where the model is hosted is insufficiently protected against unauthorized parties.
9:
title: Hyperparameter sensitivity
source: BIML_78
identifier: alg:10:hyperparameter sensitivity
description: Sensitive hyperparameters that have been set experimentally may not be sufficient for the intended problem space, and can lead to overfitting.
10:
title: Model theft
source: BIML_78
identifier: model:5:steal the box
description: Stealing ML system knowledge is possible through direct input/output observation, enabling attackers to reverse engineer the model.
J:
title: Training set reveal
source: BIML_78
identifier: model:4:training set reveal
description: Most ML algorithms learn a great deal about its data and store a representation internally. This data may be sensitive, and can potentially be extracted from the model.
Q:
title: Trojanized model
source: BIML_78
identifier: model:2:Trojan
description: Model transfer leads to the possibility that what is being reused may be a Trojaned (or otherwise damaged) version of the model.
K:
title: Improper re-use of model
source: BIML_78
identifier: model:1:improper re-use
description: ML models are re-used in transfer situations, where a pre-trained model is specialized toward a new use case. The model may be transferred into a problem space it’s not designed for.
A:
title:
source:
identifier:
description: You have invented your own risk associated with machine learning models.
Input risk:
2:
title: LLM feedback scores
source: BIML_LLM_24
identifier: LLM:inference:6:feedback scores
description: Some LLM chat systems allow user feedback as a parameter for tuning their system. This can be abused by attackers that give feedback in a coordinated fashion to nudge the ML system.
3:
title: Open to the public
source: BIML_LLM_24
identifier: LLM:input:3:open to the public
description: An LLM model is often open to the public, which makes it susceptible to attacks from users.
4:
title: Sponge input
source: BIML_LLM_24
identifier: LLM:input:5:sponge input
description: A sponge attack provides an LLM system with input that is more costly to process than “normal”. Like a DoS attack, as it seeks to exhaust processing budget.
5:
title: Input ambiguity
source: BIML_LLM_24
identifier: LLM:input:6:input ambiguity
description: English, the main interface language for LLMs, is an ambiguous interface. Natural language can be misleading, making LLMs susceptible to misinformation.
6:
title: Text encoding
source: BIML_78
identifier: raw:7:text encoding
description: An ML system engineered with one text encoding scheme in mind might yield surprising results if presented with a differently encoded text.
7:
title: Denial of service
source: BIML_78
identifier: system:10:denial of service
description: Denial of Service attacks can have a massive impact on a critical ML system. When an ML system breaks down, recovery may not be possible.
8:
title: User risk
source: BIML_78
identifier: inference:5:user risk
description: A user may expose their personal data and their interests to the owners of an ML system when they interact with the system.
9:
title: Dirty input
source: BIML_78
identifier: input:3:dirty input
description: Dirty inputs can be hard to process, and may be leveraged by an attacker adding noise in their prompts or in data sources for future training.
10:
title: Controlled input stream
source: BIML_78
identifier: input:2:controlled input stream
description: Outside sources of input may be manipulated by an attacker.
J:
title: Looped input
source: BIML_78
identifier: input:4:looped input
description: ML system output to the real world may feed back into training data or input, leading to a feedback loop, termed recursive pollution.
Q:
title: Prompt injection
source: BIML_78
identifier: LLM:input:2:prompt injection
description: Input manipulation for LLMs. An attacker manipulates a large language model (LLM) through malicious inputs to override initial instructions given in system prompts.
K:
title: Malicious input
source: BIML_78
identifier: input:1:adversarial examples
description: Fool a machine learning system by providing malicious input that causes the ML system to make a false prediction or categorization.
A:
title:
source:
identifier:
description: You have invented your own risk associated with machine learning input.
Output risk:
2:
title: Cry wolf
source: BIML_78
identifier: system:6:cry wolf
description: If an ML model is integrated into a security decision and raises too many alarms, its output may be ignored.
3:
title: Black box discrimination
source: BIML_78
identifier: system:1:black box discrimination
description: ML systems that operate with high impact decisions based on personal data carry the risk of illegal discrimination based on bias.
4:
title: LLM overreliance
source: OWASP_LLM
identifier: OWASP LLM09
description: Dependence on an LLM without oversight may lead to misinformation and legal concerns. It will also be hard to detect an attack against the LLM system.
5:
title: Inscrutability
source: BIML_78
identifier: output:4:inscrutability
description: In far too many cases with ML, nobody is really sure how the trained systems do what they do. This negatively affects trustworthiness.
6:
title: Miscategorization
source: BIML_78
identifier: output:3:miscategorization
description: Bad output due to internal bias, malicious input or other attacks may escape into the world.
7:
title: Transparency
source: BIML_78
identifier: output:5:transparency
description: It is easier to perform attacks undetected on a black-box system which is not transparent about how it works.
8:
title: Confidence scores
source: BIML_78
identifier: inference:3:confidence scores
description: An ML model’s confidence scores can help an attacker tweak inputs to make the system misbehave.
9:
title: Wrongness
source: BIML_LLM_24
identifier: LLM:output:2:wrongness
description: LLMs are stochastic in their nature, and can generate highly convincing misinformation in their attempt to satisfy the prediction of the next tokens from a prompt.
10:
title: Excessive LLM agency
source: OWASP_LLM
identifier: OWASP LLM08
description: An LLM-based system may undertake actions leading to unintended consequences if granted excessive functionality, permissions, or autonomy.
J:
title: Overconfidence
source: BIML_78
identifier: system:2:overconfidence
description: An ML model integrated into a system with its output treated as high confidence data may cause a range of unexpected issues.
Q:
title: Error propagation
source: BIML_78
identifier: system:5:error propagation
description: When ML output is input to a larger decision process, errors in the ML subsystem may propagate in unforeseen ways.
K:
title: Output manipulation
source: BIML_78
identifier: output:1:direct
description: An attacker directly manipulates the output stream getting between the ML system and its receiver. This may be hard to detect because models are sometimes opaque.
A:
title:
title:
source:
identifier:
description: You have invented your own risk associated with machine learning output.
Dataset risk:
2:
title: Metadata
source: BIML_78
identifier: raw:10:metadata
description: Metadata may accidentally degrade generalization since a model learns a feature of the metadata instead of the content itself.
3:
title: Data rights
source: BIML_LLM_24
identifier: LLM:raw:4:data rights
description: Copyrighted, privacy protected or otherwise legally encumbered data are scraped from the internet to train ML models. This can lead to expensive legal entanglements.
4:
title: Partitioning
source: BIML_78
identifier: assembly:4:partitioning
description: Bad data partitions for training, validation and testing datasets may lead to a misbehaving ML system.
5:
title: Normalization
source: BIML_78
identifier: assembly:3:normalize
description: Normalization changes the nature of raw data, and may destroy the feature of interest by introducing too much bias.
6:
title: Annotation
source: BIML_78
identifier: assembly:2:annotation
description: The way data is annotated into features can be directly attacked, introducing attacker bias into a system.
7:
title: Encoding integrity
source: BIML_78
identifier: assembly:1:encoding integrity
description: Pre-processing and encoding of the data can lead to encoding integrity issues if the data has bias or discrimination in its nature.
8:
title: Bad evaluation data
source: BIML_78
identifier: eval:2:bad eval data
description: A bad evaluation dataset can give unrealistic projections to how the model will perform when it is shipped to production.
9:
title: Storage
source: BIML_78
identifier: data:4:storage
description: Data may be stored and managed insecurely. Who has access to the data, and why?
10:
title: Recursive pollution
source: BIML_LLM_24
identifier: LLM:raw:1:recursive pollution
description: An ML model (LLM or other) generates incorrect content that content finds its way into future training data, which can damage the accuracy and reliability of the model.
J:
title: Data integrity
source: BIML_78
identifier: system:7:data integrity
description: If distributed datasets do not have proper integrity checks in place, data can be tampered with undetected as it passes between components.
Q:
title: Data confidentiality
source: BIML_78
identifier: raw:1:data confidentiality
description: Sensitive and confidential data that is used for ML training can be disclosed with extraction attacks.
K:
title: Data poisoning
source: BIML_78
identifier: data:1:poisoning
description: An attacker intentionally manipulates data to disrupt, introduce bias, control or otherwise influence ML training. On the internet, lots of data are already poisoned “by default”.
A:
title:
source:
identifier:
description: You have invented your own risk associated with machine learning datasets.