As much as possible, DJL Serving relies on automated tools to do security scanning. In particular, we support:
- Docker CVE patch scanning: Using AWS ECR
- Dependency Analysis: Using Dependabot
- Code Analysis: Using CodeQL
-
DJL Serving has two APIs: inference and management. More can be available through plugins. HTTP -
8080
for both the inference and management APIBy default, both APIs are available on port
8080
through HTTP and accessible tolocalhost
. The addresses can be configured by following the guide for global configuration. DJL Serving does not prevent users from configuring the address to be of any value, including the wildcard address0.0.0.0
. Please be aware of the security risks of configuring the address to be0.0.0.0
as this will give all addresses (including publicly accessible addresses) on the host, access to the DJL Serving endpoints listening on the ports shown above. You should be especially careful with the management API including setting it to a publicly accessible address or forwarding it's port to one. If expose, it could allow an attacker to create models and execute arbitrary code on your machine. -
By default, the docker images are configured to expose the port
8080
to the host. The default DJL Serving configuration in the container, which is executed by the docker entrypoint, will expose both the inference and management APIs set tohttp://0.0.0.0:8080
. This is designed for internal isolated services or development work. For other use cases, provide alternative configurations to avoid exposing the management API. -
Be sure to validate the authenticity of all model files and model artifacts being used with DJL Serving.
- A model file being downloaded from the internet from an untrustworthy source may have malicious code. This can compromise the integrity of your application, slow your device, or extract data.
Data that is on the device may include configurations, model files, inference requests, logs, and other standard important data such as security keys.
- For the common case of python models with our default handler and models from HuggingFace IDs, we require an additional option
options.trust_remote_code
to enable custom python files. - Remember that an attacker may still use other kinds of models, so this precaution alone will not ensure security.
- For the common case of python models with our default handler and models from HuggingFace IDs, we require an additional option
- DJL Serving executes the arbitrary python code packaged in the model file. Make sure that you've either audited that the code you're using is safe and/or is from a source that you trust.
- DJL Serving supports custom plugins. These should also only be used from trusted sources.
- Running DJL Serving inside a container environment and loading an untrusted model file does not guarantee isolation from a security perspective.
- It is possible for models to specify additional files to download. This includes options such as the
option.model_id
which can download from HuggingFace Hub or S3 and providing arequirements.txt
file which can download from PyPI and other URLs. When using these or other features that enable additional downloads from your model, you must also ensure the authenticity and security of the resources being downloaded by your model.
- A model file being downloaded from the internet from an untrustworthy source may have malicious code. This can compromise the integrity of your application, slow your device, or extract data.
Data that is on the device may include configurations, model files, inference requests, logs, and other standard important data such as security keys.
-
Enable SSL:
DJL Serving supports two ways to configure SSL:
- Using a keystore
- Using private-key/certificate files
You can find more details in the configuration guide.
-
Prepare your model against bad inputs and prompt injections. Some recommendations:
- Pre-analysis: check how the model performs by default when exposed to prompt injection (e.g. using fuzzing for prompt injection).
- Input Sanitation: Before feeding data to the model, sanitize inputs rigorously. This involves techniques such as:
- Validation: Enforce strict rules on allowed characters and data types.
- Filtering: Remove potentially malicious scripts or code fragments.
- Encoding: Convert special characters into safe representations.
- Verification: Run tooling that identifies potential script injections (e.g. models that detect prompt injection attempts).
-
If you intend to run multiple models in parallel with shared memory, it is your responsibility to ensure the models do not interact or access each other's data. The primary areas of concern are tenant isolation, resource allocation, model sharing and hardware attacks.
-
There are various options and settings within DJL Serving that can be controlled through the use of environment variables. This includes configurations at all levels of the DJL Serving stack, so ensure that no malicious environment variables can be passed to DJL Serving.
If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our vulnerability reporting page. Please do not create a public github issue.