-
Notifications
You must be signed in to change notification settings - Fork 260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for adding unique ID into volume mount names #62
Comments
Is there any consensus on how the My initial thought when thinking about this a while back was to use the Another idea is to add an |
An alternative to generating the |
I would use a similar approach to how we name images which uses To also support the repository in volume case in Remote-Containers (and possibly other cases) we need to base the hash on the id labels we use to identify the dev container (the id label includes the workspace folder path in the local case). It would be nice to have a human-readable part, but that might not be easy to derive in all cases. So my proposal is to use a sha-256 hash of the id labels (there can be multiple such labels and they all need to match to identify the dev container). |
When I look at a dev container running locally, I have the following labels: "com.visualstudio.code.devcontainers.id": "python-3",
"com.visualstudio.code.devcontainers.release": "v0.231.6",
"com.visualstudio.code.devcontainers.source": "https://github.com/microsoft/vscode-dev-containers/",
"com.visualstudio.code.devcontainers.timestamp": "Wed, 13 Apr 2022 00:28:49 GMT",
"com.visualstudio.code.devcontainers.variant": "3.10-bullseye",
"desktop.docker.io/wsl-distro": "Ubuntu",
"devcontainer.local_folder": "\\\\wsl.localhost\\Ubuntu\\home\\stuart\\source\\my-project",
"version": "0.203.5" Which of these do you suggest are included in the hash? My goal is to have an ID that is scoped to an individual dev container and that doesn't change across dev container renames (i.e. changing |
In this case only A container created from a repository cloned to a volume has different id labels, e.g.: "vsch.local.repository": "https://github.com/chrmarti/vscode-regex.git/tree/main",
"vsch.local.repository.folder": "vscode-regex",
"vsch.local.repository.volume": "vscode-regex-main-29a8aa263ac65c28ae2f5568b6fd157f" In the code these labels are passed around as |
@chrmarti I assume this would also work if a volume was used for the source code in the Remote - Containers case, correct? |
@Chuxel Yes, the variations I'm aware of are: Local folder: "devcontainer.local_folder": "\\\\wsl.localhost\\Ubuntu\\home\\stuart\\source\\my-project" "Clone Repository in Container Volume": "vsch.local.repository": "https://github.com/chrmarti/vscode-regex.git/tree/main",
"vsch.local.repository.folder": "vscode-regex",
"vsch.local.repository.volume": "vscode-regex-main-29a8aa263ac65c28ae2f5568b6fd157f" When inspecting a volume: "vsch.local.volume": "myvolume" Codespaces (here the unique id might not be relevant at the moment): "Type": "codespaces" (Using a fixed id label is fine when that uniquely identifies the dev container among other containers on the same Docker host.) The Dev Container CLI only has the local folder case built-in, in all other cases we pass id labels to the CLI as command line options ( |
So we could use a hash of the id labels and then base64 encode that like: const crypto = require('crypto');
// Array of label=value.
const idLabels = ['"devcontainer.local_folder=C:\\Users\\chrmarti\\repos\\hello'];
const hash = crypto.createHash('sha256')
.update(JSON.stringify(idLabels.sort())) // sort to avoid order dependency
.digest('base64url'); // omits padding =
// For a volume name, only "[a-zA-Z0-9][a-zA-Z0-9_.-]" are allowed,
// base64 includes "[a-zA-Z0-9+/]".
const uniqueId = 'a' + hash.replace(/\+/g, '_')
.replace(/\//g, '-');
console.log(uniqueId); // Example prints: adg5DShAK-VwSz7TkEPJqOyzEmKEAJc3EKAsKsiCfsRE Notes:
|
What do you think about removing the dash from the set of available chars so that the unique ID could be easily selected whenever needed by double clicking on it? |
@tschaffter We could use dot instead of dash (to still match base64). |
Thanks for the prompt response! Dot would lead to the same issue, but it's maybe better than using the dash since VS Code uses the dash as separator in the name of Docker images. For example: |
Some time ago I also needed to encode a sha256 in a short string (a subdomain name which are under 63 chars). I went with encoding the number in base32 which requires at most 52 chars: function sha256AsBase32(bytes: ArrayBuffer): string {
const array = Array.from(new Uint8Array(bytes));
const hexArray = array.map(b => b.toString(16).padStart(2, '0')).join('');
// sha256 has 256 bits, so we need at most ceil(lg(2^256-1)/lg(32)) = 52 chars to represent it in base 32
return BigInt(`0x${hexArray}`).toString(32).padStart(52, '0');
} Here's a code pointer (uses |
@alexdima I like that idea. It uses |
Updated code using base32: const crypto = require('crypto');
function uniqueIdForLabels(idLabels) {
const stringInput = JSON.stringify(idLabels, Object.keys(idLabels).sort()); // sort properties
const bufferInput = Buffer.from(stringInput, 'utf-8');
const hash = crypto.createHash('sha256')
.update(bufferInput)
.digest();
const uniqueId = BigInt(`0x${hash.toString('hex')}`)
.toString(32)
.padStart(52, '0');
return uniqueId;
}
const examples = [
{
'devcontainer.local_folder': '\\\\wsl.localhost\\Ubuntu\\home\\stuart\\source\\my-project'
},
{
'vsch.local.repository.volume': 'vscode-regex-main-29a8aa263ac65c28ae2f5568b6fd157f',
'vsch.local.repository': 'https://github.com/chrmarti/vscode-regex.git/tree/main',
'vsch.local.repository.folder': 'vscode-regex'
},
{
'vsch.local.volume': 'myvolume'
},
{
'Type': 'codespaces'
}
];
const uniqueIds = examples.map(uniqueIdForLabels);
console.log(uniqueIds);
/* Prints:
[
'1rr0ifggvslov34du7rckm9v3v1vq76fgstqpg49k5ettifijujv',
'0ogk7ui0niiilqvq4n3cstdcmg5fa4fc2ldsetncf60o3jvu33j3',
'09h2k0dvao1m9l93kqpcs7kenvhoeik9jlgu3q6k7l4k3nold7ce',
'0hcbhh2c7vldoj773drm1bjldnb89u0rt96sl9nju22d9ou8d14n'
]
*/ |
GoalAllow features to refer to an identifier that is unique to the dev container they are installed into. E.g., the ProposalThe identifier will be refered to as The identifier is derived from a set of container labels that uniquely identify the dev container on a Docker host. It is up to the implementation to choose these labels. (E.g., a single label with the workspace folder as its value, or a set of labels identifying a Git repository and a volume to clone the Git repository into.) E.g., the {
"id": "docker-in-docker",
"version": "1.0.4",
// ...
"mounts": [
{
"source": "dind-var-lib-docker-${devcontainerId}",
"target": "/var/lib/docker",
"type": "volume"
}
]
} Computing the Identfier
JavaScript implemenation taking an object with the labels as argument and returning a string as the result: const crypto = require('crypto');
function uniqueIdForLabels(idLabels) {
const stringInput = JSON.stringify(idLabels, Object.keys(idLabels).sort()); // sort properties
const bufferInput = Buffer.from(stringInput, 'utf-8');
const hash = crypto.createHash('sha256')
.update(bufferInput)
.digest();
const uniqueId = BigInt(`0x${hash.toString('hex')}`)
.toString(32)
.padStart(52, '0');
return uniqueId;
} |
Overall looks good. This will be a useful feature.
Can we formalize this to make it a requirement? For instance, we could only support that variable expansion in devcontainer properties that apply to container runtime and not image build time (
These two things seem to be in opposition to each other. Unless we define the labels that must be used, implementations will use different labels as the input and get different results, even if they follow the same hashing logic. I would suggest that
|
+1 on the idea of defining the set of labels explicitly. Besides that this would be a great addition. |
@jkeech @edgonmsft @chrmarti Why put a limit on it? Mounts is the primary use case and we do need this to work from image labels as well so that would apply to the image. Also, if this affects the contents of the image, then the ID would be isolated to the image, which would work great. The main place there's a challenge is runtime params. |
Sounds good.
The labels to identify a dev container seem to depend on how the dev container is created (in its context, e.g., a Codespace implies a single dev container, so the label can be constant, see #62 (comment)). Other implementations might come up with additional ways to create dev containers, so we probably don't want to limit the allowed labels. We could specify that implementations can use the |
@Chuxel, what @chrmarti and I are suggesting is that the devcontainer ID is tied to a single instance of the devcontainer. You can have multiple instances of the same image running on the same machine, so if the ID is evaluated in an image, it's no longer unique. Therefore, the ID can only be evaluated at the time when a devcontainer is spun up, and it only applies to properties at that point which are tied to the container rather than the image. That's why we should restrict the property in the spec to only be allowed in runtime properties like Even though the ID is evaluated at runtime does not mean that the ID variable cannot be present in image metadata. It would just remain as |
Posted the proposal with the discussed clarifications in PR #96 for review. |
//cc @bamurtaugh FYI - Another for the list of items to add to the reference over the next week or so. |
This proposal is an enhancement idea to the current dev container features proposal.
This spec proposal is a port of the proposed solution to microsoft/vscode-remote-release#5679 and relates to #2 (comment).
For both Dev Container Features (#61) and normal devcontainer.json scenarios, it is sometimes necessary to have a unique identifier generated that can be referenced in different places.
The most critical example of this is named volumes. When you have multiple dev containers running in the same environment (e.g. locally), the docker-in-docker feature needs to be able to create unique volume names for each to store docker data. Any feature that references a mount point will face the same challenge.
The proposed solution is to introduce a variable like
${devcontainerId}
that is a unique identifier for the resulting container. The format of the identifier does not matter as much as the fact it can be broadly used. So generally it should be alpha numeric. Features can then reference this identifier where appropriate (like in mounts).@alexdima @chrmarti @jkeech @edgonmsft - This has been the most consistent ask from early adopters of dev container features to date. So I think we'll want to cover it in the spec.
The text was updated successfully, but these errors were encountered: