Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set OSD pool size when creating ceph and cephfs storage pools #14044

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

masnax
Copy link
Contributor

@masnax masnax commented Sep 4, 2024

Closes #14006

Adds the keys ceph.osd.pool_size and cephfs.osd_pool_size (in keeping with the other naming schemes).

By default, if no value is supplied, the pool size will be pulled from the global default pool size. It can be set to any value larger than 0.

On update, we will try to re-apply the OSD pool size value, if it has changed.

@github-actions github-actions bot added Documentation Documentation needs updating API Changes to the REST API labels Sep 4, 2024
Copy link

github-actions bot commented Sep 4, 2024

Heads up @mionaalex - the "Documentation" label was applied to this issue.

@masnax
Copy link
Contributor Author

masnax commented Sep 4, 2024

I can't seem to get the docs to pick up the new extensions, I just get

WARNING: Could not find target storage-ceph-pool-conf:ceph.osd.pool_size in api-extensions

Got it, it was make update-metadata.

@masnax
Copy link
Contributor Author

masnax commented Sep 4, 2024

@simondeziel Got any recommendations for setting up more than 1 OSD for microceph? I'd like to add a test case for the new key but it's tied to the number of available disks supplied to MicroCeph, which is 1 in the github workflow tests.

@masnax masnax force-pushed the osd_pool_size branch 3 times, most recently from b44133e to 5a26cc4 Compare September 4, 2024 23:01
@masnax
Copy link
Contributor Author

masnax commented Sep 4, 2024

Looks like 1 thing I overlooked in my testing is that size=1 is disabled by default unless the global config is changed.

So while the impetus for this PR was to avoid running ceph config set global osd_pool_default_size 1 to set the default pool size, and adding these keys avoids that, a user will still need to run ceph config set global mon_allow_pool_size_one true to make the keys work if a user wants size=1

It's still added flexibility, but if you'd prefer to avoid managing these keys in LXD since it doesn't actually solve the underlying problem of needing to set global ceph configuration, I can close the PR @tomponline

@masnax masnax force-pushed the osd_pool_size branch 2 times, most recently from 850c106 to 3d07d9c Compare September 5, 2024 15:59
@simondeziel
Copy link
Member

simondeziel commented Sep 5, 2024

@simondeziel Got any recommendations for setting up more than 1 OSD for microceph? I'd like to add a test case for the new key but it's tied to the number of available disks supplied to MicroCeph, which is 1 in the github workflow tests.

Could we maybe use the ephemeral disk, split it into 3 partitions and export each as loop dev backed by their respective disk part?

pool1="$("lxdtest-$(basename "${LXD_DIR}")-pool1")"
pool2="$("lxdtest-$(basename "${LXD_DIR}")-pool2")"
lxc storage create "${pool1}" ceph volume.size=25MiB ceph.osd.pg_num=16 ceph.osd.pool_size=1
ceph --cluster "${LXD_CEPH_CLUSTER}" osd pool get "${pool1}" size --format json | jq '.size' | grep -q 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using shell comparison should help with debugging as we will see what .size we got on the left hand side of the comparison:

Suggested change
ceph --cluster "${LXD_CEPH_CLUSTER}" osd pool get "${pool1}" size --format json | jq '.size' | grep -q 1
[[ "$(ceph --cluster "${LXD_CEPH_CLUSTER}" osd pool get "${pool1}" size --format json | jq '.size')" = "1" ]]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@@ -112,6 +114,29 @@ func (d *ceph) FillConfig() error {
d.config["ceph.osd.pg_num"] = "32"
}

if d.config["ceph.osd.pool_size"] == "" {
size, err := shared.TryRunCommand("ceph",
"--name", fmt.Sprintf("client.%s", d.config["ceph.user.name"]),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a tiny nit but simple string concat is more efficient and needs less keystrokes ;)

Suggested change
"--name", fmt.Sprintf("client.%s", d.config["ceph.user.name"]),
"--name", "client." + d.config["ceph.user.name"],

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Someone was paying attention at the performance sessions in Madrid :)

Fixed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed!

Seems this always returns an error because the pool was removed with the
LXD storage pool. However, because it was the last line, the error was
not propagating.

To ensure there's no leftover state, we still run the command but do not
expect it to pass.

Signed-off-by: Max Asnaashari <[email protected]>
Copy link
Member

@simondeziel simondeziel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general with the caveat that the setup-microceph action is being used elsewhere and should probably not default to using 3 partitions. Could you maybe make that an action parameter?

@@ -2517,6 +2517,10 @@ Adds support for using a bridge network with a specified VLAN ID as an OVN uplin
Adds `logical_cpus` field to `GET /1.0/cluster/members/{name}/state` which
contains the total available logical CPUs available when LXD started.

<<<<<<< HEAD
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merge/rebase leftover.

sudo snap install microceph --channel "${{ inputs.microceph-channel }}"
sudo microceph cluster bootstrap
sudo microceph.ceph config set global osd_pool_default_size 1
sudo microceph.ceph config set global mon_allow_pool_size_one true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about having the 2 mon_* settings grouped before the OSD ones? I'm thinking this would avoid mon complaining about the OSD pool size being 1 for a very brief moment?

sudo parted "${ephemeral_disk}" --script mklabel gpt
sudo parted "${ephemeral_disk}" --script mkpart primary 0% 33%
sudo parted "${ephemeral_disk}" --script mkpart primary 33% 66%
sudo parted "${ephemeral_disk}" --script mkpart primary 66% 100%
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it and didn't know it could take %.

disk2="$(losetup -f)"
sudo losetup "${disk2}" "${ephemeral_disk}2"
disk3="$(losetup -f)"
sudo losetup "${disk3}" "${ephemeral_disk}3"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take it that MicroCeph still cannot take partitions directly right? How about adding a link to canonical/microceph#251 in a comment?

// type: string
// defaultdesc: `3`
// shortdesc: Number of RADOS object replicas. Set to 1 for no replication.
"ceph.osd.pool_size": validate.Optional(validate.IsInRange(1, math.MaxInt64)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While technically correct, maybe we could put an upper bound that's a little less... gigantic :) Hardcoding 255 feels future-proof enough I'd say.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed that ceph.osd.pg_num has no such validation.

:shortdesc: "Number of RADOS object replicas. Set to 1 for no replication."
:type: "string"
This option specifies the number of OSD pool replicas to use
when creating an OSD pool.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have a long description for cephfs and not for ceph? Also, is this really something that needs to be done at creation time only?

@masnax
Copy link
Contributor Author

masnax commented Oct 23, 2024

LGTM in general with the caveat that the setup-microceph action is being used elsewhere and should probably not default to using 3 partitions. Could you maybe make that an action parameter?

Hmm, does it matter since we set the replication factor to 1 anyway?

@simondeziel
Copy link
Member

LGTM in general with the caveat that the setup-microceph action is being used elsewhere and should probably not default to using 3 partitions. Could you maybe make that an action parameter?

Hmm, does it matter since we set the replication factor to 1 anyway?

That cuts the available space in 3 (so ~25GiB instead of ~75GiB IIRC) which might be a bit tight for some lxd-ci.git tests. Also, it comes with the losetup overhead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Changes to the REST API Documentation Documentation needs updating
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow setting ceph.osd.pool_size on pool creation
3 participants