Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore PFs without RDMA cap on BF2 according to its gid_tlb_len==0 #5965

Merged
merged 1 commit into from
Dec 13, 2020

Conversation

leibin2014
Copy link
Contributor

What

Ignore PFs without RDMA cap on BF2 according to its gid_tlb_len==0

Why ?

On BF2/ARM switchdev mode only SFs have RDMA cap, PFs don't have RDMA cap. We should filter out the SF devices with RDMA cap and ignore the PF devices without RDMA cap.

How ?

We can filter out the devices with RDMA cap according to gid_tlb_len > 0, and ignore the devices without RDMA cap according to gid_tlb_len==0.

@yosefe
Copy link
Contributor

yosefe commented Dec 1, 2020

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

return UCS_ERR_UNSUPPORTED;
}

num_ports = dev->num_ports < UCT_IB_DEV_MAX_PORTS ? dev->num_ports : UCT_IB_DEV_MAX_PORTS;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ucs_min

Comment on lines 551 to 554
ucs_error("%s has %d ports, but only up to %d are supported",
ibv_get_device_name(ibv_device), dev->num_ports,
UCT_IB_DEV_MAX_PORTS);
return UCS_ERR_UNSUPPORTED;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's move this to line 532: if num ports is > MAX_PORTS, print warning, but continue using only 2 ports, and NOT fail.

Copy link
Contributor Author

@leibin2014 leibin2014 Dec 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yosefe

  1. So for BF2/ARM switchdev mode, it will always have warnings. Is this good?
  2. This will change the original logic before my changes since printing error and return UNSUPPORTED was the original logic. But I also think it's fine to change it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so let's make it ucs_debug (not ucs_warn)

Comment on lines 541 to 544
if (dev->port_attr[i].gid_tbl_len > 0) {
has_gid = 1;
break;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (dev->port_attr[i].gid_tbl_len == 0) {
    return UCS_ERR_UNSUPPORTED;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yosefe One device has max 2 ports are allowed. If any one port with gid_tbl_len == 0 then we think the device is UNSUPPORTED even if the other port of the same device with gid_tbl_len > 0. Right? Maybe we won't have this case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can check num of gid in uct_ib_device_port_check() to enable each port separately

@yosefe
Copy link
Contributor

yosefe commented Dec 4, 2020

bot:pipe:retest

}

num_ports = ucs_min(dev->num_ports, UCT_IB_DEV_MAX_PORTS);
Copy link
Contributor

@yosefe yosefe Dec 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. remove num_ports
if (dev->num_ports > UCT_IB_DEV_MAX_PORTS) {
    ucs_debug(...)
    dev->num_ports = UCT_IB_DEV_MAX_PORTS;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yosefe Fixed.

@yosefe
Copy link
Contributor

yosefe commented Dec 13, 2020

bot:pipe:retest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants