From 353407d917b2d87cd8104a0453d012439c6ca4be Mon Sep 17 00:00:00 2001 From: Ido Schimmel Date: Wed, 6 Oct 2021 13:46:42 +0300 Subject: [PATCH 1/6] ethtool: Add ability to control transceiver modules' power mode Add a pair of new ethtool messages, 'ETHTOOL_MSG_MODULE_SET' and 'ETHTOOL_MSG_MODULE_GET', that can be used to control transceiver modules parameters and retrieve their status. The first parameter to control is the power mode of the module. It is only relevant for paged memory modules, as flat memory modules always operate in low power mode. When a paged memory module is in low power mode, its power consumption is reduced to the minimum, the management interface towards the host is available and the data path is deactivated. User space can choose to put modules that are not currently in use in low power mode and transition them to high power mode before putting the associated ports administratively up. This is useful for user space that favors reduced power consumption and lower temperatures over reduced link up times. In QSFP-DD modules the transition from low power mode to high power mode can take a few seconds and this transition is only expected to get longer with future / more complex modules. User space can control the power mode of the module via the power mode policy attribute ('ETHTOOL_A_MODULE_POWER_MODE_POLICY'). Possible values: * high: Module is always in high power mode. * auto: Module is transitioned by the host to high power mode when the first port using it is put administratively up and to low power mode when the last port using it is put administratively down. The operational power mode of the module is available to user space via the 'ETHTOOL_A_MODULE_POWER_MODE' attribute. The attribute is not reported to user space when a module is not plugged-in. The user API is designed to be generic enough so that it could be used for modules with different memory maps (e.g., SFF-8636, CMIS). The only implementation of the device driver API in this series is for a MAC driver (mlxsw) where the module is controlled by the device's firmware, but it is designed to be generic enough so that it could also be used by implementations where the module is controlled by the CPU. CMIS testing ============ # ethtool -m swp11 Identifier : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628)) ... Module State : 0x03 (ModuleReady) LowPwrAllowRequestHW : Off LowPwrRequestSW : Off The module is not in low power mode, as it is not forced by hardware (LowPwrAllowRequestHW is off) or by software (LowPwrRequestSW is off). The power mode can be queried from the kernel. In case LowPwrAllowRequestHW was on, the kernel would need to take into account the state of the LowPwrRequestHW signal, which is not visible to user space. $ ethtool --show-module swp11 Module parameters for swp11: power-mode-policy high power-mode high Change the power mode policy to 'auto': # ethtool --set-module swp11 power-mode-policy auto Query the power mode again: $ ethtool --show-module swp11 Module parameters for swp11: power-mode-policy auto power-mode low Verify with the data read from the EEPROM: # ethtool -m swp11 Identifier : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628)) ... Module State : 0x01 (ModuleLowPwr) LowPwrAllowRequestHW : Off LowPwrRequestSW : On Put the associated port administratively up which will instruct the host to transition the module to high power mode: # ip link set dev swp11 up Query the power mode again: $ ethtool --show-module swp11 Module parameters for swp11: power-mode-policy auto power-mode high Verify with the data read from the EEPROM: # ethtool -m swp11 Identifier : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628)) ... Module State : 0x03 (ModuleReady) LowPwrAllowRequestHW : Off LowPwrRequestSW : Off Put the associated port administratively down which will instruct the host to transition the module to low power mode: # ip link set dev swp11 down Query the power mode again: $ ethtool --show-module swp11 Module parameters for swp11: power-mode-policy auto power-mode low Verify with the data read from the EEPROM: # ethtool -m swp11 Identifier : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628)) ... Module State : 0x01 (ModuleLowPwr) LowPwrAllowRequestHW : Off LowPwrRequestSW : On SFF-8636 testing ================ # ethtool -m swp13 Identifier : 0x11 (QSFP28) ... Extended identifier description : 5.0W max. Power consumption, High Power Class (> 3.5 W) enabled Power set : Off Power override : On ... Transmit avg optical power (Channel 1) : 0.7733 mW / -1.12 dBm Transmit avg optical power (Channel 2) : 0.7649 mW / -1.16 dBm Transmit avg optical power (Channel 3) : 0.7790 mW / -1.08 dBm Transmit avg optical power (Channel 4) : 0.7837 mW / -1.06 dBm Rcvr signal avg optical power(Channel 1) : 0.9302 mW / -0.31 dBm Rcvr signal avg optical power(Channel 2) : 0.9079 mW / -0.42 dBm Rcvr signal avg optical power(Channel 3) : 0.8993 mW / -0.46 dBm Rcvr signal avg optical power(Channel 4) : 0.8778 mW / -0.57 dBm The module is not in low power mode, as it is not forced by hardware (Power override is on) or by software (Power set is off). The power mode can be queried from the kernel. In case Power override was off, the kernel would need to take into account the state of the LPMode signal, which is not visible to user space. $ ethtool --show-module swp13 Module parameters for swp13: power-mode-policy high power-mode high Change the power mode policy to 'auto': # ethtool --set-module swp13 power-mode-policy auto Query the power mode again: $ ethtool --show-module swp13 Module parameters for swp13: power-mode-policy auto power-mode low Verify with the data read from the EEPROM: # ethtool -m swp13 Identifier : 0x11 (QSFP28) Extended identifier description : 5.0W max. Power consumption, High Power Class (> 3.5 W) not enabled Power set : On Power override : On ... Transmit avg optical power (Channel 1) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 2) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 3) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 4) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 1) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 2) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 3) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 4) : 0.0000 mW / -inf dBm Put the associated port administratively up which will instruct the host to transition the module to high power mode: # ip link set dev swp13 up Query the power mode again: $ ethtool --show-module swp13 Module parameters for swp13: power-mode-policy auto power-mode high Verify with the data read from the EEPROM: # ethtool -m swp13 Identifier : 0x11 (QSFP28) ... Extended identifier description : 5.0W max. Power consumption, High Power Class (> 3.5 W) enabled Power set : Off Power override : On ... Transmit avg optical power (Channel 1) : 0.7934 mW / -1.01 dBm Transmit avg optical power (Channel 2) : 0.7859 mW / -1.05 dBm Transmit avg optical power (Channel 3) : 0.7885 mW / -1.03 dBm Transmit avg optical power (Channel 4) : 0.7985 mW / -0.98 dBm Rcvr signal avg optical power(Channel 1) : 0.9325 mW / -0.30 dBm Rcvr signal avg optical power(Channel 2) : 0.9034 mW / -0.44 dBm Rcvr signal avg optical power(Channel 3) : 0.9086 mW / -0.42 dBm Rcvr signal avg optical power(Channel 4) : 0.8885 mW / -0.51 dBm Put the associated port administratively down which will instruct the host to transition the module to low power mode: # ip link set dev swp13 down Query the power mode again: $ ethtool --show-module swp13 Module parameters for swp13: power-mode-policy auto power-mode low Verify with the data read from the EEPROM: # ethtool -m swp13 Identifier : 0x11 (QSFP28) ... Extended identifier description : 5.0W max. Power consumption, High Power Class (> 3.5 W) not enabled Power set : On Power override : On ... Transmit avg optical power (Channel 1) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 2) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 3) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 4) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 1) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 2) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 3) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 4) : 0.0000 mW / -inf dBm Signed-off-by: Ido Schimmel Signed-off-by: Jakub Kicinski --- Documentation/networking/ethtool-netlink.rst | 71 +++++++- include/linux/ethtool.h | 22 +++ include/uapi/linux/ethtool.h | 23 +++ include/uapi/linux/ethtool_netlink.h | 17 ++ net/ethtool/Makefile | 2 +- net/ethtool/module.c | 180 +++++++++++++++++++ net/ethtool/netlink.c | 19 ++ net/ethtool/netlink.h | 4 + 8 files changed, 335 insertions(+), 3 deletions(-) create mode 100644 net/ethtool/module.c diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst index d9b55b7a1a4d3..d6fd4b2e243c9 100644 --- a/Documentation/networking/ethtool-netlink.rst +++ b/Documentation/networking/ethtool-netlink.rst @@ -41,6 +41,11 @@ In the message structure descriptions below, if an attribute name is suffixed with "+", parent nest can contain multiple attributes of the same type. This implements an array of entries. +Attributes that need to be filled-in by device drivers and that are dumped to +user space based on whether they are valid or not should not use zero as a +valid value. This avoids the need to explicitly signal the validity of the +attribute in the device driver API. + Request header ============== @@ -179,7 +184,7 @@ according to message purpose: Userspace to kernel: - ===================================== ================================ + ===================================== ================================= ``ETHTOOL_MSG_STRSET_GET`` get string set ``ETHTOOL_MSG_LINKINFO_GET`` get link settings ``ETHTOOL_MSG_LINKINFO_SET`` set link settings @@ -213,7 +218,9 @@ Userspace to kernel: ``ETHTOOL_MSG_MODULE_EEPROM_GET`` read SFP module EEPROM ``ETHTOOL_MSG_STATS_GET`` get standard statistics ``ETHTOOL_MSG_PHC_VCLOCKS_GET`` get PHC virtual clocks info - ===================================== ================================ + ``ETHTOOL_MSG_MODULE_SET`` set transceiver module parameters + ``ETHTOOL_MSG_MODULE_GET`` get transceiver module parameters + ===================================== ================================= Kernel to userspace: @@ -252,6 +259,7 @@ Kernel to userspace: ``ETHTOOL_MSG_MODULE_EEPROM_GET_REPLY`` read SFP module EEPROM ``ETHTOOL_MSG_STATS_GET_REPLY`` standard statistics ``ETHTOOL_MSG_PHC_VCLOCKS_GET_REPLY`` PHC virtual clocks info + ``ETHTOOL_MSG_MODULE_GET_REPLY`` transceiver module parameters ======================================== ================================= ``GET`` requests are sent by userspace applications to retrieve device @@ -1521,6 +1529,63 @@ Kernel response contents: ``ETHTOOL_A_PHC_VCLOCKS_INDEX`` s32 PHC index array ==================================== ====== ========================== +MODULE_GET +========== + +Gets transceiver module parameters. + +Request contents: + + ===================================== ====== ========================== + ``ETHTOOL_A_MODULE_HEADER`` nested request header + ===================================== ====== ========================== + +Kernel response contents: + + ====================================== ====== ========================== + ``ETHTOOL_A_MODULE_HEADER`` nested reply header + ``ETHTOOL_A_MODULE_POWER_MODE_POLICY`` u8 power mode policy + ``ETHTOOL_A_MODULE_POWER_MODE`` u8 operational power mode + ====================================== ====== ========================== + +The optional ``ETHTOOL_A_MODULE_POWER_MODE_POLICY`` attribute encodes the +transceiver module power mode policy enforced by the host. The default policy +is driver-dependent, but "auto" is the recommended default and it should be +implemented by new drivers and drivers where conformance to a legacy behavior +is not critical. + +The optional ``ETHTHOOL_A_MODULE_POWER_MODE`` attribute encodes the operational +power mode policy of the transceiver module. It is only reported when a module +is plugged-in. Possible values are: + +.. kernel-doc:: include/uapi/linux/ethtool.h + :identifiers: ethtool_module_power_mode + +MODULE_SET +========== + +Sets transceiver module parameters. + +Request contents: + + ====================================== ====== ========================== + ``ETHTOOL_A_MODULE_HEADER`` nested request header + ``ETHTOOL_A_MODULE_POWER_MODE_POLICY`` u8 power mode policy + ====================================== ====== ========================== + +When set, the optional ``ETHTOOL_A_MODULE_POWER_MODE_POLICY`` attribute is used +to set the transceiver module power policy enforced by the host. Possible +values are: + +.. kernel-doc:: include/uapi/linux/ethtool.h + :identifiers: ethtool_module_power_mode_policy + +For SFF-8636 modules, low power mode is forced by the host according to table +6-10 in revision 2.10a of the specification. + +For CMIS modules, low power mode is forced by the host according to table 6-12 +in revision 5.0 of the specification. + Request translation =================== @@ -1620,4 +1685,6 @@ are netlink only. n/a ``ETHTOOL_MSG_CABLE_TEST_TDR_ACT`` n/a ``ETHTOOL_MSG_TUNNEL_INFO_GET`` n/a ``ETHTOOL_MSG_PHC_VCLOCKS_GET`` + n/a ``ETHTOOL_MSG_MODULE_GET`` + n/a ``ETHTOOL_MSG_MODULE_SET`` =================================== ===================================== diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h index 849524b55d89a..9adf8d2c3144b 100644 --- a/include/linux/ethtool.h +++ b/include/linux/ethtool.h @@ -415,6 +415,17 @@ struct ethtool_module_eeprom { u8 *data; }; +/** + * struct ethtool_module_power_mode_params - module power mode parameters + * @policy: The power mode policy enforced by the host for the plug-in module. + * @mode: The operational power mode of the plug-in module. Should be filled by + * device drivers on get operations. + */ +struct ethtool_module_power_mode_params { + enum ethtool_module_power_mode_policy policy; + enum ethtool_module_power_mode mode; +}; + /** * struct ethtool_ops - optional netdev operations * @cap_link_lanes_supported: indicates if the driver supports lanes @@ -580,6 +591,11 @@ struct ethtool_module_eeprom { * @get_eth_ctrl_stats: Query some of the IEEE 802.3 MAC Ctrl statistics. * @get_rmon_stats: Query some of the RMON (RFC 2819) statistics. * Set %ranges to a pointer to zero-terminated array of byte ranges. + * @get_module_power_mode: Get the power mode policy for the plug-in module + * used by the network device and its operational power mode, if + * plugged-in. + * @set_module_power_mode: Set the power mode policy for the plug-in module + * used by the network device. * * All operations are optional (i.e. the function pointer may be set * to %NULL) and callers must take this into account. Callers must @@ -705,6 +721,12 @@ struct ethtool_ops { void (*get_rmon_stats)(struct net_device *dev, struct ethtool_rmon_stats *rmon_stats, const struct ethtool_rmon_hist_range **ranges); + int (*get_module_power_mode)(struct net_device *dev, + struct ethtool_module_power_mode_params *params, + struct netlink_ext_ack *extack); + int (*set_module_power_mode)(struct net_device *dev, + const struct ethtool_module_power_mode_params *params, + struct netlink_ext_ack *extack); }; int ethtool_check_ops(const struct ethtool_ops *ops); diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h index b6db6590baf0a..6de61d53ca5d0 100644 --- a/include/uapi/linux/ethtool.h +++ b/include/uapi/linux/ethtool.h @@ -706,6 +706,29 @@ enum ethtool_stringset { ETH_SS_COUNT }; +/** + * enum ethtool_module_power_mode_policy - plug-in module power mode policy + * @ETHTOOL_MODULE_POWER_MODE_POLICY_HIGH: Module is always in high power mode. + * @ETHTOOL_MODULE_POWER_MODE_POLICY_AUTO: Module is transitioned by the host + * to high power mode when the first port using it is put administratively + * up and to low power mode when the last port using it is put + * administratively down. + */ +enum ethtool_module_power_mode_policy { + ETHTOOL_MODULE_POWER_MODE_POLICY_HIGH = 1, + ETHTOOL_MODULE_POWER_MODE_POLICY_AUTO, +}; + +/** + * enum ethtool_module_power_mode - plug-in module power mode + * @ETHTOOL_MODULE_POWER_MODE_LOW: Module is in low power mode. + * @ETHTOOL_MODULE_POWER_MODE_HIGH: Module is in high power mode. + */ +enum ethtool_module_power_mode { + ETHTOOL_MODULE_POWER_MODE_LOW = 1, + ETHTOOL_MODULE_POWER_MODE_HIGH, +}; + /** * struct ethtool_gstrings - string set for data tagging * @cmd: Command number = %ETHTOOL_GSTRINGS diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h index 5545f1ca9237c..ca5fbb59fa421 100644 --- a/include/uapi/linux/ethtool_netlink.h +++ b/include/uapi/linux/ethtool_netlink.h @@ -47,6 +47,8 @@ enum { ETHTOOL_MSG_MODULE_EEPROM_GET, ETHTOOL_MSG_STATS_GET, ETHTOOL_MSG_PHC_VCLOCKS_GET, + ETHTOOL_MSG_MODULE_GET, + ETHTOOL_MSG_MODULE_SET, /* add new constants above here */ __ETHTOOL_MSG_USER_CNT, @@ -90,6 +92,8 @@ enum { ETHTOOL_MSG_MODULE_EEPROM_GET_REPLY, ETHTOOL_MSG_STATS_GET_REPLY, ETHTOOL_MSG_PHC_VCLOCKS_GET_REPLY, + ETHTOOL_MSG_MODULE_GET_REPLY, + ETHTOOL_MSG_MODULE_NTF, /* add new constants above here */ __ETHTOOL_MSG_KERNEL_CNT, @@ -833,6 +837,19 @@ enum { ETHTOOL_A_STATS_RMON_MAX = (__ETHTOOL_A_STATS_RMON_CNT - 1) }; +/* MODULE */ + +enum { + ETHTOOL_A_MODULE_UNSPEC, + ETHTOOL_A_MODULE_HEADER, /* nest - _A_HEADER_* */ + ETHTOOL_A_MODULE_POWER_MODE_POLICY, /* u8 */ + ETHTOOL_A_MODULE_POWER_MODE, /* u8 */ + + /* add new constants above here */ + __ETHTOOL_A_MODULE_CNT, + ETHTOOL_A_MODULE_MAX = (__ETHTOOL_A_MODULE_CNT - 1) +}; + /* generic netlink info */ #define ETHTOOL_GENL_NAME "ethtool" #define ETHTOOL_GENL_VERSION 1 diff --git a/net/ethtool/Makefile b/net/ethtool/Makefile index 0a19470efbfb9..b76432e70e6ba 100644 --- a/net/ethtool/Makefile +++ b/net/ethtool/Makefile @@ -7,4 +7,4 @@ obj-$(CONFIG_ETHTOOL_NETLINK) += ethtool_nl.o ethtool_nl-y := netlink.o bitset.o strset.o linkinfo.o linkmodes.o \ linkstate.o debug.o wol.o features.o privflags.o rings.o \ channels.o coalesce.o pause.o eee.o tsinfo.o cabletest.o \ - tunnels.o fec.o eeprom.o stats.o phc_vclocks.o + tunnels.o fec.o eeprom.o stats.o phc_vclocks.o module.o diff --git a/net/ethtool/module.c b/net/ethtool/module.c new file mode 100644 index 0000000000000..bc2cef11bbdad --- /dev/null +++ b/net/ethtool/module.c @@ -0,0 +1,180 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include + +#include "netlink.h" +#include "common.h" +#include "bitset.h" + +struct module_req_info { + struct ethnl_req_info base; +}; + +struct module_reply_data { + struct ethnl_reply_data base; + struct ethtool_module_power_mode_params power; +}; + +#define MODULE_REPDATA(__reply_base) \ + container_of(__reply_base, struct module_reply_data, base) + +/* MODULE_GET */ + +const struct nla_policy ethnl_module_get_policy[ETHTOOL_A_MODULE_HEADER + 1] = { + [ETHTOOL_A_MODULE_HEADER] = NLA_POLICY_NESTED(ethnl_header_policy), +}; + +static int module_get_power_mode(struct net_device *dev, + struct module_reply_data *data, + struct netlink_ext_ack *extack) +{ + const struct ethtool_ops *ops = dev->ethtool_ops; + + if (!ops->get_module_power_mode) + return 0; + + return ops->get_module_power_mode(dev, &data->power, extack); +} + +static int module_prepare_data(const struct ethnl_req_info *req_base, + struct ethnl_reply_data *reply_base, + struct genl_info *info) +{ + struct module_reply_data *data = MODULE_REPDATA(reply_base); + struct netlink_ext_ack *extack = info ? info->extack : NULL; + struct net_device *dev = reply_base->dev; + int ret; + + ret = ethnl_ops_begin(dev); + if (ret < 0) + return ret; + + ret = module_get_power_mode(dev, data, extack); + if (ret < 0) + goto out_complete; + +out_complete: + ethnl_ops_complete(dev); + return ret; +} + +static int module_reply_size(const struct ethnl_req_info *req_base, + const struct ethnl_reply_data *reply_base) +{ + struct module_reply_data *data = MODULE_REPDATA(reply_base); + int len = 0; + + if (data->power.policy) + len += nla_total_size(sizeof(u8)); /* _MODULE_POWER_MODE_POLICY */ + + if (data->power.mode) + len += nla_total_size(sizeof(u8)); /* _MODULE_POWER_MODE */ + + return len; +} + +static int module_fill_reply(struct sk_buff *skb, + const struct ethnl_req_info *req_base, + const struct ethnl_reply_data *reply_base) +{ + const struct module_reply_data *data = MODULE_REPDATA(reply_base); + + if (data->power.policy && + nla_put_u8(skb, ETHTOOL_A_MODULE_POWER_MODE_POLICY, + data->power.policy)) + return -EMSGSIZE; + + if (data->power.mode && + nla_put_u8(skb, ETHTOOL_A_MODULE_POWER_MODE, data->power.mode)) + return -EMSGSIZE; + + return 0; +} + +const struct ethnl_request_ops ethnl_module_request_ops = { + .request_cmd = ETHTOOL_MSG_MODULE_GET, + .reply_cmd = ETHTOOL_MSG_MODULE_GET_REPLY, + .hdr_attr = ETHTOOL_A_MODULE_HEADER, + .req_info_size = sizeof(struct module_req_info), + .reply_data_size = sizeof(struct module_reply_data), + + .prepare_data = module_prepare_data, + .reply_size = module_reply_size, + .fill_reply = module_fill_reply, +}; + +/* MODULE_SET */ + +const struct nla_policy ethnl_module_set_policy[ETHTOOL_A_MODULE_POWER_MODE_POLICY + 1] = { + [ETHTOOL_A_MODULE_HEADER] = NLA_POLICY_NESTED(ethnl_header_policy), + [ETHTOOL_A_MODULE_POWER_MODE_POLICY] = + NLA_POLICY_RANGE(NLA_U8, ETHTOOL_MODULE_POWER_MODE_POLICY_HIGH, + ETHTOOL_MODULE_POWER_MODE_POLICY_AUTO), +}; + +static int module_set_power_mode(struct net_device *dev, struct nlattr **tb, + bool *p_mod, struct netlink_ext_ack *extack) +{ + struct ethtool_module_power_mode_params power = {}; + struct ethtool_module_power_mode_params power_new; + const struct ethtool_ops *ops = dev->ethtool_ops; + int ret; + + if (!tb[ETHTOOL_A_MODULE_POWER_MODE_POLICY]) + return 0; + + if (!ops->get_module_power_mode || !ops->set_module_power_mode) { + NL_SET_ERR_MSG_ATTR(extack, + tb[ETHTOOL_A_MODULE_POWER_MODE_POLICY], + "Setting power mode policy is not supported by this device"); + return -EOPNOTSUPP; + } + + power_new.policy = nla_get_u8(tb[ETHTOOL_A_MODULE_POWER_MODE_POLICY]); + ret = ops->get_module_power_mode(dev, &power, extack); + if (ret < 0) + return ret; + + if (power_new.policy == power.policy) + return 0; + *p_mod = true; + + return ops->set_module_power_mode(dev, &power_new, extack); +} + +int ethnl_set_module(struct sk_buff *skb, struct genl_info *info) +{ + struct ethnl_req_info req_info = {}; + struct nlattr **tb = info->attrs; + struct net_device *dev; + bool mod = false; + int ret; + + ret = ethnl_parse_header_dev_get(&req_info, tb[ETHTOOL_A_MODULE_HEADER], + genl_info_net(info), info->extack, + true); + if (ret < 0) + return ret; + dev = req_info.dev; + + rtnl_lock(); + ret = ethnl_ops_begin(dev); + if (ret < 0) + goto out_rtnl; + + ret = module_set_power_mode(dev, tb, &mod, info->extack); + if (ret < 0) + goto out_ops; + + if (!mod) + goto out_ops; + + ethtool_notify(dev, ETHTOOL_MSG_MODULE_NTF, NULL); + +out_ops: + ethnl_ops_complete(dev); +out_rtnl: + rtnl_unlock(); + dev_put(dev); + return ret; +} diff --git a/net/ethtool/netlink.c b/net/ethtool/netlink.c index 1797a0a900195..38b44c0291b11 100644 --- a/net/ethtool/netlink.c +++ b/net/ethtool/netlink.c @@ -282,6 +282,7 @@ ethnl_default_requests[__ETHTOOL_MSG_USER_CNT] = { [ETHTOOL_MSG_MODULE_EEPROM_GET] = ðnl_module_eeprom_request_ops, [ETHTOOL_MSG_STATS_GET] = ðnl_stats_request_ops, [ETHTOOL_MSG_PHC_VCLOCKS_GET] = ðnl_phc_vclocks_request_ops, + [ETHTOOL_MSG_MODULE_GET] = ðnl_module_request_ops, }; static struct ethnl_dump_ctx *ethnl_dump_context(struct netlink_callback *cb) @@ -593,6 +594,7 @@ ethnl_default_notify_ops[ETHTOOL_MSG_KERNEL_MAX + 1] = { [ETHTOOL_MSG_PAUSE_NTF] = ðnl_pause_request_ops, [ETHTOOL_MSG_EEE_NTF] = ðnl_eee_request_ops, [ETHTOOL_MSG_FEC_NTF] = ðnl_fec_request_ops, + [ETHTOOL_MSG_MODULE_NTF] = ðnl_module_request_ops, }; /* default notification handler */ @@ -686,6 +688,7 @@ static const ethnl_notify_handler_t ethnl_notify_handlers[] = { [ETHTOOL_MSG_PAUSE_NTF] = ethnl_default_notify, [ETHTOOL_MSG_EEE_NTF] = ethnl_default_notify, [ETHTOOL_MSG_FEC_NTF] = ethnl_default_notify, + [ETHTOOL_MSG_MODULE_NTF] = ethnl_default_notify, }; void ethtool_notify(struct net_device *dev, unsigned int cmd, const void *data) @@ -999,6 +1002,22 @@ static const struct genl_ops ethtool_genl_ops[] = { .policy = ethnl_phc_vclocks_get_policy, .maxattr = ARRAY_SIZE(ethnl_phc_vclocks_get_policy) - 1, }, + { + .cmd = ETHTOOL_MSG_MODULE_GET, + .doit = ethnl_default_doit, + .start = ethnl_default_start, + .dumpit = ethnl_default_dumpit, + .done = ethnl_default_done, + .policy = ethnl_module_get_policy, + .maxattr = ARRAY_SIZE(ethnl_module_get_policy) - 1, + }, + { + .cmd = ETHTOOL_MSG_MODULE_SET, + .flags = GENL_UNS_ADMIN_PERM, + .doit = ethnl_set_module, + .policy = ethnl_module_set_policy, + .maxattr = ARRAY_SIZE(ethnl_module_set_policy) - 1, + }, }; static const struct genl_multicast_group ethtool_nl_mcgrps[] = { diff --git a/net/ethtool/netlink.h b/net/ethtool/netlink.h index e8987e28036fe..836ee71578487 100644 --- a/net/ethtool/netlink.h +++ b/net/ethtool/netlink.h @@ -337,6 +337,7 @@ extern const struct ethnl_request_ops ethnl_fec_request_ops; extern const struct ethnl_request_ops ethnl_module_eeprom_request_ops; extern const struct ethnl_request_ops ethnl_stats_request_ops; extern const struct ethnl_request_ops ethnl_phc_vclocks_request_ops; +extern const struct ethnl_request_ops ethnl_module_request_ops; extern const struct nla_policy ethnl_header_policy[ETHTOOL_A_HEADER_FLAGS + 1]; extern const struct nla_policy ethnl_header_policy_stats[ETHTOOL_A_HEADER_FLAGS + 1]; @@ -373,6 +374,8 @@ extern const struct nla_policy ethnl_fec_set_policy[ETHTOOL_A_FEC_AUTO + 1]; extern const struct nla_policy ethnl_module_eeprom_get_policy[ETHTOOL_A_MODULE_EEPROM_I2C_ADDRESS + 1]; extern const struct nla_policy ethnl_stats_get_policy[ETHTOOL_A_STATS_GROUPS + 1]; extern const struct nla_policy ethnl_phc_vclocks_get_policy[ETHTOOL_A_PHC_VCLOCKS_HEADER + 1]; +extern const struct nla_policy ethnl_module_get_policy[ETHTOOL_A_MODULE_HEADER + 1]; +extern const struct nla_policy ethnl_module_set_policy[ETHTOOL_A_MODULE_POWER_MODE_POLICY + 1]; int ethnl_set_linkinfo(struct sk_buff *skb, struct genl_info *info); int ethnl_set_linkmodes(struct sk_buff *skb, struct genl_info *info); @@ -391,6 +394,7 @@ int ethnl_tunnel_info_doit(struct sk_buff *skb, struct genl_info *info); int ethnl_tunnel_info_start(struct netlink_callback *cb); int ethnl_tunnel_info_dumpit(struct sk_buff *skb, struct netlink_callback *cb); int ethnl_set_fec(struct sk_buff *skb, struct genl_info *info); +int ethnl_set_module(struct sk_buff *skb, struct genl_info *info); extern const char stats_std_names[__ETHTOOL_STATS_CNT][ETH_GSTRING_LEN]; extern const char stats_eth_phy_names[__ETHTOOL_A_STATS_ETH_PHY_CNT][ETH_GSTRING_LEN]; From f10ba086f7e30904ea7cafe7ba922bafc65ad9ad Mon Sep 17 00:00:00 2001 From: Ido Schimmel Date: Wed, 6 Oct 2021 13:46:43 +0300 Subject: [PATCH 2/6] mlxsw: reg: Add Port Module Memory Map Properties register Add the Port Module Memory Map Properties register. It will be used to set the power mode of a module in subsequent patches. Signed-off-by: Ido Schimmel Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/mellanox/mlxsw/reg.h | 50 +++++++++++++++++++++++ 1 file changed, 50 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlxsw/reg.h b/drivers/net/ethernet/mellanox/mlxsw/reg.h index c5fad3c94fac2..bff05a0a2f7ab 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/reg.h +++ b/drivers/net/ethernet/mellanox/mlxsw/reg.h @@ -5946,6 +5946,55 @@ static inline void mlxsw_reg_pddr_pack(char *payload, u8 local_port, mlxsw_reg_pddr_page_select_set(payload, page_select); } +/* PMMP - Port Module Memory Map Properties Register + * ------------------------------------------------- + * The PMMP register allows to override the module memory map advertisement. + * The register can only be set when the module is disabled by PMAOS register. + */ +#define MLXSW_REG_PMMP_ID 0x5044 +#define MLXSW_REG_PMMP_LEN 0x2C + +MLXSW_REG_DEFINE(pmmp, MLXSW_REG_PMMP_ID, MLXSW_REG_PMMP_LEN); + +/* reg_pmmp_module + * Module number. + * Access: Index + */ +MLXSW_ITEM32(reg, pmmp, module, 0x00, 16, 8); + +/* reg_pmmp_sticky + * When set, will keep eeprom_override values after plug-out event. + * Access: OP + */ +MLXSW_ITEM32(reg, pmmp, sticky, 0x00, 0, 1); + +/* reg_pmmp_eeprom_override_mask + * Write mask bit (negative polarity). + * 0 - Allow write + * 1 - Ignore write + * On write, indicates which of the bits from eeprom_override field are + * updated. + * Access: WO + */ +MLXSW_ITEM32(reg, pmmp, eeprom_override_mask, 0x04, 16, 16); + +enum { + /* Set module to low power mode */ + MLXSW_REG_PMMP_EEPROM_OVERRIDE_LOW_POWER_MASK = BIT(8), +}; + +/* reg_pmmp_eeprom_override + * Override / ignore EEPROM advertisement properties bitmask + * Access: RW + */ +MLXSW_ITEM32(reg, pmmp, eeprom_override, 0x04, 0, 16); + +static inline void mlxsw_reg_pmmp_pack(char *payload, u8 module) +{ + MLXSW_REG_ZERO(pmmp, payload); + mlxsw_reg_pmmp_module_set(payload, module); +} + /* PLLP - Port Local port to Label Port mapping Register * ----------------------------------------------------- * The PLLP register returns the mapping from Local Port into Label Port. @@ -12348,6 +12397,7 @@ static const struct mlxsw_reg_info *mlxsw_reg_infos[] = { MLXSW_REG(pmtdb), MLXSW_REG(pmpe), MLXSW_REG(pddr), + MLXSW_REG(pmmp), MLXSW_REG(pllp), MLXSW_REG(htgt), MLXSW_REG(hpkt), From fc53f5fb8037d3d29a90c3779d961982ce99e380 Mon Sep 17 00:00:00 2001 From: Ido Schimmel Date: Wed, 6 Oct 2021 13:46:44 +0300 Subject: [PATCH 3/6] mlxsw: reg: Add Management Cable IO and Notifications register Add the Management Cable IO and Notifications register. It will be used to retrieve the power mode status of a module in subsequent patches and whether a module is present in a cage or not. Signed-off-by: Ido Schimmel Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/mellanox/mlxsw/reg.h | 34 +++++++++++++++++++++++ 1 file changed, 34 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlxsw/reg.h b/drivers/net/ethernet/mellanox/mlxsw/reg.h index bff05a0a2f7ab..ed6c3356e4ebe 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/reg.h +++ b/drivers/net/ethernet/mellanox/mlxsw/reg.h @@ -10402,6 +10402,39 @@ static inline void mlxsw_reg_mlcr_pack(char *payload, u8 local_port, MLXSW_REG_MLCR_DURATION_MAX : 0); } +/* MCION - Management Cable IO and Notifications Register + * ------------------------------------------------------ + * The MCION register is used to query transceiver modules' IO pins and other + * notifications. + */ +#define MLXSW_REG_MCION_ID 0x9052 +#define MLXSW_REG_MCION_LEN 0x18 + +MLXSW_REG_DEFINE(mcion, MLXSW_REG_MCION_ID, MLXSW_REG_MCION_LEN); + +/* reg_mcion_module + * Module number. + * Access: Index + */ +MLXSW_ITEM32(reg, mcion, module, 0x00, 16, 8); + +enum { + MLXSW_REG_MCION_MODULE_STATUS_BITS_PRESENT_MASK = BIT(0), + MLXSW_REG_MCION_MODULE_STATUS_BITS_LOW_POWER_MASK = BIT(8), +}; + +/* reg_mcion_module_status_bits + * Module IO status as defined by SFF. + * Access: RO + */ +MLXSW_ITEM32(reg, mcion, module_status_bits, 0x04, 0, 16); + +static inline void mlxsw_reg_mcion_pack(char *payload, u8 module) +{ + MLXSW_REG_ZERO(mcion, payload); + mlxsw_reg_mcion_module_set(payload, module); +} + /* MTPPS - Management Pulse Per Second Register * -------------------------------------------- * This register provides the device PPS capabilities, configure the PPS in and @@ -12446,6 +12479,7 @@ static const struct mlxsw_reg_info *mlxsw_reg_infos[] = { MLXSW_REG(mgir), MLXSW_REG(mrsr), MLXSW_REG(mlcr), + MLXSW_REG(mcion), MLXSW_REG(mtpps), MLXSW_REG(mtutc), MLXSW_REG(mpsc), From 0455dc50bccab9286662f847f560cde6a648802d Mon Sep 17 00:00:00 2001 From: Ido Schimmel Date: Wed, 6 Oct 2021 13:46:45 +0300 Subject: [PATCH 4/6] mlxsw: Add ability to control transceiver modules' power mode Implement support for ethtool_ops::.get_module_power_mode and ethtool_ops::set_module_power_mode. The get operation is implemented using the Management Cable IO and Notifications (MCION) register that reports the operational power mode of the module and its presence. In case a module is not present, its operational power mode is not reported to ethtool and user space. If not set before, the power mode policy is reported as "high", which is the default on Mellanox systems. The set operation is implemented using the Port Module Memory Map Properties (PMMP) register. The register instructs the device's firmware to transition a plugged-in module to / out of low power mode by writing to its memory map. When the power mode policy is set to 'auto', a module will not transition to low power mode as long as any ports using it are administratively up. Example: # devlink port split swp11 count 4 # ethtool --set-module swp11s0 power-mode-policy auto $ ethtool --show-module swp11s0 Module parameters for swp11s0: power-mode-policy auto power-mode low # ip link set dev swp11s0 up # ip link set dev swp11s1 up $ ethtool --show-module swp11s0 Module parameters for swp11s0: power-mode-policy auto power-mode high # ip link set dev swp11s1 down $ ethtool --show-module swp11s0 Module parameters for swp11s0: power-mode-policy auto power-mode high # ip link set dev swp11s0 down $ ethtool --show-module swp11s0 Module parameters for swp11s0: power-mode-policy auto power-mode low Signed-off-by: Ido Schimmel Signed-off-by: Jakub Kicinski --- .../net/ethernet/mellanox/mlxsw/core_env.c | 193 +++++++++++++++++- .../net/ethernet/mellanox/mlxsw/core_env.h | 10 + drivers/net/ethernet/mellanox/mlxsw/minimal.c | 26 +++ .../mellanox/mlxsw/spectrum_ethtool.c | 28 +++ 4 files changed, 254 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlxsw/core_env.c b/drivers/net/ethernet/mellanox/mlxsw/core_env.c index 9e367174743d9..6dd4ae2f45f41 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/core_env.c +++ b/drivers/net/ethernet/mellanox/mlxsw/core_env.c @@ -17,6 +17,7 @@ struct mlxsw_env_module_info { bool is_overheat; int num_ports_mapped; int num_ports_up; + enum ethtool_module_power_mode_policy power_mode_policy; }; struct mlxsw_env { @@ -445,6 +446,152 @@ int mlxsw_env_reset_module(struct net_device *netdev, } EXPORT_SYMBOL(mlxsw_env_reset_module); +int +mlxsw_env_get_module_power_mode(struct mlxsw_core *mlxsw_core, u8 module, + struct ethtool_module_power_mode_params *params, + struct netlink_ext_ack *extack) +{ + struct mlxsw_env *mlxsw_env = mlxsw_core_env(mlxsw_core); + char mcion_pl[MLXSW_REG_MCION_LEN]; + u32 status_bits; + int err; + + if (WARN_ON_ONCE(module >= mlxsw_env->module_count)) + return -EINVAL; + + mutex_lock(&mlxsw_env->module_info_lock); + + params->policy = mlxsw_env->module_info[module].power_mode_policy; + + mlxsw_reg_mcion_pack(mcion_pl, module); + err = mlxsw_reg_query(mlxsw_core, MLXSW_REG(mcion), mcion_pl); + if (err) { + NL_SET_ERR_MSG_MOD(extack, "Failed to retrieve module's power mode"); + goto out; + } + + status_bits = mlxsw_reg_mcion_module_status_bits_get(mcion_pl); + if (!(status_bits & MLXSW_REG_MCION_MODULE_STATUS_BITS_PRESENT_MASK)) + goto out; + + if (status_bits & MLXSW_REG_MCION_MODULE_STATUS_BITS_LOW_POWER_MASK) + params->mode = ETHTOOL_MODULE_POWER_MODE_LOW; + else + params->mode = ETHTOOL_MODULE_POWER_MODE_HIGH; + +out: + mutex_unlock(&mlxsw_env->module_info_lock); + return err; +} +EXPORT_SYMBOL(mlxsw_env_get_module_power_mode); + +static int mlxsw_env_module_enable_set(struct mlxsw_core *mlxsw_core, + u8 module, bool enable) +{ + enum mlxsw_reg_pmaos_admin_status admin_status; + char pmaos_pl[MLXSW_REG_PMAOS_LEN]; + + mlxsw_reg_pmaos_pack(pmaos_pl, module); + admin_status = enable ? MLXSW_REG_PMAOS_ADMIN_STATUS_ENABLED : + MLXSW_REG_PMAOS_ADMIN_STATUS_DISABLED; + mlxsw_reg_pmaos_admin_status_set(pmaos_pl, admin_status); + mlxsw_reg_pmaos_ase_set(pmaos_pl, true); + + return mlxsw_reg_write(mlxsw_core, MLXSW_REG(pmaos), pmaos_pl); +} + +static int mlxsw_env_module_low_power_set(struct mlxsw_core *mlxsw_core, + u8 module, bool low_power) +{ + u16 eeprom_override_mask, eeprom_override; + char pmmp_pl[MLXSW_REG_PMMP_LEN]; + + mlxsw_reg_pmmp_pack(pmmp_pl, module); + mlxsw_reg_pmmp_sticky_set(pmmp_pl, true); + /* Mask all the bits except low power mode. */ + eeprom_override_mask = ~MLXSW_REG_PMMP_EEPROM_OVERRIDE_LOW_POWER_MASK; + mlxsw_reg_pmmp_eeprom_override_mask_set(pmmp_pl, eeprom_override_mask); + eeprom_override = low_power ? MLXSW_REG_PMMP_EEPROM_OVERRIDE_LOW_POWER_MASK : + 0; + mlxsw_reg_pmmp_eeprom_override_set(pmmp_pl, eeprom_override); + + return mlxsw_reg_write(mlxsw_core, MLXSW_REG(pmmp), pmmp_pl); +} + +static int __mlxsw_env_set_module_power_mode(struct mlxsw_core *mlxsw_core, + u8 module, bool low_power, + struct netlink_ext_ack *extack) +{ + int err; + + err = mlxsw_env_module_enable_set(mlxsw_core, module, false); + if (err) { + NL_SET_ERR_MSG_MOD(extack, "Failed to disable module"); + return err; + } + + err = mlxsw_env_module_low_power_set(mlxsw_core, module, low_power); + if (err) { + NL_SET_ERR_MSG_MOD(extack, "Failed to set module's power mode"); + goto err_module_low_power_set; + } + + err = mlxsw_env_module_enable_set(mlxsw_core, module, true); + if (err) { + NL_SET_ERR_MSG_MOD(extack, "Failed to enable module"); + goto err_module_enable_set; + } + + return 0; + +err_module_enable_set: + mlxsw_env_module_low_power_set(mlxsw_core, module, !low_power); +err_module_low_power_set: + mlxsw_env_module_enable_set(mlxsw_core, module, true); + return err; +} + +int +mlxsw_env_set_module_power_mode(struct mlxsw_core *mlxsw_core, u8 module, + enum ethtool_module_power_mode_policy policy, + struct netlink_ext_ack *extack) +{ + struct mlxsw_env *mlxsw_env = mlxsw_core_env(mlxsw_core); + bool low_power; + int err = 0; + + if (WARN_ON_ONCE(module >= mlxsw_env->module_count)) + return -EINVAL; + + if (policy != ETHTOOL_MODULE_POWER_MODE_POLICY_HIGH && + policy != ETHTOOL_MODULE_POWER_MODE_POLICY_AUTO) { + NL_SET_ERR_MSG_MOD(extack, "Unsupported power mode policy"); + return -EOPNOTSUPP; + } + + mutex_lock(&mlxsw_env->module_info_lock); + + if (mlxsw_env->module_info[module].power_mode_policy == policy) + goto out; + + /* If any ports are up, we are already in high power mode. */ + if (mlxsw_env->module_info[module].num_ports_up) + goto out_set_policy; + + low_power = policy == ETHTOOL_MODULE_POWER_MODE_POLICY_AUTO; + err = __mlxsw_env_set_module_power_mode(mlxsw_core, module, low_power, + extack); + if (err) + goto out; + +out_set_policy: + mlxsw_env->module_info[module].power_mode_policy = policy; +out: + mutex_unlock(&mlxsw_env->module_info_lock); + return err; +} +EXPORT_SYMBOL(mlxsw_env_set_module_power_mode); + static int mlxsw_env_module_has_temp_sensor(struct mlxsw_core *mlxsw_core, u8 module, bool *p_has_temp_sensor) @@ -794,15 +941,33 @@ EXPORT_SYMBOL(mlxsw_env_module_port_unmap); int mlxsw_env_module_port_up(struct mlxsw_core *mlxsw_core, u8 module) { struct mlxsw_env *mlxsw_env = mlxsw_core_env(mlxsw_core); + int err = 0; if (WARN_ON_ONCE(module >= mlxsw_env->module_count)) return -EINVAL; mutex_lock(&mlxsw_env->module_info_lock); + + if (mlxsw_env->module_info[module].power_mode_policy != + ETHTOOL_MODULE_POWER_MODE_POLICY_AUTO) + goto out_inc; + + if (mlxsw_env->module_info[module].num_ports_up != 0) + goto out_inc; + + /* Transition to high power mode following first port using the module + * being put administratively up. + */ + err = __mlxsw_env_set_module_power_mode(mlxsw_core, module, false, + NULL); + if (err) + goto out_unlock; + +out_inc: mlxsw_env->module_info[module].num_ports_up++; +out_unlock: mutex_unlock(&mlxsw_env->module_info_lock); - - return 0; + return err; } EXPORT_SYMBOL(mlxsw_env_module_port_up); @@ -814,7 +979,22 @@ void mlxsw_env_module_port_down(struct mlxsw_core *mlxsw_core, u8 module) return; mutex_lock(&mlxsw_env->module_info_lock); + mlxsw_env->module_info[module].num_ports_up--; + + if (mlxsw_env->module_info[module].power_mode_policy != + ETHTOOL_MODULE_POWER_MODE_POLICY_AUTO) + goto out_unlock; + + if (mlxsw_env->module_info[module].num_ports_up != 0) + goto out_unlock; + + /* Transition to low power mode following last port using the module + * being put administratively down. + */ + __mlxsw_env_set_module_power_mode(mlxsw_core, module, true, NULL); + +out_unlock: mutex_unlock(&mlxsw_env->module_info_lock); } EXPORT_SYMBOL(mlxsw_env_module_port_down); @@ -824,7 +1004,7 @@ int mlxsw_env_init(struct mlxsw_core *mlxsw_core, struct mlxsw_env **p_env) char mgpir_pl[MLXSW_REG_MGPIR_LEN]; struct mlxsw_env *env; u8 module_count; - int err; + int i, err; mlxsw_reg_mgpir_pack(mgpir_pl); err = mlxsw_reg_query(mlxsw_core, MLXSW_REG(mgpir), mgpir_pl); @@ -837,6 +1017,13 @@ int mlxsw_env_init(struct mlxsw_core *mlxsw_core, struct mlxsw_env **p_env) if (!env) return -ENOMEM; + /* Firmware defaults to high power mode policy where modules are + * transitioned to high power mode following plug-in. + */ + for (i = 0; i < module_count; i++) + env->module_info[i].power_mode_policy = + ETHTOOL_MODULE_POWER_MODE_POLICY_HIGH; + mutex_init(&env->module_info_lock); env->core = mlxsw_core; env->module_count = module_count; diff --git a/drivers/net/ethernet/mellanox/mlxsw/core_env.h b/drivers/net/ethernet/mellanox/mlxsw/core_env.h index c486397f5dfec..da121b1a84b4b 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/core_env.h +++ b/drivers/net/ethernet/mellanox/mlxsw/core_env.h @@ -28,6 +28,16 @@ int mlxsw_env_reset_module(struct net_device *netdev, struct mlxsw_core *mlxsw_core, u8 module, u32 *flags); +int +mlxsw_env_get_module_power_mode(struct mlxsw_core *mlxsw_core, u8 module, + struct ethtool_module_power_mode_params *params, + struct netlink_ext_ack *extack); + +int +mlxsw_env_set_module_power_mode(struct mlxsw_core *mlxsw_core, u8 module, + enum ethtool_module_power_mode_policy policy, + struct netlink_ext_ack *extack); + int mlxsw_env_module_overheat_counter_get(struct mlxsw_core *mlxsw_core, u8 module, u64 *p_counter); diff --git a/drivers/net/ethernet/mellanox/mlxsw/minimal.c b/drivers/net/ethernet/mellanox/mlxsw/minimal.c index 9644e9c486b80..e0892f259adf4 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/minimal.c +++ b/drivers/net/ethernet/mellanox/mlxsw/minimal.c @@ -145,12 +145,38 @@ static int mlxsw_m_reset(struct net_device *netdev, u32 *flags) flags); } +static int +mlxsw_m_get_module_power_mode(struct net_device *netdev, + struct ethtool_module_power_mode_params *params, + struct netlink_ext_ack *extack) +{ + struct mlxsw_m_port *mlxsw_m_port = netdev_priv(netdev); + struct mlxsw_core *core = mlxsw_m_port->mlxsw_m->core; + + return mlxsw_env_get_module_power_mode(core, mlxsw_m_port->module, + params, extack); +} + +static int +mlxsw_m_set_module_power_mode(struct net_device *netdev, + const struct ethtool_module_power_mode_params *params, + struct netlink_ext_ack *extack) +{ + struct mlxsw_m_port *mlxsw_m_port = netdev_priv(netdev); + struct mlxsw_core *core = mlxsw_m_port->mlxsw_m->core; + + return mlxsw_env_set_module_power_mode(core, mlxsw_m_port->module, + params->policy, extack); +} + static const struct ethtool_ops mlxsw_m_port_ethtool_ops = { .get_drvinfo = mlxsw_m_module_get_drvinfo, .get_module_info = mlxsw_m_get_module_info, .get_module_eeprom = mlxsw_m_get_module_eeprom, .get_module_eeprom_by_page = mlxsw_m_get_module_eeprom_by_page, .reset = mlxsw_m_reset, + .get_module_power_mode = mlxsw_m_get_module_power_mode, + .set_module_power_mode = mlxsw_m_set_module_power_mode, }; static int diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c index 06f1645561c69..7329bc84a8eee 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c @@ -1206,6 +1206,32 @@ static int mlxsw_sp_reset(struct net_device *dev, u32 *flags) return mlxsw_env_reset_module(dev, mlxsw_sp->core, module, flags); } +static int +mlxsw_sp_get_module_power_mode(struct net_device *dev, + struct ethtool_module_power_mode_params *params, + struct netlink_ext_ack *extack) +{ + struct mlxsw_sp_port *mlxsw_sp_port = netdev_priv(dev); + struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp; + u8 module = mlxsw_sp_port->mapping.module; + + return mlxsw_env_get_module_power_mode(mlxsw_sp->core, module, params, + extack); +} + +static int +mlxsw_sp_set_module_power_mode(struct net_device *dev, + const struct ethtool_module_power_mode_params *params, + struct netlink_ext_ack *extack) +{ + struct mlxsw_sp_port *mlxsw_sp_port = netdev_priv(dev); + struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp; + u8 module = mlxsw_sp_port->mapping.module; + + return mlxsw_env_set_module_power_mode(mlxsw_sp->core, module, + params->policy, extack); +} + const struct ethtool_ops mlxsw_sp_port_ethtool_ops = { .cap_link_lanes_supported = true, .get_drvinfo = mlxsw_sp_port_get_drvinfo, @@ -1228,6 +1254,8 @@ const struct ethtool_ops mlxsw_sp_port_ethtool_ops = { .get_eth_ctrl_stats = mlxsw_sp_get_eth_ctrl_stats, .get_rmon_stats = mlxsw_sp_get_rmon_stats, .reset = mlxsw_sp_reset, + .get_module_power_mode = mlxsw_sp_get_module_power_mode, + .set_module_power_mode = mlxsw_sp_set_module_power_mode, }; struct mlxsw_sp1_port_link_mode { From 3dfb51126064b594470b9c0b278188fbc9194709 Mon Sep 17 00:00:00 2001 From: Ido Schimmel Date: Wed, 6 Oct 2021 13:46:46 +0300 Subject: [PATCH 5/6] ethtool: Add transceiver module extended state Add an extended state and sub-state to describe link issues related to transceiver modules. The 'ETHTOOL_LINK_EXT_SUBSTATE_MODULE_CMIS_NOT_READY' extended sub-state tells user space that port is unable to gain a carrier because the CMIS Module State Machine did not reach the ModuleReady (Fully Operational) state. For example, if the module is stuck at ModuleLowPwr or ModuleFault state. In case of the latter, user space can read the fault reason from the module's EEPROM and potentially reset it. Signed-off-by: Ido Schimmel Signed-off-by: Jakub Kicinski --- Documentation/networking/ethtool-netlink.rst | 10 ++++++++++ include/linux/ethtool.h | 1 + include/uapi/linux/ethtool.h | 6 ++++++ 3 files changed, 17 insertions(+) diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst index d6fd4b2e243c9..7b598c7e39129 100644 --- a/Documentation/networking/ethtool-netlink.rst +++ b/Documentation/networking/ethtool-netlink.rst @@ -528,6 +528,8 @@ Link extended states: power required from cable or module ``ETHTOOL_LINK_EXT_STATE_OVERHEAT`` The module is overheated + + ``ETHTOOL_LINK_EXT_STATE_MODULE`` Transceiver module issue ================================================ ============================================ Link extended substates: @@ -621,6 +623,14 @@ Link extended substates: ``ETHTOOL_LINK_EXT_SUBSTATE_CI_CABLE_TEST_FAILURE`` Cable test failure =================================================== ============================================ + Transceiver module issue substates: + + =================================================== ============================================ + ``ETHTOOL_LINK_EXT_SUBSTATE_MODULE_CMIS_NOT_READY`` The CMIS Module State Machine did not reach + the ModuleReady state. For example, if the + module is stuck at ModuleFault state + =================================================== ============================================ + DEBUG_GET ========= diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h index 9adf8d2c3144b..845a0ffc16ee8 100644 --- a/include/linux/ethtool.h +++ b/include/linux/ethtool.h @@ -94,6 +94,7 @@ struct ethtool_link_ext_state_info { enum ethtool_link_ext_substate_link_logical_mismatch link_logical_mismatch; enum ethtool_link_ext_substate_bad_signal_integrity bad_signal_integrity; enum ethtool_link_ext_substate_cable_issue cable_issue; + enum ethtool_link_ext_substate_module module; u8 __link_ext_substate; }; }; diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h index 6de61d53ca5d0..a2223b6854519 100644 --- a/include/uapi/linux/ethtool.h +++ b/include/uapi/linux/ethtool.h @@ -603,6 +603,7 @@ enum ethtool_link_ext_state { ETHTOOL_LINK_EXT_STATE_CALIBRATION_FAILURE, ETHTOOL_LINK_EXT_STATE_POWER_BUDGET_EXCEEDED, ETHTOOL_LINK_EXT_STATE_OVERHEAT, + ETHTOOL_LINK_EXT_STATE_MODULE, }; /* More information in addition to ETHTOOL_LINK_EXT_STATE_AUTONEG. */ @@ -649,6 +650,11 @@ enum ethtool_link_ext_substate_cable_issue { ETHTOOL_LINK_EXT_SUBSTATE_CI_CABLE_TEST_FAILURE, }; +/* More information in addition to ETHTOOL_LINK_EXT_STATE_MODULE. */ +enum ethtool_link_ext_substate_module { + ETHTOOL_LINK_EXT_SUBSTATE_MODULE_CMIS_NOT_READY = 1, +}; + #define ETH_GSTRING_LEN 32 /** From 235dbbec7d7233d5e405d993669616e600a93725 Mon Sep 17 00:00:00 2001 From: Ido Schimmel Date: Wed, 6 Oct 2021 13:46:47 +0300 Subject: [PATCH 6/6] mlxsw: Add support for transceiver module extended state Add support for the transceiver module extended state and sub-state added in previous patch. The extended state is meant to describe link issues related to transceiver modules. Signed-off-by: Ido Schimmel Signed-off-by: Jakub Kicinski --- drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c index 7329bc84a8eee..84d4460f3dcde 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c @@ -96,6 +96,9 @@ mlxsw_sp_link_ext_state_opcode_map[] = { {1032, ETHTOOL_LINK_EXT_STATE_POWER_BUDGET_EXCEEDED, 0}, {1030, ETHTOOL_LINK_EXT_STATE_OVERHEAT, 0}, + + {1042, ETHTOOL_LINK_EXT_STATE_MODULE, + ETHTOOL_LINK_EXT_SUBSTATE_MODULE_CMIS_NOT_READY}, }; static void @@ -124,6 +127,10 @@ mlxsw_sp_port_set_link_ext_state(struct mlxsw_sp_ethtool_link_ext_state_opcode_m link_ext_state_info->cable_issue = link_ext_state_mapping.link_ext_substate; break; + case ETHTOOL_LINK_EXT_STATE_MODULE: + link_ext_state_info->module = + link_ext_state_mapping.link_ext_substate; + break; default: break; }