Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[求助/Help]物理机(没有IPMI)无法通过PXE引导启动 #21611

Closed
happinesslijian opened this issue Nov 15, 2024 · 9 comments
Closed
Labels
question Further information is requested

Comments

@happinesslijian
Copy link

背景叙述:
10.10.22.70是服务端 部署方式
已经启用了物理机管理服务
10.10.22.70服务端所在网络有DHCP
问题描述:
客户端物理机(没有IPMI)所在网络10.10.23.x 使用预注册方式配置了如下网络:
微信截图_20241115175728

10.10.23.x网段里没有DHCP,现在客户端PXE引导不起来,日志如下:
kubectl logs -n onecloud -f default-baremetal-agent-84d9fb7c7b-lm46p
lQLPJwvQ0_BuoxnNAWLNB2Kw2YuYmTfZK7QHHW1mUz87AA_1890_354

@happinesslijian happinesslijian added the question Further information is requested label Nov 15, 2024
@zexi
Copy link
Member

zexi commented Nov 18, 2024

@happinesslijian 可以用 tcpdump 抓包看下 baremetal-agent 服务有没有回复 dhcp?看日志应该是回复了,有可能客户端没有收到

@happinesslijian
Copy link
Author

使用命令tcpdump -i br0 port 67 or port 68 -vv抓包信息见附件tcpdump_dhcp.txt
使用命令kubectl logs -f default-baremetal-agent-84d9fb7c7b-h2mff -n onecloud拿到的容器日志见附件podlogs.txt并且一直无休止的输出podlogs.txt的内容 由客户端发起请求,但是看起来是baremetal-agent并没有进行回复,导致客户端无法正常从PXE引导启动 @zexi

tcpdump_dhcp.txt
podlogs.txt

@zexi
Copy link
Member

zexi commented Nov 20, 2024

@happinesslijian
我看日志是从 10.10.22.70 relay 过来的 dhcp 请求,和对应 pxe 子网 10.10.23.x 不在同一个子网,那就需要设置对应 pxe 网段的 dhcp relay 属性为 10.10.22.70 。

image

@happinesslijian
Copy link
Author

happinesslijian commented Nov 20, 2024

@zexi
image
PXE网络的dhcp_relay已经是10.10.22.70 但是客户端启动还是无法通过PXE引导 并且baremetal-agent容器会频繁的出现如下:

[info 2024-11-20 05:47:19 pxe.(*DHCPHandler).ServeDHCP(dhcp.go:84)]DHCP: from relay 10.10.22.70 packet, mac: 74:86:e2:1c:4e:ef
[info 2024-11-20 05:47:19 pxe.(*DHCPHandler).ServeDHCP(dhcp.go:84)]DHCP: from relay 10.10.22.70 packet, mac: 74:86:e2:1c:4e:ef
[info 2024-11-20 05:47:19 pxe.(*DHCPHandler).ServeDHCP(dhcp.go:84)]DHCP: from relay 10.10.22.70 packet, mac: 74:86:e2:1c:4e:ef
[info 2024-11-20 05:47:19 pxe.(*DHCPHandler).ServeDHCP(dhcp.go:84)]DHCP: from relay 10.10.22.70 packet, mac: 74:86:e2:1c:4e:ef
[info 2024-11-20 05:47:19 pxe.(*DHCPHandler).ServeDHCP(dhcp.go:84)]DHCP: from relay 10.10.22.70 packet, mac: 74:86:e2:1c:4e:ef
[info 2024-11-20 05:47:19 pxe.(*DHCPHandler).ServeDHCP(dhcp.go:84)]DHCP: from relay 10.10.22.70 packet, mac: 74:86:e2:1c:4e:ef
[info 2024-11-20 05:47:19 pxe.(*DHCPHandler).ServeDHCP(dhcp.go:84)]DHCP: from relay 10.10.22.70 packet, mac: 74:86:e2:1c:4e:ef
[info 2024-11-20 05:47:19 pxe.(*DHCPHandler).ServeDHCP(dhcp.go:84)]DHCP: from relay 10.10.22.70 packet, mac: 74:86:e2:1c:4e:ef

@zexi
Copy link
Member

zexi commented Nov 20, 2024

74:86:e2:1c:4e:ef 这个是物理机的 mac 吗?

@happinesslijian
Copy link
Author

74:86:e2:1c:4e:ef 这个是物理机的 mac 吗?

是的,是物理机的mac地址,这是在bios里拍摄的照片 @zexi
lQDPKeBSFP778g3NBQDNAtCwjgl6nShNxfIHI8S5IcOgAA_720_1280

@zexi
Copy link
Member

zexi commented Nov 20, 2024

@happinesslijian 那再让这个服务器 pxe 启动,同时 baremetal-agent 那边 tcpdump 抓包看下

@happinesslijian
Copy link
Author

@zexi
执行命令kubectl logs -f default-baremetal-agent-84d9fb7c7b-wg9g2 -n onecloud > baremetal-agent.log输出日志
执行命令tcpdump -i br0 port 67 or port 68 -vv > tcpdump.txt进行抓包
两条命令都是运行在10.10.22.70服务端上的,详情见附件
添加物理机如下:
0F266D54-88AD-4a2a-9541-92637DFA4053
tcpdump抓包和baremetal-agent日志见附件
baremetal-agent.log
tcpdump.txt

@happinesslijian
Copy link
Author

开启交换机DHCP_RELAY
不要开启host 服务DHCP Relay
网络配置如下图:
76c08564e52078dc2699e3d783b9524c
预注册机器可以从PXE正确引导启动

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants