Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Several CLI commands failed to connect to lotus-miner #7072

Closed
7 tasks done
William8Work opened this issue Aug 14, 2021 · 15 comments
Closed
7 tasks done

Several CLI commands failed to connect to lotus-miner #7072

William8Work opened this issue Aug 14, 2021 · 15 comments
Assignees
Labels
kind/bug Kind: Bug P1 P1: Must be resolved team/ignite Issues and PRs being tracked by Team Ignite at Protocol Labs
Milestone

Comments

@William8Work
Copy link

William8Work commented Aug 14, 2021

Checklist

  • This is not a security-related bug/issue. If it is, please follow please follow the security policy.
  • This is not a question or a support request. If you have any lotus related questions, please ask in the lotus forum.
  • This is not a new feature request. If it is, please file a feature request instead.
  • This is not an enhancement request. If it is, please file a improvement suggestion instead.
  • I have searched on the issue tracker and the lotus forum, and there is no existing related issue or discussion.
  • I am running the Latest release, or the most recent RC(release canadiate) for the upcoming release or the dev branch(master), or have an issue updating to any of these.
  • I did not make any code changes to lotus.

Lotus component

lotus miner - mining and block production

Lotus Version

Daemon:  1.11.1-rc2+mainnet+git.40449f1cc+api1.2.0
Local: lotus-miner version 1.11.1-rc2+mainnet+git.40449f1cc

Describe the Bug

After upgraded to v1.11.1-rc2, I tried to run cli commands on worker nodes:

lotus-miner info
lotus-miner storage-deals list
lotus-miner sealing jobs
lotus-miner sealing workers
lotus-miner sectors list --fast

These commands run successful in miner node. However, in the worker nodes (separate machines) encountered the following:

these commands failed:

lotus-miner info
lotus-miner storage-deals list

but these commands works:

lotus-miner sealing jobs
lotus-miner sealing workers
lotus-miner sectors list --fast

The worker machine has the proper MINER_API_INFO env set up so the lotus-miner sealing jobs and other commands are able to success. However, lotus-miner info and lotus-miner storage-deals list failed.

Logging Information

$ lotus-miner info
ERROR: could not get API info: repo directory does not exist. Make sure your configuration is correct

Repo Steps

  1. Run lotus-miner info command in lotus miner machine as well as a worker machine.
  2. The command will success in miner machine but failed in worker machine.
@Angelo-gh3990
Copy link

I can confirm the same issue on :

Daemon: 1.11.1-rc3+mainnet+git.56c35ff1e+api1.2.0
Local: lotus-miner version 1.11.1-rc3+mainnet+git.56c35ff1e

running lotus daemon on separate machine

command's run fine on rc1

@6enno
Copy link

6enno commented Aug 15, 2021

I confirm similar issue on m1.3.5

$ lm info
ERROR: malformed HTTP response "\x13/multistream/1.0.0"
$ lm sealing workers
Worker 7b055dae-02c8-40e2-83ef-6cee421802d2, host hectorb
	CPU:  [                                                                ] 0/16 core(s) in use
	RAM:  [                                                                ] 1% 5.533 GiB/377.6 GiB
	VMEM: [                                                                ] 0% 5.533 GiB/632.6 GiB
	GPU: GeForce RTX 3090, not used
Worker 84960bea-960d-48c8-b799-1bbee101f4a3, host HectorA
	CPU:  [                                                                ] 0/16 core(s) in use
	RAM:  [||||||                                                          ] 10% 12.73 GiB/125.8 GiB
	VMEM: [||                                                              ] 3% 12.73 GiB/381.8 GiB
	GPU: GeForce RTX 2080 Ti, not used
$ lm version
Daemon:  1.11.1-m1.3.5+mainnet+git.3ff8e256b+api1.2.0
Local: lotus-miner version 1.11.1-m1.3.5+mainnet+git.3ff8e256b

@dayou5168
Copy link

Guys, i reply in your slack thread. myabe you can try the solution I suggested

@jennijuju jennijuju added P1 P1: Must be resolved and removed need/triage labels Aug 15, 2021
@jennijuju
Copy link
Member

@William8Work could you please run lotus-miner -vv info and send us any error warning log that potentially appears?

@jennijuju
Copy link
Member

I can confirm the same issue on :

Daemon: 1.11.1-rc3+mainnet+git.56c35ff1e+api1.2.0
Local: lotus-miner version 1.11.1-rc3+mainnet+git.56c35ff1e

running lotus daemon on separate machine

command's run fine on rc1

Is it not working on the miner node or worker node?

@jennijuju

This comment has been minimized.

@jennijuju
Copy link
Member

From @William8Work

confirmed @jennijuju - PLFD. It works on miner nodes but workers nodes fails. Also, while lotus-miner info and lotus-miner storage-deals list fails, but lotus-miner sealing workers and lotus-miner sectors list success in the worker nodes.

@William8Work
Copy link
Author

William8Work commented Aug 15, 2021

Running this on worker nodes:

$ lotus-miner -vv info

using raw API v0 endpoint: ws://10.1.18.180:2345/rpc/v0
using miner API v0 endpoint: ws://10.1.18.180:2345/rpc/v0
ERROR: could not get API info: repo directory does not exist. Make sure your configuration is correct

Running the same command in miner node:
$ lotus-miner --vv info

using raw API v0 endpoint: ws://10.1.18.180:2345/rpc/v0
using miner API v0 endpoint: ws://10.1.18.180:2345/rpc/v0
using raw API v0 endpoint: ws://10.1.18.180:2345/rpc/v0
using markets API v0 endpoint: ws://10.1.18.180:2345/rpc/v0
using raw API v0 endpoint: ws://10.1.18.166:1234/rpc/v0
using full node API v0 endpoint: ws://10.1.18.166:1234/rpc/v0
Enabled subsystems (from miner API): [Mining Sealing SectorStorage Markets]
Enabled subsystems (from markets API): [Mining Sealing SectorStorage Markets]
Chain: [sync ok] [basefee 136.919 pFIL]
using raw API v0 endpoint: ws://10.1.18.180:2345/rpc/v0
using miner API v0 endpoint: ws://10.1.18.180:2345/rpc/v0
Miner: f08399 (32 GiB sectors)
Power: 270 Ti / 9.14 Ei (0.0028%)
        Raw: 252.7 TiB / 9.134 EiB (0.0026%)
        Committed: 253 TiB
        Proving: 252.7 TiB
Projected average block win rate: 2.84/week (every 59h9m5s)
Projected block win with 99.9% probability every 408h34m31s
(projections DO NOT account for future network and miner growth)

Miner Balance:    3132.572 FIL
      PreCommit:  214.716 mFIL
      Pledge:     2333.248 FIL
      Vesting:    732.939 FIL
      Available:  66.171 FIL
Market Balance:   5.77 FIL
       Locked:    2.617 FIL
       Available: 3.153 FIL
Worker Balance:   322.543 FIL
       Control:   276.033 FIL
Total Spendable:  667.9 FIL

Sectors:
        Total: 8376
        Proving: 8110
        WaitSeed: 1
        Committing: 8
        Removed: 257

Storage Deals: 352, 6.624 TiB
      Active:  351  6.593 TiB (Verified: 130 2.251 TiB)
      Sealing: 1    32 GiB    (Verified: 1   32 GiB)

Retrieval Deals (complete): 9, 170 GiB
$ lotus-miner version
Daemon:  1.11.1-rc2+mainnet+git.40449f1cc+api1.2.0
Local: lotus-miner version 1.11.1-rc2+mainnet+git.40449f1cc

@Angelo-gh3990
Copy link

on my miner node :

miner:~# lotus-miner -vv info
using raw API v0 endpoint: ws://10.10.10.140:2345/rpc/v0
using miner API v0 endpoint: ws://10.10.10.140:2345/rpc/v0
ERROR: could not get API info: could not get api endpoint: API not running (no endpoint)

@Angelo-gh3990
Copy link

netstat -an :
process is running on that port : tcp 0 0 0.0.0.0:2345 0.0.0.0:* LISTEN

@Angelo-gh3990
Copy link

other command:

miner:~# lotus-miner -vv sectors list --fast
using raw API v0 endpoint: ws://10.10.10.140:2345/rpc/v0
using miner API v0 endpoint: ws://10.10.10.140:2345/rpc/v0
using raw API v0 endpoint: ws://10.10.10.101:42002/rpc/v0
using full node API v0 endpoint: ws://10.10.10.101:42002/rpc/v0
ID State OnChain Active Deals
0 Proving YES YES CC

seems to connect just fine on that port

@Angelo-gh3990
Copy link

I had set : LOTUS_MARKETS_PATH, was from a test a while back
after unsetting/removing it : unset LOTUS_MARKETS_PATH it works on my miner

@jennijuju jennijuju added this to the v1.11.2 milestone Aug 16, 2021
@jennijuju jennijuju added the team/ignite Issues and PRs being tracked by Team Ignite at Protocol Labs label Aug 16, 2021
@nonsense
Copy link
Member

nonsense commented Aug 16, 2021

@William8Work

1. Could you explain what you mean with worker nodes? It seems like you are running lotus-miner commands, and not lotus-worker info for example.

Running this on worker nodes:

$ lotus-miner -vv info

using raw API v0 endpoint: ws://10.1.18.180:2345/rpc/v0
using miner API v0 endpoint: ws://10.1.18.180:2345/rpc/v0
ERROR: could not get API info: repo directory does not exist. Make sure your configuration is correct

In the error message we see repo directory does not exist, so I guess LOTUS_MINER_PATH or LOTUS_MARKETS_PATH is pointing at a location that does not exist, or they are not set.

Overall I am a bit confused as it is not clear what your setup is - are you running MRA (miner node + markets node) and then individual worker nodes for various sealing operations?

Having read the Slack thread, I now understand that you are running lotus-miner CLI commands on your lotus-worker nodes, without running MRA in split mode (i.e. lotus-miner is handling all subsystems - mining, sealing, proving, markets).

@William8Work could you confirm that all 3 API_INFO env vars are setup correctly? In order to interact with lotus-miner info, the command needs access to the markets subsystem and to the proving/storage subsystems and to a full node, so you need 3 env vars, for example:

MARKETS_API_INFO=token:/ip4/127.0.0.1/tcp/2345/http
MINER_API_INFO=token:/ip4/127.0.0.1/tcp/8787/http
FULLNODE_API_INFO=token:/ip4/127.0.0.1/tcp/1234/http

Having debugged this, we should further improve the error messages, because ERROR: could not get API info: repo directory does not exist. Make sure your configuration is correct is rather confusing in this case.

@William8Work
Copy link
Author

@nonsense ok, since my worker nodes already have MINER_API_INFO and FULLNODE_API_INFO, I added the MARKETS_API_INFO env. and now the lotus-miner info and lotus-miner storage-deals commands works in worker nodes!!

the only small change to your advice above is that I set the market api exactly same value as miner api (same IP address, same port):

MARKETS_API_INFO=token:/ip4/127.0.0.1/tcp/2345/http
MINER_API_INFO=token:/ip4/127.0.0.1/tcp/2345/http
FULLNODE_API_INFO=token:/ip4/127.0.0.1/tcp/1234/http

@nonsense
Copy link
Member

Now that filecoin-project/filecoin-docs#1012 and #7088 are merged, I think we can close this.

For now miners have to specify all environment variables in order to connect to a remote miner, and in order for lotus-miner info and other CLI commands to work as expected.


Lotus CLI configuration will get a revamp in the near future, when we plan on simplifying it and unifying it at one place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Kind: Bug P1 P1: Must be resolved team/ignite Issues and PRs being tracked by Team Ignite at Protocol Labs
Projects
None yet
Development

No branches or pull requests

6 participants