Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cli: run node status and node ls commands through SQL #25094

Merged
merged 1 commit into from
May 12, 2018

Conversation

2nishantg
Copy link

@2nishantg 2nishantg commented Apr 26, 2018

Earlier node status command used to hit statusServer api endpoint and node ls
internally called node status.

After this commit, node status and node ls commands will internally query various tables in
crdb_internal and format the result.

Closes #20713.

Release note: None

@2nishantg 2nishantg requested a review from a team as a code owner April 26, 2018 09:51
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@2nishantg
Copy link
Author

2nishantg commented Apr 26, 2018

Review status: 0 of 2 files reviewed at latest revision, all discussions resolved, some commit checks failed.


a discussion (no related file):
An example after this patch(first one is after the patch):

~/g/s/g/c/cockroach λ ./cockroach node status --all
+----+-----------------+-----------------------+----------------------------------+----------------------------------+---------+------------------+-----------------------+--------+--------------------+------------------------+------------+-----------+-------------+--------------+--------------+-------------------+--------------------+-------------+
| id |     address     |         build         |            updated_at            |            started_at            | is_live | replicas_leaders | replicas_leaseholders | ranges | ranges_unavailable | ranges_underreplicated | live_bytes | key_bytes | value_bytes | intent_bytes | system_bytes | gossiped_replicas | is_decommissioning | is_draining |
+----+-----------------+-----------------------+----------------------------------+----------------------------------+---------+------------------+-----------------------+--------+--------------------+------------------------+------------+-----------+-------------+--------------+--------------+-------------------+--------------------+-------------+
|  1 | alchemist:26257 | v2.1.0-alpha.20180416 | 2018-04-24 06:50:55.292539+00:00 | 2018-04-25 17:32:26.952795+00:00 |    true |               14 |                    13 |     38 |                  0 |                      0 |  444148584 |   2912844 |   441281991 |            0 |       575955 |                38 |              false |    false    |
|  2 | alchemist:26260 | v2.1.0-alpha.20180416 | 2018-04-24 06:25:23.831566+00:00 | 2018-04-25 17:32:25.492247+00:00 |    true |               12 |                    12 |     38 |                  0 |                      0 |  552248424 |   2341208 |   549953432 |            0 |       574683 |                38 |              false |    false    |
|  3 | alchemist:26259 | v2.1.0-alpha.20180416 | 2018-04-25 17:29:19.93002+00:00  | 2018-04-25 17:29:19.93441+00:00  |   false |                0 |                     0 |     13 |                  0 |                      0 |  238149449 |   3037039 |   235112552 |            0 |       227450 |                13 |              false |    false    |
|  4 | alchemist:26258 | v2.1.0-alpha.20180416 | 2018-04-25 17:18:45.135884+00:00 | 2018-04-25 17:32:25.179784+00:00 |    true |               12 |                    12 |     38 |                  0 |                      0 |  323669391 |   3527563 |   320188044 |            0 |       578439 |                38 |              false |    false    |
+----+-----------------+-----------------------+----------------------------------+----------------------------------+---------+------------------+-----------------------+--------+--------------------+------------------------+------------+-----------+-------------+--------------+--------------+-------------------+--------------------+-------------+
(4 rows)
~/g/s/g/c/cockroach λ cockroach node status --all
+----+-----------------+-----------------------+---------------------+---------------------+---------+------------------+-----------------------+--------+--------------------+------------------------+------------+-----------+-------------+--------------+--------------+-------------------+--------------------+-------------+
| id |     address     |         build         |     updated_at      |     started_at      | is_live | replicas_leaders | replicas_leaseholders | ranges | ranges_unavailable | ranges_underreplicated | live_bytes | key_bytes | value_bytes | intent_bytes | system_bytes | gossiped_replicas | is_decommissioning | is_draining |
+----+-----------------+-----------------------+---------------------+---------------------+---------+------------------+-----------------------+--------+--------------------+------------------------+------------+-----------+-------------+--------------+--------------+-------------------+--------------------+-------------+
|  1 | alchemist:26257 | v2.1.0-alpha.20180416 | 2018-04-25 17:32:26 | 2018-04-24 06:50:55 |    true |               14 |                    13 |     38 |                  0 |                      0 |  444148584 |   2912844 |   441281991 |            0 |       575955 |                38 |              false |    false    |
|  2 | alchemist:26260 | v2.1.0-alpha.20180416 | 2018-04-25 17:32:25 | 2018-04-24 06:25:23 |    true |               12 |                    12 |     38 |                  0 |                      0 |  552248424 |   2341208 |   549953432 |            0 |       574683 |                38 |              false |    false    |
|  3 | alchemist:26259 | v2.1.0-alpha.20180416 | 2018-04-25 17:29:19 | 2018-04-25 17:29:19 |   false |                0 |                     0 |     13 |                  0 |                      0 |  238149449 |   3037039 |   235112552 |            0 |       227450 |                13 |              false |    false    |
|  4 | alchemist:26258 | v2.1.0-alpha.20180416 | 2018-04-25 17:32:25 | 2018-04-25 17:18:45 |    true |               12 |                    12 |     38 |                  0 |                      0 |  323669391 |   3527563 |   320188044 |            0 |       578439 |                38 |              false |    false    |
+----+-----------------+-----------------------+---------------------+---------------------+---------+------------------+-----------------------+--------+--------------------+------------------------+------------+-----------+-------------+--------------+--------------+-------------------+--------------------+-------------+
(4 rows)

Since I directly join the queries and print the result, updated_at and started_at fields after the patch have microseconds and timezone info also. Removing this will require parsing rows and re-formatting these columns(I couldn't find a way to specify format for timestamp columns in SQL query).
Let me know if this needs to be changed to remove microseconds and timezone.
PS: I have not fixed the tests to accomodate for this change right now, will do that once this is resolved.


pkg/cli/node.go, line 196 at r1 (raw file):

Quoted 4 lines of code…
	if cliCtx.cmdTimeout != 0 {
		if err := conn.Exec(fmt.Sprintf("SET statement_timeout=%d", cliCtx.cmdTimeout), nil); err != nil {
			return nil, nil, err
		}

Earlier timeout were handled by passing a context with deadline since statusServer respects context timeouts.
sqlConn doesn't implement QueryContext, so that cannot be used. I was stuck in trying to implement it as I don't know how to cleanup resources on server and stop query once I recieve signal on ctx.Done().
This is the hack I have implemented to go around it. Please tell me if this is not acceptable or if you know of a better way.
PS: I have not fixed the tests to accomodate for the changed error message right now, waiting on resolution for this first.


Comments from Reviewable

@2nishantg 2nishantg changed the title cli: run node status and node ls commands through SQL [WIP] cli: run node status and node ls commands through SQL Apr 26, 2018
@a-robinson a-robinson requested a review from tbg April 30, 2018 16:06
@tbg
Copy link
Member

tbg commented May 5, 2018

Sorry for letting this sit, @Nishant9! Will take a look soon.

@tbg
Copy link
Member

tbg commented May 8, 2018

Reviewed 2 of 2 files at r1.
Review status: all files reviewed at latest revision, 2 unresolved discussions, some commit checks failed.


a discussion (no related file):
This looks really good!

Let me know if this needs to be changed to remove microseconds and timezone.

No, I think this is fine.

Could you ping me when I should do the final review, after tests have been fixed up?


pkg/cli/node.go, line 126 at r1 (raw file):

func runStatusNodeInner(args []string) ([]string, [][]string, error) {

	joinUsingID := func(queries []string) (query string) {

You could take a queries ...string here and then you could call it as joinUsingID(query1, query2).


pkg/cli/node.go, line 196 at r1 (raw file):

Previously, Nishant9 (Nishant Gupta) wrote…
	if cliCtx.cmdTimeout != 0 {
		if err := conn.Exec(fmt.Sprintf("SET statement_timeout=%d", cliCtx.cmdTimeout), nil); err != nil {
			return nil, nil, err
		}

Earlier timeout were handled by passing a context with deadline since statusServer respects context timeouts.
sqlConn doesn't implement QueryContext, so that cannot be used. I was stuck in trying to implement it as I don't know how to cleanup resources on server and stop query once I recieve signal on ctx.Done().
This is the hack I have implemented to go around it. Please tell me if this is not acceptable or if you know of a better way.
PS: I have not fixed the tests to accomodate for the changed error message right now, waiting on approval for this first.

If this works this is good for a first cut, we can file an issue with the cli team for follow-up work. I think we just need to address the TODOs here?

//lint:ignore SA1019 TODO(mjibson): clean this up to use go1.8 APIs
driver.Execer
//lint:ignore SA1019 TODO(mjibson): clean this up to use go1.8 APIs
driver.Queryer


Comments from Reviewable

@codecov-io
Copy link

Codecov Report

❗ No coverage uploaded for pull request base (master@4eb0f4d). Click here to learn what that means.
The diff coverage is 7.4%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master   #25094   +/-   ##
=========================================
  Coverage          ?   76.83%           
=========================================
  Files             ?      875           
  Lines             ?   123657           
  Branches          ?        0           
=========================================
  Hits              ?    95006           
  Misses            ?    21711           
  Partials          ?     6940
Impacted Files Coverage Δ
pkg/sql/opt/memo/logical_props.go 100% <ø> (ø)
pkg/sql/control_jobs.go 77.55% <ø> (ø)
pkg/sql/alter_table.go 76.18% <ø> (ø)
pkg/sql/conn_fsm.go 93.04% <ø> (ø)
pkg/sql/conn_io.go 70.65% <ø> (ø)
pkg/sql/opt/idxconstraint/index_constraints.go 91.43% <ø> (ø)
pkg/server/admin.go 56.88% <ø> (ø)
pkg/sql/cancel_queries.go 84.61% <ø> (ø)
pkg/server/testserver.go 43.37% <ø> (ø)
pkg/sql/opt/memo/expr_view.go 95.32% <ø> (ø)
... and 58 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4eb0f4d...c22b864. Read the comment docs.

@2nishantg
Copy link
Author

Review status: 1 of 3 files reviewed at latest revision, 3 unresolved discussions.


pkg/cli/node.go, line 126 at r1 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

You could take a queries ...string here and then you could call it as joinUsingID(query1, query2).

I have to collect all the queries to be joined in a slice anyway. So, I thought there was no use unpacking and packing a slice.


Comments from Reviewable

@tbg
Copy link
Member

tbg commented May 8, 2018

Reviewed 2 of 2 files at r2.
Review status: all files reviewed at latest revision, 3 unresolved discussions, some commit checks failed.


pkg/cli/node.go, line 126 at r1 (raw file):

Previously, Nishant9 (Nishant Gupta) wrote…

I have to collect all the queries to be joined in a slice anyway. So, I thought there was no use unpacking and packing a slice.

This would make the population of baseQuery nicer (and the other callsite becomes joinUsingID(queriesToJoin...). But I'm fine leaving as is.


Comments from Reviewable

@tbg
Copy link
Member

tbg commented May 8, 2018

CI currently fails here. I hope this is just due to a change in format of the output there.


Review status: all files reviewed at latest revision, 3 unresolved discussions, some commit checks failed.


Comments from Reviewable

@2nishantg 2nishantg force-pushed the node_sql branch 2 times, most recently from c1037d7 to d28b792 Compare May 9, 2018 08:00
@2nishantg
Copy link
Author

@tschottdorf This is ready for final review, I have fixed all the tests.

@tbg
Copy link
Member

tbg commented May 10, 2018

Reviewed 2 of 2 files at r3.
Review status: all files reviewed at latest revision, 3 unresolved discussions.


pkg/cli/node.go, line 145 at r3 (raw file):

	maybeAddActiveNodesFilter := func(query string) string {
		activeNodesFilter := "decommissioning = false OR expiration > concat((now()::decimal(30,9))::text, ',0')"

Can't you just use is_live here that you define below?


pkg/cli/node.go, line 157 at r3 (raw file):

			maybeAddActiveNodesFilter(
				`SELECT node_id AS id,
                CASE WHEN expiration > concat((now()::decimal(30,9))::text, ',0')

This looks weird in that you're comparing these lexicographically. This probably works since the length of timestamps rarely changes, but consider the two timestamps 10 and 9. Wouldn't you come back saying that 9 is larger?

I think instead you can just strip the logical component and compare the rest as numbers (and yeah, this is all a bit awkward because we don't have a type for hybrid timestamps).


Comments from Reviewable

@2nishantg
Copy link
Author

Review status: 2 of 3 files reviewed at latest revision, 5 unresolved discussions.


pkg/cli/node.go, line 145 at r3 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Can't you just use is_live here that you define below?

is_live is different from activeNodes, even if a node is not live it is considered active until it is decommissioned.
If you meant simplifying this filter to decommissioning = false or is_live = true, I don't know how we can give a filter on a custom SELECT clause in SQL.


Comments from Reviewable

@2nishantg
Copy link
Author

pkg/cli/node.go, line 157 at r3 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

This looks weird in that you're comparing these lexicographically. This probably works since the length of timestamps rarely changes, but consider the two timestamps 10 and 9. Wouldn't you come back saying that 9 is larger?

I think instead you can just strip the logical component and compare the rest as numbers (and yeah, this is all a bit awkward because we don't have a type for hybrid timestamps).

Done.


Comments from Reviewable

Earlier `node status` command used to hit statusServer api endpoint and `node ls`
internally called `node status`.

Now, `node status` and `node ls` commands internally query various tables in
crdb_internal and format the result.

Closes cockroachdb#20713.

Release note: None
@2nishantg 2nishantg changed the title [WIP] cli: run node status and node ls commands through SQL cli: run node status and node ls commands through SQL May 11, 2018
@tbg
Copy link
Member

tbg commented May 11, 2018

:lgtm:

Thanks a bunch, @Nishant9!

bors r+


Reviewed 1 of 1 files at r4.
Review status: all files reviewed at latest revision, 4 unresolved discussions.


pkg/cli/node.go, line 145 at r3 (raw file):

Previously, Nishant9 (Nishant Gupta) wrote…

is_live is different from activeNodes, even if a node is not live it is considered active until it is decommissioned.
If you meant simplifying this filter to decommissioning = false or is_live = true, I don't know how we can give a filter on a custom SELECT clause in SQL.

You're right, sorry about the noise!


Comments from Reviewable

@tbg
Copy link
Member

tbg commented May 11, 2018

Filed #25435 about having context cancellation available for cli usage of SQL.

@craig
Copy link
Contributor

craig bot commented May 11, 2018

Build failed (retrying...)

craig bot pushed a commit that referenced this pull request May 11, 2018
25094: cli: run `node status` and `node ls` commands through SQL r=tschottdorf a=Nishant9

Earlier `node status` command used to hit statusServer api endpoint and `node ls`
internally called `node status`.

After this commit, `node status` and `node ls` commands will internally query various tables in
crdb_internal and format the result.

Closes #20713.

Release note: None

25314: scripts: add a winworker script to create Windows VMs on GCE r=tschottdorf a=benesch

It's quite useful to be able to quickly spin up a Windows VM to test our
Windows binaries.

Release note: None

Co-authored-by: Nishant Gupta <[email protected]>
Co-authored-by: Nikhil Benesch <[email protected]>
Co-authored-by: Tobias Schottdorf <[email protected]>
@craig
Copy link
Contributor

craig bot commented May 12, 2018

Build succeeded

@craig craig bot merged commit 61428e3 into cockroachdb:master May 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants