Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: report drop and truncate in jobs table #19004

Closed
mberhault opened this issue Oct 3, 2017 · 14 comments
Closed

sql: report drop and truncate in jobs table #19004

mberhault opened this issue Oct 3, 2017 · 14 comments
Assignees
Labels
A-schema-changes A-telemetry C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
Milestone

Comments

@mberhault
Copy link
Contributor

Both operations can take a long long long time. For added visibility, we should include those in the jobs table.
eg: I dropped the only user table in a 1.7TB (total data used) cluster. Hours later, I can't see much change in usage (it's there, just subtle) and there's absolutely no visibility into the progress or even whether it's doing anything (other than a single goroutine on the node executing the drop).

@mberhault mberhault added this to the 1.2 milestone Oct 3, 2017
@mberhault
Copy link
Contributor Author

tentatively adding 1.2 milestone.

@dianasaur323
Copy link
Contributor

odd... I guess I always assumed those were considered schema changes... off the top of my head, I don't know what we would take off to do this, so I think a "tentative" add sounds good. I'll follow up once I figure out what capacity looks like. Thanks for filing.

@cuongdo
Copy link
Contributor

cuongdo commented Oct 5, 2017

+1 on this. From a logical point of view, all SQL statements belong in either SHOW QUERIES or SHOW JOBS. Since the other schema changes show up in SHOW JOBS, that seems like a reasonable place to put DROP and TRUNCATE.

@vivekmenezes
Copy link
Contributor

Seems okay. The only difference from other schema changes is that the DROP and TRUNCATE are complete as far as the user is concerned (The table is gone/or the table is empty). In other words the schema change is complete. It's just the deleted data being GC-ed in the background after the schema change and the expectation that the disk usages goes down significantly after the operation that is surprising.

@tbg
Copy link
Member

tbg commented Nov 15, 2017

In our production testing, where clusters are bound to run out of disk space eventually, we've found that the currently functionality is insufficient. It turns out that DROP TABLE is essentially never fast enough to thwart impending doom, even with aggressive GC TTLs. This could be due to the actual background deletion or slowness of the GC queue in purging the deletions, but either way it's not acceptable that there is no indication of the progress even though the background task is still ongoing.

There is likely performance that we need to optimize, but knowing how far the actual deletion has processed is a baseline that doesn't seem too difficult to achieve.

See also #19329 (comment) where we discuss using DROP for that purpose.

@dianasaur323
Copy link
Contributor

This begs to question why we count certain commands as schema changes while other are not. We currently define them based on whether or not the operation requires backfilling, but I've been getting feedback (and I'm personally confused) since it matches the technical implementation, not how users would expect things to be classified.

@vivekmenezes
Copy link
Contributor

@dianasaur323 commands that change the schema are called schema changes. Some of these commands need to backfill data and thus take time to execute. In the case of DROP once SQL stops referencing the data SQL responds back to the user that it has completed the schema change, which it indeed has. The only problem is that users also expect the command to delete all the data associated with the table, which we do asynchronously. We do the same with TRUNCATE, where we don't delete the all the underlying data immediately, just provide the illusion via SQL that it's gone.

@dianasaur323
Copy link
Contributor

@vivekmenezes On that point, the reason why I'm asking is because in the jobs table, we only show progress of schema changes that require backfilling, hence the confusion. Both the things you mentioned (DROP and TRUNCATE) have confused some users out in the field. I'm wondering if we should do something about that?

@vivekmenezes
Copy link
Contributor

The reason why we did it this way is because users were complaining about DROP/TRUNCATE being too slow. The correct solution is to drop this data quickly which we can't accomplish in the near term. There is also the added concern that users need to query the old data using AS OF SYSTEM TIME which prevents us from just blowing away the ranges.

I agree we need to do something

@tbg
Copy link
Member

tbg commented Nov 16, 2017

I think we want to rely on AS OF SYSTEM TIME as a recovery mechanism for fat-fingering a drop table. I don't think that's principally in the way of removing the data more quickly if desired, though (a bit of care has been taken that the quick-delete can't be fat-fingered easily).

I'd really just like to track the status of the deletion (which is a kind of backfill) in the jobs tab. I think it's OK that DROP returns and the job is still running. After all, the DB is gone.

tbg added a commit to tbg/cockroach that referenced this issue Dec 5, 2017
Motivated by the fact that previously, `DROP TABLE` did not allow any
kind of introspection. Now, at least there's something you can look
at in `/debug/requests` to see activity.

Touches cockroachdb#19004

Release note: None
tbg added a commit to tbg/cockroach that referenced this issue Dec 6, 2017
Motivated by the fact that previously, `DROP TABLE` did not allow any
kind of introspection. Now, at least there's something you can look
at in `/debug/requests` to see activity.

Touches cockroachdb#19004

Release note: None
tbg added a commit to tbg/cockroach that referenced this issue Dec 8, 2017
Motivated by the fact that previously, `DROP TABLE` did not allow any
kind of introspection. Now, at least there's something you can look
at in `/debug/requests` to see activity.

Touches cockroachdb#19004

Release note: None
@dianasaur323 dianasaur323 modified the milestones: 1.2, 1.3 Dec 10, 2017
@dianasaur323
Copy link
Contributor

I'm tentatively marking this as 1.3 right now. I don't think we have capacity to fix this now, and I think it might make sense to wrap it into some jobs work that is going to tentatively get scheduled for 1.3...

@knz knz added the A-telemetry label May 9, 2018
@knz knz added A-bulkio-schema-changes C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) labels May 9, 2018
eriktrinh pushed a commit to eriktrinh/cockroach that referenced this issue Sep 10, 2018
Creates a job for truncate and drop table statements, which is completed
when the GC TTL expires and the table data and ID is deleted.

Fixes cockroachdb#19004

Release note: None
eriktrinh pushed a commit to eriktrinh/cockroach that referenced this issue Sep 11, 2018
Creates a job for statements involving dropping or truncated tables, including
DROP DATABASE. The job is completed when the GC TTL expires and both table
data and ID is deleted for each of the tables involved.

Fixes cockroachdb#19004

Release note: None
eriktrinh pushed a commit to eriktrinh/cockroach that referenced this issue Sep 12, 2018
Creates a job for statements involving dropping or truncated tables, including
DROP DATABASE. The job is completed when the GC TTL expires and both table
data and ID is deleted for each of the tables involved.

Fixes cockroachdb#19004

Release note: None
eriktrinh pushed a commit to eriktrinh/cockroach that referenced this issue Sep 17, 2018
Creates a job for statements involving dropping or truncated tables, including
DROP DATABASE. The job is completed when the GC TTL expires and both table
data and ID is deleted for each of the tables involved.

Fixes cockroachdb#19004

Release note: None
eriktrinh pushed a commit to eriktrinh/cockroach that referenced this issue Sep 18, 2018
Creates a job for statements involving dropping or truncated tables, including
DROP DATABASE. The job is completed when the GC TTL expires and both table
data and ID is deleted for each of the tables involved.

Detailed running statuses are added to provide visibility to the
progress of the dropping or truncating of tables. This is surfaced by
adding an additional status field to the payload proto of jobs, and
concatenated to the running status when populating the interal jobs
table.

For dropping or truncating jobs, the detailed running status is
determined by the status of the table at the earliest stage of the
schema change.

Fixes cockroachdb#19004

Release note: None
eriktrinh pushed a commit to eriktrinh/cockroach that referenced this issue Sep 19, 2018
Creates a job for statements involving dropping or truncated tables, including
DROP DATABASE. The job is completed when the GC TTL expires and both table
data and ID is deleted for each of the tables involved.

Detailed running statuses are added to provide visibility to the
progress of the dropping or truncating of tables. This is surfaced by
adding an additional status field to the payload proto of jobs, and
concatenated to the running status when populating the interal jobs
table.

For dropping or truncating jobs, the detailed running status is
determined by the status of the table at the earliest stage of the
schema change.

Fixes cockroachdb#19004

Release note: None
eriktrinh pushed a commit to eriktrinh/cockroach that referenced this issue Sep 19, 2018
Creates a job for statements involving dropping or truncated tables, including
DROP DATABASE. The job is completed when the GC TTL expires and both table
data and ID is deleted for each of the tables involved.

Detailed running statuses are added to provide visibility to the
progress of the dropping or truncating of tables. This is surfaced by
adding an additional status field to the payload proto of jobs, and
concatenated to the running status when populating the interal jobs
table.

For dropping or truncating jobs, the detailed running status is
determined by the status of the table at the earliest stage of the
schema change.

Fixes cockroachdb#19004

Release note: None
eriktrinh pushed a commit to eriktrinh/cockroach that referenced this issue Sep 19, 2018
Creates a job for statements involving dropping or truncated tables, including
DROP DATABASE. The job is completed when the GC TTL expires and both table
data and ID is deleted for each of the tables involved.

Detailed running statuses are added to provide visibility to the
progress of the dropping or truncating of tables. This is surfaced by
adding an additional status field to the payload proto of jobs, and
concatenated to the running status when populating the interal jobs
table.

For dropping or truncating jobs, the detailed running status is
determined by the status of the table at the earliest stage of the
schema change.

Fixes cockroachdb#19004

Release note: None
craig bot pushed a commit that referenced this issue Sep 20, 2018
29993: sql: create jobs for truncated and dropped tables r=eriktrinh a=eriktrinh

Creates a job for statements involving dropping or truncated tables, including
DROP DATABASE. The job is completed when the GC TTL expires and both table
data and ID is deleted for each of the tables involved.

Detailed running statuses are added to provide visibility to the
progress of the dropping or truncating of tables. This is surfaced by
adding an additional status field to the payload proto of jobs, and
concatenated to the running status when populating the interal jobs
table.

For dropping or truncating jobs, the detailed running status is
determined by the status of the table at the earliest stage of the
schema change.

Fixes #19004

Co-authored-by: Erik Trinh <[email protected]>
@craig craig bot closed this as completed in #29993 Sep 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-schema-changes A-telemetry C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
Projects
None yet
Development

No branches or pull requests

8 participants