Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql,backupccl: failed to WaitForNoVersion on deleted descriptor during restore cancellation #88913

Open
stevendanna opened this issue Sep 28, 2022 · 7 comments
Labels
A-disaster-recovery A-schema-catalog Related to the schema descriptors collection and the catalog API in general. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. E-easy Easy issue to tackle, requires little or no CockroachDB experience T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)

Comments

@stevendanna
Copy link
Collaborator

stevendanna commented Sep 28, 2022

Describe the problem

When attempting to write a datadriven test that runs a query after a cancelled cluster restore, I see the following under stress:

Details
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602  encountered internal error:                                                                                                                                                                                                                                                                                                                                                                   [98/1347]
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +referenced descriptor ID 123: descriptor not found                                                                                              
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +(1) assertion failure                                                                                                                           
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +Wraps: (2) attached stack trace                                                                                                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  -- stack trace:                                                                                                                               
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv.wrapError                                                                   
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv/catalog_query.go:128                                                   
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv.catalogQuery.processDescriptorResultRow                                     
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv/catalog_query.go:111                                                   
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv.catalogQuery.query                                                          
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv/catalog_query.go:73                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv.catalogReader.GetDescriptorEntries                                          
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv/catalog_reader.go:215                                                  
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv.(*StoredCatalog).EnsureFromStorageByIDs                                     
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv/stored_catalog.go:283                                                  
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/catalog/descs.(*Collection).getDescriptorsByID                                                     
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/catalog/descs/descriptor.go:137                                                               
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/catalog/descs.(*Collection).getDescriptorByName                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/catalog/descs/descriptor.go:280                                                               
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/catalog/descs.(*Collection).getSchemaByName                                                        
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/catalog/descs/schema.go:73                                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/catalog/descs.(*Collection).GetImmutableSchemaByName                                                                                                                                                                                                                                                                                                      
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/catalog/descs/schema.go:61                                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql.(*schemaResolver).LookupSchema                                                                                                                                                                                                                                                                                                                            
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/schema_resolver.go:136                                                                        
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql.(*schemaResolver).ResolveFunction.func1                                                            
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/schema_resolver.go:408                                                                        
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/sessiondata.(*SearchPath).IterateSearchPath                                                        
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/sessiondata/search_path.go:301                                                                
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql.(*schemaResolver).ResolveFunction                                                                  
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/schema_resolver.go:407                                                                        
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/sem/tree.(*ResolvableFunctionReference).Resolve                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/sem/tree/function_name.go:110                                                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*scope).VisitPre                                                                   
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/scope.go:1056                                                                  
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/sem/tree.WalkExpr                                                                                  
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/sem/tree/walk.go:824                                                                          
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*scope).walkExprTree                                                               
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/scope.go:428                                                                   
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*scope).resolveType                                                                
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/scope.go:467                                                                   
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).analyzeSelectList                                                        
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/project.go:160                                                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).analyzeProjectionList                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/project.go:94                                                                  
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).buildSelectClause                                                        
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/select.go:1059                                                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).buildSelectStmtWithoutParens                                             
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/select.go:996                                                                  
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).buildSelect.func1                                                        
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/select.go:965                                                                  
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).processWiths                                                             
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/with.go:116                                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).buildSelect                                                              
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/select.go:964                                                                  
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).buildStmt                                                                
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/builder.go:305                                                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).buildStmtAtRoot                                                          
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/builder.go:252                                                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).Build                                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/builder.go:226                                                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql.(*optPlanningCtx).buildExecMemo                                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/plan_opt.go:560                                                                               
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql.(*planner).makeOptimizerPlan                                                                       
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/plan_opt.go:231                                                                               
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).makeExecPlan                                                                       
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:1432                                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +Wraps: (3) referenced descriptor ID 123                                                                                                         
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +Wraps: (4) secondary error attachment                                                                                                           
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | failed to find descriptor [123]                                                                                                             
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | (1) attached stack trace                                                                                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   -- stack trace:                                                                                                                           
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv.requiredError                                                           
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv/catalog_query.go:176                                                   
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv.build                                                                   
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv/catalog_query.go:146                                                   
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv.catalogQuery.processDescriptorResultRow                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv/catalog_query.go:109                                                   
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv.catalogQuery.query                                                      
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv/catalog_query.go:73                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv.catalogReader.GetDescriptorEntries                                      
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv/catalog_reader.go:215                                                  
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv.(*StoredCatalog).EnsureFromStorageByIDs                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv/stored_catalog.go:283                                                  
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/catalog/descs.(*Collection).getDescriptorsByID                                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/catalog/descs/descriptor.go:137                                                               
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/catalog/descs.(*Collection).getDescriptorByName                                                
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/catalog/descs/descriptor.go:280                                                               
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/catalog/descs.(*Collection).getSchemaByName                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/catalog/descs/schema.go:73                                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/catalog/descs.(*Collection).GetImmutableSchemaByName                                           
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/catalog/descs/schema.go:61                                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql.(*schemaResolver).LookupSchema                                                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/schema_resolver.go:136                                                                        
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql.(*schemaResolver).ResolveFunction.func1                                                        
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/schema_resolver.go:408                                                                        
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/sessiondata.(*SearchPath).IterateSearchPath                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/sessiondata/search_path.go:301                                                                
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql.(*schemaResolver).ResolveFunction                                                              
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/schema_resolver.go:407                                                                        
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/sem/tree.(*ResolvableFunctionReference).Resolve                                                
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/sem/tree/function_name.go:110                                                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*scope).VisitPre                                                               
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/scope.go:1056                                                                  
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/sem/tree.WalkExpr                                                                              
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/sem/tree/walk.go:824                                                                          
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*scope).walkExprTree                                                           
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/scope.go:428                                                                   
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*scope).resolveType                                                            
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/scope.go:467                                                                   
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).analyzeSelectList                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/project.go:160                                                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).analyzeProjectionList                                                
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/project.go:94                                                                  
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).buildSelectClause                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/select.go:1059                                                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).buildSelectStmtWithoutParens                                         
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/select.go:996                                                                  
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).buildSelect.func1                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/select.go:965                                                                  
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).processWiths                                                         
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/with.go:116                                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).buildSelect                                                          
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/select.go:964                                                                  
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).buildStmt                                                            
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/builder.go:305                                                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).buildStmtAtRoot                                                      
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/builder.go:252                                                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).Build                                                                
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/builder.go:226                                                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql.(*optPlanningCtx).buildExecMemo                                                                
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/plan_opt.go:560                                                                               
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql.(*planner).makeOptimizerPlan                                                                   
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/plan_opt.go:231                                                                               
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).makeExecPlan                                                                   
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:1432                                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | Wraps: (2) failed to find descriptor [123]                  

It's unclear to me if this is a bug on the schema side or if we are holding it wrong (I suppose sometime soon both side of this will be the schema side).

During OnFailOrCancel, we go through all the databases and delete them, and along the way we try to make sure to add them to uncommitted descriptors:

db.SetDropped()
db.MaybeIncrementVersion()
if err := descsCol.AddUncommittedDescriptor(ctx, db); err != nil {
return err
}
descKey := catalogkeys.MakeDescMetadataKey(codec, db.GetID())
b.Del(descKey)
// We have explicitly to delete the system.namespace entry for the public schema
// if the database does not have a public schema backed by a descriptor.
if !db.(catalog.DatabaseDescriptor).HasPublicSchemaWithDescriptor() {
b.Del(catalogkeys.MakeSchemaNameKey(codec, db.GetID(), tree.PublicSchema))
}
nameKey := catalogkeys.MakeDatabaseNameKey(codec, db.GetName())
b.Del(nameKey)
descsCol.NotifyOfDeletedDescriptor(db.GetID())

But for 2 of these databases, we are going to recreate them later:

if details.DescriptorCoverage == tree.AllDescriptors {
// We've dropped defaultdb and postgres in the planning phase, we must
// recreate them now if the full cluster restore failed.
_, err := ie.Exec(ctx, "recreate-defaultdb", txn, "CREATE DATABASE IF NOT EXISTS defaultdb")
if err != nil {
return err
}
_, err = ie.Exec(ctx, "recreate-postgres", txn, "CREATE DATABASE IF NOT EXISTS postgres")
if err != nil {
return err
}
}

The descriptor collection tracks uncommitted descriptors in a nstree.NameMap. When you upsert into this structure, items get replaced based on both name and ID. I've confirmed with some injected logging confirms that the create statement for defaultdb and postgres end up removing the deleted descriptor with the same name from uncommitted descriptors. This results in us not calling WaitForNoVersion on the deleted descriptors.

As a result, when a query I am trying to run after the backup is cancelled runs, the test occasionally end up trying to do function resolution using an erroneously still leased database descriptor which ends up in us trying to look up a non-existent schema.

Jira issue: CRDB-20034

@stevendanna stevendanna added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-schema-deprecated Use T-sql-foundations instead labels Sep 28, 2022
@blathers-crl
Copy link

blathers-crl bot commented Sep 28, 2022

cc @cockroachdb/disaster-recovery

@stevendanna stevendanna changed the title sql,backupccl: failed to WaitForNoVersion on deleted description during restore cancellation sql,backupccl: failed to WaitForNoVersion on deleted descriptor during restore cancellation Sep 28, 2022
@ajwerner
Copy link
Contributor

ajwerner commented Oct 4, 2022

We could solve this in the lease manager without too much pain.

@postamar postamar added the E-easy Easy issue to tackle, requires little or no CockroachDB experience label Oct 4, 2022
@ajwerner
Copy link
Contributor

ajwerner commented Oct 4, 2022

Something like this:

diff --git a/pkg/sql/catalog/lease/lease.go b/pkg/sql/catalog/lease/lease.go
index 13ebcc0a1f8..e63b3d71162 100644
--- a/pkg/sql/catalog/lease/lease.go
+++ b/pkg/sql/catalog/lease/lease.go
@@ -1072,10 +1072,19 @@ func (m *Manager) findDescriptorState(id descpb.ID, create bool) *descriptorStat
 // RangefeedLeases is not active.
 func (m *Manager) RefreshLeases(ctx context.Context, s *stop.Stopper, db *kv.DB) {
 	descUpdateCh := make(chan *descpb.Descriptor)
+	descDeleteCh := make(chan descpb.ID)
 	m.watchForUpdates(ctx, descUpdateCh)
 	_ = s.RunAsyncTask(ctx, "refresh-leases", func(ctx context.Context) {
 		for {
 			select {
+			case id := <-descDeleteCh:
+				// Try to refresh the lease to one >= this version.
+				log.VEventf(ctx, 2, "purging old version of delete descriptor %d",
+					id)
+				if err := purgeOldVersions(ctx, db, id, true /* dropped */, 0, m); err != nil {
+					log.Warningf(ctx, "error purging leases for descriptor %d: %s",
+						id, err)
+				}
 			case desc := <-descUpdateCh:
 				// NB: We allow nil descriptors to be sent to synchronize the updating of
 				// descriptors.
@@ -1117,7 +1126,9 @@ func (m *Manager) RefreshLeases(ctx context.Context, s *stop.Stopper, db *kv.DB)
 
 // watchForUpdates will watch a rangefeed on the system.descriptor table for
 // updates.
-func (m *Manager) watchForUpdates(ctx context.Context, descUpdateCh chan<- *descpb.Descriptor) {
+func (m *Manager) watchForUpdates(
+	ctx context.Context, descUpdateCh chan<- *descpb.Descriptor, deletedDescCh chan<- descpb.ID,
+) {
 	if log.V(1) {
 		log.Infof(ctx, "using rangefeeds for lease manager updates")
 	}
@@ -1130,6 +1141,11 @@ func (m *Manager) watchForUpdates(ctx context.Context, descUpdateCh chan<- *desc
 		ctx context.Context, ev *roachpb.RangeFeedValue,
 	) {
 		if len(ev.Value.RawBytes) == 0 {
+			id := 0 // decode ID from key
+			select {
+			case <-ctx.Done():
+			case deletedDescCh <- id:
+			}
 			return
 		}
 		b, err := descbuilder.FromSerializedValue(&ev.Value)

@stevendanna
Copy link
Collaborator Author

stevendanna commented Oct 4, 2022

I think that solution would probably also resolve at least the main symptom of #89079, but there it still seems wrong to not put the descriptors offline first.

@ajwerner
Copy link
Contributor

ajwerner commented Oct 4, 2022

Was the descriptor ever online? I think part of what's going on here is that we used to not lease descriptors in the ADDING state or the OFFLINE state, but that ran into trouble (#61798). Now that we are leasing them, we run into this trouble. I'm increasingly thinking it ought to be fine to just delete an OFFLINE descriptor.

@stevendanna
Copy link
Collaborator Author

Was the descriptor ever online?

Unfortunately, I think technically it can be. Because this happens in OnFailOrCancel, we might be responding to a failure that happened after we published the descriptors.

@ajwerner
Copy link
Contributor

ajwerner commented Oct 4, 2022

In that case, I'd prefer we go to dropped if we can muster the will to pull that off

@postamar postamar added the A-schema-catalog Related to the schema descriptors collection and the catalog API in general. label Nov 10, 2022
@exalate-issue-sync exalate-issue-sync bot added T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) and removed T-disaster-recovery T-sql-schema-deprecated Use T-sql-foundations instead labels May 10, 2023
stevendanna added a commit to stevendanna/cockroach that referenced this issue Feb 17, 2024
Full cluster restore drops the default DB. The test driver cache
connections that may have originally connected to a database that is
now dropped. This causes problems for queries issued after the full
cluster restore.

Here, (1) I change the query we use to get job IDs to one that doesn't
depend on doing any search path lookups and (2) reset all of our
connections after the first restore we do.

See also cockroachdb#88913

Fixes cockroachdb#119079

Release note: None
craig bot pushed a commit that referenced this issue Mar 11, 2024
119342: backupccl: deflake TestDataDriven_restore_on_fail_or_cancel_retry r=msbutler a=stevendanna

Full cluster restore drops the default DB. The test driver cache connections that may have originally connected to a database that is now dropped. This causes problems for queries issued after the full cluster restore.

Here, (1) I change the query we use to get job IDs to one that doesn't depend on doing any search path lookups and (2) reset all of our connections after the first restore we do.

See also #88913

Fixes #119079

Release note: None

Co-authored-by: Steven Danna <[email protected]>
blathers-crl bot pushed a commit that referenced this issue Mar 11, 2024
Full cluster restore drops the default DB. The test driver cache
connections that may have originally connected to a database that is
now dropped. This causes problems for queries issued after the full
cluster restore.

Here, (1) I change the query we use to get job IDs to one that doesn't
depend on doing any search path lookups and (2) reset all of our
connections after the first restore we do.

See also #88913

Fixes #119079

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-disaster-recovery A-schema-catalog Related to the schema descriptors collection and the catalog API in general. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. E-easy Easy issue to tackle, requires little or no CockroachDB experience T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)
Projects
None yet
Development

No branches or pull requests

3 participants