sql,backupccl: failed to WaitForNoVersion on deleted descriptor during restore cancellation #88913

stevendanna · 2022-09-28T13:13:48Z

Describe the problem

When attempting to write a datadriven test that runs a query after a cancelled cluster restore, I see the following under stress:

Details

E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602  encountered internal error:                                                                                                                                                                                                                                                                                                                                                                   [98/1347]
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +referenced descriptor ID 123: descriptor not found                                                                                              
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +(1) assertion failure                                                                                                                           
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +Wraps: (2) attached stack trace                                                                                                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  -- stack trace:                                                                                                                               
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv.wrapError                                                                   
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv/catalog_query.go:128                                                   
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv.catalogQuery.processDescriptorResultRow                                     
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv/catalog_query.go:111                                                   
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv.catalogQuery.query                                                          
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv/catalog_query.go:73                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv.catalogReader.GetDescriptorEntries                                          
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv/catalog_reader.go:215                                                  
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv.(*StoredCatalog).EnsureFromStorageByIDs                                     
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv/stored_catalog.go:283                                                  
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/catalog/descs.(*Collection).getDescriptorsByID                                                     
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/catalog/descs/descriptor.go:137                                                               
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/catalog/descs.(*Collection).getDescriptorByName                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/catalog/descs/descriptor.go:280                                                               
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/catalog/descs.(*Collection).getSchemaByName                                                        
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/catalog/descs/schema.go:73                                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/catalog/descs.(*Collection).GetImmutableSchemaByName                                                                                                                                                                                                                                                                                                      
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/catalog/descs/schema.go:61                                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql.(*schemaResolver).LookupSchema                                                                                                                                                                                                                                                                                                                            
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/schema_resolver.go:136                                                                        
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql.(*schemaResolver).ResolveFunction.func1                                                            
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/schema_resolver.go:408                                                                        
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/sessiondata.(*SearchPath).IterateSearchPath                                                        
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/sessiondata/search_path.go:301                                                                
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql.(*schemaResolver).ResolveFunction                                                                  
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/schema_resolver.go:407                                                                        
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/sem/tree.(*ResolvableFunctionReference).Resolve                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/sem/tree/function_name.go:110                                                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*scope).VisitPre                                                                   
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/scope.go:1056                                                                  
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/sem/tree.WalkExpr                                                                                  
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/sem/tree/walk.go:824                                                                          
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*scope).walkExprTree                                                               
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/scope.go:428                                                                   
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*scope).resolveType                                                                
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/scope.go:467                                                                   
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).analyzeSelectList                                                        
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/project.go:160                                                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).analyzeProjectionList                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/project.go:94                                                                  
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).buildSelectClause                                                        
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/select.go:1059                                                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).buildSelectStmtWithoutParens                                             
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/select.go:996                                                                  
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).buildSelect.func1                                                        
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/select.go:965                                                                  
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).processWiths                                                             
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/with.go:116                                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).buildSelect                                                              
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/select.go:964                                                                  
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).buildStmt                                                                
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/builder.go:305                                                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).buildStmtAtRoot                                                          
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/builder.go:252                                                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).Build                                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/builder.go:226                                                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql.(*optPlanningCtx).buildExecMemo                                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/plan_opt.go:560                                                                               
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql.(*planner).makeOptimizerPlan                                                                       
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/plan_opt.go:231                                                                               
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).makeExecPlan                                                                       
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |      github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:1432                                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +Wraps: (3) referenced descriptor ID 123                                                                                                         
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +Wraps: (4) secondary error attachment                                                                                                           
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | failed to find descriptor [123]                                                                                                             
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | (1) attached stack trace                                                                                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   -- stack trace:                                                                                                                           
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv.requiredError                                                           
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv/catalog_query.go:176                                                   
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv.build                                                                   
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv/catalog_query.go:146                                                   
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv.catalogQuery.processDescriptorResultRow                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv/catalog_query.go:109                                                   
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv.catalogQuery.query                                                      
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv/catalog_query.go:73                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv.catalogReader.GetDescriptorEntries                                      
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv/catalog_reader.go:215                                                  
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv.(*StoredCatalog).EnsureFromStorageByIDs                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/catalog/internal/catkv/stored_catalog.go:283                                                  
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/catalog/descs.(*Collection).getDescriptorsByID                                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/catalog/descs/descriptor.go:137                                                               
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/catalog/descs.(*Collection).getDescriptorByName                                                
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/catalog/descs/descriptor.go:280                                                               
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/catalog/descs.(*Collection).getSchemaByName                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/catalog/descs/schema.go:73                                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/catalog/descs.(*Collection).GetImmutableSchemaByName                                           
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/catalog/descs/schema.go:61                                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql.(*schemaResolver).LookupSchema                                                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/schema_resolver.go:136                                                                        
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql.(*schemaResolver).ResolveFunction.func1                                                        
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/schema_resolver.go:408                                                                        
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/sessiondata.(*SearchPath).IterateSearchPath                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/sessiondata/search_path.go:301                                                                
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql.(*schemaResolver).ResolveFunction                                                              
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/schema_resolver.go:407                                                                        
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/sem/tree.(*ResolvableFunctionReference).Resolve                                                
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/sem/tree/function_name.go:110                                                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*scope).VisitPre                                                               
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/scope.go:1056                                                                  
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/sem/tree.WalkExpr                                                                              
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/sem/tree/walk.go:824                                                                          
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*scope).walkExprTree                                                           
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/scope.go:428                                                                   
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*scope).resolveType                                                            
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/scope.go:467                                                                   
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).analyzeSelectList                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/project.go:160                                                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).analyzeProjectionList                                                
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/project.go:94                                                                  
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).buildSelectClause                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/select.go:1059                                                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).buildSelectStmtWithoutParens                                         
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/select.go:996                                                                  
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).buildSelect.func1                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/select.go:965                                                                  
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).processWiths                                                         
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/with.go:116                                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).buildSelect                                                          
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/select.go:964                                                                  
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).buildStmt                                                            
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/builder.go:305                                                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).buildStmtAtRoot                                                      
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/builder.go:252                                                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder.(*Builder).Build                                                                
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/opt/optbuilder/builder.go:226                                                                 
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql.(*optPlanningCtx).buildExecMemo                                                                
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/plan_opt.go:560                                                                               
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql.(*planner).makeOptimizerPlan                                                                   
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/plan_opt.go:231                                                                               
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   | github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).makeExecPlan                                                                   
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  |   |  github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:1432                                                                    
E220928 13:08:54.442318 3083 sql/sqltelemetry/report.go:57  [n1,client=127.0.0.1:64308,user=root] 602 +  | Wraps: (2) failed to find descriptor [123]

It's unclear to me if this is a bug on the schema side or if we are holding it wrong (I suppose sometime soon both side of this will be the schema side).

During OnFailOrCancel, we go through all the databases and delete them, and along the way we try to make sure to add them to uncommitted descriptors:

cockroach/pkg/ccl/backupccl/restore_job.go

Lines 2525 to 2542 in 85b29ae

    
           db.SetDropped() 
        
           db.MaybeIncrementVersion() 
        
           if err := descsCol.AddUncommittedDescriptor(ctx, db); err != nil { 
        
           	return err 
        
           } 
        
           descKey := catalogkeys.MakeDescMetadataKey(codec, db.GetID()) 
        
           b.Del(descKey) 
        
           // We have explicitly to delete the system.namespace entry for the public schema 
        
           // if the database does not have a public schema backed by a descriptor. 
        
           if !db.(catalog.DatabaseDescriptor).HasPublicSchemaWithDescriptor() { 
        
           	b.Del(catalogkeys.MakeSchemaNameKey(codec, db.GetID(), tree.PublicSchema)) 
        
           } 
        
           nameKey := catalogkeys.MakeDatabaseNameKey(codec, db.GetName()) 
        
           b.Del(nameKey) 
        
           descsCol.NotifyOfDeletedDescriptor(db.GetID())

But for 2 of these databases, we are going to recreate them later:

cockroach//pkg/ccl/backupccl/restore_job.go

Lines 2232 to 2244 in 85b29ae

    
           if details.DescriptorCoverage == tree.AllDescriptors { 
        
           	// We've dropped defaultdb and postgres in the planning phase, we must 
        
           	// recreate them now if the full cluster restore failed. 
        
           	_, err := ie.Exec(ctx, "recreate-defaultdb", txn, "CREATE DATABASE IF NOT EXISTS defaultdb") 
        
           	if err != nil { 
        
           		return err 
        
           	} 
        
           	_, err = ie.Exec(ctx, "recreate-postgres", txn, "CREATE DATABASE IF NOT EXISTS postgres") 
        
           	if err != nil { 
        
           		return err 
        
           	} 
        
           }

The descriptor collection tracks uncommitted descriptors in a nstree.NameMap. When you upsert into this structure, items get replaced based on both name and ID. I've confirmed with some injected logging confirms that the create statement for defaultdb and postgres end up removing the deleted descriptor with the same name from uncommitted descriptors. This results in us not calling WaitForNoVersion on the deleted descriptors.

As a result, when a query I am trying to run after the backup is cancelled runs, the test occasionally end up trying to do function resolution using an erroneously still leased database descriptor which ends up in us trying to look up a non-existent schema.

Jira issue: CRDB-20034

The text was updated successfully, but these errors were encountered:

blathers-crl · 2022-09-28T14:51:37Z

cc @cockroachdb/disaster-recovery

ajwerner · 2022-10-04T14:09:22Z

We could solve this in the lease manager without too much pain.

ajwerner · 2022-10-04T14:52:55Z

Something like this:

diff --git a/pkg/sql/catalog/lease/lease.go b/pkg/sql/catalog/lease/lease.go
index 13ebcc0a1f8..e63b3d71162 100644
--- a/pkg/sql/catalog/lease/lease.go
+++ b/pkg/sql/catalog/lease/lease.go
@@ -1072,10 +1072,19 @@ func (m *Manager) findDescriptorState(id descpb.ID, create bool) *descriptorStat
 // RangefeedLeases is not active.
 func (m *Manager) RefreshLeases(ctx context.Context, s *stop.Stopper, db *kv.DB) {
 	descUpdateCh := make(chan *descpb.Descriptor)
+	descDeleteCh := make(chan descpb.ID)
 	m.watchForUpdates(ctx, descUpdateCh)
 	_ = s.RunAsyncTask(ctx, "refresh-leases", func(ctx context.Context) {
 		for {
 			select {
+			case id := <-descDeleteCh:
+				// Try to refresh the lease to one >= this version.
+				log.VEventf(ctx, 2, "purging old version of delete descriptor %d",
+					id)
+				if err := purgeOldVersions(ctx, db, id, true /* dropped */, 0, m); err != nil {
+					log.Warningf(ctx, "error purging leases for descriptor %d: %s",
+						id, err)
+				}
 			case desc := <-descUpdateCh:
 				// NB: We allow nil descriptors to be sent to synchronize the updating of
 				// descriptors.
@@ -1117,7 +1126,9 @@ func (m *Manager) RefreshLeases(ctx context.Context, s *stop.Stopper, db *kv.DB)
 
 // watchForUpdates will watch a rangefeed on the system.descriptor table for
 // updates.
-func (m *Manager) watchForUpdates(ctx context.Context, descUpdateCh chan<- *descpb.Descriptor) {
+func (m *Manager) watchForUpdates(
+	ctx context.Context, descUpdateCh chan<- *descpb.Descriptor, deletedDescCh chan<- descpb.ID,
+) {
 	if log.V(1) {
 		log.Infof(ctx, "using rangefeeds for lease manager updates")
 	}
@@ -1130,6 +1141,11 @@ func (m *Manager) watchForUpdates(ctx context.Context, descUpdateCh chan<- *desc
 		ctx context.Context, ev *roachpb.RangeFeedValue,
 	) {
 		if len(ev.Value.RawBytes) == 0 {
+			id := 0 // decode ID from key
+			select {
+			case <-ctx.Done():
+			case deletedDescCh <- id:
+			}
 			return
 		}
 		b, err := descbuilder.FromSerializedValue(&ev.Value)

stevendanna · 2022-10-04T20:50:11Z

I think that solution would probably also resolve at least the main symptom of #89079, but there it still seems wrong to not put the descriptors offline first.

ajwerner · 2022-10-04T21:29:24Z

Was the descriptor ever online? I think part of what's going on here is that we used to not lease descriptors in the ADDING state or the OFFLINE state, but that ran into trouble (#61798). Now that we are leasing them, we run into this trouble. I'm increasingly thinking it ought to be fine to just delete an OFFLINE descriptor.

stevendanna · 2022-10-04T22:32:23Z

Was the descriptor ever online?

Unfortunately, I think technically it can be. Because this happens in OnFailOrCancel, we might be responding to a failure that happened after we published the descriptors.

ajwerner · 2022-10-04T22:38:29Z

In that case, I'd prefer we go to dropped if we can muster the will to pull that off

Full cluster restore drops the default DB. The test driver cache connections that may have originally connected to a database that is now dropped. This causes problems for queries issued after the full cluster restore. Here, (1) I change the query we use to get job IDs to one that doesn't depend on doing any search path lookups and (2) reset all of our connections after the first restore we do. See also cockroachdb#88913 Fixes cockroachdb#119079 Release note: None

119342: backupccl: deflake TestDataDriven_restore_on_fail_or_cancel_retry r=msbutler a=stevendanna Full cluster restore drops the default DB. The test driver cache connections that may have originally connected to a database that is now dropped. This causes problems for queries issued after the full cluster restore. Here, (1) I change the query we use to get job IDs to one that doesn't depend on doing any search path lookups and (2) reset all of our connections after the first restore we do. See also #88913 Fixes #119079 Release note: None Co-authored-by: Steven Danna <[email protected]>

Full cluster restore drops the default DB. The test driver cache connections that may have originally connected to a database that is now dropped. This causes problems for queries issued after the full cluster restore. Here, (1) I change the query we use to get job IDs to one that doesn't depend on doing any search path lookups and (2) reset all of our connections after the first restore we do. See also #88913 Fixes #119079 Release note: None

stevendanna added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-schema-deprecated Use T-sql-foundations instead labels Sep 28, 2022

stevendanna added the T-disaster-recovery label Sep 28, 2022

blathers-crl bot added the A-disaster-recovery label Sep 28, 2022

stevendanna changed the title ~~sql,backupccl: failed to WaitForNoVersion on deleted description during restore cancellation~~ sql,backupccl: failed to WaitForNoVersion on deleted descriptor during restore cancellation Sep 28, 2022

postamar added the E-easy Easy issue to tackle, requires little or no CockroachDB experience label Oct 4, 2022

postamar added the A-schema-catalog Related to the schema descriptors collection and the catalog API in general. label Nov 10, 2022

exalate-issue-sync bot added T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) and removed T-disaster-recovery T-sql-schema-deprecated Use T-sql-foundations instead labels May 10, 2023

stevendanna mentioned this issue Feb 17, 2024

backupccl: deflake TestDataDriven_restore_on_fail_or_cancel_retry #119342

Merged

blathers-crl bot mentioned this issue Mar 11, 2024

release-23.2: backupccl: deflake TestDataDriven_restore_on_fail_or_cancel_retry #120202

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql,backupccl: failed to WaitForNoVersion on deleted descriptor during restore cancellation #88913

sql,backupccl: failed to WaitForNoVersion on deleted descriptor during restore cancellation #88913

stevendanna commented Sep 28, 2022 •

edited by cockroach-jira-scripts

Loading

blathers-crl bot commented Sep 28, 2022

ajwerner commented Oct 4, 2022

ajwerner commented Oct 4, 2022

stevendanna commented Oct 4, 2022 •

edited

Loading

ajwerner commented Oct 4, 2022

stevendanna commented Oct 4, 2022

ajwerner commented Oct 4, 2022

sql,backupccl: failed to WaitForNoVersion on deleted descriptor during restore cancellation #88913

sql,backupccl: failed to WaitForNoVersion on deleted descriptor during restore cancellation #88913

Comments

stevendanna commented Sep 28, 2022 • edited by cockroach-jira-scripts Loading

blathers-crl bot commented Sep 28, 2022

ajwerner commented Oct 4, 2022

ajwerner commented Oct 4, 2022

stevendanna commented Oct 4, 2022 • edited Loading

ajwerner commented Oct 4, 2022

stevendanna commented Oct 4, 2022

ajwerner commented Oct 4, 2022

stevendanna commented Sep 28, 2022 •

edited by cockroach-jira-scripts

Loading

stevendanna commented Oct 4, 2022 •

edited

Loading