Handle runtime table gaps on code block deletion #10605

mmisol · 2020-04-27T13:37:47Z

Purpose

When the runtime table are built there is an implicit assumption that
the code block ids are consecutive. However, that is not always the
case, as the deletion of a procedure causes the deletion of its child
code blocks, which may generate gaps in the id numbering.

In order to make the code resiliant to these gaps, the runtime tables
are sized based on the largest code block id, rather than in the amount
of code blocks.

Declarations

Check these if you believe they are true

The codebase is in a better state after this PR
Is documented according to the standards
The level of testing this PR includes is appropriate
User facing strings, if any, are extracted into *.resx files
All tests pass using the self-service CI.
Snapshot of UI changes, if any.
Changes to the API follow Semantic Versioning and are documented in the API Changes document.

Reviewers

@aparajit-pratap @mjkkirschner

FYIs

@QilongTang

When the runtime table are built there is an implicit assumption that the code block ids are consecutive. However, that is not always the case, as the deletion of a procedure causes the deletion of its child code blocks, which may generate gaps in the id numbering. In order to make the code resiliant to these gaps, the runtime tables are sized based on the largest code block id, rather than in the amount of code blocks.

mmisol · 2020-04-27T13:38:54Z

Making this a WIP since I still need to add a test, but please take a look.

aparajit-pratap · 2020-04-27T14:01:18Z

@mmisol so is the reason for the crash the deletion of a DS function? Could you add some details to explain the reason for the crash?

mjkkirschner · 2020-04-27T16:28:25Z

src/Engine/ProtoCore/Core.cs

+        /// </summary>
+        private int GetRuntimeTableSize()
+        {
+            // Due to the way this list is constructed, the largest id is the one of the las block.


src/Engine/ProtoCore/Core.cs

QilongTang · 2020-04-27T16:56:10Z

Is this PR still WIP? I guess for unit tests?

mmisol · 2020-04-27T16:56:36Z

@aparajit-pratap Yes, the relevant related code is in SetFunctionInactive after the comment that says // Remove codeblock defined in procNode from CodeBlockList and CompleteCodeBlockList. That's where the gap is created.

aparajit-pratap · 2020-04-27T17:33:28Z

What about line

Dynamo/src/Engine/ProtoCore/Core.cs

Line 383 in c00c80b

RuntimeTableIndex = CodeBlockIndex;

Is RuntimeTableIndex being set correctly here?

mmisol · 2020-04-27T17:40:54Z

@aparajit-pratap I would say so. As you probably know, we have exactly 1 top-level scope, so in practice this is setting the RuntimeTableIndex to 1.

mmisol · 2020-04-27T19:54:44Z

@QilongTang I added the test and removed the WIP tag just now. Please take a look

aparajit-pratap

@mmisol I think this may need more investigation. I tried your fix locally and although it didn't crash with the steps outlined in the bug, I could reproduce the crash dialogs (all stacked up as seen by Josh) with undo-redo. I think the solution we have to mark functions inactive and add new functions is brittle. The RunTimeTableIndex is just one aspect of it.

aparajit-pratap · 2020-04-28T13:51:46Z

src/Engine/ProtoCore/Core.cs

+            {
+                lastBlock = lastBlock.children.Last();
+            }
+            return lastBlock.codeBlockId + 1;


If the assertion that the last codeblock always has the largest id is true, then this function needs to be recursive.

I have wondered about that too, but if you take a look at the part of the code I mention here #10605 (comment), you'll see that the code doesn't do recursion, it also looks only at the first level of children. Maybe these code blocks can contain only one level of children in practice?

The BfsBuildSequenceTable method is recursive.

@aparajit-pratap Yes, but it's using CodeBlockList which is actually a tree. Both this method and the one I pointed to in my previous comment use CompleteCodeBlockList which contains the same nodes but is flattened, to one level of children it would seem.

There should be a codeblock for each new block (language block such as [Imperative {...}], if-else, while, for blocks etc., and function definition) so there can be any number of nesting or children of child codeblocks and so on.

Then how can it be that the children can have the largest id? The largest should always be at the top level, i.e. CompleteCodeBlockList?

I don't know if I understood what you meant. What I was saying is that the List.Add method adds to the end of the list, and since we always get a bigger id each time, by construction the biggest id should be at the end of the flattened list. Does that make sense?

Yes, I understood that. If we're adding to the end of the CompleteCodeBlockList, we simply need to check the size of that list or the codeBlockId of the last item in CompleteCodeBlockList. Why are you checking for children of the last item in the CompleteCodeBlockList?

we simply need to check the size of that list or the codeBlockId of the last item in CompleteCodeBlockList

If we go back to the purpose of this PR, those two do not always match, so it's not the same to go with one or the other. Just wanted to make that clear.

Why are you checking for children of the last item in the CompleteCodeBlockList?

By way of undo/redo in Josh's graph I was able to see a crash where the last code block contained children with larger ids than the id of its parent or the amount of elements in the list. It does seem strange that the children themselves were not part of the list, so it could be a byproduct of the undo/redo bug.

I'm not so sure I understand the changes, to be honest. If you feel confident you are welcome to get this approved by someone else on the team. I'll do some investigation on my own to try to convince myself of your changes :)

mmisol · 2020-04-28T13:53:55Z

@aparajit-pratap Thanks for the find! I was indeed able to reproduce it, but only using Josh's design. The problem arises when the last code block contains children. In that case the last block id should come from the last child. I pushed a fix for this and it no longer crashes. Please give it a try.

By the way, I do agree that the problem deserves more investigation, but I also think that if we can prevent the crash with this, it provides short term value.

mmisol · 2020-04-28T16:22:22Z

After further testing I was able to make it fail again, also using Josh's design by way of undo/redo several times.

I think there is a deeper problem going on with undo/redo. Taking a look at the CodeBlockList I see there are two code blocks with the same id, located in different parts of the tree. I'd guess these ids are meant to be unique. Here is a screen capture showing this:

mmisol · 2020-04-28T18:26:45Z

My last push fixed that issue. I still have one more crash to tackle:

An item with the same key has already been added.

   at System.ThrowHelper.ThrowArgumentException(ExceptionResource resource)
   at System.Collections.Generic.Dictionary`2.Insert(TKey key, TValue value, Boolean add)
   at System.Collections.Generic.Dictionary`2.Add(TKey key, TValue value)
   at ProtoScript.Runners.ChangeSetComputer.GetDeltaAstListAdded(IEnumerable`1 addedSubTrees) in C:\Users\t_misom\dev\github\Dynamo\src\Engine\ProtoScript\Runners\LiveRunner.cs:line 493
   at ProtoScript.Runners.ChangeSetComputer.GetDeltaASTList(GraphSyncData syncData) in C:\Users\t_misom\dev\github\Dynamo\src\Engine\ProtoScript\Runners\LiveRunner.cs:line 753
   at ProtoScript.Runners.LiveRunner.SynchronizeInternal(GraphSyncData syncData) in C:\Users\t_misom\dev\github\Dynamo\src\Engine\ProtoScript\Runners\LiveRunner.cs:line 1470
   at ProtoScript.Runners.LiveRunner.UpdateGraph(GraphSyncData syncData) in C:\Users\t_misom\dev\github\Dynamo\src\Engine\ProtoScript\Runners\LiveRunner.cs:line 1240
   at Dynamo.Engine.LiveRunnerServices.UpdateGraph(GraphSyncData graphData, Boolean verboseLogging) in C:\Users\t_misom\dev\github\Dynamo\src\DynamoCore\Engine\LiveRunnerServices.cs:line 97
   at Dynamo.Engine.EngineController.UpdateGraphImmediate(GraphSyncData graphSyncData) in C:\Users\t_misom\dev\github\Dynamo\src\DynamoCore\Engine\EngineController.cs:line 462
   at Dynamo.Scheduler.UpdateGraphAsyncTask.HandleTaskExecutionCore() in C:\Users\t_misom\dev\github\Dynamo\src\DynamoCore\Scheduler\UpdateGraphAsyncTask.cs:line 106
   at Dynamo.Scheduler.AsyncTask.Execute() in C:\Users\t_misom\dev\github\Dynamo\src\DynamoCore\Scheduler\AsyncTask.cs:line 189

mmisol · 2020-04-28T19:07:32Z

I managed to see what caused that last issue, but I don't think it's related to the imperative block per se, but more of a general problem with undo/redo.

The source is a discrepancy between the incoming syncData and the currentSubTreeList in the changeSetComputer. An AST node that already exists is trying to be added, which results in the above error. The node is a DSCore.List.UniqueItems@var[]..[]. The code that is coming in syncData is actually the same that is already in the changeSetComputer:

{var_291a044825354a55a49807d44bdb1494 = 
  __CreateFunctionObject(DSCore.List.UniqueItems, 1, [], [null], true);}

Do you think we should tackle this a separate issue?

mmisol · 2020-04-28T20:22:40Z

Adding to my previous comment, I was able to reproduce that issue with a totally unrelated graph.
Do you know if we have seen that before @mjkkirschner ?

src/Engine/ProtoCore/Core.cs

aparajit-pratap · 2020-04-28T20:49:35Z

@mmisol I think we can file the last issue you saw with the delta computer separately but could you describe what steps you followed to produce the last crash?

mmisol · 2020-04-28T20:52:39Z

@aparajit-pratap The last crash was the changeSetComputer one. The previous to that was also reproduced by undoing and redoing with Josh's graph about three times, but without going too fast or you hit the changeSetComputer one.

QilongTang · 2020-04-29T19:33:42Z

src/Engine/ProtoCore/Core.cs

+            // If the last block has children, then its last child has the largest id.
+            if (lastBlock.children.Count > 0)
+            {
+                lastBlock = lastBlock.children.Last();


I guess I had similar question here why we are checking children of last block but other than this, everything else looks fine to me..

I logged a ticket for the undo/redo bug. Maybe we can revisit this after we have fixed that? I feel they may be related.

I could also place a comment here mentioning that.

Thanks, would be good if you do that

* add namespace conflict tests for short name replacer and node to code (#10611) * Register custom node before package load reset (#10591) This fixes a problem where existing custom nodes in the home workspace became unresolved after a package that contained binaries was loaded. The cause of the problem was that the compiled function was not available at the time of the execution after a VM reset. Now the data is registered on package load, by queueing it in pendingCustomNodeSyncData. This results in a CompileCustomNodeAsyncTask being scheduled before the update of the home workspace graph takes place. * Increase coverage of the Core folder (#10609) * Add a test for the CrashPromptArgs class * Update DynamoCoreTests.csproj * Added NotificationObject tests * Updated the test, NotificationObject coverage at 100% * Changed some access modifiers * Added coverage for Updates/BinaryVersion * Added comments and fixed names * Revert "Added comments and fixed names" This reverts commit 42cd024. * Revert "Added coverage for Updates/BinaryVersion" This reverts commit 679deca. * Increased coverage in CrashPromptArgs * Added some Core coverage * DYN-2560 - Increase the code coverage: DynamoCore/Models Folder First Part (#10612) * DYN-2560 - Increase the code coverage: DynamoCore/Models Folder I started adding just one test method TestOnRequestDispatcherBeginInvoke() for testing the DynamoModelEvents class. * DYN-2560 - Adding test cases for DynamoModelEvents Adding test cases for DynamoModelEvents * DYN-2560 - Adding test cases for DynamoModelEvents I added all the test cases for the all events in the DynamoModelEvent class, i just need to fix the last 6 of them. * DYN-2560 - Adding test cases for DynamoModelEventsArgs I added several test cases for the classes inside the DynamoModelEventsArgs.cs file. ZoomEventArgs TaskDialogEventArgs EvaluationCompletedEventArgs DynamoModelUpdateArgs FunctionNamePromptEventArgs PresetsNamePromptEventArgs ViewOperationEventArgs PointEventArgs WorkspaceEventArgs ModelEventArgs * DYN-2560 - Code Review Comments Based in the comment done by Aaron in the GitHub pull request, I added more description comments for the method TestTaskDialogEventArgs() and also I added comments for a local function * DYN-2560 - Code Review Comments 2 There was a spelling error in two methods for the word "Internally", then I fixed this error in the two places. * Python Engine Enum (#10618) * Cherrypick * Comments * Add unit test * Comments * Handle runtime table gaps on code block deletion (#10605) When the runtime table are built there is an implicit assumption that the code block ids are consecutive. However, that is not always the case, as the deletion of a procedure causes the deletion of its child code blocks, which may generate gaps in the id numbering. In order to make the code resiliant to these gaps, the runtime tables are sized based on the largest code block id, rather than in the amount of code blocks. * Validate ASM installations before loading (#10621) The check for the specific assemblies tbb.dll and tbbmalloc.dll is generalized to a full file list validation of detected ASM locations. This way, Dynamo is guarded against any incomplete/unusual ASM binary folders that other applications might include. The lists of files for each version were taken from LibG. They cannot be reused from LibG without involving major changes in the preloader, so the lists should be kept in sync as new major release of ASM occur. * SQ bug fix (#10622) * (1) Null reference bug fix from SQ dashboard * Add support for debug modes (#10603) * Add support for debug modes * Increase coverage of the Updates folder (#10628) * Add a test for the CrashPromptArgs class * Update DynamoCoreTests.csproj * Added NotificationObject tests * Updated the test, NotificationObject coverage at 100% * Changed some access modifiers * Added coverage for Updates/BinaryVersion * Added comments and fixed names * Added asserts for happy path, changed namespace and deleted a console function * Removed using * Revert "Removed using" This reverts commit 022823d. * Removed a change that should not be there * Added coverage for Updates folder * Tests for CPython3 Engine as well as for IronPython. * [WIP][FEEDBACK] Reduce test time by substantially reducing number of serialization tests. (#10624) * reduce number of serialization tests by factor 3~ * reduce wpf json serialization tests * Handle missing instance calling method statically (#10630) A code block node calling an instance method in its static form would make Dynamo crash if the instance was not provided. An example of this would be calling 'Curve.Patch();'. The cause of the issue was that the default argument was ultimately tried to be interpreted as a pointer. By avoiding that wrong conversion the engine is now able to surface the real problem as a human-readable warning. * Python3 Selection Under Debug Modes (#10629) * Cherrypick Python3 changes * Cleanup * Use Debug Modes * CleanUp * Rename Function * Clean Up * Do not use anouymous function as handler * Revert newer language change * Add adp analytics to Dynamo (#10576) * add ADPTracker register * Fix non-array item search for dot operation (#10633) When an array is passed to the dot operation, it needs to get an item to determine the actual class being processed. This was done in ArrayUtils.GetFirstNonArrayStackValue, but the function only checked the first item of the array and its descendants. This caused the dot operation to failed when passed an array that contained an empty array as its first item. In order to fix the problem, the function was changed to check for non-array items in the entire array. The function should stop as soon as the first non-array item is found. * Addressing some comments * some more comments * Update AssemblyInfo.cs * Update AssemblyInfo.cs * Marking the python node as modified when its engine property is modified Also updated some tests. * Remove unwanted check. * changes to test * Update AssemblySharedInfo.cs * Using python engine's AcquireLock to avoid deadlock. The deadlock was happening only when multiple PythonEngine's are initialized on a thread from 2 different test fixtures in the same run. The main difference is calling this function PythonEngine.BeginAllowThreads(). Also we do not want to initialze the python engine if it is already initialized. Co-authored-by: aparajit-pratap <[email protected]> Co-authored-by: Martin Misol Monzo <[email protected]> Co-authored-by: Bruno Yorda <[email protected]> Co-authored-by: Roberto T <[email protected]> Co-authored-by: Aaron (Qilong) <[email protected]> Co-authored-by: Ashish Aggarwal <[email protected]> Co-authored-by: Tibi <[email protected]> Co-authored-by: Michael Kirschner <[email protected]>

mmisol added 3 commits April 27, 2020 09:32

Remove commented out code

85f7232

Remove commented out code

c3bd236

mmisol added the WIP label Apr 27, 2020

mmisol requested review from aparajit-pratap and mjkkirschner April 27, 2020 13:38

mjkkirschner reviewed Apr 27, 2020

View reviewed changes

src/Engine/ProtoCore/Core.cs Outdated Show resolved Hide resolved

Fix typo and improve comment

87b0284

Add test

7a45ab7

mmisol removed the WIP label Apr 27, 2020

aparajit-pratap reviewed Apr 28, 2020

View reviewed changes

Last code block may have children

f6343bd

aparajit-pratap reviewed Apr 28, 2020

View reviewed changes

Fix code block id generation on create

e331114

aparajit-pratap reviewed Apr 28, 2020

View reviewed changes

src/Engine/ProtoCore/Core.cs Show resolved Hide resolved

Add comment for empty list case

84b701b

mmisol requested a review from a team April 29, 2020 16:56

QilongTang reviewed Apr 29, 2020

View reviewed changes

QilongTang approved these changes May 1, 2020

View reviewed changes

mmisol merged commit 81df8a9 into DynamoDS:master May 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle runtime table gaps on code block deletion #10605

Handle runtime table gaps on code block deletion #10605

mmisol commented Apr 27, 2020 •

edited

Loading

mmisol commented Apr 27, 2020

aparajit-pratap commented Apr 27, 2020

mjkkirschner Apr 27, 2020

QilongTang commented Apr 27, 2020

mmisol commented Apr 27, 2020

aparajit-pratap commented Apr 27, 2020

mmisol commented Apr 27, 2020 •

edited

Loading

mmisol commented Apr 27, 2020

aparajit-pratap left a comment

aparajit-pratap Apr 28, 2020

mmisol Apr 28, 2020 •

edited

Loading

aparajit-pratap Apr 28, 2020 •

edited

Loading

mmisol Apr 28, 2020

aparajit-pratap Apr 28, 2020 •

edited

Loading

aparajit-pratap Apr 28, 2020

mmisol Apr 28, 2020

aparajit-pratap Apr 29, 2020 •

edited

Loading

mmisol Apr 29, 2020

aparajit-pratap Apr 29, 2020

mmisol commented Apr 28, 2020

mmisol commented Apr 28, 2020

mmisol commented Apr 28, 2020

mmisol commented Apr 28, 2020 •

edited

Loading

mmisol commented Apr 28, 2020

aparajit-pratap commented Apr 28, 2020

mmisol commented Apr 28, 2020

QilongTang Apr 29, 2020

mmisol Apr 29, 2020

QilongTang Apr 29, 2020

Handle runtime table gaps on code block deletion #10605

Handle runtime table gaps on code block deletion #10605

Conversation

mmisol commented Apr 27, 2020 • edited Loading

Purpose

Declarations

Reviewers

FYIs

mmisol commented Apr 27, 2020

aparajit-pratap commented Apr 27, 2020

Choose a reason for hiding this comment

QilongTang commented Apr 27, 2020

mmisol commented Apr 27, 2020

aparajit-pratap commented Apr 27, 2020

mmisol commented Apr 27, 2020 • edited Loading

mmisol commented Apr 27, 2020

aparajit-pratap left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mmisol Apr 28, 2020 • edited Loading

Choose a reason for hiding this comment

aparajit-pratap Apr 28, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aparajit-pratap Apr 28, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aparajit-pratap Apr 29, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mmisol commented Apr 28, 2020

mmisol commented Apr 28, 2020

mmisol commented Apr 28, 2020

mmisol commented Apr 28, 2020 • edited Loading

mmisol commented Apr 28, 2020

aparajit-pratap commented Apr 28, 2020

mmisol commented Apr 28, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mmisol commented Apr 27, 2020 •

edited

Loading

mmisol commented Apr 27, 2020 •

edited

Loading

mmisol Apr 28, 2020 •

edited

Loading

aparajit-pratap Apr 28, 2020 •

edited

Loading

aparajit-pratap Apr 28, 2020 •

edited

Loading

aparajit-pratap Apr 29, 2020 •

edited

Loading

mmisol commented Apr 28, 2020 •

edited

Loading