Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix and Test case for #27924 #1059

Merged
merged 5 commits into from
Jan 10, 2020
Merged

Fix and Test case for #27924 #1059

merged 5 commits into from
Jan 10, 2020

Conversation

CarolEidt
Copy link
Contributor

@CarolEidt CarolEidt commented Dec 19, 2019

@jkotas jkotas added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Dec 19, 2019
@@ -954,7 +954,7 @@ void CodeGen::genCodeForBinary(GenTreeOp* treeNode)
// reg3 = reg3 op reg2
else
{
inst_RV_RV(ins_Copy(targetType), targetReg, op1reg, targetType);
inst_RV_RV(ins_Copy(targetType), targetReg, op1reg, op1->TypeGet());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, should gcMarkRegPtrVal below also use op1->TypeGet()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting - it probably should. It gets set to a byref during emit, which fixes the GC info. I confess that I'm not sure how regSet.m_rsGCInfo.gcRegByrefSetCur is used during the codegen phase, but I presume that, since it's maintained, it must be depended on. I'll go ahead and change that and dig a little deeper in the meantime.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT, the codegen-time gc info is used at labels and calls, and hence, since this is a transitory byref, it wouldn't show up as being wrong in this case. That said, I believe it should be maintained correctly.

@CarolEidt
Copy link
Contributor Author

@dotnet/dnceng - I am getting package failures even on retry:

root/runtime/src/installer/pkg/projects/netcoreapp/sfx/Microsoft.NETCore.App.SharedFx.sfxproj(0,0): error NU1102: Unable to find package Microsoft.NETCore.Platforms with version (>= 5.0.0-ci.19620.1)

  • Found 1 version(s) in /root/runtime/artifacts/packages/Release/Shipping/ [ Nearest version: 5.0.0-ci.19619.1 ]
  • Found 0 version(s) in /root/runtime/artifacts/packages/Release/NonShipping/

@CarolEidt CarolEidt changed the title Test case for #27924 Fix and Test case for #27924 Dec 20, 2019
@CarolEidt
Copy link
Contributor Author

PTAL @dotnet/jit-contrib
cc @jkotas
The test failed in the second commit. The 3rd and 4th commits have the fix.
Once I get a clean CI run I'll change the priority of the test to 1. It has GC stress set so it takes a few seconds to run.

@mmitche
Copy link
Member

mmitche commented Dec 20, 2019

@dotnet/dnceng - I am getting package failures even on retry:

root/runtime/src/installer/pkg/projects/netcoreapp/sfx/Microsoft.NETCore.App.SharedFx.sfxproj(0,0): error NU1102: Unable to find package Microsoft.NETCore.Platforms with version (>= 5.0.0-ci.19620.1)

  • Found 1 version(s) in /root/runtime/artifacts/packages/Release/Shipping/ [ Nearest version: 5.0.0-ci.19619.1 ]
  • Found 0 version(s) in /root/runtime/artifacts/packages/Release/NonShipping/

That looks like a build issue. @ViktorHofer @dagood I thought the live-live build was working now? That looks like it's referencing packages from yesterday?


static void Work()
{
for (uint i = 0; i < 1000000; i++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: { placement is inconsistent

Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@dagood
Copy link
Member

dagood commented Dec 20, 2019

That looks like a build issue. @ViktorHofer @dagood I thought the live-live build was working now? That looks like it's referencing packages from yesterday?

#839 documents that you need to do a fresh build whenever you cross a day boundary. Looks like it applies to CI as well.

I believe this is a combination of:

@dagood
Copy link
Member

dagood commented Dec 20, 2019

It looks like the original error was also caused by UTC day tickover. Here's the timestamp on the first line of a failing Installer build step in attempt 1:

2019-12-20T00:03:04.6739571Z

Core-Setup stopped having this problem once it moved off BuildTools onto Arcade. Now we've got it again because of global dotnet/runtime settings.

Right now, any build where Libraries => Installer spans a UTC day should fail this way. Adding it to the CI problem tracking issue #702.

Copy link
Contributor

@jashook jashook left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved but meant to comment.

Having an innerloop test which sets GCStress 0xC seems incorrect. We know that gc stress has a certain level of unreliability that does not fit with our innerloop test bar.`

@CarolEidt
Copy link
Contributor Author

@dotnet/dnceng - I re-pushed (with a change to address PR feedback), but I'm still getting some inscrutable failures - e.g. https://helix.dot.net/api/2019-06-17/jobs/fda888e8-b6d4-40fc-ba3f-7fb296c47338/workitems/JIT.jit64.mcc/console simply ends with:

2019-12-20T15:52:00.450Z ERROR xunit-reporter.py xunit-reporter(80) main Unable to report xunit results: no test results xml file found.

Any suggestions for how to move this forward?

@dagood
Copy link
Member

dagood commented Dec 20, 2019

I think that's #1097, @trylek is going to look at that. The log line to focus on seems to be set _commandExitCode=-1073741701, (‭C000007B‬, STATUS_INVALID_IMAGE_FORMAT). It's happening on other PRs too.

@CarolEidt CarolEidt closed this Dec 26, 2019
@CarolEidt CarolEidt reopened this Dec 26, 2019
@BruceForstall
Copy link
Member

@CarolEidt The remaining CI failures are known Windows arm failures, so is this ready to merge?

@CarolEidt
Copy link
Contributor Author

is this ready to merge?

Since this adds a new test, I was hoping to get arm testing on this, but perhaps at this point I should just build and test on arm myself.

@CarolEidt CarolEidt merged commit 685406a into dotnet:master Jan 10, 2020
@CarolEidt CarolEidt deleted the Fix27924 branch January 10, 2020 17:33
CarolEidt added a commit to CarolEidt/coreclr that referenced this pull request Jan 10, 2020
This is the fix for #27924. This is a GC hole bug that was found externally, #27590.
The cause is that the JIT was using the target type of the subtract when it needed
to make a copy of the source, but it needs to use the source type.

## Customer Impact
Corruption of state that is non-deterministic and hard to track down.

## Regression?
Not a recent regression, but exposed by Unsafe.ByteOffset.

## Testing
The fix has been verified in the runtime repo.

## Risk
Low: The fix is straightfoward and only impacts 3 lines of code.
Anipik pushed a commit to dotnet/coreclr that referenced this pull request Feb 13, 2020
This is the fix for #27924. This is a GC hole bug that was found externally, #27590.
The cause is that the JIT was using the target type of the subtract when it needed
to make a copy of the source, but it needs to use the source type.

## Customer Impact
Corruption of state that is non-deterministic and hard to track down.

## Regression?
Not a recent regression, but exposed by Unsafe.ByteOffset.

## Testing
The fix has been verified in the runtime repo.

## Risk
Low: The fix is straightfoward and only impacts 3 lines of code.
@ghost ghost locked as resolved and limited conversation to collaborators Dec 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants