Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use pin_user_pages_fast instead of get_user_pages_fast in create_existing_sysmem(drivers/hv/dxgkrnl/dxgvmbus.c) #11095

Open
1 of 2 tasks
langyuxf opened this issue Jan 30, 2024 · 16 comments
Assignees
Labels
GPU kernel WSL kernel

Comments

@langyuxf
Copy link

langyuxf commented Jan 30, 2024

Windows Version

10.0.22631.3085

WSL Version

2.0.9.0

Are you using WSL 1 or WSL 2?

  • WSL 2
  • WSL 1

Kernel Version

5.15.133.1-1

Distro Version

Ubuntu 22.04

Other Software

No response

Repro Steps

1, use mmap to allocate a PRIVATE + ANONYMOUS CPU memory
2, use dxgkio_create_allocation and the CPU memory to create a GPU allocation
3, fork a child process, then write to this memory from CPU side in parent process

Expected Behavior

The backing store physical pages of CPU VA in parent process should be still mapped in GPU page table.

Actual Behavior

Due to fork and write from parent process, copy-on-write is triggered, new physical pages are allocated for parent process
which are not mapped in GPU page table.

Diagnostic Logs

No response

@ghost ghost assigned iourit Jan 30, 2024
@ghost ghost added kernel WSL kernel and removed kernel WSL kernel labels Jan 30, 2024
@ghost ghost unassigned iourit Jan 30, 2024
@ghost
Copy link

ghost commented Jan 30, 2024

Word from the source is that this is unsupported by design. (Forking a process using a vGPU that is)

@ghost ghost closed this as completed Jan 30, 2024
@ghost ghost added the feature label Jan 30, 2024
@langyuxf
Copy link
Author

Can you elaborate on this?
What is unsupported?

@langyuxf
Copy link
Author

langyuxf commented Feb 1, 2024

Hi @pmartincic ,

I'm not forking a process(child process) using a vGPU, I mean the parent process should still be able to use the vGPU.
But now the parent process can't use vGPU when using get_user_pages_*.

Regards,
Lang

@ghost
Copy link

ghost commented Feb 1, 2024

Ahhh, I'm sorry I misread what you wrote. That's my fault. I'll start the conversation internally again.

@ghost ghost reopened this Feb 1, 2024
@ghost ghost removed the feature label Feb 1, 2024
@langyuxf
Copy link
Author

langyuxf commented Feb 3, 2024

Hi @pmartincic,

Any updates?

Thanks,
Lang

@ghost
Copy link

ghost commented Feb 6, 2024

Do you have a sample/repro/minimal app/code sample you can post? I don't work with/on these features normally, is a little out of my wheelhouse and I want to perform a sanity check before I pass it along.

(Edit: they agree the parent process should still work)

@ghost ghost self-assigned this Feb 6, 2024
@langyuxf
Copy link
Author

langyuxf commented Feb 6, 2024

It's hard to put all sample codes here to reproduce the issue.

What do you want to check?

@langyuxf
Copy link
Author

langyuxf commented Feb 8, 2024

Hi @pmartincic,

Have you guys seen these docs?

https://github.com/torvalds/linux/blob/master/Documentation/core-api/pin_user_pages.rst#page-maybe-dma-pinned-the-whole-point-of-pinning
https://github.com/torvalds/linux/blob/master/Documentation/core-api/pin_user_pages.rst#another-way-of-thinking-about-foll-get-foll-pin-and-foll-longterm
https://github.com/torvalds/linux/blob/master/include/linux/mm.h#L1967

Especially, the following

/*
 * This should most likely only be called during fork() to see whether we
 * should break the cow immediately for an anon page on the src mm.
 *
 * The caller has to hold the PT lock and the vma->vm_mm->->write_protect_seq.
 */

static inline bool folio_needs_cow_for_dma(struct vm_area_struct *vma,
					  struct folio *folio)
{
	VM_BUG_ON(!(raw_read_seqcount(&vma->vm_mm->write_protect_seq) & 1));

	if (!test_bit(MMF_HAS_PINNED, &vma->vm_mm->flags))
		return false;

	return folio_maybe_dma_pinned(folio);
}

Regards,
Lang

@ghost
Copy link

ghost commented Feb 8, 2024

Trust but verify.

I don't work on this area of the product. My plate is pretty full. I don't have time to write a repro. Somebody who does know this area is willing to look at it. I have a limited amount of political capital. This person helping owes me nothing and it's not their responsibility to do what I ask. I will pass this bug along to them if you give a repro that I can sanity check.

@langyuxf
Copy link
Author

langyuxf commented Feb 8, 2024

I'm working on some closed source work on AMD hardware and It's hard for me to expose the codes.

The bug is actually clear.

@ghost
Copy link

ghost commented Feb 8, 2024

You're missing the point. I'm probably not going to dig into this. The person I'm talking to who has the capacity to, would like a sample repro.

For their sake please make one.

@langyuxf
Copy link
Author

langyuxf commented Feb 8, 2024

Thanks,I got you point.

But it's inconvenient for me to expose a sample to reproduce which includes some closed source parts.

Can I talk with that person?If they have some questions.

@langyuxf
Copy link
Author

langyuxf commented Feb 9, 2024

I create a repro demo and hide hardware driver specific details. You guys must use AMD gfx9 GPU and above.

https://github.com/langyuxf/dxgkrnl_linux.git

Thanks
Lang

@ghost ghost added GPU kernel WSL kernel labels Feb 12, 2024
@ghost ghost assigned iourit Feb 13, 2024
@langyuxf
Copy link
Author

langyuxf commented Mar 1, 2024

Hi @pmartincic @iourit,

We observed similar issues under memory pressure. Pages acquired via get_user_pages_fast() are reclaimed by linux memory management system while GPU page tables are not updated accordingly. Then the GPU still operates on the original pages.

Have you guys reproduced the issue? Thanks.

Regards,
Lang

@benhillis
Copy link
Member

This should be fixed with https://github.com/microsoft/WSL/releases/tag/2.2.3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GPU kernel WSL kernel
Projects
None yet
Development

No branches or pull requests

3 participants