Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZES_ENABLE_SYSMAN vs multiple layers using L0 #36

Open
bgoglin opened this issue Aug 21, 2020 · 35 comments
Open

ZES_ENABLE_SYSMAN vs multiple layers using L0 #36

bgoglin opened this issue Aug 21, 2020 · 35 comments

Comments

@bgoglin
Copy link
Contributor

bgoglin commented Aug 21, 2020

Hello

I am porting hwloc to L0 for exposing the locality of GPU devices. Things worked fine with API 0.91 but porting to 1.0 doesn't look good.

I am listing devices with zeDriverGet() and zeDeviceGet() and then calling zesDevicePciGetProperties() and zesDeviceGetProperties() after casting into zes_device_handle_t. As documented in 1.0 doc, this only works if ZES_ENABLE_SYSMAN=1 in the environment. This looks like a terrible idea.

  1. Requiring a third-party library to always modify the environment of a program isn't nice. It's ok when enabling some debugging in a library from time to time, but not when enabling a mandatory feature. I . Passing a flag to zeInit() would be better.

  2. This won't work if multiple layers use L0 at the same time: an application using L0 (most likely a runtime system managing tasks) will call zeInit() without ZES_ENABLE_SYSMAN (because it doesn't need it). It will then call hwloc to get the locality of that some ze devices (to select the local one, etc). hwloc will fail because sysman wasn't enabled in L0 (it will actually segfault in the current implementation). hwloc could set the env var and call its own zeInit() too but it's likely too late anyway. Given that all drivers/devices are shared between all these layers (which is good), the behavior of L0 will basically change depending on which layer calls zeInit() first, that's unpredictable and bad.

0.95 had zetSysmanGet() for converting between ze and zet devices. 1.0 should just do the same: have an explicit function to enable sysman and "cast" a ze device handle to zes. The env var of init flags won't work just like multiple MPI_Init() don't work in MPI.

@malikm
Copy link

malikm commented Sep 21, 2020

I fully agree with your comment and experience a similar issue.
Moreover, on Windows using the most recent drivers calling any zes function without having ZES_ENABLE_SYSMAN=1 set will result in a crash in the API due to a null-pointer use.

@rscohn2
Copy link
Member

rscohn2 commented Sep 21, 2020

@bgoglin I would like to try out your hwloc with L0 support when you are ready to share.

@bgoglin
Copy link
Contributor Author

bgoglin commented Sep 21, 2020

@rscohn2 The code is in the levelzero branch of the main repo https://github.com/open-mpi/hwloc/tree/levelzero
Tarballs are available at https://ci.inria.fr/hwloc/job/basic/job/levelzero/
It works on my laptop with a single integrated GPU (it creates a zeX object inside my GPU PCI device). Next step is to add information about links between GPUs.
Set C_INCLUDE_PATH=/usr/local/include/levelzero and LD_LIBRARY_PATH=/usr/local/lib/ and configure with --enable-levelzero --disable-netloc

@bmyates
Copy link
Contributor

bmyates commented Sep 21, 2020

@malikm I have a fix that may resolve the windows issue you described. Currently the loader is loading all drivers at startup. My fix moves the driver load into the zeInit path. This will allow a windows application to do a putenv before calling zeInit and have sysman load correctly. I will post my fix soon.

@rhc54
Copy link

rhc54 commented Dec 7, 2020

@bgoglin This has sparked considerable internal discussion over here. Please close this issue - I will discuss this with you offline.

@Kerilk
Copy link

Kerilk commented Dec 7, 2020

Before the issue is closed, I would like to state that I am following it with attention...

@rhc54
Copy link

rhc54 commented Dec 8, 2020

My apologies - I didn't mean to sound like there was some mysterious or sinister reason behind closing it. The bottom line is that this isn't going to be fixed, but a workaround has been identified. In brief, the architectural intent behind L0 assumed that application processes would not need L0/Sysman support. Sysman pulls in a lot of dependencies and the team wanted L0 to be lightweight.

Thus, the L0 design calls for apps to only use the bare minimum "core" L0 functions, and that only system-level entities would use the broader capabilities that require Sysman. This is not the case for HPC applications that want to see the device capabilities, but it is "baked in" to the L0 architecture.

The workaround is to have the local daemon's HWLOC instance initialize the full L0/Sysman support, and then include the device information in HWLOC's structures that it shares with the local procs (either via HWLOC's shmem support or by passing it down as XML). This gives the application full access to the GPU's capabilities without the app initializing Sysman.

It leaves open the question of an app wanting to access the dynamic information via L0 that also requires Sysman - e.g., current energy usage. For those interfaces, we will create PMIx definitions that will allow the app to obtain the info via the PMIx_Query interface. This interface relays the request to the local daemon, which can then use its Sysman-enabled L0 library to obtain the info and return it to the requestor. We currently do this with a variety of things, so this is a simple extension.

We will be working with @bgoglin and the PMIx community (which I lead anyway) to get these services into place. Meantime, the advice is for applications and libraries that access L0 to not initialize L0 with Sysman. It is only the first call to zeinit that matters, and I know that creates a bit of a race condition as to who makes that call - thus, it is probably good practice for someone to do it early (before initializing other libraries) to ensure it gets done correctly.

This also means that libraries should test to see if Sysman was loaded prior to assuming it is present - I would suggest a "dlsym" check to ensure the Sysman functions are available so you avoid the segfault if not.

@Kerilk
Copy link

Kerilk commented Dec 8, 2020

Thanks for the detailed answer.

The problem is not limited to the Sysman layer. Any tools in the table here:
https://spec.oneapi.com/level-zero/latest/tools/PROG.html
would be affected as well. How would those be dealt with?

This also means that libraries should test to see if Sysman was loaded prior to assuming it is present - I would suggest a "dlsym" check to ensure the Sysman functions are available so you avoid the segfault if not.

Given that the loader always export the symbols, I am pretty sure this will not work...

@rhc54
Copy link

rhc54 commented Dec 8, 2020

I should think the app would be safe to access those as they don't involve Sysman, but with the same problem regarding who init's the library first. If an app wants to use those features, I would strongly suggest they immediately call zeinit with the appropriate envar setting.

I'm not sure you are correct about the loader, but this is subject to investigation at this point. I was only trying to suggest other possible avenues for avoiding the segfault as it is unlikely to be fixed.

@bgoglin
Copy link
Contributor Author

bgoglin commented Dec 8, 2020

@Kerilk dlsym will indeed not work, both L0 and Sysman symbols are in the same libze_loader.so. A simple program calling zeDeviceGet() and passing it to zesDeviceGetProperties() will segfault unless ZES_ENABLE_SYSMAN=1. This function should return an error instead of segfaulting. That's likely 2 lines of code, I'll see if I can submit a PR. There is no way to know ZES_ENABLE_SYSMAN=1 was set during the first zeInit().

bgoglin added a commit to bgoglin/compute-runtime that referenced this issue Dec 8, 2020
bgoglin added a commit to bgoglin/compute-runtime that referenced this issue Dec 8, 2020
@bgoglin
Copy link
Contributor Author

bgoglin commented Dec 8, 2020

Patch above is totally untested since I don't know how to recompile L0 but it seems simple enough to me...

@eero-t
Copy link

eero-t commented Mar 10, 2021

Why this control is environment variable instead of e.g. an extra flag one can give for zeInit()?

(Extra flag should be backwards compatible change assuming L0 ignores unknown flag bits.)

@kevin-harms
Copy link

Complete agree with either zeInit() having a flag for initializing SYSMAN or zesInit() to initialize. Environment variables are intended to change the behavior of code at runtime. In this case, using SYSMAN code without setting ZE_ENABLE_SYSMAN results in code that doesn't work...

@eero-t
Copy link

eero-t commented Jun 16, 2021

Something like SDL does would be nice: https://wiki.libsdl.org/SDL_Init

@bgoglin
Copy link
Contributor Author

bgoglin commented Jan 29, 2022

Is there any progress on this? We've been using ugly hacks to define ZES_ENABLE_SYSMAN=1 as early as possible in our library. However every couple months, a new issue is discovered on our side because, as explained above, modifying environment variables within a library is a bad idea. We really need a correct solution on your side, either with zeInit() flags as explained above or anything without environment variable. Thanks.

@rhc54
Copy link

rhc54 commented Jan 31, 2022

Given that it has been nearly two years without Intel fixing this problem, I am inclined to think they never will no matter how many customers and 3rd-party developers complain about it. It therefore seems we have two options:

  1. just refuse to support Intel GPUs. Fortunately, they aren't that pervasive in the world, so this wouldn't enormously impact the user community. Seems a shame to do it, but if we cannot reliably support them, then this might be our best choice.
  2. fork the level-zero repository and fix it ourselves. Maybe call it "level-zero-plus"? Would require that we periodically sync with this repo, but that isn't an overwhelming burden. Negative would be causing some confusion out there, but if we explain why it is necessary, I suspect the packagers would accept the alternative releases.

Obviously, the best solution would be for Intel to fix the bloody library - but it seems that just isn't possible. Of the above options, I'd vote for the first one and simply generate a big warning stating why we aren't supporting Intel GPUs, pointing the user to this ticket. However, if people want to pursue the second, I'm happy to do the fork and occasionally sync it.

@eero-t
Copy link

eero-t commented Feb 1, 2022

Maybe easier to start with a pull request for fixing the issue?

@rhc54
Copy link

rhc54 commented Feb 1, 2022

We tried that - didn't go anywhere.

@eero-t
Copy link

eero-t commented Feb 1, 2022

We tried that - didn't go anywhere.

@rhc54 Which PR that is? I do not see it either in open or closed PRs.

@jandres742
Copy link

hi everyone. So first, thanks for your interest and input here. A few things:

https://spec.oneapi.io/level-zero/0.91/tools/api.html#zetinit

Actually, we had the earlier versions of the API with a specific interface to initialize sysman. see here: https://spec.oneapi.io/level-zero/0.91/tools/api.html#zetinit

Idea was that users would call zetInit to initialize sysman and tools API. This was removed in v1.0, if I remember correctly, because there were suggestions that having an environment variable instead of two initialization functions zeInit and zetInit was better. From reading this post, it seems that was not completely the case :)

Now, we are working on moving some more Sysman queries to Core, so that way users dont need to call Sysman (see here for instance https://spec.oneapi.io/level-zero/latest/core/EXT_PCIProperties.html?highlight=pci#pci-properties-extension). However, that doesn't mean that every Sysman interface would be moved to Core.

So main question here: are we proposing eliminating the SYSMAN env variable because we want to access some specific sysman interfaces? on which case we could just consider maybe moving them to Core to avoid the use of the env var.

Or is the proposal because we are thinking more in the long term for anything that Sysman might provide?

@bgoglin
Copy link
Contributor Author

bgoglin commented Feb 1, 2022

@jandres742 If all non-sensitive queries were available in Core, and Sysman became some sort of "admin" interface for reading sensitive information (not sure if that exists) and configuring hardware/drivers, it would be fine. Right now, on the query side, some things are in Core, some are in Sysman, it's not clear why.

@eero-t
Copy link

eero-t commented Feb 1, 2022

So main question here: are we proposing eliminating the SYSMAN env variable because we want to access some specific sysman interfaces? on which case we could just consider maybe moving them to Core to avoid the use of the env var.

@jandres742 Different users need different Sysman interfaces, and you cannot know all the use-cases (library vs. app, what data is needed, and with what privileges things get run with) beforehand. Using environment variable to enabled them for anything else than for debugging purposes, is just a bad idea.

What's the problem in supporting init flag? It should(?) be fairly small, backwards compatible API improvement, and the environment can still also be used to enable Sysman.

@jandres742
Copy link

@bgoglin

Right now, on the query side, some things are in Core, some are in Sysman, it's not clear why.

Right. That's because we have moved to Core only those for which we have found the need. Not sure whether every query on Sysman need to be on Core to be used by Compute applications.

@eero-t

What's the problem in supporting init flag? It should(?) be fairly small, backwards compatible API improvement, and the environment can still also be used to enable Sysman.

it's not a problem, but more of understanding the requirements for all users. As mentioned above, we started to use the environment variable only in 1.0, whereas in previous releases we had a specific interface for that, which aligns with the idea of the flag. So the main issue here I would say is that we are trying to define something that it is applicable for most cases, and that is stable. As you say, using a flag is a way of doing that.

W are discussing internally how we could incorporate the suggestions from this post and provide an update then.

@rhc54
Copy link

rhc54 commented Feb 1, 2022

My only comment would somewhat echo that from @eero-t. It feels like you would have to support all the sysman queries as there really is no way to anticipate what people will need - seems like every new programming model needs something more than the last one. Might find yourselves chasing the goal, and eventually wind up simply replicating all of sysman other than the controls. Allowing the user to decide if and when to pull sysman into memory feels like the most cost-effective and customer-friendly solution.

@AstroVPK
Copy link

AstroVPK commented Feb 2, 2022

Hi guys - the viewpoint that we are converging on is that L0 sysman is to be used for managing assets (i.e. fan speed, power envelopes, ECC status etcc....) whereas L0 core is to be used for querying system properties typically required for compute as well as actually launching work etc.. So, for example, a query to get the B/D/F address of a device - used for determining which HCA/Ethernet adaptor to use to move a buffer from the device to some remote - is justifiably a compute related query and should belong in L0 core. On the other hand, querying how many fans a device has (even though it is a pure query and not a request to modify some property - is probably not useful to someone doing compute and should probably remain in L0 sysman.
Obviously, we won't catch every property on day 1, but what we can do is move the ones that are clearly compute-related (e.g. fabric topology queries etc...) to core and let users file tickets to move the ones that we miss from sysman -> core.
How does that sound?

@bgoglin
Copy link
Contributor Author

bgoglin commented Feb 2, 2022

@AstroVPK This looks good to me.

FYI, what hwloc currently uses in L0 sysman and would like to have in L0 core is:

  • zesDeviceGetProperties for vendor name, model name, etc
  • zesDevicePciGetProperties for BDF and PCI link total bandwidth
  • zesMemoryGetProperties for physical size, subdevice location and type (type is more precise than in L0 core), no need about individual module info if there are multiple modules with same kind and location
  • zesMemoryGetState because the doc says sometimes the memory size is valid there but 0 in zesMemoryGetProperties
  • zesDeviceEnumFabricPorts+zesFabricPortGetProperties+zesFabricPortGetState to find links between tiles and their bandwidth

@Kerilk
Copy link

Kerilk commented Feb 2, 2022

@AstroVPK To be perfectly honest, I don't like the direction this is going. A flag at zeInit should be the way to access sysman functions, even if the zeInit call is not the first one. From the user perspective, moving things from sysman to core is just a name change, and is therefore irrelevant unless in deciding which flags to use during zeInit. If you need to use a function prefixed by zes, you need to pass the flag to to enable sysman to zeInit.

I doubt categorizing what "applications" will need will converge: let's say somebody at some point would like to make a wrapper library to better manage fans when running intensive calculations?

Moving things to core might be a solution to avoid initializing sysman in the general case (and avoid paying the overhead if that is what the concern is here), but the option to initialize it at runtime and at will is still important.

@kevin-harms
Copy link

The use case is the need to initialize Sysman. An environment variable didn't work, doesn't work and will never work. The split of interfaces between Core and Sysman is irrelevant. Choosing function names based on which library to initialize also doesn't seem to be a good design metric.

I would encourage the L0 spec source be put on a public GitHub so that these type of changes can be discussed before mistakes are made.

@AstroVPK
Copy link

We're in agreement to get rid of the environment variable and adding zesInit() to initialize Sysman. We're working on the details and will share the proposal when ready.

@nahdia99
Copy link

yes, i think after updating bios from another device we also running again sysman support other wise had disable and stop on services.msc. again and again returnback on environment variable. T try to open registry and open P9Rdr service then modify dword "start" on 2 , delete and take own services mode

@bd4
Copy link

bd4 commented Jun 29, 2022

Adding zesInit() API sounds like a good solution to me. We use the sysman API in a normal science code (GENE fusion code https://genecode.org/), to get detailed device info for ensuring MPI -> device mappings are correct and for zesMemoryGetState to track how memory is being used throughout a run. This may seem like debug info, but it's useful to have in production runs to make sure things are working as expected.

@bgoglin
Copy link
Contributor Author

bgoglin commented Jun 12, 2023

Hello
I was told that zesInit() is the proper way to solve this issue, good. I was planning to use a compile-time check for zesInit() before deciding between setting ZES_ENABLE_SYSMAN=1 and calling zesInit(). Unfortunately, the current release of the compute-runtime provides a zesInit() that returns an error (unsupported). That's rather annoying. Runtime-checks are difficult because of cross-compiling cases, compile-time-checks are much better. Can you please release a working version soon so that we just tell users to use either older releases (without zesInit) or recent releases (with a working zesInit())?

@eero-t
Copy link

eero-t commented Jun 12, 2023

This is the L0 frontend project. Backend related issues should be filed to L0 backend project (and linked here when relevant): https://github.com/intel/compute-runtime/issues

In general you should handle errors for all the API calls. If backend loaded at runtime happens to be older than the frontend, there will be errors for the new APIs, as backend does not have entry points for them.

@bgoglin
Copy link
Contributor Author

bgoglin commented Jun 12, 2023

Thanks @eero-t but I opened this issue almost 3 years ago, it has caused many headaches to many people in the meantime, hence I figured it's useful to notify people that things are improving but not yet finished because of this additional issue.

@rhc54
Copy link

rhc54 commented Jun 12, 2023

Please wait for the fix

Hopefully this will happen in a more timely manner than Intel's response to this entire matter. Meantime, all we can do is continue to advise people that (at least for my software package) we cannot support Intel GPUs due to a problem in the vendor-supplied software, and suggest that they either contact the vendor or seek alternative products.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests