Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ROS Gazebo black screen #3368

Closed
IceSentry opened this issue Jul 9, 2018 · 37 comments
Closed

ROS Gazebo black screen #3368

IceSentry opened this issue Jul 9, 2018 · 37 comments
Labels

Comments

@IceSentry
Copy link

Following this tutorial I managed to run ros on wsl with ubuntu 16.04 and Microsoft Windows [Version 10.0.17134.112]. However when launching gazebo with VcXsrv it gives a black screen where you should see the 3d model of the robot, but the rest of the UI is there and there is no error in the console.

I know WSL is supposed to be for command line tools, but there have been a few issues related to using gui apps through x server that have been resolved here.

I created this issue after reading #1450 where it was suggested to create a new issue for this.

@therealkenc
Copy link
Collaborator

where it was suggested to create a new issue for this.

Hmmm, Sunil did say that, didn't he. "This tutorial" is a pretty big ask, given that it demonstrably does work out of the box and this isn't a WSL bug or even feature request. You are probably way better off asking your question wherever it is that ROS Gazebo people hang out.

As a first step before I bite, see if you able to run glxgears or do you just get blackness there too:

$ sudo apt install mesa-utils
$ export DISPLAY=localhost:0
$ glxgears

In VcXsrv choose multi-window and "Native opengl" unchecked (sic). If glxgears is okay and gazebo still doesn't light up I'll try to take a closer look if/when I get a moment.

@MVoz
Copy link

MVoz commented Jul 9, 2018

@IceSentry
vcxsrv.exe :0 -ac -terminate -lesspointer -multiwindow -clipboard ...

windows\ vcxsrv.exe :0 <<-->> display :0 \linux

@IceSentry
Copy link
Author

@therealkenc it was asked on gazebo answer http://answers.gazebosim.org/question/16888/gazebo-on-wsl-bash-for-windows/ which is actually how I came to the original issue on github.

I did manage to see some gears with glxgears, unfortunately gazebo still shows a black screen. Although on my version of vcxsrv I don't see a native opengl checkbox. I use vcxsrv 1.20.0.0 from sourceforge

@voskrese I have no idea what you want me to do

@shoffmeister
Copy link

glxgears gives me about 650 fps on vcxsrv 1.20 in native OpenGL mode on a comparatively slow Intel HD 4000 laptop GPU, about 1500 fps on an Intel HD 620 (two slow generations later)

You can enable OpenGL mode on the vcxsrv side by running XLaunch; this will take you through a wizard which on its last page refers to the LIBGL_ALWAYS_INDIRECT environment variable (export LIBGL_ALWAYS_INDIRECT=1)

Generally speaking, it would be nice to understand whether ROS gazebo actually does run with a native Linux and vcxsrv as the remote X server.

@IceSentry
Copy link
Author

With export LIBGL_ALWAYS_INDIRECT=1 gazebo simply doesn't launch.

Also I made a mistake in my other comment, I was using an old version of xming. I tried it again with vcxsrv with native opengl and I had the same result.

@MVoz
Copy link

MVoz commented Jul 9, 2018

@IceSentry You wrote that you see a black screen, instead of anything, I assumed that you installed Xming\vcxsrv, just do not set it on the desired display, by default -1, and you need 0, if of course you passed the same settings in UNIX environment

In general, you will get your own \Google translation ))

@therealkenc
Copy link
Collaborator

this will take you through a wizard which on its last page refers to the LIBGL_ALWAYS_INDIRECT environment variable (export LIBGL_ALWAYS_INDIRECT=1)

Do not do that.

Generally speaking, it would be nice to understand whether ROS gazebo actually does run with a native Linux and vcxsrv as the remote X server.

This is generally the prerequisite for posting here. Even then, monsters like ROS make for terrible use-cases for tracking down WSL bugs or feature gaps; since those bugs won't have to do with drawing pixels. An exception is being made since it was suggested to open an issue.

Although on my version of vcxsrv I don't see a native opengl checkbox. I use vcxsrv 1.20.0.0 from sourceforge

Start VcXsrv with XLaunch. You will see the checkbox on the third page of the wizard.

In any case, against my better judgement I installed ROS (a short part of my life I'll never get back). Worked here out of the box with Ubuntu 16.04 and VcXsrv 1.20. Which is pretty much what I feared.

n.b. I don't want to imply unchecking native gl is your magic bullet. I only know it lit up without incident for me. I am not sure the thing is actually usable, mind. I suspect Gazebo falls much into the same category as Blender on that front.

image

@therealkenc
Copy link
Collaborator

therealkenc commented Jul 9, 2018

I only just clicked though the gazebosim forum question. It is useful as an illustration (and future link target):

it was asked on gazebo answer http://answers.gazebosim.org/question/16888/gazebo-on-wsl-bash-for-windows/

"It" was the wrong question, and went sideways in the first sentence. Like shoffmeister suggests, the procedure would be to run Gazebo in a Real Linux VM talking to a "remote" VcXsrv first, and if it doesn't work, that is the question you'd be asking on the Gazebo list (or any other list).

"I'm running Gazebo in the cloud using VcXsrv as my remote X Server, and I get a black screen. Please help."

Instead what the poster said was "I am surprised to how RPi is able render the visualization window, whereas WSL is unable to!" Which is going to get basically zero help from anyone unless you are lucky enough to find someone exceedingly patient. Because WSL, just like any headless box, doesn't render anything.

@IceSentry
Copy link
Author

I'm sorry if I wasted your time, I'm very new to all of this and since it was suggested in the other thread and I had a similar issue I made this thread.

I know WSL isn't supposed to be used for GUI apps, but I've seen a few issues here talking about running gui apps through an xserver so I assumed it was in a kind of grey area.

I will try to do a fresh wsl and gazebo install and see if it works. I'll also look into running gazebo on a
a linux vm with an X server. To be clear I did find the native opengl option and I disabled it as suggested and it still didn't work for me.

But if I understand correctly you don't believe this is an issue with WSL and it is most likely with gazebo and/or the x server?

@therealkenc
Copy link
Collaborator

I'm sorry if I wasted your time

Don't be. This isn't on you. It wasn't a waste. Yes it's a grey area.

But if I understand correctly you don't believe this is an issue with WSL and it is most likely with gazebo and/or the x server?

Dunno, precisely, or I'd tell ya. I was kind of hoping glxgears wouldn't work for you, in which case we could attack that head on. Or, that gazebo wouldn't work for me, in which case I could deep dive for the curiosity value. We've got neither of those situations, unfortunately.

We do know "WSL" (namely the kernel emulation layer that makes this whole thing go) is fine. Because it works, demonstrably. WSL doesn't even know you are talking to VcXsrv. All WSL knows is you are writing some bytes down a socket on localhost port 6000.

It might be firewall related. I got quite a few popups to open ports. Try disabling your Windows firewall entirely and see if that helps. But we're grasping here.

You might try XMing just to switch things up. That at least purturbs a variable (the X Server).

You can also try running it remotely from inside a VM as an experiment, if you are feeling motivated or are looking for a time sink. But trying a given scenario (like Gazebo) in a VM is more valuable in the case where you know it doesn't work in WSL, and thus can start tracking down a diverge from the Real Thing. Here we know it does, so there is no diverge to discover. The possible outcomes of the experiment for you aren't pretty. Either it doesn't work in the VM, which leaves you exactly where you are right now (wondering why). Or it does work, which puts you in the unenviable position of trying to find out precisely what bits of your Real Linux userspace (Ubuntu + everything) differ from your WSL userspace (Ubuntu + everything). Which is "not easy", to put it mildly.

One "last" shot: what does glxinfo | grep OpenGL return for you. Should look like:

ken@DESKTOP-4UTIQSF:~/Devel/ros$ glxinfo | grep OpenGL
OpenGL vendor string: VMware, Inc.
OpenGL renderer string: llvmpipe (LLVM 6.0, 256 bits)
OpenGL core profile version string: 3.3 (Core Profile) Mesa 18.0.5
OpenGL core profile shading language version string: 3.30
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 3.0 Mesa 18.0.5
OpenGL shading language version string: 1.30
OpenGL context flags: (none)
OpenGL extensions:
OpenGL ES profile version string: OpenGL ES 3.0 Mesa 18.0.5
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.00
OpenGL ES profile extensions:

If it's the same just say 'same' no need to paste the output. I'm just trying to guess variables.

@shoffmeister
Copy link

On the subject of using export LIBGL_ALWAYS_INDIRECT=1, and also regarding "native windows opengl" in vcxsrv, I finally understand some of the implications there.

Fundamentally, using native OpenGL forces all OpenGL commands to be rendered on the remote server (i.e. the "box" which eventually shows the bits and bytes). One of the side-effects of that is that the OpenGL command stream forces use of the GLX protocol - and that only supports OpenGL 1.4 (see http://x.cygwin.com/docs/ug/using-glx.html)

So, some experiments after sudo apt install gazebo9:

  • launch vcxsrv with "Native opengl" checked
    -- export LIBGL_ALWAYS_INDIRECT=1 && gazebo -> exactly nothing happens, silent crash (gazebo diagnostics probably broken)
    -- export LIBGL_ALWAYS_INDIRECT=0 && gazebo ->
libGL error: No matching fbConfigs or visuals found
libGL error: failed to load driver: swrast
libGL error: No matching fbConfigs or visuals found
libGL error: failed to load driver: swrast
  • launch vcxsrv without "Native opengl" checked
    -- export LIBGL_ALWAYS_INDIRECT=1 && gazebo -> exactly nothing happens, silent crash (gazebo diagnostics probably broken)
    -- export LIBGL_ALWAYS_INDIRECT=0 && gazebo -> life is good

At no time am I able to see a black screen; something must be seriously broken elsewhere for that to happen (in case "black screen" is exactly what is visible)

@shoffmeister
Copy link

But if I understand correctly you don't believe this is an issue with WSL and it is most likely with gazebo and/or the x server?

The probability of this being a WSL problem is very, very low, given the above.

@IceSentry
Copy link
Author

image

This is what I mean by black screen I should have made this at the beginning. The issue is that the 3d viewer is black not the entire window.

I'll do a clean install of wsl, gazebo and vcxsrv tomorrow and try to investigate more.

@yxy1996
Copy link

yxy1996 commented Jul 25, 2018

I have the same black screen problem, with a "preparing world" window on the front. I run it in xfce4. And the origin ubuntu commander output like this:

(xfwm4:6656): xfwm4-WARNING **: Unmanaged net_wm_state (window 0xd00004, atom "_NET_WM_STATE_STAYS_ON_TOP")

I also try the last shot glxinfo | grep OpenGL, and it returns the same.

@therealkenc
Copy link
Collaborator

That warning is a window manager thing (xfce4) and won't be related. [Also, running a window manager serves no purpose, but I digress.]

I think this one has about run its course. There are reports of the same behaviour on Real Linux. Try disabling the Windows firewall (there was no response to that suggestion), but that is mostly pissing on a spark plug. If glxgears was also drawing blackness we could explore general remote opengl problems, but that isn't the case here. The way forward here would be for someone motivated to deep dive Gazebo and find out exactly what it is up to. But there's no WSL actionable, and no one here purports to be a Gazebo expert (I had never heard of the thing before this issue was submitted). Bonne chance.

@jbohren-hbr
Copy link

@therealkenc thanks for spending time on this. I encountered the same issue that @IceSentry is seeing, and there's an easy solution:

export GAZEBO_IP=127.0.0.1

I tried running this out of the box, and also saw a black render window, so I started with your suggestion:

Try disabling the Windows firewall (there was no response to that suggestion),

With the firewall completely disabled, I still saw a black render window when following the normal gazebo bringup. This did appear to be related to the network configuration, though, as the gazebo server emits warnings about outgoing message queues filling up (nobody is receiving them). I also had the same issue when running as admin.

Below is the repro that shows and then fixes the bug:


From vanilla Windows 10 Pro x64 with Intel graphics:

  1. Install WSL ref
win$ Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Windows-Subsystem-Linux
  1. Install Ubuntu16.04 from the MS Store link
  2. Install VcXsrv 64.1.20.1.0 link
  3. Launch VcXsrv through XLaunch with the following commands:
    image
    image
    image
  4. Launch Ubuntu16.04 in WSL (and initialize with new user creds)
  5. Install xubuntu-desktop to suppress window manager warnings
linux$ sudo apt install xubuntu-desktop
  1. Install mesa-utils to verify 3d acceleration
linux$ sudo apt install mesa-utils`
  1. Set DISPLAY
linux$ export DISPLAY=:0
  1. Run glxgears to verify 3d acceleration:
linux$ glxgears
  1. Install gazebo 9 with bootstrap script:
linux$ curl -sSL http://get.gazebosim.org | sh
  1. Run gazebo:
linux$ gazebo --verbose

Result:
image

Gazebo multi-robot simulator, version 9.3.1
Copyright (C) 2012 Open Source Robotics Foundation.
Released under the Apache 2 License.
http://gazebosim.org

[Msg] Waiting for master.
[Msg] Connected to gazebo master @ http://127.0.0.1:11345
[Msg] Publicized address: 169.254.241.102
Gazebo multi-robot simulator, version 9.3.1
Copyright (C) 2012 Open Source Robotics Foundation.
Released under the Apache 2 License.
http://gazebosim.org

[Msg] Waiting for master.
[Msg] Connected to gazebo master @ http://127.0.0.1:11345
[Msg] Publicized address: 169.254.241.102

What stands out here is the [Msg] Publicized address: 169.254.241.102 since that is not the ip address for any valid interface according to ifconfig.

So, killing gazebo and exporting GAZEBO_IP ref fixes this:

linux$ export GAZEBO_IP=127.0.0.1
linux$ gazebo --verbose

working

Gazebo multi-robot simulator, version 9.3.1
Copyright (C) 2012 Open Source Robotics Foundation.
Released under the Apache 2 License.
http://gazebosim.org

[Msg] Waiting for master.
Gazebo multi-robot simulator, version 9.3.1
Copyright (C) 2012 Open Source Robotics Foundation.
Released under the Apache 2 License.
http://gazebosim.org

[Msg] Connected to gazebo master @ http://127.0.0.1:11345
[Msg] Publicized address: 127.0.0.1
[Msg] Waiting for master.
[Msg] Connected to gazebo master @ http://127.0.0.1:11345
[Msg] Publicized address: 127.0.0.1

I still see a lot of message errors, but this is unlikely a WSL-specific issue beyond networking config:

Exception sending a message
...

@therealkenc, as you said, it runs, but usability is still yet to be seen.

Shoutout @nkoenig @scpeters @tfoote

@yxy1996
Copy link

yxy1996 commented Aug 21, 2018 via email

@therealkenc
Copy link
Collaborator

Glad you got it working. 👍

  1. Set DISPLAY
    linux$ export DISPLAY=:0

Minor observation, setting aside $GAZEBO_IP for a moment. Your $DISPLAY variable wants to be 127.0.0.1:0 or localhost:0 as well (per earlier). It might not (and probably doesn't) have any effect on Gazebo, but ":0" means X11 over AF_UNIX on /tmp/.X11-unix/X0, not AF_INET over localhost:6000. But this can (and does) cause problems in some scenarios.

Mentioning it mostly because I didn't personally set $GAZEBO_IP despite it working anyway. That difference remains somewhat unexplained. I did however blindly install all of ROS (because the OP said do that), which pulls in the kitchen sink. So who knows what variables got set. Either way, nice follow up.

@jbohren-hbr
Copy link

jbohren-hbr commented Aug 22, 2018

Your $DISPLAY variable wants to be 127.0.0.1:0 or localhost:0 as well (per earlier). It might not (and probably doesn't) have any effect on Gazebo, but ":0" means X11 over AF_UNIX on /tmp/.X11-unix/X0, not AF_INET over localhost:6000. But this can (and does) cause problems in some scenarios.

Ah, that's good clarification.

Mentioning it mostly because I didn't personally set $GAZEBO_IP despite it working anyway. That difference remains somewhat unexplained.

Gazebo looks like it's using getifaddrs (location) (manpage) to determine the IP of a given host. WSL networking is also something I lack knowledge about, but since that's in the linux kernel, this might actually be a WSL bug. I see that there are some previous issues with getifaddrs: #308

I did however blindly install all of ROS (because the OP said do that), which pulls in the kitchen sink. So who knows what variables got set.

Yeah, I already had a ROS installation in my WSL (but not the machine I used for vanilla repro/debug), and it still failed in the same way. I don't think any package in gazebo_ros_pkgs exports GAZEBO_IP, anyway.

@jbohren-hbr
Copy link

jbohren-hbr commented Aug 22, 2018

It looks like instead of coming back as AF_INET like it does on linux, on WSL, sa_family for every interface is AF_PACKET (manpage). (forget this, it was just gazebo picking the first non-loopback interface).

The other, correct, IP addresses are listed in getifaddrs on WSL, so he Gazebo host IP heuristic is definitely to blame here. It's still not clear why getifaddrs is returning this additional 169.254.x.x IP address on WSL and not native Linux, though.

@therealkenc

@therealkenc
Copy link
Collaborator

It's still not clear why getifaddrs is returning this additional 169.254.x.x IP address on WSL and not native Linux, though.

That's excellent information to add, thanks. Since it did "work" for me out of the box (possibly due to differences in my network rig) and, more practically, because I've long since wiped ROS/Gazebo, I will need a high bar of boredom to probably won't deep dive Gazebo as the repro of choice for a getifaddrs diverge. But absolutely, consider your observation well noted (for when not if something else comes up that smells like a getifaddrs diverge). If you can do a small tight CLI test case you'd definitely get a shiny gold star, but the challenge is going to be doing one concise enough to actually reproduce for the folks in Redmond (read: "returns this additional 169.254.x.x IP").

@jbohren-hbr
Copy link

If you can do a small tight CLI test case you'd definitely get a shiny gold star, but the challenge is going to be doing one concise enough to actually reproduce for the folks in Redmond (read: "returns this additional 169.254.x.x IP").

Here you go (no shiny stars needed):
https://github.com/jbohren-hbr/ifaddrs_test

This works on my native xubuntu 16.04 machine, fails on WSL with extra phantom interfaces that don't show up in ifconfig and are all on the 169.254.x.x subnet, which is not the right subnet for any of the interfaces I actually have:

wsl$ ./test.bash
not matching.
1,2d0
< eth0
< eth1
7,8d4
< eth6
< eth7
11,12d6
< wifi1
< wifi2
native$ ./test.bash
matching.

@therealkenc
Copy link
Collaborator

Okay. Unfortunately (?) your test doesn't repro for me here. I get "matching." on WSL 18219. Which is, of course, why Gazebo worked out of the box for me (I would have been on some earlier Insiders in July).

I can't think of a dupe off the cuff that would explain a fix since 17134 (usually I can if one exists, but that doesn't mean it doesn't exist). I don't have a 17134 handy to try. First step would be to try Insiders (fast ring not skip-ahead will do) and see if it reproduces there.

Note also your ./text.bash script is sensitive to the ifconfig output syntax, which appears to have changed between Ubuntu 16.04 and 18.04 (ifconfig | grep -B1 "inet addr" won't get a hit on Real Linux Ubuntu 18.04). Not a problem, but something to be aware of since we've got Windows version number and Ubuntu version number variables at play here.

@jbohren-hbr
Copy link

OK. I don't think this is an issue with WSL. The issue seems to be that getifaddrs returns interfaces that are both up and down. Gazebo should be checking if IFF_UP in the flags, surprise.

See here: https://github.com/jbohren-hbr/ifaddrs_test/blob/master/print_ifaddrs.cpp#L92

Test now passing, this can be put to rest. Need to create a PR on gazebo though...

@jbohren-hbr
Copy link

For completeness, Gazebo PR with fix is here: https://bitbucket.org/osrf/gazebo/pull-requests/3009/adding-check-to-make-sure-automatic/diff

@therealkenc
Copy link
Collaborator

🌟

@maxymczech
Copy link

I have encountered same issue on Ubuntu Bionic, ROS Melodic and Gazebo 9 (running on virtual machine). I have 'solved' it by killing previously previously launched instance of gzserver.

@karray
Copy link

karray commented Nov 20, 2018

I still see a lot of message errors, but this is unlikely a WSL-specific issue beyond networking config:

Exception sending a message
...

The same here. I've installed ROS-Melodic (full desktop) with Gazebo 9 on 3 different machines: 2 of them are with Ubuntu 18.04 and third one with WSL Ubuntu 18.04 (Windows 10). After a fresh install I get Exception sending a message, which is being printed infinitely after starting gazebo (gzserver, roslaunch). But it seems that everything.

I've also noticed that it happens only when I open VPN connection

@traversaro
Copy link

It seems that a similar issue related to 169.245.*.* "phantoom" network interfaces is affecting also the Windows build of Gazebo, see ms-iot/ROSOnWindows#47 (comment) . This may indicate that the problem/feature of phantom interfaces may not be related to WSL at all, but it could be related just to the Windows network stack.

@traversaro
Copy link

If I understand correctly (see ms-iot/ROSOnWindows#47 (comment)) the 169.245.*.* interfaces are related to ädapter that are disconnected, and so the issues is in Gazebo that should ignore them.

@therealkenc
Copy link
Collaborator

so the issues is in Gazebo

You are now caught up to above.

@traversaro
Copy link

traversaro commented Dec 31, 2018

You are now caught up to above.

Thanks! Sorry for noise due to skipping over the last comments.

@achimismaili
Copy link

For me, setting DISPLAY to 0, 0.0, localhost:0 or 127.0.0.1 on a ubuntu windows linux sub system (WLS2) did all not work, I had to use the public IP and the following command helped me:

export DISPLAY=$(cat /etc/resolv.conf | grep nameserver | awk '{print $2}'):0

@luyor
Copy link

luyor commented Feb 17, 2021

I use ROS Gazebo in WSL docker. For me, the network settings are correct, but I still see the blank screen. The reason is, Gazebo is downloading data silently in background, and I just need to wait for it to finish.

I use this tutorial to setup ROS Gazebo in WSL docker.
I run roscore in terminal. Then run rosrun gazebo_ros gazebo in second terminal. I have a blank screen as @IceSentry, and a log in terminal: waitForService: Service [/gazebo/set_physics_properties] has not been advertised, waiting...

Gazebo is downloading data silently in background, see discussion in the forum.
Close the window, and wait for the log to print in terminal:

[ INFO] [1613558941.011174200]: waitForService: Service [/gazebo/set_physics_properties] is now available.
[ INFO] [1613558941.208707300]: Physics dynamic reconfigure ready.

Rerun rosrun gazebo_ros gazebo, then it works.

@besch
Copy link

besch commented Jun 29, 2022

For me the 'black screen in gazebo' fix was to install latest Intel driver
Although one issue left to solve is, after Windows restart gazebo scene resets to black screen

@TrinhNC
Copy link

TrinhNC commented Oct 18, 2022

For me follow the solution of @jbohren-hbr works only when I uninstall Xming and install vcxrv.

@lidianzhong
Copy link

lidianzhong commented Jul 2, 2024

I finally solve the problem by deleting the .wslconfig content. For me, the problem should be related to the ip network problem.
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests