-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gluster distributed volume #193
Comments
I am running 3 C1 servers from scaleway, ARM architectures too and encountering exactly the same problem. My firewall is correctly configured, I have no DAC (SELinux, AppArmor). It works if I configure it in |
@superboum Could you attach "gluster volume info " and "gluster volume status" output for this? Including other developers who work on Disperse volume: @xhernandez @aspandey @sunilheggodu |
Create and start the volume:
Gluster Volume Info:
Gluster Volume Status:
Error:
Some info about my system:
No log in `/var/log/glusterfs/bricks/var-lib-erasure.log`:
`/var/log/glusterfs/glusterd.log`:
Let me know if you want me to try something or additional logs :) |
Could you add attach the following log: /var/log/glusterfs/mnt-test.log when this error happened? |
Logs for `/var/log/glusterfs/mnt-test.log`:
|
@superboum It seems to be crashing :-(. Could you check the corefile and attach the backtrace of the core? |
Sorry, I am not very familiar with C programs debugging but indeed I have a Backtrace
The full core file can be found here: |
Thanks for the core, I am not able to download without email/password, is there any other location where I can download this file from? |
Sorry, you can find it here: |
I can't download the core dump (connection timed out), but I think the problem is caused by conflicting sizes of data types on ARM. If I write a patch, can you compile it an test ? |
Ok, I should have started by uploading it on github... Didn't know this feature existed: I can compile it and test it indeed. |
I am having the same issue, Ubuntu Number of Bricks: 1 x (3 + 1) = 4 ls: cannot open directory '.': Transport endpoint is not connected. |
I've uploaded a patch that should fix the problem. I've fixed all warnings that appear when compiling on x86 32 bits (some of them were dangerous). If you get any warning related to variable sizes when you compile on ARM, please let me know. Note that the patch is completely untested. |
I tried to compile GlusterFS with your patch and Make output
But I encounter a segfault when I try to run glusterd:
In the log, I have the following lines:
When I try to display a backtrace from GDB, I get that:
You can find the core file here: core.zip |
Can you try to install debug symbols for libcrypto.so.1.1 and retry the backtrace ? |
Sure:
For your information:
It might be an error independent from your patch. If you think so, I plan to try to compile/install GlusterFS from the master branch and look at Debian specific patches, but I don't know when I will have time to test that. |
This seems to me as an issue with crypto library. It seems that it's using an illegal instruction when it tries to detect processor capabilities. Does it work with same configuration but without the patch ? |
Ok, so it appears that's a normal behavior: When debugging I observe SIGILL during OpenSSL initialization: why? Edit: the real error:
It seems I have this error (SIGSEGV on gf_add_cmdline_options) with and without your patch, so another thing to investigate from my side. I will update this post when I will know why. Edit 2: I changed strategy. I tried to build glusterfs in a Docker container to prevent any interference from past installations. And I don't have the error anymore. I still need to test erasure coding. I will post all the logs when I will have done all the tests. DockerfileFROM arm32v7/debian:buster
RUN apt-get update && \
apt-get install -y \
autotools-dev \
libfuse-dev \
libibverbs-dev \
libdb-dev \
librdmacm-dev \
libaio-dev \
libacl1-dev \
libsqlite3-dev \
liburcu-dev \
uuid-dev \
liblvm2-dev \
attr \
flex \
bison \
libreadline-dev \
libncurses5-dev \
libglib2.0-dev \
libssl-dev \
libxml2-dev \
pkg-config \
dh-python \
python-all-dev \
build-essential \
git \
wget \
autoconf \
libtool \
gdb
WORKDIR /opt
RUN wget https://review.gluster.org/changes/21276/revisions/6baeb147c19c1f9f29552eebf98b33e4442e8a31/archive?format=tgz -O glusterfs.tgz
RUN tar xzvf glusterfs.tgz
RUN ./autogen.sh
RUN ./configure --enable-debug
RUN make -j5
RUN make install
RUN echo "/usr/local/lib" > /etc/ld.so.conf.d/local.conf && ldconfig |
You patch seems to fix the erasure coding bug I encountered Here is my test protocol: I created a Dockerfile that I built with DockerfileFROM arm32v7/debian:buster
RUN apt-get update && \
apt-get install -y \
autotools-dev \
libfuse-dev \
libibverbs-dev \
libdb-dev \
librdmacm-dev \
libaio-dev \
libacl1-dev \
libsqlite3-dev \
liburcu-dev \
uuid-dev \
liblvm2-dev \
attr \
flex \
bison \
libreadline-dev \
libncurses5-dev \
libglib2.0-dev \
libssl-dev \
libxml2-dev \
pkg-config \
dh-python \
python-all-dev \
build-essential \
git \
wget \
autoconf \
libtool \
gdb
WORKDIR /opt
RUN wget https://review.gluster.org/changes/21276/revisions/6baeb147c19c1f9f29552eebf98b33e4442e8a31/archive?format=tgz -O glusterfs.tgz
RUN tar xzvf glusterfs.tgz
RUN ./autogen.sh
RUN ./configure --enable-debug
RUN make -j5
RUN make install
RUN echo "/usr/local/lib" > /etc/ld.so.conf.d/local.conf && ldconfig After that I have started a gluster daemon: docker run --privileged=true -ti superboum/glusterbuild
/usr/local/sbin/glusterd --debug And in a second terminal, I got a shell in the container to run the following tests which worked: docker exec -t -i b4697a3b04de bash
mkdir /srv/g{1,2,3}
gluster volume create test-erasure disperse 3 redundancy 1 transport tcp 172.17.0.2:/srv/g1 172.17.0.2:/srv/g2 172.17.0.2:/srv/g3 force
gluster volume start test-erasure
mkdir /mnt/glerasure
mount -t glusterfs 127.0.0.1:/test-erasure /mnt/glerasure/
gluster volume info test-erasure
# Output:
# Volume Name: test-erasure
# Type: Disperse
# Volume ID: b7bab530-b537-48e0-bf81-ab7d361fed00
# Status: Started
# Snapshot Count: 0
# Number of Bricks: 1 x (2 + 1) = 3
# Transport-type: tcp
# Bricks:
# Brick1: 172.17.0.2:/srv/g1
# Brick2: 172.17.0.2:/srv/g2
# Brick3: 172.17.0.2:/srv/g3
# Options Reconfigured:
# transport.address-family: inet
# nfs.disable: on
#
cd /mnt/glerasure
echo world > hello # it worked
cat hello # it worked You can find the whole log of the compilation here (including some warnings that could interest you): screen.log |
That's great :) It seems that there are still some warnings in the compilation, though they don't seem dangerous. I'll update the patch to also remove them. |
Thank you for your contributions. |
Thank you for your contributions. |
Closing this issue as there was no update since my last update on issue. If this is an issue which is still valid, feel free to open it. |
No description provided.
The text was updated successfully, but these errors were encountered: