Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems quering from other nodes #295

Closed
anfho93 opened this issue Mar 17, 2017 · 31 comments
Closed

Problems quering from other nodes #295

anfho93 opened this issue Mar 17, 2017 · 31 comments
Milestone

Comments

@anfho93
Copy link

anfho93 commented Mar 17, 2017

Hi. I'm having problems with the cluster connection, im following the steps in the documentation, and when I connect directly to the leader through rqlite cmd and execute queries everything is ok, but when i do the same thing from other nodes, the answer to all the queries is:
ERR! unexpected end of JSON input

I thinks this is related to the recent changes made. because i use curl to make a query in a non-leader node and this is what i get:

curl -G 'localhost:4003/db/query?pretty&timings' --data-urlencode 'q=.tables'
Moved Permanently.

btw, this is my configuration :

rqlited ./node.1
rqlited -http localhost:4003 -raft localhost:4004 -join http://localhost:4001 ./node.2
rqlited -http localhost:4005 -raft localhost:4006 -join http://localhost:4001 ./node.3

This is my first step on the leader node

$ rqlite
127.0.0.1:4001> CREATE TABLE foo (id INTEGER NOT NULL PRIMARY KEY, name TEXT)
0 row affected (0.000668 sec)
127.0.0.1:4001> .schema
+-----------------------------------------------------------------------------+
| sql |
+-----------------------------------------------------------------------------+
| CREATE TABLE foo (id INTEGER NOT NULL PRIMARY KEY, name TEXT) |
+-----------------------------------------------------------------------------+
127.0.0.1:4001> INSERT INTO foo(name) VALUES("fiona")
1 row affected (0.000080 sec)
127.0.0.1:4001> SELECT * FROM foo
+----+-------+
| id | name |
+----+-------+
| 1 | fiona |
+----+-------+

When i try to do the same on a non-leader node, this is the answer

$ rqlite -H localhost -p 4003
localhost:4003> .schema
ERR! unexpected end of JSON input

I tried to make a curl request and this is what i get
$ curl -G 'localhost:4003/db/query?pretty&timings' --data-urlencode 'q=select * from foo'
Moved Permanently.

This behavior was not present on the previous releases. I hope you can help me with this.

@otoolep
Copy link
Member

otoolep commented Mar 17, 2017

Thanks for your report.

curl -G 'localhost:4003/db/query?pretty&timings' --data-urlencode 'q=.tables'
Moved Permanently.

Sending dot-commands like .tables directly to the HTTP endpoint is not supported. Dot-commands are actually translated to a different query by the CLI. Check the source of the CLI for details. And otherwise what you are getting back from the node is correct -- you get a 301. Pass -v to curl to see more.

As for your other issue, I cannot reproduce it. I built the latest source on MacOX and repeated your steps of connecting to the leader, and then connecting to a different node. I also tested the latest release. It all works fine.

$ ./rqlite 
127.0.0.1:4001> .schema
+---------------------------------------------------------------+
| sql                                                           |
+---------------------------------------------------------------+
| CREATE TABLE foo (id INTEGER NOT NULL PRIMARY KEY, name TEXT) |
+---------------------------------------------------------------+
127.0.0.1:4001> 
EOF (CTRL+D)
$ ./rqlite -H localhost -p 4003
localhost:4003> .schema
+---------------------------------------------------------------+
| sql                                                           |
+---------------------------------------------------------------+
| CREATE TABLE foo (id INTEGER NOT NULL PRIMARY KEY, name TEXT) |
+---------------------------------------------------------------+
localhost:4003> 

@otoolep
Copy link
Member

otoolep commented Mar 17, 2017

Be sure you are running rqlite and rqlited from the same release.

@otoolep
Copy link
Member

otoolep commented Mar 17, 2017

Execute curl localhost:4001/status?pretty and paste your output here. You can run this command against post 4003 and 4005 to see what each node thinks about the cluster.

@otoolep
Copy link
Member

otoolep commented Mar 17, 2017

If I had to guess, I'd say you're not running the version of rqlited you think you're running. It almost looks like you're still running older software. The status commands will show you the version each node is running.

@anfho93
Copy link
Author

anfho93 commented Mar 17, 2017

Hi, the process i follow to run the software is just clone the master branch from github, the run get -d ./... , and after that go install ./...,
Also i have all the nodes in the same PC, running from the same rqlite binary file, but on different ports

And this is the output from all the running nodes.

host in port 4001
https://drive.google.com/open?id=0B9lwPlBmlRuAZm4tbV9GaDdZc3M
host in port 4003
https://drive.google.com/open?id=0B9lwPlBmlRuAYnlBOFM2QzkwX0k
host in port 4005
https://drive.google.com/open?id=0B9lwPlBmlRuATE5hb2RvM2JQSm8

@otoolep
Copy link
Member

otoolep commented Mar 17, 2017

Nothing looks wrong with the output. I see you are running Linux, so perhaps you can rule out something wrong with your build by running a pre-built binary. You can download v3.12.1 at https://github.com/rqlite/rqlite/releases

Follow the instructions for Linux, and show me exactly what you do. Right now I cannot repro this issue. https://gist.github.com/otoolep/80dd3ac5a53bc9ecc5acd91d45a67aa6 shows me running two v3.12.1 nodes, the nodes forming a cluster, and connecting to each node via the CLI and executing .schema without any trouble.

@anfho93
Copy link
Author

anfho93 commented Mar 17, 2017

Looks like i have troubles with my environment and building the rqlite from source code. So i went back to the 3.12.0 version and built it again but same thing happens.

I tried to download the latest release from https://github.com/rqlite/rqlite/releases and unfortunately the same thing happens.

I will try to run this on other computer, and i will let you know about it. Thanks for your quick answer.

@otoolep
Copy link
Member

otoolep commented Mar 17, 2017

You will need to show me as detailed a sequence of events demonstrating the problem, as I showed on the gist link above (which shows everything to be fine). As far as I can see there is nothing wrong with the software.

@anfho93
Copy link
Author

anfho93 commented Mar 17, 2017

Ok i tried with a new docker and this is the full process

https://gist.github.com/anfho93/fa6cbed9f56a61b9201e79c1d3dfec97

@otoolep
Copy link
Member

otoolep commented Mar 17, 2017 via email

@otoolep
Copy link
Member

otoolep commented Mar 17, 2017 via email

@anfho93
Copy link
Author

anfho93 commented Mar 17, 2017

This is the full process downloading the prebuilt binaries

https://gist.github.com/anfho93/1960026cb8550283b6729c7a98d68fcf

And it works, the funny fact is that a couple of days ago it was working with go1.8.

@anfho93
Copy link
Author

anfho93 commented Mar 17, 2017

I will try to re build it with go 1.7.1 and i will let you know

@otoolep
Copy link
Member

otoolep commented Mar 17, 2017

I will see if I can repro after building with 1.8.

@otoolep
Copy link
Member

otoolep commented Mar 17, 2017

There were changes in HTTP redirect handling between 1.7 and 1.8. I had to make one change to account for it: #289

Perhaps there is another place that needs changing.

@otoolep
Copy link
Member

otoolep commented Mar 17, 2017

OK, I can repro this when I build the system with 1.8. I'm looking into a fix.

@anfho93
Copy link
Author

anfho93 commented Mar 17, 2017

Thanks a lot.

@otoolep
Copy link
Member

otoolep commented Mar 17, 2017

@anfho93 -- I believe this is now fixed on master. Do you want to pull down the latest code on master, build with Go 1.8, and see if it works now? If so, we can close this issue.

@anfho93
Copy link
Author

anfho93 commented Mar 17, 2017

Works perfectly now, thank you so much @otoolep

@anfho93 anfho93 closed this as completed Mar 17, 2017
@otoolep
Copy link
Member

otoolep commented Mar 17, 2017

Thank you @anfho93 for the report -- there was a real bug here. Glad it's fixed.

@wil222
Copy link

wil222 commented Jul 11, 2018

Hi,

I ran into the exact same issue. I build it from the sources using go1.9 on my Raspberry Pi 3 (ARM). I used the source code at commit 2eb02dc on branch master (last commit up to now).

At the end is the result of curl localhost:4001/status?pretty. Everything seems ok on the other nodes : http.redirect = "localhost:4001", and store.raft.state = "Follower".

I will try it with go1.8 and report later. I don't know a single thing about go, but if you have bugs everytime you try to upgrade your version without a single warning... Well I don't want to know more about it.

{
    "build": {
        "branch": "",
        "build_time": "",
        "commit": "",
        "version": ""
    },
    "http": {
        "addr": "127.0.0.1:4001",
        "auth": "disabled",
        "conn_idle_timeout": "1m0s",
        "conn_tx_timeout": "10s",
        "redirect": "localhost:4001"
    },
    "node": {
        "start_time": "2018-07-11T13:10:47.431260462+02:00",
        "uptime": "38m49.599303526s"
    },
    "runtime": {
        "GOARCH": "arm",
        "GOMAXPROCS": 4,
        "GOOS": "linux",
        "num_cpu": 4,
        "num_goroutine": 21,
        "version": "go1.9"
    },
    "store": {
        "addr": "127.0.0.1:4002",
        "apply_timeout": "10s",
        "conn_poll_period": "10s",
        "db_conf": {
            "DSN": "",
            "Memory": true
        },
        "dir": "/home/pi/node.1",
        "heartbeat_timeout": "1s",
        "leader": {
            "addr": "127.0.0.1:4002",
            "node_id": "localhost:4002"
        },
        "metadata": {
            "localhost:4002": {
                "api_addr": "localhost:4001"
            },
            "localhost:4004": {
                "api_addr": "localhost:4003"
            },
            "localhost:4006": {
                "api_addr": "localhost:4005"
            }
        },
        "node_id": "localhost:4002",
        "nodes": [
            {
                "id": "localhost:4002",
                "addr": "127.0.0.1:4002"
            },
            {
                "id": "localhost:4004",
                "addr": "127.0.0.1:4004"
            },
            {
                "id": "localhost:4006",
                "addr": "127.0.0.1:4006"
            }
        ],
        "raft": {
            "applied_index": "10",
            "commit_index": "10",
            "fsm_pending": "0",
            "last_contact": "0",
            "last_log_index": "10",
            "last_log_term": "12",
            "last_snapshot_index": "0",
            "last_snapshot_term": "0",
            "latest_configuration": "[{Suffrage:Voter ID:localhost:4002 Address:127.0.0.1:4002} {Suffrage:Voter ID:localhost:4004 Address:127.0.0.1:4004} {Suffrage:Voter ID:localhost:4006 Address:127.0.0.1:4006}]",
            "latest_configuration_index": "6",
            "num_peers": "2",
            "protocol_version": "3",
            "protocol_version_max": "3",
            "protocol_version_min": "0",
            "snapshot_version_max": "1",
            "snapshot_version_min": "0",
            "state": "Leader",
            "term": "12"
        },
        "snapshot_threshold": 8192,
        "sqlite3": {
            "dsn": "",
            "fk_constraints": "disabled",
            "path": ":memory:",
            "version": "3.24.0"
        }
    }
}

@wil222
Copy link

wil222 commented Jul 11, 2018

Ok, I'm a bit confused here. I tried the following combinations and none worked :

  • go1.7 on ARM
  • go1.8 on ARM
  • go1.9 on ARM
  • go1.8 on amd64
  • go1.10 on amd64

With commit 2eb02dc

All give the same result :

ThinkPad-E560 ~/rqlite $ ./rqlite
127.0.0.1:4001> CREATE TABLE foo (id INTEGER NOT NULL PRIMARY KEY, name TEXT)
0 row affected (0.000000 sec)
127.0.0.1:4001>  INSERT INTO foo(name) VALUES("fiona")
1 row affected (0.000000 sec)
127.0.0.1:4001> select * from foo
+----+-------+
| id | name  |
+----+-------+
| 1  | fiona |
+----+-------+
127.0.0.1:4001> 
EOF (CTRL+D)
ThinkPad-E560 ~/rqlite $ ./rqlite -p 4003
127.0.0.1:4003> select * from foo
ERR! unexpected end of JSON input

I am guessing I am not compiling it the way I should, even if I am following the steps of package.sh with very few modifications because the releases indeed work like they should :

REPO_URL="https://github.com/rqlite/rqlite"

VERSION=$1
RELEASE_ID=$2
API_TOKEN=$3

tmp_build=`mktemp -d`
tmp_pkg=`mktemp -d`

kernel=`uname -s`
machine=`uname -m`
if [ "$machine" == "x86_64" ]; then
    machine="amd64"
fi

mkdir -p $tmp_build/src/github.com/rqlite
export GOPATH=$tmp_build
cd $tmp_build/src/github.com/rqlite
git clone $REPO_URL
cd rqlite
go get -d ./...
branch=`git rev-parse --abbrev-ref HEAD`
commit=`git rev-parse HEAD`
kernel=`uname -s`
buildtime=`date +%Y-%m-%dT%T%z`
go install -ldflags="-X main.version=$VERSION -X main.branch=$branch -X main.commit=$commit -X main.buildtime=$buildtime" ./...

I am interested in knowing what I am doing wrong because I would like to compile it on ARM, even though I can't figure out what is wrong even for AMD.

@wil222
Copy link

wil222 commented Jul 11, 2018

Don't mind me. I got it working with v4.3.0 77e345b
However, it means there is a bug since this version.

@otoolep
Copy link
Member

otoolep commented Jul 11, 2018

OK, thanks @wil222 -- I'll take a look at top of tree, which is in active development.

@otoolep otoolep added this to the v5.0 milestone Jul 23, 2018
@otoolep otoolep reopened this Jul 23, 2018
@otoolep
Copy link
Member

otoolep commented Jul 23, 2018

Re-opening so I take a look for 5.0.

@forchid
Copy link

forchid commented Jan 26, 2019

This issue also occurs in rqlite-v5.0 on windows 8 x86_64. The bugs are at github.com/rqlite/rqlite/cmd/rqlite/query.go makeQueryRequest() and github.com/rqlite/rqlite/http/service.go handleQuery():

// github.com/rqlite/rqlite/cmd/rqlite/query.go makeQueryRequest() 
func makeQueryRequest(line string) func(string) (*http.Request, error) {
	requestData := strings.NewReader(makeJSONBody(line))
	return func(urlStr string) (*http.Request, error) {
		req, err := http.NewRequest("POST", urlStr, requestData)
		if err != nil {
			return nil, err
		}
		return req, nil
	}
}

// github.com/rqlite/rqlite/http/service.go handleQuery()
	var resp Response
	results, err := queryer.Query(&store.QueryRequest{queries, timings, isAtomic, lvl})
	if err != nil {
		if err == store.ErrNotLeader {
			leader := s.leaderAPIAddr()
			if leader == "" {
				http.Error(w, err.Error(), http.StatusServiceUnavailable)
				return
			}

			// - little-pan Bug-fix: should restore the shifted path.
			r.URL.Path = "/db/query"
			// - little-pan
			redirect := s.FormRedirect(r, leader)
			// - little-pan
			s.logger.Printf("query: redirect to %s", redirect)
			http.Redirect(w, r, redirect, http.StatusMovedPermanently)
			return

The requestData is a Reader, shouldn't be read many times in the returned func! And only jsonBody string can be reused.A fix method may be:

func makeQueryRequest(line string) func(string) (*http.Request, error) {
	//Bug-fix: requestData is a Reader, shouldn't be read many times!
	//And only jsonBody string can be reused.
	// @since 2019-01-26 little-pan
	//requestData := strings.NewReader(makeJSONBody(line))
	jsonBody := makeJSONBody(line)
	return func(urlStr string) (*http.Request, error) {
		requestData := strings.NewReader(jsonBody)
		req, err := http.NewRequest("POST", urlStr, requestData)
		if err != nil {
			return nil, err
		}
		return req, nil
	}
}

@otoolep
Copy link
Member

otoolep commented Jan 29, 2019

@Little-Pan -- I don't the problem with the code you point at. Exactly where is the reader called multiple times?

@forchid
Copy link

forchid commented Jan 29, 2019

@otoolep There are many bugs here. The first, the result func of makeQueryRequest() is called multiple times when the rqlited follower node returns http 302(eg. this node not the leader). And another bug(such issues also in other http api implementations) exists in following function:

// github.com/rqlite/rqlite/http/service.go handleQuery()
var resp Response
results, err := queryer.Query(&store.QueryRequest{queries, timings, isAtomic, lvl})
if err != nil {
	if err == store.ErrNotLeader {
		leader := s.leaderAPIAddr()
		if leader == "" {
			http.Error(w, err.Error(), http.StatusServiceUnavailable)
			return
		}

		// - little-pan Bug-fix: should restore the shifted path here.
		r.URL.Path = "/db/query"
		
		redirect := s.FormRedirect(r, leader)

		// - little-pan
		s.logger.Printf("query: redirect to %s", redirect)

		http.Redirect(w, r, redirect, http.StatusMovedPermanently)
		return
// ...

When s.FormRedirect(r, leader) called in the handleQuery() func, the r.URL.Path(has been shifted in the front handler steps) doesn't be restored to original request URL path, then set as Location header of http redirection. This way can lead to "ERR! unexpected end of JSON input"(status code http 400 actually), because of the Location header of http redirection has been truncated and incorrect.

@forchid
Copy link

forchid commented Jan 29, 2019

And no such issues are in rqlite v4.4.0(http api implementation relatively simple and stable), but the redirection is a bit slow in windows 8 x86_64(the performation issue not in centos 7 x86_64).

@otoolep
Copy link
Member

otoolep commented Jan 29, 2019

OK, I'll take look when I return to the code -- thanks.

@otoolep
Copy link
Member

otoolep commented Apr 12, 2020

No known issues in 5.x.

@otoolep otoolep closed this as completed Apr 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants