Skip to content
This repository has been archived by the owner on Jun 4, 2024. It is now read-only.

ElasticSearch v 5 #316

Closed
shannonlal opened this issue Dec 13, 2017 · 21 comments
Closed

ElasticSearch v 5 #316

shannonlal opened this issue Dec 13, 2017 · 21 comments

Comments

@shannonlal
Copy link
Contributor

I recently started using Falcon SQL and we are interested in using it (with ChartBuilder) to query our ElasticSearch Cluster. We are using version 5.4 of ElasticSearch and I am running into a problem trying to connect to it with Falcon SQL Client. When I attempt to connect to the Sample ElasticSearch database (https://67a7441549120daa2dbeef8ac4f5bb2e.us-east-1.aws.found.io) I am able to connect and view the indexes; however, I am not able to with our production system. I noticed that test system (url provided above) is currently using ElasticSearch 2.4. Does anyone know if Falcon SQL Client can connect to ElasticSearch 5? As a side note we have been able to connect Grafana to our ElasticSearch cluster without any issues.

I have experience writing node code to ElasticSearch (primarily with https://www.npmjs.com/package/elasticsearch) and would be willing to create a new database connector. I have read through the steps to define a new connector in the following issue #261 and would be willing to give it try if people feel that this would be useful.

Any thoughts or comments?

Thanks

@n-riesco
Copy link
Contributor

Thank you for the offer!

This must be a bug. I've just tested a local installation of [email protected] with the sample dataset here and Falcon is working.

A mistake I keep making with the elasticsearch connector is to forget to set the port number. Falcon should be smarter about this.

Please, let me know if setting the port number fixes the issue for you. If it doesn't, could you provides us with a sample dataset to reproduce the issue?

@shannonlal
Copy link
Contributor Author

Nicolas,
Let me look into this. I think the issue is that I have an NGINX server infront of this so adding the port number may be causing an issue. I have the code and will run some tests tonight to test this. A lot people stick an NGINX server infront of ElasticSearch and Kibana so that they can do basic authentication or use a JWT tocken. The only Auth mechanism with Elastic Search is X-Pack which starts around $13,000/year so a lot people use NGINX as a work around on a small cluster. I will run some tests tonight and update this.

Thanks for the response

@shannonlal
Copy link
Contributor Author

I did some testing tonight and identified one of the problems. Because I have an NGINX infront of my Elastic Search cluster when I removed the port from the url in the following file:
Elasticsearch.js

//Original
const url = `${host}:${port}/${relativeUrl}?format=json${queryStringParams}`;
//Updated
const url = `${host}/${relativeUrl}?format=json${queryStringParams}`;

I am able to connect to my ElasticSearch cluster (behind NGINX) and view the list of indexes.

However, now that I am able to see the indexes I am now getting a 400 error. I looked at the sample data used by Falcon (https://67a7441549120daa2dbeef8ac4f5bb2e.us-east-1.aws.found.io) and when I look at the following index (Test-Types) and type (Elastic-2.4-Types) I also get a 400 error. The error message says the following
ERROR {"status":400,"content":{"error":{"message":"Cannot read property 'properties' of undefined"}}}

When I switch to the following index (Plotly_datasets) and type (Consumer_complaints) it seems to work fine

Anyone able to reproduce the 400 error with ElasticSearch sample data?

I am going to keep stepping through the code to see where the 400 error is coming from; however, any information would be greatly appreciated.

@n-riesco
Copy link
Contributor

I am going to keep stepping through the code to see where the 400 error is coming from; however, any information would be greatly appreciated.

If I were you, I'd use the command line instead to determine what the correct URL is. Falcon only uses 3 entrypoints in an elasticsearch server:

  • _cat/indices
$ curl -XGET 'https://67a7441549120daa2dbeef8ac4f5bb2e.us-east-1.aws.found.io:9243/_cat/indices'

yellow open test-types      1 1      3 0     8kb     8kb 
yellow open plotly_datasets 1 1  28187 0   5.8mb   5.8mb 
yellow open test-scroll     1 1 200000 0  42.5mb  42.5mb 
yellow open live-data       1 1   1000 0 574.3kb 574.3kb 
yellow open sample-data     1 1 200111 0  42.8mb  42.8mb 
  • _all/_mappings
$ curl -XGET 'https://67a7441549120daa2dbeef8ac4f5bb2e.us-east-1.aws.found.io:9243/_all/_mappings' | python -m json.tool

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2701  100  2701    0     0   4099      0 --:--:-- --:--:-- --:--:--  4098
{
    "live-data": {
        "mappings": {
            "test-type": {
                "properties": {
[...]
                }
            }
        }
    }
}
  • ${index}/${type}/_search
$ curl -XGET 'https://67a7441549120daa2dbeef8ac4f5bb2e.us-east-1.aws.found.io:9243/live-data/test-type/_search' | python -m json.tool

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  5085  100  5085    0     0   7386      0 --:--:-- --:--:-- --:--:--  7380
{
    "_shards": {
        "failed": 0,
        "successful": 1,
        "total": 1
    },
    "hits": {
        "hits": [
...
        ],
        "max_score": 1.0,
        "total": 1000
    },
    "timed_out": false,
    "took": 1
}

@shannonlal
Copy link
Contributor Author

Yup that is exactly what I am doing. The issue that I am running into is that I am getting a response from curl and I am getting a response in the connect function (which does not seem to be a 400 error) within ElasticSearch.js; however, on the front end it is returning a 400 error. I think the error might be in the parse.js file and my next step is going to be to step through this file to see where the error is.

If you have any ideas let me know; otherwise, I will keep you posted with my progress

@shannonlal
Copy link
Contributor Author

I have been testing the Elastic Search sample data and I have been getting some 400 errors. These errors are similar to the errors I am getting when I connect to my DB. I wanted to see if anyone else is able to produce the same errors:

Sample Elastic Search DB

URL: https://67a7441549120daa2dbeef8ac4f5bb2e.us-east-1.aws.found.io
Port: 9243

When I try the following indexes and types I get a 400 error

Index: Test-Types
Type:Elastic-2.4-Types

Index: Live-Data
Type: Test-Type

Index: Sample-Data
Type: Test-Type,Test-Ranges, Test-Scroll

Index: Test-Scroll
Type: 200k

The following indexes work without any issue

Index: Plotly_datasets
Type: Ebola_2014, Consumer_complaints

Is anyone else able to reproduce the same errors?

Thanks

@n-riesco
Copy link
Contributor

Is anyone else able to reproduce the same errors?

I can't reproduce any of these errors.

When you get the 400 error, could you open the dev console (View > Toggle Developer Tools) and see if there are any relevant error messages?

@shannonlal
Copy link
Contributor Author

I just installed a new Falcon Client (2.3.3-beta) on another laptop and I was able to successfully connect to the Sample Elastic Search client without any issues. I tested all the indexes and types and I was able to view the data. When I tried with my endpoint I was getting an error (see below). The first cause of the error is due to the port issue #317. Once this issue has been fixed it will allow me (and others) to connect to ES behind an Nginx server.

On my dev laptop I had put in a fix which allowed my to connect to my Elastic Search cluster (I specified that if the port was not defined to not include this in URL). I was able to get the list of indexes and their types; however, I was still getting the weird 400 error. I was stepping through the code trying to trace through the error and I had a couple of questions for you:

  1. I able to step through the code all the way from Settings.React.js all the way to sessions.js (getElasticsearchMappings). I see that this calls apiThunk which dispatches this off to redux but I am little bit lost on how it gets to the backend ( backend/routes.js ) and calls the route

//line 566
server.post('/connections/:connectionId/elasticsearch-mappings',
function elasticsearchMappingsHandler(req, res, next) {

If you can shed any light on this it would be greatly appreciated.

Error from Dev Console on Falcon Client:

Uncaught TypeError: Cannot read property 'statusCode' of undefined
    at OptionsDropdown.renderElasticsearchDocs (OptionsDropdown.react.js:113)
    at OptionsDropdown.render (OptionsDropdown.react.js:139)

@n-riesco
Copy link
Contributor

n-riesco commented Feb 1, 2018

@shannonlal Would you be able to check if setting the port to 80 works with your Nginx setup?

@shannonlal
Copy link
Contributor Author

I just checked and port 80 does not solve the problem. I am getting a 404 when I try to go to the following:

https://company-nginx-server-elaticsearch-url:80/_all/_mappings?format=json

I added the following into
backend/persistent/datastore/Elasticsearch.js in the request function

function request(relativeUrl, connection, {body, method, queryStringParams = ''}) {
    const {host, port, username, password} = connection;
    let url;
    if( typeof port !== 'undefined' && port !== ''){
        url = `${host}:${port}/${relativeUrl}?format=json${queryStringParams}`;
    }else{
        url = `${host}/${relativeUrl}?format=json${queryStringParams}`;
    }

When I leave port empty I am able to hit my Elastic Search server and get the list of indexes back.

Your thoughts? I can merge this into a PR if you want

@shannonlal
Copy link
Contributor Author

@n-riesco. when I add the above code snippet I am able to connect to the Elastic Search server behind NGINX. I am able to see the indexes and documents. I am noticing some weird 500 and timeout errors that I am currently investigating. I am getting 400 or 500 errors even though when I look at the Debugger (through Electron) I can see that there have not been any network calls. I am looking into seeing if I can put together an updated query view so users can enter their own Elastic Search query (i.e. beyond just all index and document types). I should have something over the weekend. Let me know you if you need my help on something else

@n-riesco
Copy link
Contributor

n-riesco commented Feb 10, 2018

@shannonlal Since the url is https, could you try whether setting the port to 8080 fixes the issue?

I'm trying to determine whether we can use 80 (for http) and 8080 443 (for https) as default ports.

One more, would you create a PR for this?

@shannonlal
Copy link
Contributor Author

No problem on PR. Just a question. Would you prefer to have a proxy flag? instead of ignoring the port? i.e. Have a proxy option (similar to SSL) on the Connection page. If this is enabled it will ignore the port. Or we can just not include port if it is not provided. Just let me know and I will put it together this weekend

@n-riesco
Copy link
Contributor

Just a question. Would you prefer to have a proxy flag? instead of ignoring the port? i.e. Have a proxy option (similar to SSL) on the Connection page.

I don't have a strong opinion about this. I feel ignoring the port has more pros:

  • it fixes the way build the url (now it's broken when the port is unset)

and fewer cons:

  • no need to add logic to check that the user sets both the port and the proxy flag.

One question: would this PR fix all the issues you've seen with elastic search v5?

@shannonlal
Copy link
Contributor Author

This would fix one of the issues. There are two other issues I have seen with ElasticSearch:

  1. If the query response fails it is difficult to recover from (I am not sure if this is only about Elastic Search)
  2. There is no way to add in query string to Elastic Search. Essentially are Elastic Search query is doing a select * from all documents within the specific index. There is no way to pass in query to get all documents whose timestamp where within the last 5 minutes (as an example). I have been looking into this one now and have been playing around creating a seperate Preview Screen.

@shannonlal
Copy link
Contributor Author

@n-riesco One last note. For this ES port issue. Which branch should I merge this into ?

@n-riesco
Copy link
Contributor

@shannonlal

If the query response fails it is difficult to recover from (I am not sure if this is only about Elastic Search)

This is an UI issue affecting all connectors (we have a few of those, e.g. #355).

PR #343 is getting too big (I'm having second thoughts and I don't really want to touch the UI in this PR). I'm tempted to merge it as it is; or perhaps, after adding the dockerfile and the test specs for ES v6.

There is no way to add in query string to Elastic Search.

I think this can be done touching only the UI side. I'd rather have this implemented in a separate a PR.

For this ES port issue. Which branch should I merge this into ?

Please, use master (at the moment, #343 doesn't touch Elasticsearch.js; and if it eventually does, I'll handle the conflicts on #343).

@shannonlal
Copy link
Contributor Author

@n-riesco I just submitted a small PR for this issue. I agree that we should keep this PR small. I have addressed the issue of it working behind NGINX. I think you are working on ElasticSearch v6. I can look at validation of the connection

do not use _cat/indices to validate the connection (it doesn't fail when the user forgot to provide a port number)

Did you have some initial ideas on how you would like to fix this issue? I can take a stab at this.

@n-riesco
Copy link
Contributor

n-riesco commented Feb 10, 2018

@shannonlal

do not use _cat/indices to validate the connection (it doesn't fail when the user forgot to provide a port number)

This was a misunderstanding on my side. The ES server we're using for testing works, even when the URL doesn't set the port number (i.e. the server is listening on both ports 443 and 9243).

Maybe, in the future, when we deal with backwards-incompatible changes in ES v6+, we can request / instead and store the ES version. E.g.:

$ curl -XGET 'https://67a7441549120daa2dbeef8ac4f5bb2e.us-east-1.aws.found.io/?pretty'
{
  "name" : "instance-0000000004",
  "cluster_name" : "67a7441549120daa2dbeef8ac4f5bb2e",
  "cluster_uuid" : "HfmIoztqSeGoQdJ74_hVyA",
  "version" : {
    "number" : "2.4.1",
    "build_hash" : "c67dc32e24162035d18d6fe1e952c4cbcbe79d16",
    "build_timestamp" : "2016-09-27T18:57:55Z",
    "build_snapshot" : false,
    "lucene_version" : "5.5.2"
  },
  "tagline" : "You Know, for Search"
}

@shannonlal
Copy link
Contributor Author

@n-riesco I ran a couple of tests and wanted to run a couple of things by you to see if this makes sense.

Our main objective inside the connect method (backend/persistent/datastores/ElasticSearch.js) is to ensure that the connection parameters work. This means inside the connect method, instead of

/**
* Currently it will the get indices.  Example:
* https://67a7441549120daa2dbeef8ac4f5bb2e.us-east-1.aws.found.io:9243/_cat/indicies
*/
export function connect(connection) {
    console.log( 'Start Elastic Search Connect');
    return request('_cat/indices/', connection, {method: 'GET'});
}

We want something like

/**
*https://67a7441549120daa2dbeef8ac4f5bb2e.us-east-1.aws.found.io:9243/?pretty
*/
export function connect(connection) {
    console.log( 'Start Elastic Search Connect');
    return request('?pretty', connection, {method: 'GET'});
}

This should go off to the ES server and make sure that it is to get the "You know, For Search" tagline. This will also make sure that the port and url are configured correctly. I have tested on the plot.ly ES server and the my ES server behind an NGINX server.

Is this the direction that you were thinking? If it is I could put in a small fix and test this and then send in a PR for this.

Let me know your thoughts

@n-riesco
Copy link
Contributor

Is this the direction that you were thinking?

yes, it is (the reason also being that the entrypoint / returns the server version in version.number; something that we will need in the future and we could store in connection.version).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants