Add some settings to specify a remote Openshift cluster URL #111

barkbay · 2017-11-02T16:35:11Z

Hi,

Our ElasticSearch cluster is hosted on an remote K8S cluster. Therefore we have to be able to set the URL of the Openshift cluster when the plugin want to check the authorizations of a user.
This PR allows you to specify a remote Openshift cluster URL while allowing the Kubernetes plugin to discover the topology of the ElasticSearch cluster the usual way.
I would love to hear your thoughts.

Thanks

richm · 2017-11-02T17:00:17Z

src/main/java/io/fabric8/elasticsearch/plugin/OpenshiftRequestContextFactory.java

@@ -64,6 +67,9 @@ public OpenshiftRequestContextFactory(final Settings settings, final RequestUtil
                ConfigurationSettings.DEFAULT_OPENSHIFT_OPS_PROJECTS);
        this.kibanaPrefix = settings.get(ConfigurationSettings.KIBANA_CONFIG_INDEX_NAME, ConfigurationSettings.DEFAULT_USER_PROFILE_PREFIX);
        this.kibanaIndexMode = settings.get(ConfigurationSettings.OPENSHIFT_KIBANA_INDEX_MODE, UNIQUE);
+        this.openshiftMasterUrl = settings.get(ConfigurationSettings.OPENSHIFT_MASTER, ConfigurationSettings.DEFAULT_MASTER);
+        this.openshiftCaPath = settings.get(ConfigurationSettings.OPENSHIFT_CA_PATH, null);


What is the default CA path?

There is no default CA path. If this parameter is not set by the user then the CA detected or set by the K8S plugin is not overwritten.

richm · 2017-11-02T17:02:43Z

src/main/java/io/fabric8/elasticsearch/plugin/OpenshiftRequestContextFactory.java

+            if (openshiftCaPath != null) {
+                builder.withCaCertFile(openshiftCaPath);
+            }
+


Are there additional types of Exceptions which can be thrown and need to be handled? e.g. if the user specifies an incorrect openshiftCaPath will the api throw a FileNotFound or PermissionDenied or some other type of exception? I want to make sure e.g. if the user made a typo they can easily identify the problem, or if the user did not set the right file permission, etc.

If the CA path is incorrect the following exception is thrown :

io.fabric8.kubernetes.client.KubernetesClientException: An error has occurred. at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:57) at io.fabric8.kubernetes.client.utils.HttpClientUtils.createHttpClient(HttpClientUtils.java:137) at io.fabric8.kubernetes.client.BaseClient.<init>(BaseClient.java:41) at io.fabric8.openshift.client.DefaultOpenShiftClient.<init>(DefaultOpenShiftClient.java:174) at io.fabric8.openshift.client.DefaultOpenShiftClient.<init>(DefaultOpenShiftClient.java:170) at io.fabric8.elasticsearch.plugin.OpenshiftClientFactory.create(OpenshiftClientFactory.java:50) [....] Caused by: java.io.FileNotFoundException: /tmp/junit2122595593726896985/ca.crt.does_not_exist (No such file or directory) at java.io.FileInputStream.open0(Native Method) at java.io.FileInputStream.open(FileInputStream.java:195) at java.io.FileInputStream.<init>(FileInputStream.java:138) at java.io.FileInputStream.<init>(FileInputStream.java:93) at io.fabric8.kubernetes.client.internal.CertUtils.getInputStreamFromDataOrFile(CertUtils.java:55) at io.fabric8.kubernetes.client.internal.CertUtils.createTrustStore(CertUtils.java:61) at io.fabric8.kubernetes.client.internal.SSLUtils.trustManagers(SSLUtils.java:113) at io.fabric8.kubernetes.client.internal.SSLUtils.trustManagers(SSLUtils.java:107) at io.fabric8.kubernetes.client.utils.HttpClientUtils.createHttpClient(HttpClientUtils.java:68) ... 29 more

In case of bad permissions :

io.fabric8.kubernetes.client.KubernetesClientException: An error has occurred. at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:57) at io.fabric8.kubernetes.client.utils.HttpClientUtils.createHttpClient(HttpClientUtils.java:137) at io.fabric8.kubernetes.client.BaseClient.<init>(BaseClient.java:41) at io.fabric8.openshift.client.DefaultOpenShiftClient.<init>(DefaultOpenShiftClient.java:174) at io.fabric8.openshift.client.DefaultOpenShiftClient.<init>(DefaultOpenShiftClient.java:170) at io.fabric8.elasticsearch.plugin.OpenshiftClientFactory.create(OpenshiftClientFactory.java:50) [...] Caused by: java.io.FileNotFoundException: /tmp/you_cant_read_it.crt (Permission denied) at java.io.FileInputStream.open0(Native Method) at java.io.FileInputStream.open(FileInputStream.java:195) at java.io.FileInputStream.<init>(FileInputStream.java:138) at java.io.FileInputStream.<init>(FileInputStream.java:93) at io.fabric8.kubernetes.client.internal.CertUtils.getInputStreamFromDataOrFile(CertUtils.java:55) at io.fabric8.kubernetes.client.internal.CertUtils.createTrustStore(CertUtils.java:61) at io.fabric8.kubernetes.client.internal.SSLUtils.trustManagers(SSLUtils.java:113) at io.fabric8.kubernetes.client.internal.SSLUtils.trustManagers(SSLUtils.java:107) at io.fabric8.kubernetes.client.utils.HttpClientUtils.createHttpClient(HttpClientUtils.java:68) ... 29 more

richm · 2017-11-02T17:03:20Z

Please also add some unit tests for this new feature

jcantrill · 2017-11-03T18:08:39Z

I'm not certain this change is required at all since you can alter all these values using env variables [1]. I would go further in that if we are going to accept these changes to prefer using if checks to build up the config instead of providing defaults and setting them. I think we should prefer the client defaults as we do now instead of redefining them.

[1] https://github.com/fabric8io/kubernetes-client

barkbay · 2017-11-03T18:57:57Z

Not sure we can use env variables since the K8S client is already used inside a local Kubernetes cluster.
In other words we have to manage two different K8S clusters within the same JVM.
How handle this use case with env variables ?

barkbay · 2017-11-04T09:35:35Z

I'm working on a new PR that will add some unit tests and preserve the client defaults instead of redefining them : https://github.com/fabric8io/openshift-elasticsearch-plugin/compare/master...barkbay:external_openshift.diff

I will update this PR when I have got time to run some e2e integration tests against our clusters.

jcantrill · 2017-11-10T13:42:35Z

@barkbay Please clarify the usecase you have:

Elasticsearch running on a Kubernetes Cluster
Logs from an Openshift cluster are written to ES on Kubernetes Cluster

This LGTM other then what appears to be an unexpected deployment topology. Please also rebase and squash the commits

barkbay · 2017-11-12T12:56:27Z

A picture is worth a thousand words ;)

Today our ElasticSearch cluster is deployed with Ansible but we are working on a new deployment using recent Kubernetes features like statefulset and local volumes.
I will rebase and squash the commits.

jcantrill · 2017-11-12T21:25:16Z

@barkbay FYI, we have been slow to adopt statefulsets due do internal discussions regarding their ability to correctly support the ES usecase.
I will put together a PR against openshift/origin-aggregated-logging to run these changes through our CI test

barkbay · 2017-11-13T12:22:37Z

@jcantrill Could you tell me a little bit more about the drawbacks of deploying ES with statefulsets ?
Thank you for the PR.

portante · 2017-11-13T21:19:59Z

@barkbay, from https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/, the section on "Deployment and Scaling Guarantees":

For a StatefulSet with N replicas, when Pods are being deployed, they are created sequentially, in order from {0..N-1}.
When Pods are being deleted, they are terminated in reverse order, from {N-1..0}.
Before a scaling operation is applied to a Pod, all of its predecessors must be Running and Ready.
Before a Pod is terminated, all of its successors must be completely shutdown.

The StatefulSet should not specify a pod.Spec.TerminationGracePeriodSeconds of 0. This practice is unsafe and strongly discouraged. For further explanation, please refer to force deleting StatefulSet Pods.

When the nginx example above is created, three Pods will be deployed in the order web-0, web-1, web-2. web-1 will not be deployed before web-0 is Running and Ready, and web-2 will not be deployed until web-1 is Running and Ready. If web-0 should fail, after web-1 is Running and Ready, but before web-2 is launched, web-2 will not be launched until web-0 is successfully relaunched and becomes Running and Ready.

If a user were to scale the deployed example by patching the StatefulSet such that replicas=1, web-2 would be terminated first. web-1 would not be terminated until web-2 is fully shutdown and deleted. If web-0 were to fail after web-2 has been terminated and is completely shutdown, but prior to web-1’s termination, web-1 would not be terminated until web-0 is Running and Ready.

Pods are created, terminated in a specific order. Which means, if a given pod fails to come up, of fails to terminate, then we have a problem.

How do statefulsets handle the case where we have to apply maintenance to one member out of order? Disk failure, etc.?

Do statefulsets allow dynamic scaling? For example oc scale dc es --replicas=3? If we consider the case where the replica set had 5 to start with, isn't this very easy to drop state quickly?

For Elasticsearch, how do statefulsets know when it is okay to remove those two pods? When data is properly replicated? How have we told Elasticsearch that the expected cluster size is now 3 instead of 5?

@smarterclayton

barkbay · 2017-11-14T08:48:44Z

Pods are created, terminated in a specific order because of the default OrderedReady pod management policy.
With Parallel pod management the second pod does not wait for the first to be ready :

Parallel Pod Management
Parallel pod management tells the StatefulSet controller to launch or terminate all Pods in parallel, and to not wait for Pods to become Running and Ready or completely terminated prior to launching or terminating another Pod.

With a replica of two and OrderedReady pod management policy, the second pod is not created :

{17-11-14 8:11}124-100:~ root# kubectl get pods -n elastic1 -o wide                                                                                                                                                                                 
NAME                         READY     STATUS    RESTARTS   AGE       IP               NODE 
[...]
es-data-0                    0/1       Pending   0          2m        <none>           <none>

With Parallel pod management policy :

{17-11-14 8:35}124-100:~ root# kubectl get pods -n elastic1 -o wide                                                                                                                                                                                
NAME                         READY     STATUS    RESTARTS   AGE       IP               NODE                              
[...]                        
es-data-0                    0/1       Pending   0          14m       <none>           <none>                            
es-data-1                    1/1       Running   0          14m       10.233.109.236   124-103

124-100:~ root# kubectl uncordon 124-101
node "124-101" uncordoned

124-100:~ root# kubectl get pods -n elastic1 -o wide
NAME                         READY     STATUS            RESTARTS   AGE       IP               NODE                      
[...]                 
es-data-0                    0/1       PodInitializing   0          18m       10.233.98.152    124-101                   
es-data-1                    1/1       Running           0          18m       10.233.109.236   124-103

For Elasticsearch, how do statefulsets know when it is okay to remove those two pods?

Nothing specific to ES or K8S here. IMHO dropping more than one node at the same time is a very bad idea with any distributed storage system (ES, Cassandra, Zookeeper, Ceph).

barkbay · 2018-10-16T06:14:39Z

Hi,

Please, could you have an other look at this PR ? We have to maintain our own build pipeline for this very small patch, we would be glad to have it merged into the plugin source code.

Tanks.

jcantrill · 2018-10-16T13:03:22Z

/ok-to-test

fusesource-ci · 2018-10-16T13:03:42Z

License check failed: run mvn -N license:format to update all licenses, commit, squash & force push please.

jcantrill

minor nits

jcantrill · 2018-10-16T13:05:04Z

src/main/java/io/fabric8/elasticsearch/plugin/OpenshiftAPIService.java

@@ -25,6 +25,7 @@
 import org.apache.logging.log4j.Logger;
 import org.elasticsearch.ElasticsearchException;
 import org.elasticsearch.ElasticsearchSecurityException;
+import org.elasticsearch.common.inject.Inject;


remove. DI is manual in lieu of using a library to wire dependencies

jcantrill · 2018-10-16T13:06:24Z

src/main/java/io/fabric8/elasticsearch/plugin/ConfigurationSettings.java

@@ -89,6 +89,10 @@
    static final String OPENSHIFT_ACL_ROLE_STRATEGY = "openshift.acl.role_strategy";
    static final String DEFAULT_ACL_ROLE_STRATEGY = "user";

+    static final String OPENSHIFT_MASTER = "openshift.master";


Please set to 'openshift.master.url' to be consistent with the config setting

…t-elasticsearch-plugin#111

fusesource-ci · 2018-12-07T13:27:40Z

License check failed: run mvn -N license:format to update all licenses, commit, squash & force push please.

fusesource-ci · 2018-12-07T13:30:07Z

Tests failed.

fusesource-ci · 2018-12-07T17:39:46Z

Tests failed.

richm · 2018-12-07T21:28:51Z

src/main/java/io/fabric8/elasticsearch/plugin/PluginSettings.java

@@ -64,6 +67,15 @@ public PluginSettings(final Settings settings) {
        this.opsIndexPatterns = new HashSet<String>(Arrays.asList(settings.getAsArray(OPENSHIFT_KIBANA_OPS_INDEX_PATTERNS, DEFAULT_KIBANA_OPS_INDEX_PATTERNS)));
        this.expireInMillis = settings.getAsLong(OPENSHIFT_ACL_EXPIRE_IN_MILLIS, new Long(1000 * 60));

+        this.masterUrl = settings.get(OPENSHIFT_MASTER);


settings.get returns null if there is no such setting?

yes, this is why the reference is tested at https://github.com/fabric8io/openshift-elasticsearch-plugin/pull/111/files#diff-064e72178bca90921d549aa255acd4b6R197

barkbay · 2018-12-12T09:35:13Z

Do you think that this PR can be merged for the next release ?

richm

lgtm - @jcantrill what say you?

jcantrill · 2018-12-12T20:56:19Z

[merge]

fusesource-ci · 2018-12-12T20:57:15Z

Merge failed.

jcantrill · 2018-12-12T21:08:25Z

@barkbay please look at the merge test failures. I'm not certain why this would not be caught in the test job

jcantrill · 2022-05-12T17:12:21Z

closed as stale

richm reviewed Nov 2, 2017

View reviewed changes

barkbay force-pushed the external_openshift_rc1 branch from e135ef0 to 0156dec Compare November 12, 2017 13:06

barkbay force-pushed the external_openshift_rc1 branch 3 times, most recently from f8fe852 to c6b98f5 Compare July 12, 2018 09:17

barkbay force-pushed the external_openshift_rc1 branch from c6b98f5 to fc96d27 Compare October 9, 2018 14:11

jcantrill reviewed Oct 16, 2018

View reviewed changes

jcantrill added enhancement release/5.6.10 labels Oct 16, 2018

barkbay force-pushed the external_openshift_rc1 branch from fc96d27 to 78fc68f Compare October 17, 2018 03:54

barkbay added a commit to barkbay/origin-aggregated-logging that referenced this pull request Oct 17, 2018

Update openshift-elasticsearch-plugin according to fabric8io/openshif…

b4cea41

…t-elasticsearch-plugin#111

barkbay force-pushed the external_openshift_rc1 branch from 78fc68f to 90a6109 Compare December 7, 2018 13:27

barkbay force-pushed the external_openshift_rc1 branch from 90a6109 to 271132c Compare December 7, 2018 13:28

barkbay force-pushed the external_openshift_rc1 branch from 271132c to cdb07a7 Compare December 7, 2018 13:34

barkbay force-pushed the external_openshift_rc1 branch from cdb07a7 to ab1178a Compare December 7, 2018 17:39

barkbay force-pushed the external_openshift_rc1 branch from ab1178a to 529c4c1 Compare December 7, 2018 17:42

richm reviewed Dec 7, 2018

View reviewed changes

barkbay force-pushed the external_openshift_rc1 branch from 529c4c1 to 627d18c Compare December 10, 2018 10:35

Add some settings to specify a remote Openshift cluster URL

03168f5

barkbay force-pushed the external_openshift_rc1 branch from 627d18c to 03168f5 Compare December 10, 2018 10:48

richm approved these changes Dec 12, 2018

View reviewed changes

barkbay mentioned this pull request Dec 13, 2018

Dead code with side effect in test OpenshiftAPIServiceTest #167

Open

jcantrill closed this May 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add some settings to specify a remote Openshift cluster URL #111

Add some settings to specify a remote Openshift cluster URL #111

barkbay commented Nov 2, 2017

richm Nov 2, 2017

barkbay Nov 10, 2017

richm Nov 2, 2017

barkbay Nov 10, 2017

richm commented Nov 2, 2017

jcantrill commented Nov 3, 2017

barkbay commented Nov 3, 2017

barkbay commented Nov 4, 2017

jcantrill commented Nov 10, 2017

barkbay commented Nov 12, 2017

jcantrill commented Nov 12, 2017

barkbay commented Nov 13, 2017

portante commented Nov 13, 2017

barkbay commented Nov 14, 2017

barkbay commented Oct 16, 2018

jcantrill commented Oct 16, 2018

fusesource-ci commented Oct 16, 2018

jcantrill left a comment

jcantrill Oct 16, 2018

jcantrill Oct 16, 2018

fusesource-ci commented Dec 7, 2018

fusesource-ci commented Dec 7, 2018

fusesource-ci commented Dec 7, 2018

richm Dec 7, 2018

barkbay Dec 8, 2018

barkbay commented Dec 12, 2018 •

edited

Loading

richm left a comment

jcantrill commented Dec 12, 2018

fusesource-ci commented Dec 12, 2018

jcantrill commented Dec 12, 2018

jcantrill commented May 12, 2022

Add some settings to specify a remote Openshift cluster URL #111

Add some settings to specify a remote Openshift cluster URL #111

Conversation

barkbay commented Nov 2, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

richm commented Nov 2, 2017

jcantrill commented Nov 3, 2017

barkbay commented Nov 3, 2017

barkbay commented Nov 4, 2017

jcantrill commented Nov 10, 2017

barkbay commented Nov 12, 2017

jcantrill commented Nov 12, 2017

barkbay commented Nov 13, 2017

portante commented Nov 13, 2017

barkbay commented Nov 14, 2017

barkbay commented Oct 16, 2018

jcantrill commented Oct 16, 2018

fusesource-ci commented Oct 16, 2018

jcantrill left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fusesource-ci commented Dec 7, 2018

fusesource-ci commented Dec 7, 2018

fusesource-ci commented Dec 7, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

barkbay commented Dec 12, 2018 • edited Loading

richm left a comment

Choose a reason for hiding this comment

jcantrill commented Dec 12, 2018

fusesource-ci commented Dec 12, 2018

jcantrill commented Dec 12, 2018

jcantrill commented May 12, 2022

barkbay commented Dec 12, 2018 •

edited

Loading