Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Registering MinIO (S3) snapshot repository fails with "Connect timed out" #16305

Open
pjuri opened this issue Oct 14, 2024 · 7 comments
Open
Labels
bug Something isn't working Plugins

Comments

@pjuri
Copy link

pjuri commented Oct 14, 2024

Describe the bug

I’m running OpenSearch as part of Graylog Helm installation under Kubernetes. I’m trying to register a snapshot endpoint with MinIO. I’m following this document: https://opensearch.org/docs/latest/tuning-your-cluster/availability-and-recovery/snapshots/snapshot-restore/

When I try to register the repository with curl (using the REST API), I get "Connect timed out" error. Using tcpdump I can see that no connection to provided IP address is attempted. When I manually test the connection to MinIO with curl, it works. (I.e. it’s not a network issue.)

If I remove s3.client.default.endpoint setting, I can see OpenSearch connecting to Amazon servers. (Which is not what I want.)

I suspect this might be just a misconfiguration, but no matter what I try, I get the same results.

Related component

Plugins

To Reproduce

[opensearch@opensearch-cluster-master-0 ~]$ opensearch-keystore create
An opensearch keystore already exists. Overwrite? [y/N]y
Created opensearch keystore in /usr/share/opensearch/config/opensearch.keystore
[opensearch@opensearch-cluster-master-0 ~]$ opensearch-keystore add s3.client.default.access_key
Enter value for s3.client.default.access_key:
[opensearch@opensearch-cluster-master-0 ~]$ opensearch-keystore add s3.client.default.secret_key
Enter value for s3.client.default.secret_key:
[opensearch@opensearch-cluster-master-0 ~]$ grep s3.client.default config/opensearch.yml
s3.client.default.protocol: "http"
s3.client.default.endpoint: "http://1.2.3.4:9000/"
s3.client.default.path_style_access: "true"

Did steps above on all 3 cluster members.

[opensearch@opensearch-cluster-master-0 ~]$ curl -X POST "http://localhost:9200/_nodes/reload_secure_settings"
{"_nodes":{"total":3,"successful":3,"failed":0},"cluster_name":"opensearch-cluster","nodes":{"Ug2a4ZiqS_6sNDvKlFRNbg":{"name":"opensearch-cluster-master-2"},"zi7xQcAsT0WyPEXLozMEJQ":{"name":"opensearch-cluster-master-0"},"R6I3MgjqRrS85OjyIWHCaw":{"name":"opensearch-cluster-master-1"}}}[opensearch@opensearch-cluster-master-0 ~]$
[opensearch@opensearch-cluster-master-0 ~]$ curl -X PUT "http://localhost:9200/_snapshot/minio-repo?pretty" -H 'Content-Type: application/json' -d '

{
"type": "s3",
"settings": {
"bucket": "opensearch",
"base_path": "opensearch/snapshot/"

}
}'
{
"error" : {
"root_cause" : [
{
"type" : "repository_verification_exception",
"reason" : "[minio-repo] path [opensearch/snapshot/] is not accessible on cluster-manager node"
}
],
"type" : "repository_verification_exception",
"reason" : "[minio-repo] path [opensearch/snapshot/] is not accessible on cluster-manager node",
"caused_by" : {
"type" : "i_o_exception",
"reason" : "Unable to upload object [opensearch/snapshot//tests-nZNGJ5szRh-Pd5gX3q44dA/master.dat] using a single upload",
"caused_by" : {
"type" : "sdk_client_exception",
"reason" : "sdk_client_exception: Failed to connect to service endpoint: ",
"caused_by" : {
"type" : "i_o_exception",
"reason" : "Connect timed out"
}
}
}
},
"status" : 500
}

tcpdump shows no traffic to MinIO

Test if the Minio endpoint is reachable:

[opensearch@opensearch-cluster-master-0 ~]$ curl http://1.2.3.4:9000/

AccessDeniedAccess Denied./minio17FE44A7FEAD5E72dd9025bab4ad464b049177c95eb6ebf374d3b3fd1af9251148b658df7ac2e3e8[opensearch@opensearch-cluster-master-0 ~]$

tcpdump shows connection with MinIO was established

Expected behavior

Snapshot endpoint should be successfully registered, allowing me to make snapshots and recoveries.

Additional Details

Plugins
plugins:
enabled: true
installList:
- repository-s3

Host/Environment (please complete the following information):

  • OS: Ubuntu Server
  • Version: 22.04, kernel 5.15.0-102-generic

Additional context
Kubernetes: v1.28.14
Containerd: 1.7.2-0ubuntu1~22.04.1
Docker image: opensearchproject/opensearch:2.4.0
Helm chart: graylog-2.3.10 - uses https://artifacthub.io/packages/helm/opensearch-project-helm-charts/opensearch

@pjuri pjuri added bug Something isn't working untriaged labels Oct 14, 2024
@pjuri
Copy link
Author

pjuri commented Oct 14, 2024

Here is the list of installed plugins from _cat/plugins: (Only from first node)
opensearch-cluster-master-0 opensearch-alerting 2.4.0.0
opensearch-cluster-master-0 opensearch-anomaly-detection 2.4.0.0
opensearch-cluster-master-0 opensearch-asynchronous-search 2.4.0.0
opensearch-cluster-master-0 opensearch-cross-cluster-replication 2.4.0.0
opensearch-cluster-master-0 opensearch-geospatial 2.4.0.0
opensearch-cluster-master-0 opensearch-index-management 2.4.0.0
opensearch-cluster-master-0 opensearch-job-scheduler 2.4.0.0
opensearch-cluster-master-0 opensearch-knn 2.4.0.0
opensearch-cluster-master-0 opensearch-ml 2.4.0.0
opensearch-cluster-master-0 opensearch-neural-search 2.4.0.0
opensearch-cluster-master-0 opensearch-notifications 2.4.0.0
opensearch-cluster-master-0 opensearch-notifications-core 2.4.0.0
opensearch-cluster-master-0 opensearch-observability 2.4.0.0
opensearch-cluster-master-0 opensearch-performance-analyzer 2.4.0.0
opensearch-cluster-master-0 opensearch-reports-scheduler 2.4.0.0
opensearch-cluster-master-0 opensearch-security 2.4.0.0
opensearch-cluster-master-0 opensearch-security-analytics 2.4.0.0
opensearch-cluster-master-0 opensearch-sql 2.4.0.0
opensearch-cluster-master-0 repository-s3 2.4.0

@jwitko
Copy link

jwitko commented Oct 21, 2024

You have to use an environment variable to disable the AWS EC2 METADATA connection. OpenSearch is trying to reach the aws magic IP on your server and of course failing. We just hit this same issue.

@pjuri
Copy link
Author

pjuri commented Oct 22, 2024

Thanks, @jwitko, setting AWS_EC2_METADATA_DISABLED helped!

It would be good if this was mentioned in the documentation.

Here's what I put in my Helm values:

opensearch:
  extraEnvs:
    - name: AWS_EC2_METADATA_DISABLED
      value: "true"

@dblock dblock removed the untriaged label Nov 4, 2024
@dblock
Copy link
Member

dblock commented Nov 4, 2024

[Catch All Triage - 1, 2]

@pjuri Looks like we got to the bottom of this, care to contribute to the documentation?

@pjuri
Copy link
Author

pjuri commented Nov 5, 2024

@dblock sure, if you tell me where to put it.

pjuri added a commit to pjuri/opensearch-documentation-website that referenced this issue Nov 12, 2024
@pjuri
Copy link
Author

pjuri commented Nov 12, 2024

@dblock done. Here's the pull request: opensearch-project/documentation-website#8734

kolchfa-aws added a commit to opensearch-project/documentation-website that referenced this issue Nov 13, 2024
* Update snapshot-restore.md

Adds info from: opensearch-project/OpenSearch#16305 

Signed-off-by: pjuri <[email protected]>

* Update _tuning-your-cluster/availability-and-recovery/snapshots/snapshot-restore.md

Co-authored-by: kolchfa-aws <[email protected]>
Signed-off-by: pjuri <[email protected]>

* Update _tuning-your-cluster/availability-and-recovery/snapshots/snapshot-restore.md

Co-authored-by: kolchfa-aws <[email protected]>
Signed-off-by: pjuri <[email protected]>

* Update _tuning-your-cluster/availability-and-recovery/snapshots/snapshot-restore.md

Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: pjuri <[email protected]>

---------

Signed-off-by: pjuri <[email protected]>
Co-authored-by: kolchfa-aws <[email protected]>
Co-authored-by: Nathan Bower <[email protected]>
opensearch-trigger-bot bot pushed a commit to opensearch-project/documentation-website that referenced this issue Nov 13, 2024
* Update snapshot-restore.md

Adds info from: opensearch-project/OpenSearch#16305

Signed-off-by: pjuri <[email protected]>

* Update _tuning-your-cluster/availability-and-recovery/snapshots/snapshot-restore.md

Co-authored-by: kolchfa-aws <[email protected]>
Signed-off-by: pjuri <[email protected]>

* Update _tuning-your-cluster/availability-and-recovery/snapshots/snapshot-restore.md

Co-authored-by: kolchfa-aws <[email protected]>
Signed-off-by: pjuri <[email protected]>

* Update _tuning-your-cluster/availability-and-recovery/snapshots/snapshot-restore.md

Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: pjuri <[email protected]>

---------

Signed-off-by: pjuri <[email protected]>
Co-authored-by: kolchfa-aws <[email protected]>
Co-authored-by: Nathan Bower <[email protected]>
(cherry picked from commit 87a36f7)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
epugh pushed a commit to o19s/documentation-website that referenced this issue Nov 23, 2024
* Update snapshot-restore.md

Adds info from: opensearch-project/OpenSearch#16305

Signed-off-by: pjuri <[email protected]>

* Update _tuning-your-cluster/availability-and-recovery/snapshots/snapshot-restore.md

Co-authored-by: kolchfa-aws <[email protected]>
Signed-off-by: pjuri <[email protected]>

* Update _tuning-your-cluster/availability-and-recovery/snapshots/snapshot-restore.md

Co-authored-by: kolchfa-aws <[email protected]>
Signed-off-by: pjuri <[email protected]>

* Update _tuning-your-cluster/availability-and-recovery/snapshots/snapshot-restore.md

Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: pjuri <[email protected]>

---------

Signed-off-by: pjuri <[email protected]>
Co-authored-by: kolchfa-aws <[email protected]>
Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: Eric Pugh <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Plugins
Projects
None yet
Development

No branches or pull requests

3 participants