Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elasticsearch accepts requests to write indices with bad characters that cannot be written to disk by java #6589

Closed
geekpete opened this issue Jun 23, 2014 · 4 comments

Comments

@geekpete
Copy link
Member

Elasticsearch 1.1.1 appears to accept requests to create an index with invalid characters that cannot be written to disk as files or directories by java.

It should instead reject the request with invalid characters detected in index name or some similar error. Or perhaps save the disk files/directories with escaped characters and translate them back to unescaped when needed.

The request then floats around in the cluster unpersisted, unable to be written to cluster state on master nodes or data files on data nodes.

Logs show the error in two ways.

This example is an index name with ^@^@ chars on the end, (or so the logs tell me).

Master nodes complain with:

[2014-06-21 17:28:02,185][WARN ][gateway.local.state.meta ] [masternode1.whateverdomain] [0e5f2bd5e517b93056cd3ef7d51223c5^@^@]: failed to state
java.io.FileNotFoundException: Invalid file path
    at java.io.FileOutputStream.<init>(FileOutputStream.java:215)
    at java.io.FileOutputStream.<init>(FileOutputStream.java:171)
    at org.elasticsearch.gateway.local.state.meta.LocalGatewayMetaState.writeIndex(LocalGatewayMetaState.java:359)
    at org.elasticsearch.gateway.local.state.meta.LocalGatewayMetaState.clusterChanged(LocalGatewayMetaState.java:217)
    at org.elasticsearch.gateway.local.LocalGateway.clusterChanged(LocalGateway.java:207)
    at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:431)
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:134)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

Data nodes complain with:

[2014-06-21 18:41:02,758][WARN ][indices.cluster          ] [datanode1.whateverdomain] [0e5f2bd5e517b93056cd3ef7d51223c5^@^@][6] failed to create shard
org.elasticsearch.index.shard.IndexShardCreationException: [0e5f2bd5e517b93056cd3ef7d51223c5^@^@][6] failed to create shard
    at org.elasticsearch.index.service.InternalIndexService.createShard(InternalIndexService.java:342)
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyInitializingShard(IndicesClusterStateService.java:628)
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyNewOrUpdatedShards(IndicesClusterStateService.java:546)
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.clusterChanged(IndicesClusterStateService.java:178)
    at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:425)
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:134)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Invalid file path

Shards will continually flow around the available data nodes attempting to initialise but unable to do so, then trying on another node. This writes lots of lines to all logs and eats a chunk of cpu on all nodes involved.

The cluster goes into the red state due to being unable to allocate primary shards, even though the rest of the indices are fine.

You will be unable to delete this rogue index as it has not yet been persisted to disk for either state or data files, so there is nothing to reference to delete, it doesn't yet exist.

The only way to bring your cluster back into green state is to perform a full cluster shutdown and restart.

Is there a way to reset cluster state live, as per a full cluster shutdown without actually performing a full cluster restart of all the nodes? Via an api command that tells the cluster to reset only cluster state from a fresh state as per restarting?

@geekpete
Copy link
Member Author

Some more info:

Linux some-node-hostname 3.2.0-4-amd64 #1 SMP Debian 3.2.57-3 x86_64 GNU/Linux

Filesystem is Ext4.

@areek areek assigned rmuir and unassigned rmuir Jun 23, 2014
@spinscale
Copy link
Contributor

Hey,

looks like some weird control characters (however we need to know which one). Can you reproduce this using curl or a shell script or sense? Also, maybe you can check the data.path how the directory is actually looking like on the filesystem in order to check those?

@geekpete
Copy link
Member Author

I've done a quick test on my mac, OSX seems to write the filename/dir out just fine, but I'll need to confirm the java version. Probably difference in filesystem allows it.

I'll try to replicate the test on the same linux/java versions if I can and let you know how that goes.

@clintongormley
Copy link
Contributor

Close in favour of #6736

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants