Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suppress hdfsFixture if there are spaces in the path #30302

Merged

Conversation

DaveCTurner
Copy link
Contributor

Tests such as https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.x+multijob-unix-compatibility/os=ubuntu%20&&%20virtual/1015/console are failing because their path contains spaces, which causes issues with hdfsFixture. This change skips these tests in this situation.

05:21:07     2018-05-01 05:20:52,697 WARN  [DataNode: [[[DISK]file:/var/lib/jenkins/workspace/elastic+elasticsearch+6.x+multijob-unix-compatibility/os/ubuntu%2520&&%2520virtual/plugins/repository-hdfs/build/fixtures/hdfsFixture/hdfs-data/data/data1/, [DISK]file:/var/lib/jenkins/workspace/elastic+elasticsearch+6.x+multijob-unix-compatibility/os/ubuntu%2520&&%2520virtual/plugins/repository-hdfs/build/fixtures/hdfsFixture/hdfs-data/data/data2/]]  heartbeating to localhost/127.0.0.1:9999] datanode.DataNode (BPServiceActor.java:run(779)) - Unexpected exception in block pool Block pool <registering> (Datanode Uuid 34160522-c1dc-4082-94c1-5771444285ea) service to localhost/127.0.0.1:9999
05:21:07     java.util.UnknownFormatConversionException: Conversion = '2'
05:21:07     	at java.util.Formatter.checkText(Formatter.java:2579)
05:21:07     	at java.util.Formatter.parse(Formatter.java:2555)
05:21:07     	at java.util.Formatter.format(Formatter.java:2501)
05:21:07     	at java.util.Formatter.format(Formatter.java:2455)
05:21:07     	at java.lang.String.format(String.java:2940)
05:21:07     	at com.google.common.util.concurrent.ThreadFactoryBuilder.setNameFormat(ThreadFactoryBuilder.java:71)
05:21:07     	at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.initializeCacheExecutor(FsVolumeImpl.java:153)
05:21:07     	at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.<init>(FsVolumeImpl.java:135)
05:21:07     	at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.addVolume(FsDatasetImpl.java:436)
05:21:07     	at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.<init>(FsDatasetImpl.java:333)
05:21:07     	at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetFactory.newInstance(FsDatasetFactory.java:34)
05:21:07     	at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetFactory.newInstance(FsDatasetFactory.java:30)
05:21:07     	at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1579)
05:21:07     	at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1527)
05:21:07     	at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:327)
05:21:07     	at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:266)
05:21:07     	at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:746)
05:21:07     	at java.lang.Thread.run(Thread.java:748)

@DaveCTurner DaveCTurner added >test Issues or PRs that are addressing/adding tests :Delivery/Build Build or test infrastructure v7.0.0 v5.6.10 v6.4.0 v6.3.1 labels May 1, 2018
@DaveCTurner DaveCTurner requested a review from jbaiera May 1, 2018 13:16
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

Copy link
Member

@jbaiera jbaiera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments. On second thinking, is that path already HTTP encoded when gradle is executing, or is it encoded later than this check?

@@ -230,6 +230,11 @@ if (Os.isFamily(Os.FAMILY_WINDOWS)) {
fixtureSupported = true
}

boolean legalPath = rootProject.rootDir.toString().contains(" ") == false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure a regular space in the path is what was causing the original issue. The issue was because a space character was HTTP encoded to %20 (and then encoded again to %2520). The % sign with numbers trips up string formatting in Java, which is what caused the exception.

Perhaps it might make sense to try formatting the string the same way HDFS does in a try block, catch the exception and mark the path illegal in that case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The offending code in HDFS does:

ThreadFactory workerFactory = new ThreadFactoryBuilder()
        .setDaemon(true)
        .setNameFormat("FsVolumeImplWorker-" + parent.toString() + "-%d") // <----
        .build();

Which checks the name format eagerly to make sure it works with:

public ThreadFactoryBuilder setNameFormat(String nameFormat) {
    String.format(nameFormat, 0); // fail fast if the format is bad or null
    this.nameFormat = nameFormat;
    return this;
  }

So perhaps we could try formatting the string the exact same way HDFS does to ensure max compatibility in this regard.

try {
    // See o.a.h.h.s.d.f.i.FsVolumeImpl#initializeCacheExecutor:150
    String.format(rootProject.rootDir.toString() + '-%d', 0) 
} catch (IllegalFormatException ife) {
    legalPath = false;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, the URL encoding hasn't happened by this point, so looking for a literal space finds the issue here:

$ pwd 
/var/lib/jenkins/workspace/elastic+elasticsearch+6.x+multijob-unix-compatibility/os/ubuntu && virtual
$ pwd | xxd
00000000: 2f76 6172 2f6c 6962 2f6a 656e 6b69 6e73  /var/lib/jenkins
00000010: 2f77 6f72 6b73 7061 6365 2f65 6c61 7374  /workspace/elast
00000020: 6963 2b65 6c61 7374 6963 7365 6172 6368  ic+elasticsearch
00000030: 2b36 2e78 2b6d 756c 7469 6a6f 622d 756e  +6.x+multijob-un
00000040: 6978 2d63 6f6d 7061 7469 6269 6c69 7479  ix-compatibility
00000050: 2f6f 732f 7562 756e 7475 2026 2620 7669  /os/ubuntu && vi
00000060: 7274 7561 6c0a                           rtual.

However, from the code you've shown it looks like any path that contains anything that needs URL-encoding is going to cause issues here - % for instance - so I agree that trying to catch an IllegalFormatException seems like it'll be more faithful. Can you dig out how the URL-encoding is happening (e.g. does it use java.net.URLEncoder (twice?!) or is it more complicated)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question. I'll take a look but I wouldn't be surprised if its something deep in Hadoop or far away from where it's breaking. If it's not encoded at this point the space check should be fine enough then. Thanks!

Copy link
Member

@jbaiera jbaiera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The path is not encoded at the time of the build script running. This LGTM.

Copy link
Member

@jasontedor jasontedor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@DaveCTurner DaveCTurner merged commit 15df911 into elastic:master May 11, 2018
DaveCTurner added a commit that referenced this pull request May 11, 2018
HDFS sets its thread-name format based partly on a URL-encoded version of the
path, but the URL-encoding of spaces as `%20` is interpreted as a field in the
formatted string of type `2`, which is nonsensical. This change simply skips 
these tests in this case.
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request May 11, 2018
HDFS sets its thread-name format based partly on a URL-encoded version of the
path, but the URL-encoding of spaces as `%20` is interpreted as a field in the
formatted string of type `2`, which is nonsensical. This change simply skips 
these tests in this case.
@DaveCTurner
Copy link
Contributor Author

Marking this as backport pending to 6.3 as a naive backport fails for me and it'll take some time to dig into why.

DaveCTurner added a commit that referenced this pull request May 11, 2018
HDFS sets its thread-name format based partly on a URL-encoded version of the
path, but the URL-encoding of spaces as `%20` is interpreted as a field in the
formatted string of type `2`, which is nonsensical. This change simply skips 
these tests in this case.
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request May 12, 2018
* master: (41 commits)
  Bump Gradle heap to 2 GB (elastic#30535)
  SQL: Use request flavored methods in tests (elastic#30345)
  Suppress hdfsFixture if there are spaces in the path (elastic#30302)
  Delete temporary blobs before creating index file (elastic#30528)
  Watcher: Remove TriggerEngine.getJobCount() (elastic#30395)
  [ML] Fix wire BWC for JobUpdate (elastic#30512)
  Use simpler write-once semantics for FS repository (elastic#30435)
  Derive max composite buffers from max content len
  Use simpler write-once semantics for HDFS repository (elastic#30439)
  SQL: Improve correctness of SYS COLUMNS & TYPES (elastic#30418)
  Mute two tests in FlushIT with @AwaitsFix.
  Fix incorrect template name in test case
  Build: Remove legacy bwc files from xpack (elastic#30485)
  Mute UnicastZenPingTests#testSimplePings with @AwaitsFix.
  Security: cleanup code in file stores (elastic#30348)
  Security: fix TokenMetaData equals and hashcode (elastic#30347)
  Mute two tests from SmokeTestWatcherWithSecurityClientYamlTestSuiteIT.
  Mute SharedClusterSnapshotRestoreIT#testSnapshotSucceedsAfterSnapshotFailure with @AwaitsFix.
  SQL: Improve compatibility with MS query (elastic#30516)
  SQL: Fix parsing of dates with milliseconds (elastic#30419)
  ...
dnhatn added a commit that referenced this pull request May 14, 2018
* master:
  Default to one shard (#30539)
  Unmute IndexUpgradeIT tests
  Forbid expensive query parts in ranking evaluation (#30151)
  Docs: Update HighLevelRestClient migration docs (#30544)
  Clients: Switch to new performRequest (#30543)
  [TEST] Fix typo in MovAvgIT test
  Add missing dependencies on testClasses (#30527)
  [TEST] Mute ML test that needs updating to following ml-cpp changes
  Document woes between auto-expand-replicas and allocation filtering (#30531)
  Moved tokenizers to analysis common module (#30538)
  Adjust copy settings versions
  Mute ShrinkIndexIT suite
  SQL: SYS TABLES ordered according to *DBC specs (#30530)
  Deprecate not copy settings and explicitly disallow (#30404)
  [ML] Improve state persistence log message
  Build: Add mavenPlugin cluster configuration method (#30541)
  Re-enable FlushIT tests
  Bump Gradle heap to 2 GB (#30535)
  SQL: Use request flavored methods in tests (#30345)
  Suppress hdfsFixture if there are spaces in the path (#30302)
  Delete temporary blobs before creating index file (#30528)
  Watcher: Remove TriggerEngine.getJobCount() (#30395)
  [ML] Fix wire BWC for JobUpdate (#30512)
  Use simpler write-once semantics for FS repository (#30435)
  Derive max composite buffers from max content len
  Use simpler write-once semantics for HDFS repository (#30439)
  SQL: Improve correctness of SYS COLUMNS & TYPES (#30418)
  Mute two tests in FlushIT with @AwaitsFix.
  Fix incorrect template name in test case
  Build: Remove legacy bwc files from xpack (#30485)
  Mute UnicastZenPingTests#testSimplePings with @AwaitsFix.
  Security: cleanup code in file stores (#30348)
  Security: fix TokenMetaData equals and hashcode (#30347)
  Mute two tests from SmokeTestWatcherWithSecurityClientYamlTestSuiteIT.
  Mute SharedClusterSnapshotRestoreIT#testSnapshotSucceedsAfterSnapshotFailure with @AwaitsFix.
  SQL: Improve compatibility with MS query (#30516)
  SQL: Fix parsing of dates with milliseconds (#30419)
dnhatn added a commit that referenced this pull request May 14, 2018
* 6.x:
  Unmute IndexUpgradeIT tests
  Forbid expensive query parts in ranking evaluation (#30151)
  Docs: Update HighLevelRestClient migration docs (#30544)
  Clients: Switch to new performRequest (#30543)
  [TEST] Fix typo in MovAvgIT test
  [TEST] Mute ML test that needs updating to following ml-cpp changes
  Moved tokenizers to analysis common module (#30538)
  Document woes between auto-expand-replicas and allocation filtering (#30531)
  [ML] Hide internal Job update options from the REST API (#30537)
  Deprecate not copy settings and explicitly disallow (#30404)
  Mute ShrinkIndexIT suite
  SQL: SYS TABLES ordered according to *DBC specs (#30530)
  [ML] Improve state persistence log message
  Build: Add mavenPlugin cluster configuration method (#30541)
  Re-enable FlushIT tests
  Bump Gradle heap to 2 GB (#30535)
  Bump Gradle heap to 1792m (#30484)
  SQL: Use request flavored methods in tests (#30345)
  Suppress hdfsFixture if there are spaces in the path (#30302)
  Delete temporary blobs before creating index file (#30528)
  Watcher: Remove TriggerEngine.getJobCount() (#30395)
  Use simpler write-once semantics for FS repository (#30435)
  Use simpler write-once semantics for HDFS repository (#30439)
  SQL: Improve correctness of SYS COLUMNS & TYPES (#30418)
  Mute two tests in FlushIT with @AwaitsFix.
  Fix incorrect template name in test case
  Build: Remove legacy bwc files from xpack (#30485)
  Security: Simplify security index listeners (#30466)
  Mute SharedClusterSnapshotRestoreIT#testSnapshotSucceedsAfterSnapshotFailure with @AwaitsFix.
  Add proper longitude validation in geo_polygon_query (#30497)
  Mute UnicastZenPingTests#testSimplePings with @AwaitsFix.
  Security: cleanup code in file stores (#30348)
  Security: fix TokenMetaData equals and hashcode (#30347)
  Mute two tests from SmokeTestWatcherWithSecurityClientYamlTestSuiteIT.
  Fix incorrect merged entry in changelog
  SQL: Improve compatibility with MS query (#30516)
  SQL: Fix parsing of dates with milliseconds (#30419)
martijnvg added a commit that referenced this pull request May 15, 2018
* es/ccr: (37 commits)
  Default to one shard (#30539)
  Unmute IndexUpgradeIT tests
  Forbid expensive query parts in ranking evaluation (#30151)
  Docs: Update HighLevelRestClient migration docs (#30544)
  Clients: Switch to new performRequest (#30543)
  [TEST] Fix typo in MovAvgIT test
  Add missing dependencies on testClasses (#30527)
  [TEST] Mute ML test that needs updating to following ml-cpp changes
  Document woes between auto-expand-replicas and allocation filtering (#30531)
  Moved tokenizers to analysis common module (#30538)
  Adjust copy settings versions
  Mute ShrinkIndexIT suite
  SQL: SYS TABLES ordered according to *DBC specs (#30530)
  Deprecate not copy settings and explicitly disallow (#30404)
  [ML] Improve state persistence log message
  Build: Add mavenPlugin cluster configuration method (#30541)
  Re-enable FlushIT tests
  Bump Gradle heap to 2 GB (#30535)
  SQL: Use request flavored methods in tests (#30345)
  Suppress hdfsFixture if there are spaces in the path (#30302)
  ...
martijnvg added a commit that referenced this pull request May 15, 2018
* es/ccr: (37 commits)
  Default to one shard (#30539)
  Unmute IndexUpgradeIT tests
  Forbid expensive query parts in ranking evaluation (#30151)
  Docs: Update HighLevelRestClient migration docs (#30544)
  Clients: Switch to new performRequest (#30543)
  [TEST] Fix typo in MovAvgIT test
  Add missing dependencies on testClasses (#30527)
  [TEST] Mute ML test that needs updating to following ml-cpp changes
  Document woes between auto-expand-replicas and allocation filtering (#30531)
  Moved tokenizers to analysis common module (#30538)
  Adjust copy settings versions
  Mute ShrinkIndexIT suite
  SQL: SYS TABLES ordered according to *DBC specs (#30530)
  Deprecate not copy settings and explicitly disallow (#30404)
  [ML] Improve state persistence log message
  Build: Add mavenPlugin cluster configuration method (#30541)
  Re-enable FlushIT tests
  Bump Gradle heap to 2 GB (#30535)
  SQL: Use request flavored methods in tests (#30345)
  Suppress hdfsFixture if there are spaces in the path (#30302)
  ...
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request May 15, 2018
* es/ccr: (37 commits)
  Default to one shard (elastic#30539)
  Unmute IndexUpgradeIT tests
  Forbid expensive query parts in ranking evaluation (elastic#30151)
  Docs: Update HighLevelRestClient migration docs (elastic#30544)
  Clients: Switch to new performRequest (elastic#30543)
  [TEST] Fix typo in MovAvgIT test
  Add missing dependencies on testClasses (elastic#30527)
  [TEST] Mute ML test that needs updating to following ml-cpp changes
  Document woes between auto-expand-replicas and allocation filtering (elastic#30531)
  Moved tokenizers to analysis common module (elastic#30538)
  Adjust copy settings versions
  Mute ShrinkIndexIT suite
  SQL: SYS TABLES ordered according to *DBC specs (elastic#30530)
  Deprecate not copy settings and explicitly disallow (elastic#30404)
  [ML] Improve state persistence log message
  Build: Add mavenPlugin cluster configuration method (elastic#30541)
  Re-enable FlushIT tests
  Bump Gradle heap to 2 GB (elastic#30535)
  SQL: Use request flavored methods in tests (elastic#30345)
  Suppress hdfsFixture if there are spaces in the path (elastic#30302)
  ...
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request May 15, 2018
* es/ccr: (37 commits)
  Default to one shard (elastic#30539)
  Unmute IndexUpgradeIT tests
  Forbid expensive query parts in ranking evaluation (elastic#30151)
  Docs: Update HighLevelRestClient migration docs (elastic#30544)
  Clients: Switch to new performRequest (elastic#30543)
  [TEST] Fix typo in MovAvgIT test
  Add missing dependencies on testClasses (elastic#30527)
  [TEST] Mute ML test that needs updating to following ml-cpp changes
  Document woes between auto-expand-replicas and allocation filtering (elastic#30531)
  Moved tokenizers to analysis common module (elastic#30538)
  Adjust copy settings versions
  Mute ShrinkIndexIT suite
  SQL: SYS TABLES ordered according to *DBC specs (elastic#30530)
  Deprecate not copy settings and explicitly disallow (elastic#30404)
  [ML] Improve state persistence log message
  Build: Add mavenPlugin cluster configuration method (elastic#30541)
  Re-enable FlushIT tests
  Bump Gradle heap to 2 GB (elastic#30535)
  SQL: Use request flavored methods in tests (elastic#30345)
  Suppress hdfsFixture if there are spaces in the path (elastic#30302)
  ...
@bleskes bleskes added the v6.3.0 label May 16, 2018
@bleskes bleskes removed the v6.3.1 label May 16, 2018
DaveCTurner added a commit that referenced this pull request May 16, 2018
HDFS sets its thread-name format based partly on a URL-encoded version of the
path, but the URL-encoding of spaces as `%20` is interpreted as a field in the
formatted string of type `2`, which is nonsensical. This change simply skips 
these tests in this case.

Backport of #30302
@jimczi jimczi added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019
@mark-vieira mark-vieira added the Team:Delivery Meta label for Delivery team label Nov 11, 2020
@DaveCTurner DaveCTurner deleted the 2018-05-01-hdfs-tests-require-no-spaces branch July 23, 2022 10:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Delivery/Build Build or test infrastructure Team:Delivery Meta label for Delivery team >test Issues or PRs that are addressing/adding tests v6.3.0 v6.4.0 v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants