Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IntelliJ] Failed to run Spark application with filled-in default Azure Blob storage account credential #5002

Closed
t-rufang opened this issue Mar 9, 2021 · 2 comments

Comments

@t-rufang
Copy link
Contributor

t-rufang commented Mar 9, 2021

Repro steps:

  1. Sign in Azure account

  2. In Run/Configuration dialog, create a new HDInsight configuration. Select a Spark cluster with Azure blob storage account as default storage account.

  3. [Key step] In Job Upload Storage part, select Use Azure Blob to upload. Fill in the storage account credential with Cluster default storage account credential. Save the config.
    image

  4. Submit the Spark application. We will find the application will fail will following errors.

2021-04-02 05:55:46,248 INFO org.apache.livy.utils.LineBufferedStream: stdout: 21/04/02 05:55:46 INFO MetricsSystemImpl [main]: azure-file-system metrics system started
2021-04-02 05:55:46,273 INFO org.apache.livy.utils.LineBufferedStream: stdout: Exception in thread "main" org.apache.hadoop.fs.azure.AzureException: org.apache.hadoop.fs.azure.KeyProviderException: ExitCodeException exitCode=2: Error reading S/MIME message
2021-04-02 05:55:46,274 INFO org.apache.livy.utils.LineBufferedStream: stdout: 140390793635480:error:0D0680A8:asn1 encoding routines:ASN1_CHECK_TLEN:wrong tag:tasn_dec.c:1236:
2021-04-02 05:55:46,274 INFO org.apache.livy.utils.LineBufferedStream: stdout: 140390793635480:error:0D07803A:asn1 encoding routines:ASN1_ITEM_EX_D2I:nested asn1 error:tasn_dec.c:405:Type=CMS_ContentInfo
2021-04-02 05:55:46,274 INFO org.apache.livy.utils.LineBufferedStream: stdout: 140390793635480:error:0D0D106E:asn1 encoding routines:B64_READ_ASN1:decode error:asn_mime.c:192:
2021-04-02 05:55:46,274 INFO org.apache.livy.utils.LineBufferedStream: stdout: 140390793635480:error:0D0D40CB:asn1 encoding routines:SMIME_read_ASN1:asn1 parse error:asn_mime.c:517:
2021-04-02 05:55:46,274 INFO org.apache.livy.utils.LineBufferedStream: stdout: 
2021-04-02 05:55:46,274 INFO org.apache.livy.utils.LineBufferedStream: stdout: 	at org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.createAzureStorageSession(AzureNativeFileSystemStore.java:1097)
2021-04-02 05:55:46,274 INFO org.apache.livy.utils.LineBufferedStream: stdout: 	at org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.initialize(AzureNativeFileSystemStore.java:547)
2021-04-02 05:55:46,274 INFO org.apache.livy.utils.LineBufferedStream: stdout: 	at org.apache.hadoop.fs.azure.NativeAzureFileSystem.initialize(NativeAzureFileSystem.java:1359)
2021-04-02 05:55:46,274 INFO org.apache.livy.utils.LineBufferedStream: stdout: 	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3303)
2021-04-02 05:55:46,274 INFO org.apache.livy.utils.LineBufferedStream: stdout: 	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124)
2021-04-02 05:55:46,274 INFO org.apache.livy.utils.LineBufferedStream: stdout: 	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3352)
2021-04-02 05:55:46,275 INFO org.apache.livy.utils.LineBufferedStream: stdout: 	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3320)
2021-04-02 05:55:46,275 INFO org.apache.livy.utils.LineBufferedStream: stdout: 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:479)
2021-04-02 05:55:46,275 INFO org.apache.livy.utils.LineBufferedStream: stdout: 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:227)
2021-04-02 05:55:46,275 INFO org.apache.livy.utils.LineBufferedStream: stdout: 	at org.apache.spark.deploy.yarn.Client$$anonfun$9.apply(Client.scala:139)
2021-04-02 05:55:46,275 INFO org.apache.livy.utils.LineBufferedStream: stdout: 	at org.apache.spark.deploy.yarn.Client$$anonfun$9.apply(Client.scala:139)
2021-04-02 05:55:46,275 INFO org.apache.livy.utils.LineBufferedStream: stdout: 	at scala.Option.getOrElse(Option.scala:121)
2021-04-02 05:55:46,275 INFO org.apache.livy.utils.LineBufferedStream: stdout: 	at org.apache.spark.deploy.yarn.Client.<init>(Client.scala:139)
2021-04-02 05:55:46,275 INFO org.apache.livy.utils.LineBufferedStream: stdout: 	at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1634)
2021-04-02 05:55:46,275 INFO org.apache.livy.utils.LineBufferedStream: stdout: 	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:858)
2021-04-02 05:55:46,275 INFO org.apache.livy.utils.LineBufferedStream: stdout: 	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
2021-04-02 05:55:46,275 INFO org.apache.livy.utils.LineBufferedStream: stdout: 	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
2021-04-02 05:55:46,275 INFO org.apache.livy.utils.LineBufferedStream: stdout: 	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
2021-04-02 05:55:46,275 INFO org.apache.livy.utils.LineBufferedStream: stdout: 	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:933)
2021-04-02 05:55:46,276 INFO org.apache.livy.utils.LineBufferedStream: stdout: 	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:942)
2021-04-02 05:55:46,276 INFO org.apache.livy.utils.LineBufferedStream: stdout: 	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
2021-04-02 05:55:46,276 INFO org.apache.livy.utils.LineBufferedStream: stdout: Caused by: org.apache.hadoop.fs.azure.KeyProviderException: ExitCodeException exitCode=2: Error reading S/MIME message
2021-04-02 05:55:46,276 INFO org.apache.livy.utils.LineBufferedStream: stdout: 140390793635480:error:0D0680A8:asn1 encoding routines:ASN1_CHECK_TLEN:wrong tag:tasn_dec.c:1236:
2021-04-02 05:55:46,276 INFO org.apache.livy.utils.LineBufferedStream: stdout: 140390793635480:error:0D07803A:asn1 encoding routines:ASN1_ITEM_EX_D2I:nested asn1 error:tasn_dec.c:405:Type=CMS_ContentInfo
2021-04-02 05:55:46,276 INFO org.apache.livy.utils.LineBufferedStream: stdout: 140390793635480:error:0D0D106E:asn1 encoding routines:B64_READ_ASN1:decode error:asn_mime.c:192:
2021-04-02 05:55:46,276 INFO org.apache.livy.utils.LineBufferedStream: stdout: 140390793635480:error:0D0D40CB:asn1 encoding routines:SMIME_read_ASN1:asn1 parse error:asn_mime.c:517:
2021-04-02 05:55:46,276 INFO org.apache.livy.utils.LineBufferedStream: stdout: 
2021-04-02 05:55:46,276 INFO org.apache.livy.utils.LineBufferedStream: stdout: 	at org.apache.hadoop.fs.azure.ShellDecryptionKeyProvider.getStorageAccountKey(ShellDecryptionKeyProvider.java:56)
2021-04-02 05:55:46,276 INFO org.apache.livy.utils.LineBufferedStream: stdout: 	at org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.getAccountKeyFromConfiguration(AzureNativeFileSystemStore.java:992)
2021-04-02 05:55:46,276 INFO org.apache.livy.utils.LineBufferedStream: stdout: 	at org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.createAzureStorageSession(AzureNativeFileSystemStore.java:1081)
2021-04-02 05:55:46,276 INFO org.apache.livy.utils.LineBufferedStream: stdout: 	... 20 more
2021-04-02 05:55:46,276 INFO org.apache.livy.utils.LineBufferedStream: stdout: Caused by: ExitCodeException exitCode=2: Error reading S/MIME message
2021-04-02 05:55:46,276 INFO org.apache.livy.utils.LineBufferedStream: stdout: 140390793635480:error:0D0680A8:asn1 encoding routines:ASN1_CHECK_TLEN:wrong tag:tasn_dec.c:1236:
2021-04-02 05:55:46,276 INFO org.apache.livy.utils.LineBufferedStream: stdout: 140390793635480:error:0D07803A:asn1 encoding routines:ASN1_ITEM_EX_D2I:nested asn1 error:tasn_dec.c:405:Type=CMS_ContentInfo
2021-04-02 05:55:46,277 INFO org.apache.livy.utils.LineBufferedStream: stdout: 140390793635480:error:0D0D106E:asn1 encoding routines:B64_READ_ASN1:decode error:asn_mime.c:192:
2021-04-02 05:55:46,277 INFO org.apache.livy.utils.LineBufferedStream: stdout: 140390793635480:error:0D0D40CB:asn1 encoding routines:SMIME_read_ASN1:asn1 parse error:asn_mime.c:517:
2021-04-02 05:55:46,277 INFO org.apache.livy.utils.LineBufferedStream: stdout: 
2021-04-02 05:55:46,277 INFO org.apache.livy.utils.LineBufferedStream: stdout: 	at org.apache.hadoop.util.Shell.runCommand(Shell.java:1008)
2021-04-02 05:55:46,277 INFO org.apache.livy.utils.LineBufferedStream: stdout: 	at org.apache.hadoop.util.Shell.run(Shell.java:901)
2021-04-02 05:55:46,277 INFO org.apache.livy.utils.LineBufferedStream: stdout: 	at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1213)
2021-04-02 05:55:46,277 INFO org.apache.livy.utils.LineBufferedStream: stdout: 	at org.apache.hadoop.util.Shell.execCommand(Shell.java:1307)
2021-04-02 05:55:46,277 INFO org.apache.livy.utils.LineBufferedStream: stdout: 	at org.apache.hadoop.util.Shell.execCommand(Shell.java:1289)
2021-04-02 05:55:46,277 INFO org.apache.livy.utils.LineBufferedStream: stdout: 	at org.apache.hadoop.fs.azure.ShellDecryptionKeyProvider.getStorageAccountKey(ShellDecryptionKeyProvider.java:54)
2021-04-02 05:55:46,277 INFO org.apache.livy.utils.LineBufferedStream: stdout: 	... 22 more

Workaround: In Job Upload Storage part, you can either

  • select Use cluster default storage account to upload.
    image
  • select Use Azure Blob to upload but fill in a non-cluster-default Azure blob storage account credential.
@t-rufang
Copy link
Contributor Author

t-rufang commented Apr 2, 2021

WHen HDINsight cluster access the blob storage account, it needs access key of the account. By default the access key is encryped and the default KeyProvider is ShellDecryptionKeyProvider.
image
That means following configuration are enabled by default.

<property>
  <name>fs.azure.account.keyprovider.youraccount</name>
  <value>org.apache.hadoop.fs.azure.ShellDecryptionKeyProvider</value>
</property>

<property>
  <name>fs.azure.account.key.youraccount.blob.core.windows.net</name>
  <value>YOUR ENCRYPTED ACCESS KEY</value>
</property>

For a linked HDI cluster in IntelliJ, if we specify the job upload storage account as cluster default storage account and filled in the storage account credentials, we will use the un-encryped access key in spark configuration.
image
The configuration looks like the following way

<property>
  <name>fs.azure.account.key.youraccount.blob.core.windows.net</name>
  <value>YOUR ACCESS KEY</value>
</property>

So when this spark application tries to access the cluster's default storage account, the application will take the access key as encryped key because the value of fs.azure.account.keyprovider.youraccount is still org.apache.hadoop.fs.azure.ShellDecryptionKeyProvider. That will cause error because what we provided in the spark configuration is an un-encryped access key.

To fix this issue, we need to override the value of fs.azure.account.keyprovider.youraccount with org.apache.hadoop.fs.azure.SimpleKeyProvider. You can find the fix at #5073.

Reference:

@v-xiaochuan
Copy link

verified with build #5073 and not repro

@t-rufang t-rufang closed this as completed Apr 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants