-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce SparkParameterComposerCollection #2774
Introduce SparkParameterComposerCollection #2774
Conversation
fb6ded6
to
8d16732
Compare
8d16732
to
607ac20
Compare
import lombok.Setter; | ||
|
||
/** Define Spark Submit Parameters. */ | ||
public class SparkSubmitParameters { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved from:
async-query-core/src/main/java/org/opensearch/sql/spark/asyncquery/model/SparkSubmitParameters.java
And builder was extracted as separate class.
import org.opensearch.sql.spark.dispatcher.model.DispatchQueryRequest; | ||
import org.opensearch.sql.spark.execution.statestore.OpenSearchStateStoreUtil; | ||
|
||
public class SparkSubmitParametersBuilder { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic in this builder was originally within SparkSubmitParameters class.
S3GLUE related configs are extracted to S3GlueDataSourceSparkParameterComposer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it GeneralSparkParameterComposer? Hive and S3 relation configuration should in datasource composer?
...-core/src/main/java/org/opensearch/sql/spark/parameter/DataSourceSparkParameterComposer.java
Show resolved
Hide resolved
/** Stores Spark parameter composers and dispatch compose request to each composer */ | ||
public class SparkParameterComposerCollection { | ||
Collection<GeneralSparkParameterComposer> generalComposers = new ArrayList<>(); | ||
Map<DataSourceType, Collection<DataSourceSparkParameterComposer>> datasourceComposers = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why need map? how about Collection of DataSourceSparkParameterComposer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want to lookup by DataSourceType. Do you mean we want to lookup using linear search?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, i mean why not use Collection of DataSourceSparkParameterComposer.
understood now,
import org.opensearch.sql.spark.dispatcher.model.DispatchQueryRequest; | ||
import org.opensearch.sql.spark.execution.statestore.OpenSearchStateStoreUtil; | ||
|
||
public class SparkSubmitParametersBuilder { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it GeneralSparkParameterComposer? Hive and S3 relation configuration should in datasource composer?
...ain/java/org/opensearch/sql/spark/config/SparkExecutionEngineConfigClusterSettingLoader.java
Show resolved
Hide resolved
SparkSubmitParameters sparkSubmitParameters, | ||
DispatchQueryRequest dispatchQueryRequest, | ||
AsyncQueryRequestContext context) { | ||
settingLoader |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need Setting loader, why not directly access setting in here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it had access from two places, I extracted as a class to avoid redundancy. (It include AccessController.doPrivileged, etc.)
|
||
@AllArgsConstructor | ||
public class OpenSearchSparkSubmitParameterModifier implements SparkSubmitParameterModifier { | ||
|
||
private String extraParameters; | ||
|
||
@Override | ||
public void modifyParameters(SparkSubmitParameters parameters) { | ||
parameters.setExtraParameters(this.extraParameters); | ||
public void modifyParameters(SparkSubmitParametersBuilder builder) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could u explain more on Modifier and Composer and add to PR description. are these necessary? could we simply the concept?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added description. I am planning to delete Modifier since Generic composer will cover the use case.
...ry/src/main/java/org/opensearch/sql/spark/config/SparkExecutionEngineConfigSupplierImpl.java
Show resolved
Hide resolved
...ery-core/src/main/java/org/opensearch/sql/spark/parameter/GeneralSparkParameterComposer.java
Show resolved
Hide resolved
|
||
/** | ||
* Interface for extension point to allow modification of spark submit parameter. modifyParameter | ||
* method is called after the default spark submit parameter is build. | ||
* method is called after the default spark submit parameter is build. To be deprecated in favor of | ||
* {@link org.opensearch.sql.spark.parameter.GeneralSparkParameterComposer} | ||
*/ | ||
public interface SparkSubmitParameterModifier { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When does modifier come into picture?
So we have two concepts composer and modifier. is modifier used after the completion of building spark submit parameters using composer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am thinking to remove Modifier once I can verify the Composer can satisfy the use cases.
Please resolve the conflict. |
8a77cc1
to
5dac199
Compare
Signed-off-by: Tomoyuki Morita <[email protected]>
Signed-off-by: Tomoyuki Morita <[email protected]>
Signed-off-by: Tomoyuki Morita <[email protected]>
fe9b35f
to
9f2162e
Compare
The backport to
To backport manually, run these commands in your terminal: # Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.x
# Create a new branch
git switch --create backport/backport-2774-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 a151a7d4484134afb597473871a56a831ffbf323
# Push it to GitHub
git push --set-upstream origin backport/backport-2774-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.x Then, create a pull request where the |
* Introduce SparkParameterComposerCollection Signed-off-by: Tomoyuki Morita <[email protected]> * Fix comments Signed-off-by: Tomoyuki Morita <[email protected]> * Fix integ test Signed-off-by: Tomoyuki Morita <[email protected]> --------- Signed-off-by: Tomoyuki Morita <[email protected]> (cherry picked from commit a151a7d)
* Introduce SparkParameterComposerCollection Signed-off-by: Tomoyuki Morita <[email protected]> * Fix comments Signed-off-by: Tomoyuki Morita <[email protected]> * Fix integ test Signed-off-by: Tomoyuki Morita <[email protected]> --------- Signed-off-by: Tomoyuki Morita <[email protected]> (cherry picked from commit a151a7d)
* Introduce SparkParameterComposerCollection Signed-off-by: Tomoyuki Morita <[email protected]> * Fix comments Signed-off-by: Tomoyuki Morita <[email protected]> * Fix integ test Signed-off-by: Tomoyuki Morita <[email protected]> --------- Signed-off-by: Tomoyuki Morita <[email protected]>
* Introduce SparkParameterComposerCollection Signed-off-by: Tomoyuki Morita <[email protected]> * Fix comments Signed-off-by: Tomoyuki Morita <[email protected]> * Fix integ test Signed-off-by: Tomoyuki Morita <[email protected]> --------- Signed-off-by: Tomoyuki Morita <[email protected]>
Description
composeByDatasource
andcompose
is called.Issues Resolved
n/a
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.