-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Improvement-13476][*] Improve DS load hdfs configuration automatically by using environment variable HADOOP_CONF_DIR #13478
Conversation
…using environment variable HADOOP_CONF_DIR This closes issue apache#13476
Please use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should better avoid using system env, maybe we can add hadoop.conf.dir
in our property file.
In addition, i think there's nothing wrong with current way which move configuration files into classpath, i just wonder do we really have to make this change?
please send to detail to |
It is not convenient enough. You have to copy hdfs-site.xml and core-site.xml to the classpath manually; And it will bring maintenance costs, users have to remember that copy again when configuration files change ;
It is a choice for users , you can still move configuration files into classpath. This only takes effect if you specify an environment variable; |
Codecov Report
@@ Coverage Diff @@
## dev #13478 +/- ##
============================================
+ Coverage 39.62% 39.63% +0.01%
- Complexity 4354 4364 +10
============================================
Files 1097 1097
Lines 41150 41184 +34
Branches 4712 4721 +9
============================================
+ Hits 16304 16323 +19
- Misses 23036 23045 +9
- Partials 1810 1816 +6
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
I've already fixed it |
Environment variables |
You are right, |
what detail? Is it a mistake? |
In practice, users always maintain a set of common global variables, this always include Actually, all I know about components are use environment variables to configure. For Example, a source code fragment of Flink’s HadoopUtils public static String[] possibleHadoopConfPaths(
org.apache.flink.configuration.Configuration flinkConfiguration) {
String[] possiblePaths = new String[4];
possiblePaths[0] = flinkConfiguration.getString(ConfigConstants.PATH_HADOOP_CONFIG, null);
possiblePaths[1] = System.getenv("HADOOP_CONF_DIR");
if (System.getenv("HADOOP_HOME") != null) {
possiblePaths[2] = System.getenv("HADOOP_HOME") + "/conf";
possiblePaths[3] = System.getenv("HADOOP_HOME") + "/etc/hadoop"; // hadoop 2.2
}
return Arrays.stream(possiblePaths).filter(Objects::nonNull).toArray(String[]::new);
} At Last, we should not put all configurations in properties, especially if the configuration is contained in common environment variables. This will make DS deploy fast. There is not much difference between putting it in I know very little about k8s. Maybe you can use |
@xxjingcd I think the |
@xxjingcd Sorry for my mistake, we have already created environment variables with
In this PR, i think you don't have to change that part of codes, it's up to users whether override hdfs configuration or not.
For me, there's no problem, please update docs in resource center configuration page : ) Also, you don't have to add environment configuration in Basically LGTM, cc @zhongjiajie @EricGao888 |
@Radeity thanks for your code review work.
OK, I'll remove the
The purpose of the pr #13028 is to remove the export of
OK, I'll sync the configuration. |
We should better only add ds' internal env configuration, which we don't have now. And user/tenant manages external env configurations by themselves. |
Ok, I will remove |
Kudos, SonarCloud Quality Gate passed! |
This pull request has been automatically marked as stale because it has not had recent activity for 120 days. It will be closed in 7 days if no further activity occurs. |
This pull request has been closed because it has not had recent activity. You could reopen it if you try to continue your work, and anyone who are interested in it are encouraged to continue work on this pull request. |
Purpose of the pull request
Improve DS load hdfs configuration automatically by using common environment variables
HADOOP_CONF_DIR
orHADOOP_HOME
which are usually already configured in os;It provides a more convenient choice. If the environment variable does not exist or is an error configuration, DS will load the configuration as before;
This closes #13476
Brief change log
HdfsStorageOperator
dolphinscheduler_env.sh
: add tips for environment variableHADOOP_HOME
andHADOOP_CONF_DIR
Verify this pull request
This change added tests and can be verified as follows:
HdfsStorageOperatorTest