-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[#772] fix(kerberos): cache proxy user ugi to avoid memory leak #773
Conversation
PTAL @leixm |
Codecov Report
@@ Coverage Diff @@
## master #773 +/- ##
============================================
+ Coverage 63.11% 63.33% +0.21%
- Complexity 1955 1986 +31
============================================
Files 230 231 +1
Lines 11346 11462 +116
Branches 1119 1126 +7
============================================
+ Hits 7161 7259 +98
- Misses 3789 3800 +11
- Partials 396 403 +7
... and 20 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
common/src/main/java/org/apache/uniffle/common/security/HadoopSecurityContext.java
Show resolved
Hide resolved
Could we modify the description about |
@@ -75,6 +78,7 @@ public HadoopSecurityContext( | |||
refreshIntervalSec, | |||
refreshIntervalSec, | |||
TimeUnit.SECONDS); | |||
proxyUserUgiPool = Maps.newConcurrentMap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use JavaUtils here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes.
It seems that we need some extra tests . |
+1. At least, you should test the filesystem instance should be the same for the same proxy users. |
common/src/main/java/org/apache/uniffle/common/security/HadoopSecurityContext.java
Outdated
Show resolved
Hide resolved
common/src/main/java/org/apache/uniffle/common/security/HadoopSecurityContext.java
Outdated
Show resolved
Hide resolved
common/src/test/java/org/apache/uniffle/common/security/HadoopSecurityContextTest.java
Outdated
Show resolved
Hide resolved
|
||
FileSystem fileSystem1 = context.runSecured("alex", () -> FileSystem.get(kerberizedHdfs.getConf())); | ||
FileSystem fileSystem2 = context.runSecured("alex", () -> FileSystem.get(kerberizedHdfs.getConf())); | ||
assertEquals(fileSystem1, fileSystem2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assertTrue(filesystem1 == fileSystem2)?
I believe it's more about they are the same instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, except one minor comment
common/src/main/java/org/apache/uniffle/common/security/HadoopSecurityContext.java
Show resolved
Hide resolved
common/src/main/java/org/apache/uniffle/common/security/HadoopSecurityContext.java
Show resolved
Hide resolved
common/src/main/java/org/apache/uniffle/common/security/HadoopSecurityContext.java
Show resolved
Hide resolved
common/src/main/java/org/apache/uniffle/common/security/HadoopSecurityContext.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, pending CI passes
…apache#773) ### What changes were proposed in this pull request? 1. To avoid memory leak by caching of proxy user UGI. ### Why are the changes needed? Fix: apache#772 The Hadoop filesystem instance will be created too many time in cache, which will cause the shuffle server memory leak. As we know, the filesystem cache's key is built by the scheme、authority and UGI. The scheme and authority are not changed every time. But for UGI, if we invoke the createProxyUser, it will always create a new one, that means the every invoking `Filesystem.get()`, it will be cached due to different key. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? 1. Existing UTs 2. Added tests
### What changes were proposed in this pull request? 1. To avoid memory leak by caching of proxy user UGI. ### Why are the changes needed? Fix: #772 The Hadoop filesystem instance will be created too many time in cache, which will cause the shuffle server memory leak. As we know, the filesystem cache's key is built by the scheme、authority and UGI. The scheme and authority are not changed every time. But for UGI, if we invoke the createProxyUser, it will always create a new one, that means the every invoking `Filesystem.get()`, it will be cached due to different key. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? 1. Existing UTs 2. Added tests
This is a fix. It should be merged branch 0.7. But there is conflict with branch 0.7. Could you raise a pr to branch 0.7? |
### What changes were proposed in this pull request? 1. To avoid memory leak by caching of proxy user UGI. ### Why are the changes needed? Fix: #772 The Hadoop filesystem instance will be created too many time in cache, which will cause the shuffle server memory leak. As we know, the filesystem cache's key is built by the scheme、authority and UGI. The scheme and authority are not changed every time. But for UGI, if we invoke the createProxyUser, it will always create a new one, that means the every invoking `Filesystem.get()`, it will be cached due to different key. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? 1. Existing UTs 2. Added tests
…apache#773) 1. To avoid memory leak by caching of proxy user UGI. Fix: apache#772 The Hadoop filesystem instance will be created too many time in cache, which will cause the shuffle server memory leak. As we know, the filesystem cache's key is built by the scheme、authority and UGI. The scheme and authority are not changed every time. But for UGI, if we invoke the createProxyUser, it will always create a new one, that means the every invoking `Filesystem.get()`, it will be cached due to different key. No. 1. Existing UTs 2. Added tests
…773) (#824) ### What changes were proposed in this pull request? 1. To avoid memory leak by caching of proxy user UGI. ### Why are the changes needed? Fix: #772 The Hadoop filesystem instance will be created too many time in cache, which will cause the shuffle server memory leak. As we know, the filesystem cache's key is built by the scheme、authority and UGI. The scheme and authority are not changed every time. But for UGI, if we invoke the createProxyUser, it will always create a new one, that means the every invoking `Filesystem.get()`, it will be cached due to different key. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? 1. Existing UTs 2. Added tests
…ory leak (apache#773)" This reverts commit 89c2b92.
…apache#773) 1. To avoid memory leak by caching of proxy user UGI. Fix: apache#772 The Hadoop filesystem instance will be created too many time in cache, which will cause the shuffle server memory leak. As we know, the filesystem cache's key is built by the scheme、authority and UGI. The scheme and authority are not changed every time. But for UGI, if we invoke the createProxyUser, it will always create a new one, that means the every invoking `Filesystem.get()`, it will be cached due to different key. No. 1. Existing UTs 2. Added tests
…ory leak (apache#773)" This reverts commit 42d9b0a.
… leak (apache#773) (apache#824) 1. To avoid memory leak by caching of proxy user UGI. Fix: apache#772 The Hadoop filesystem instance will be created too many time in cache, which will cause the shuffle server memory leak. As we know, the filesystem cache's key is built by the scheme、authority and UGI. The scheme and authority are not changed every time. But for UGI, if we invoke the createProxyUser, it will always create a new one, that means the every invoking `Filesystem.get()`, it will be cached due to different key. No. 1. Existing UTs 2. Added tests
What changes were proposed in this pull request?
Why are the changes needed?
Fix: #772
The Hadoop filesystem instance will be created too many time in cache,
which will cause the shuffle server memory leak.
As we know, the filesystem cache's key is built by the scheme、authority and UGI.
The scheme and authority are not changed every time. But for UGI, if we invoke the
createProxyUser, it will always create a new one, that means the every invoking
Filesystem.get()
,it will be cached due to different key.
Does this PR introduce any user-facing change?
No.
How was this patch tested?