Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiple processes read the same file at the same time, cannot read file #16889

Open
liugs0213 opened this issue Feb 15, 2023 · 4 comments
Open
Labels
area-fuse Alluxio fuse integration needs-response waiting on alluxio response priority-high type-bug This issue is about a bug

Comments

@liugs0213
Copy link

Alluxio Version:
2.9.0

Describe the bug
same issue: #14007
image
a process reads the directory: 70s
two processes reads the directory: 160s
three processes reads the directory: two process error,a process is slow.
image
image

To Reproduce
Three processes mount a fuse to read the same directory

Expected behavior
A clear and concise description of what you expected to happen.

Urgency
Describe the impact and urgency of the bug.

Are you planning to fix it
Please indicate if you are already working on a PR.

Additional context
Add any other context about the problem here.

@liugs0213 liugs0213 added the type-bug This issue is about a bug label Feb 15, 2023
@Kai-Zhang Kai-Zhang added area-fuse Alluxio fuse integration needs-response waiting on alluxio response labels Feb 15, 2023
@Kai-Zhang
Copy link
Contributor

@LuQQiu can you share more information on the slow reading problem?

@LuQQiu
Copy link
Contributor

LuQQiu commented Feb 18, 2023

@LuQQiu
Copy link
Contributor

LuQQiu commented Feb 18, 2023

Likely same issue as before, multiple Files shared the same FileInStream and do many of the seek() operation.
The current Alluxio system does not support seek efficiently

@LuQQiu
Copy link
Contributor

LuQQiu commented Feb 18, 2023

One idea is to use BufferedSeeakbleInputStream, it's a way to use memory to improve performance.
When applications do
seek() -> read 128KB -> seek() -> read 128KB
it can change to read e.g. 2MB -> seek inside the 2MB is efficient.
The cache size is tunable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-fuse Alluxio fuse integration needs-response waiting on alluxio response priority-high type-bug This issue is about a bug
Projects
None yet
Development

No branches or pull requests

3 participants