Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Qualification tool: Improve memory consumption while processing large eventlogs. #815

Open
2 of 5 tasks
Tracked by #367
nartal1 opened this issue Oct 1, 2021 · 0 comments
Open
2 of 5 tasks
Tracked by #367
Assignees
Labels
core_tools Scope the core module (scala) feature request New feature or request

Comments

@nartal1
Copy link
Collaborator

nartal1 commented Oct 1, 2021

Is your feature request related to a problem? Please describe.
This is followon from discussion NVIDIA/spark-rapids#3714 (comment) . Currently if the eventlogs are large and exceeds heap size we throw OOM error and exit the program.
Currently we have documented it to increase the heap size(-Xmx8G) to avoid OOM errors.

Describe the solution you'd like
Would like to improve memory consumption so that large event log files can be read without throwing OOM error.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Current OOM error:

21/09/30 15:15:32 ERROR Profiler: OOM error while processing large file file:/home/nartal/CPU_runs/application_1630450374626_0001_1.Increase heap size.
java.lang.OutOfMemoryError: GC overhead limit exceeded
	at java.util.Arrays.copyOf(Arrays.java:3332)
	at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
	at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:596)
	at java.lang.StringBuilder.append(StringBuilder.java:190)
	at java.io.BufferedReader.readLine(BufferedReader.java:358)

Tasks

  1. bug core_tools
    amahussein bilalbari
  2. bug core_tools
    bilalbari
  3. core_tools
    amahussein
@nartal1 nartal1 added feature request New feature or request ? - Needs Triage core_tools Scope the core module (scala) labels Oct 1, 2021
@nartal1 nartal1 changed the title [FEA] Qualification tool: Improve memory consumption while reading large eventlogs. [FEA] Qualification tool: Improve memory consumption while processing large eventlogs. Oct 1, 2021
@jlowe jlowe transferred this issue from NVIDIA/spark-rapids Feb 28, 2024
@amahussein amahussein assigned amahussein and unassigned amahussein Jun 26, 2024
@amahussein amahussein self-assigned this Jul 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core_tools Scope the core module (scala) feature request New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants