Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

While downloading file the sha256 changes using helidon 2.6.1 #7407

Closed
SkyGlancer opened this issue Aug 21, 2023 · 11 comments · Fixed by #7441
Closed

While downloading file the sha256 changes using helidon 2.6.1 #7407

SkyGlancer opened this issue Aug 21, 2023 · 11 comments · Fixed by #7441
Assignees
Labels
4.x Version 4.x bug Something isn't working

Comments

@SkyGlancer
Copy link

Environment Details

  • Helidon Version: 2.6.2
  • Helidon SE or Helidon MP se
  • JDK version: 17
  • OS: oracle linux 7
  • Docker version (if applicable):

Problem Description

While sending a file using StreamingOutput, when we try to download a large file(some specific file, around 511M) the sha256sum changes(binary file changes). (This is happening after upgrading to 2.6 from 1.x series)

`
private final long THREAD_SLEEP_TIME = 1000;
private final long EACH_CHUNK_SIZE = 10485760;
private final int MEMORY_LIMIT_MB = 1 * 1024 * 1024;

		StreamingOutput fileStream =  new StreamingOutput(){
			@Override
			public void write(java.io.OutputStream output) throws IOException{
				long fileSize = filePathFinal.toFile().length();

				try(FileInputStream inputStream = new FileInputStream(filePathFinal.toFile())){
					do {

						byte[] buffer;
						if(fileSize > EACH_CHUNK_SIZE) {
							buffer = new byte[(int) EACH_CHUNK_SIZE];
						} else {
							buffer = new byte[(int) fileSize];
						}
						int len = inputStream.read(buffer);
						try {
							if (PooledByteBufAllocator.DEFAULT.pinnedHeapMemory() > MEMORY_LIMIT_MB) {
								Thread.sleep(THREAD_SLEEP_TIME);
							}
							if (PooledByteBufAllocator.DEFAULT.pinnedDirectMemory() > MEMORY_LIMIT_MB) {
								Thread.sleep(THREAD_SLEEP_TIME);
							}
						} catch (InterruptedException e) {
					
						}
						output.write(buffer);
						output.flush();
						fileSize = fileSize - len;
					} while(fileSize > 0);
				
				} finally {
				}
			}
		};
		return Response.ok(fileStream, MediaType.APPLICATION_OCTET_STREAM).build();`

Steps to reproduce

@spericas
Copy link
Member

@SkyGlancer You should be using output.write(buffer, 0, len)

@SkyGlancer
Copy link
Author

SkyGlancer commented Aug 21, 2023

@spericas Hi, I changed the code to output.write(buffer, 0, len) but still no luck

@spericas
Copy link
Member

That's odd, someone would need to look at this in more detail. How does the file returned compare to the original? Is it of different length? If you could provide a runnable reproducible, it may be easier for someone to evaluate faster.

@spericas spericas self-assigned this Aug 21, 2023
@romain-grecourt
Copy link
Contributor

Reproduced with 2.x and 3.x:

@Path("/download")
public class DownloadResource {

    @GET
    @Path("{fname}")
    @Produces(MediaType.APPLICATION_OCTET_STREAM)
    public Response download(@PathParam("fname") String fname) {
        return Response.ok()
                .entity((StreamingOutput) output -> Files.copy(Paths.get(fname), output))
                .build();
    }
}
dd if=/dev/urandom of=sample.txt bs=100m count=1
for i in $(seq 10) ; do curl http://localhost:8080/api/sample.txt -o /tmp/sample.txt 2> /dev/null; cmp sample.txt /tmp/sample.txt || break ; sleep 1 ; done

@barchetta
Copy link
Member

barchetta commented Aug 21, 2023

I was able to reproduce as well. The failure is intermittent. File size is correct but content differs. In one case I examined: on a 100MB file the corruption started at byte 48,996,353 and continued for about 128KB. Then the rest of the file was OK.

Changing the value of server.backpressure-strategy= seems to alter behavior for me.

With both 2.6.2 and 3.2.2

  • server.backpressure-strategy=AUTO_FLUSH: fails sporadically
  • server.backpressure-strategy=UNBOUNDED: seems to work-around the problem for me (can't reproduce)

But, Romain said setting UNBOUNDED did not work-around the problem for him. So maybe it just altered timing.

@SkyGlancer
Copy link
Author

Hi @barchetta What does UNBOUNDED backpressure strategy mean, does it mean that whole file will be loaded in the memory?

@barchetta
Copy link
Member

@SkyGlancer my understanding is that it means the server's reactive layer will not apply back-pressure to whoever is writing the data, so with a slow (reading) client it can end up buffering in memory. Some info here: BackpressureStrategy. @danielkec might have more insights.

It looks like your code does some throttling itself. As a data point, could you try setting UNBOUNDED and see if it changes the symptoms you see?

@spericas
Copy link
Member

spericas commented Aug 22, 2023

@barchetta Sounds like buffer ordering is somehow not preserved as part of the backpressure logic. I've modified our AutoFlushTest to compute a hash and I see that it fails intermittently as well.

@spericas
Copy link
Member

>> mvn -Dserver.backpressure-strategy=AUTO_FLUSH -Dtest=AutoFlushTest clean test

[ERROR] Failures: 
[ERROR] io.helidon.microprofile.server.AutoFlushTest.testAutoFlush
[ERROR]   Run 1: AutoFlushTest.testAutoFlush:103 
Expected: is [<-39>, <75>, <111>, <-10>, <-20>, <-101>, <78>, <118>, <65>, <23>, <-11>, <-122>, <-13>, <67>, <-5>, <-127>, <78>, <-9>, <-108>, <76>, <18>, <117>, <60>, <105>, <-9>, <-96>, <123>, <-80>, <72>, <-25>, <12>, <79>]
     but: was [<39>, <-109>, <-71>, <11>, <114>, <-118>, <-47>, <28>, <54>, <-40>, <-84>, <-14>, <-50>, <29>, <-124>, <-83>, <18>, <38>, <4>, <-112>, <89>, <-59>, <-49>, <-123>, <77>, <-49>, <-108>, <-69>, <127>, <94>, <-54>, <-63>]
[INFO]   Run 2: PASS
[INFO]   Run 3: PASS
[INFO]   Run 4: PASS
[INFO]   Run 5: PASS
[ERROR]   Run 6: AutoFlushTest.testAutoFlush:103 
Expected: is [<103>, <-74>, <4>, <109>, <14>, <60>, <1>, <60>, <-97>, <110>, <43>, <24>, <69>, <80>, <-14>, <-115>, <-94>, <-24>, <85>, <41>, <-75>, <-43>, <-110>, <9>, <-38>, <44>, <67>, <17>, <-63>, <1>, <49>, <25>]
     but: was [<39>, <-109>, <-71>, <11>, <114>, <-118>, <-47>, <28>, <54>, <-40>, <-84>, <-14>, <-50>, <29>, <-124>, <-83>, <18>, <38>, <4>, <-112>, <89>, <-59>, <-49>, <-123>, <77>, <-49>, <-108>, <-69>, <127>, <94>, <-54>, <-63>]
[ERROR]   Run 7: AutoFlushTest.testAutoFlush:103 
Expected: is [<-104>, <-3>, <-28>, <-30>, <-3>, <-7>, <-20>, <-64>, <-84>, <71>, <59>, <-19>, <-113>, <-124>, <87>, <13>, <-68>, <8>, <73>, <65>, <15>, <-109>, <-78>, <95>, <60>, <82>, <-89>, <60>, <-83>, <112>, <-118>, <-95>]
     but: was [<39>, <-109>, <-71>, <11>, <114>, <-118>, <-47>, <28>, <54>, <-40>, <-84>, <-14>, <-50>, <29>, <-124>, <-83>, <18>, <38>, <4>, <-112>, <89>, <-59>, <-49>, <-123>, <77>, <-49>, <-108>, <-69>, <127>, <94>, <-54>, <-63>]
[INFO]   Run 8: PASS
[INFO]   Run 9: PASS
[ERROR]   Run 10: AutoFlushTest.testAutoFlush:103 
Expected: is [<-29>, <-14>, <-109>, <110>, <-62>, <56>, <101>, <49>, <65>, <-100>, <-95>, <-80>, <15>, <43>, <-104>, <15>, <-75>, <10>, <-82>, <104>, <37>, <-111>, <-16>, <65>, <99>, <111>, <78>, <18>, <-128>, <-116>, <26>, <-32>]
     but: was [<39>, <-109>, <-71>, <11>, <114>, <-118>, <-47>, <28>, <54>, <-40>, <-84>, <-14>, <-50>, <29>, <-124>, <-83>, <18>, <38>, <4>, <-112>, <89>, <-59>, <-49>, <-123>, <77>, <-49>, <-108>, <-69>, <127>, <94>, <-54>, <-63>]
[INFO] 
[INFO] 
[ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0
 mvn -Dserver.backpressure-strategy=UNBOUNDED -Dtest=AutoFlushTest clean test

[INFO] Running io.helidon.microprofile.server.AutoFlushTest
[INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.496 s - in io.helidon.microprofile.server.AutoFlushTest
[INFO] 
[INFO] Results:
[INFO] 
[INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0

@SkyGlancer
Copy link
Author

Hi @barchetta @spericas I am able to get around the problem with "backpressure-strategy": "UNBOUNDED" but we need to evaluate if this is correct. I am worried that it can cause OOM memory issues. Do we know why AUTOFLUSH is not working as expected?

@spericas
Copy link
Member

@SkyGlancer Using UNBOUNDED is a workaround for now, but we should really understand what is going on with AUTO_FLUSH. I will take another look at it today and report back.

danielkec added a commit to danielkec/helidon that referenced this issue Aug 24, 2023
danielkec added a commit to danielkec/helidon that referenced this issue Aug 24, 2023
danielkec added a commit to danielkec/helidon that referenced this issue Aug 24, 2023
Signed-off-by: Daniel Kec <[email protected]>
Co-authored-by: Santiago Pericas-Geertsen <[email protected]>
@edbratt edbratt added bug Something isn't working 4.x Version 4.x labels Aug 24, 2023
danielkec added a commit that referenced this issue Aug 24, 2023
Signed-off-by: Daniel Kec <[email protected]>
Co-authored-by: Santiago Pericas-Geertsen <[email protected]>
danielkec added a commit to danielkec/helidon that referenced this issue Aug 25, 2023
Signed-off-by: Daniel Kec <[email protected]>
Co-authored-by: Santiago Pericas-Geertsen <[email protected]>
romain-grecourt pushed a commit that referenced this issue Aug 25, 2023
* Fixed problem in AUTO_FLUSH backpressure strategy  (#6556)

* Fixed problem in AUTO_FLUSH strategy that may result in pub-sub deadlock. Increment buffer sum before checking watermark and flushing.

* Generate large binary file programmatically.

Signed-off-by: Santiago Pericasgeertsen <[email protected]>

* Use constant.

Signed-off-by: Santiago Pericasgeertsen <[email protected]>

---------

Signed-off-by: Santiago Pericasgeertsen <[email protected]>

* Fix intermittent out-of-order chunk #7407 (#7441)

Signed-off-by: Daniel Kec <[email protected]>
Co-authored-by: Santiago Pericas-Geertsen <[email protected]>

---------

Signed-off-by: Santiago Pericasgeertsen <[email protected]>
Signed-off-by: Daniel Kec <[email protected]>
Co-authored-by: Santiago Pericas-Geertsen <[email protected]>
@m0mus m0mus added this to Backlog Aug 12, 2024
@m0mus m0mus moved this to Closed in Backlog Aug 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
4.x Version 4.x bug Something isn't working
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

6 participants