Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java.lang.ArrayIndexOutOfBoundsException when indexing dataset #573

Open
1 of 5 tasks
ate47 opened this issue Jan 13, 2025 · 0 comments
Open
1 of 5 tasks

java.lang.ArrayIndexOutOfBoundsException when indexing dataset #573

ate47 opened this issue Jan 13, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@ate47
Copy link
Collaborator

ate47 commented Jan 13, 2025

Part of the endpoint? (leave empty if you don't know)

  • Backend (qendpoint-backend)
  • Store (qendpoint-backend)
  • Core (qendpoint-core)
  • Frontend (qendpoint-frontend)
  • Other

Description of the issue

When indexing an RDF graph with 200000 nodes, an error is throw. @hmottestad it seems to be in your part of the code

Excepted behavior

No response

Obtained behavior


java.lang.ArrayIndexOutOfBoundsException: Index 131181 out of bounds for length 131072

	at com.the_qa_company.qendpoint.core.util.disk.AbstractLongArray.recalculateEstimatedValueLocation(AbstractLongArray.java:83)
	at com.the_qa_company.qendpoint.core.compact.bitmap.Bitmap375Big.updateIndex(Bitmap375Big.java:203)
	at com.the_qa_company.qendpoint.core.compact.bitmap.Bitmap375Big.load(Bitmap375Big.java:429)
	at com.the_qa_company.qendpoint.core.triples.impl.BitmapTriples.mapFromFile(BitmapTriples.java:500)
	at com.the_qa_company.qendpoint.core.hdt.impl.HDTImpl.mapFromHDT(HDTImpl.java:247)
	at com.the_qa_company.qendpoint.core.hdt.HDTManagerImpl.doMapHDT(HDTManagerImpl.java:84)
	at com.the_qa_company.qendpoint.core.hdt.HDTManager.mapHDT(HDTManager.java:195)
	at com.the_qa_company.qendpoint.core.hdt.HDTManager.mapHDT(HDTManager.java:260)
	at com.the_qa_company.qendpoint.core.hdt.impl.diskimport.MapOnCallHDT.mapOrGetHDT(MapOnCallHDT.java:45)
	at com.the_qa_company.qendpoint.core.hdt.impl.diskimport.MapOnCallHDT.getTriples(MapOnCallHDT.java:65)
	at com.the_qa_company.qendpoint.core.hdt.HDTManagerTest$StaticTest.calcErrorTest(HDTManagerTest.java:1122)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
	at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
	at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
	at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
	at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
	at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
	at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
	at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
	at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
	at com.intellij.rt.junit.IdeaTestRunner$Repeater$1.execute(IdeaTestRunner.java:38)
	at com.intellij.rt.execution.junit.TestsRepeater.repeat(TestsRepeater.java:11)
	at com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:35)
	at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:232)
	at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:55)


How to reproduce

	@Test
		public void calcErrorTest() throws ParserException, IOException, NotFoundException {
			Path root = tempDir.newFolder().toPath();

			HDTOptions s = HDTOptions.of(
					"loader.cattree.futureHDTLocation", root.resolve("cfuture.hdt"),
					"loader.cattree.loadertype", "disk",
					"loader.cattree.location", root.resolve("cattree"),
					"loader.cattree.memoryFaultFactor", "1",
					"loader.disk.futureHDTLocation", root.resolve("future_msd.hdt"),
					"loader.disk.location", root.resolve("gen"),
					"loader.type", "cat",
					"parser.ntSimpleParser", "true",
					"loader.disk.compressWorker", "3",
					"loader.cattree.kcat", "20",
					"hdtcat.location", root.resolve("catgen"),
					"hdtcat.location.future", root.resolve("catgen.hdt"),
					"bitmaptriples.sequence.disk", "true",
					"bitmaptriples.indexmethod", "disk",
					"bitmaptriples.sequence.disk.location", "bitmaptripleseq"
			);

			LargeFakeDataSetStreamSupplier sup = LargeFakeDataSetStreamSupplier.createSupplierWithMaxTriples(200000, 42)
					.withMaxElementSplit(100)
					.withMaxLiteralSize(20);

			Path outPath = root.resolve("t.hdt");

			long size;
			try (HDT hdt = HDTManager.generateHDT(sup.createTripleStringStream(), LargeFakeDataSetStreamSupplier.BASE_URI, s, ProgressListener.ignore())) {
				assertTrue(hdt instanceof MapOnCallHDT);
				size = hdt.getTriples().getNumberOfElements();
				hdt.saveToHDT(outPath);
			}

			try (HDT hdt = HDTManager.mapHDT(outPath)) {
				assertEquals(size, hdt.getTriples().getNumberOfElements());
			}

		}

Endpoint version

2.4.0

Do I want to contribute to fix it?

Maybe

Something else?

No response

@ate47 ate47 added the bug Something isn't working label Jan 13, 2025
@ate47 ate47 mentioned this issue Jan 14, 2025
5 tasks
hmottestad added a commit to HASMAC-AS/qEndpoint that referenced this issue Jan 15, 2025
hmottestad added a commit to HASMAC-AS/qEndpoint that referenced this issue Jan 15, 2025
hmottestad added a commit to HASMAC-AS/qEndpoint that referenced this issue Jan 15, 2025
hmottestad added a commit to HASMAC-AS/qEndpoint that referenced this issue Jan 15, 2025
hmottestad added a commit to HASMAC-AS/qEndpoint that referenced this issue Jan 15, 2025
hmottestad added a commit to HASMAC-AS/qEndpoint that referenced this issue Jan 15, 2025
… actually somewhat faster to not use power of 2
ate47 added a commit that referenced this issue Jan 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant