Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zarr streaming e2e test #8137

Merged
merged 20 commits into from
Nov 11, 2024
Merged

Zarr streaming e2e test #8137

merged 20 commits into from
Nov 11, 2024

Conversation

frcroth
Copy link
Member

@frcroth frcroth commented Oct 21, 2024

TODOs:

  • Reset changes to tools/test.sh
  • Zarr 3 TEsts (zarr json and cube)

Issues:

Summary by CodeRabbit

Release Notes

  • New Features

    • Enhanced error messages for user authentication, organization management, and dataset handling for clearer user feedback.
    • Introduced new end-to-end tests for Zarr data formats, expanding dataset functionality coverage.
  • Bug Fixes

    • Improved error handling messages related to dataset access and user management for better clarity.
  • Chores

    • Updated .gitignore to track the test-dataset.zip file for important test data.
    • Modified test setup to ensure necessary datasets are prepared before running tests, enhancing robustness.
    • Adjusted volume mapping for the e2e-tests service in the configuration.
    • Enhanced CircleCI configuration to ensure the directory for test data is created with appropriate permissions before tests run.

@frcroth frcroth changed the title Add zip test dataset and extract it in e2e test setup Zarr streaming e2e test Oct 21, 2024
@frcroth frcroth force-pushed the test-zarr-streaming branch from f55a56b to 2d6065a Compare October 21, 2024 14:50
@frcroth frcroth requested a review from fm3 October 21, 2024 15:18
@frcroth frcroth marked this pull request as ready for review October 21, 2024 15:24
Copy link
Member

@fm3 fm3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice stuff! I added a couple of questions

frontend/javascripts/test/e2e-setup.ts Outdated Show resolved Hide resolved
frontend/javascripts/test/e2e-setup.ts Outdated Show resolved Hide resolved
@@ -118,6 +118,8 @@ jobs:
- run:
name: Run end-to-end tests
command: |
mkdir -p "binaryData/Connectomics department"
chmod 777 "binaryData/Connectomics department"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

below it looks like the new test actually uses Organization_X. Why is it necessary to create the Connectomics department directory here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why, but the volumes for the docker compose are arranged in such a way:

- ./binaryData/Connectomics department:/home/${USER_NAME:-sbt-user}/webknossos/binaryData/Organization_X

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I think that is a legacy artifact, from when things were named differently. Could you name it Organization_X there too and see if that breaks anything?

test/e2e/End2EndSpec.scala Outdated Show resolved Hide resolved
test/e2e/End2EndSpec.scala Outdated Show resolved Hide resolved
@frcroth frcroth requested a review from fm3 October 23, 2024 08:29
Copy link
Contributor

coderabbitai bot commented Nov 4, 2024

Walkthrough

The changes encompass modifications to user-facing error messages and status notifications in the conf/messages file, updates to the docker-compose.yml file for service volume mapping, enhancements to end-to-end tests in datasets.e2e.ts and End2EndSpec.scala, and adjustments in the e2e-setup.ts file for handling volatile values. A new entry was also added to the .gitignore file to track a specific zip file. These updates collectively improve error clarity, testing robustness, and configuration management.

Changes

File Change Summary
conf/messages Updated and added error messages related to user authentication, organization management, dataset handling, and annotations; removed messages for braintracing service.
docker-compose.yml Changed volume mapping for e2e-tests service from Connectomics department to Organization_X.
frontend/javascripts/test/backend-snapshot-tests/datasets.e2e.ts Added replaceVolatileValues import; modified test cases to use this function; introduced new test cases for Zarr streaming.
frontend/javascripts/test/e2e-setup.ts Added sortingKey to volatileKeys; updated writeTypeCheckingFile function; modified global fetch function for URL handling.
test/dataset/.gitignore Added entry to track test-dataset.zip.
test/e2e/End2EndSpec.scala Added ensureTestDataset method to prepare test datasets; modified test case to invoke this method.
.circleci/config.yml Added step to create and set permissions for binaryData/Organization_X before running tests.
frontend/javascripts/test/snapshots/public-test/test-bundle/test/backend-snapshot-tests/datasets.e2e.js.md Updated snapshot data structure with placeholder values for dataset properties; introduced new entries for Zarr streaming.

Assessment against linked issues

Objective Addressed Explanation
Add end to end tests for zarr streaming (#7997)
Ensure robust testing for dataset functionalities

🐇 In the meadow, we hop and play,
With messages clear, come what may!
Zarr streaming tests now take their flight,
Ensuring our code is robust and bright.
So let’s celebrate with a joyful cheer,
For changes that bring us all good cheer! 🌼✨


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (3)
test/e2e/End2EndSpec.scala (2)

52-53: Consider making paths configurable

The hardcoded paths for test dataset and data directory could make the tests brittle. Consider:

  1. Moving these to configuration
  2. Making them relative to the project root
-    val testDatasetPath = "test/dataset/test-dataset.zip"
-    val dataDirectory = new File("binaryData/Organization_X")
+    val testDatasetPath = config.getString("test.dataset.path")
+    val dataDirectory = new File(config.getString("test.data.directory"))

51-71: Consider adding test data cleanup

The method creates test data but doesn't handle cleanup. This could lead to accumulated test data over multiple test runs.

Consider adding a cleanup method and using it in a after block:

private def cleanupTestDataset(): Unit = {
  val dataDirectory = new File("binaryData/Organization_X/test-dataset")
  if (dataDirectory.exists()) {
    dataDirectory.delete()
  }
}

// Use in an after block
override def after = {
  cleanupTestDataset()
  super.after
}
frontend/javascripts/test/backend-snapshot-tests/datasets.e2e.ts (1)

73-110: Consider adding data content validation

While snapshot testing is good for regression testing, consider adding explicit assertions about the data content to catch potential data corruption issues early.

For example:

 test("Zarr streaming", async (t) => {
   const zattrsResp = await fetch("/data/zarr/Organization_X/test-dataset/segmentation/.zattrs", {
     headers: new Headers(),
   });
   const zattrs = await zattrsResp.text();
+  const zattrsJson = JSON.parse(zattrs);
+  t.truthy(zattrsJson.dimension_separator, "Missing dimension_separator in .zattrs");
   t.snapshot(zattrs);

   const rawDataResponse = await fetch(
     "/data/zarr/Organization_X/test-dataset/segmentation/1/0.1.1.0",
     {
       headers: new Headers(),
     },
   );
   const bytes = await rawDataResponse.arrayBuffer();
+  t.true(bytes.byteLength > 0, "Raw data should not be empty");
   const base64 = btoa(String.fromCharCode(...new Uint8Array(bytes)));
   t.snapshot(base64);
 });
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between f937be0 and 459ff77.

⛔ Files ignored due to path filters (2)
  • frontend/javascripts/test/snapshots/public-test/test-bundle/test/backend-snapshot-tests/datasets.e2e.js.snap is excluded by !**/*.snap
  • test/dataset/test-dataset.zip is excluded by !**/*.zip
📒 Files selected for processing (6)
  • conf/messages (0 hunks)
  • docker-compose.yml (1 hunks)
  • frontend/javascripts/test/backend-snapshot-tests/datasets.e2e.ts (4 hunks)
  • frontend/javascripts/test/e2e-setup.ts (2 hunks)
  • test/dataset/.gitignore (1 hunks)
  • test/e2e/End2EndSpec.scala (4 hunks)
💤 Files with no reviewable changes (1)
  • conf/messages
✅ Files skipped from review due to trivial changes (2)
  • docker-compose.yml
  • test/dataset/.gitignore
🔇 Additional comments (7)
test/e2e/End2EndSpec.scala (3)

3-3: LGTM: Import added for ZipIO utility

The import is correctly added to support the use of ZipIO.unzipToFolder as suggested in previous reviews.


12-13: LGTM: File handling imports added

The imports for File and Paths are appropriate for the file operations being performed.


33-33: LGTM: Test data preparation added

The ensureTestDataset() call is correctly placed to prepare test data before running the tests.

frontend/javascripts/test/backend-snapshot-tests/datasets.e2e.ts (3)

2-8: LGTM: Import and setup changes look good

The addition of replaceVolatileValues and the dataset check trigger in the setup enhance test stability by handling volatile values and ensuring proper dataset initialization.

Also applies to: 24-24


39-39: LGTM: Consistent handling of volatile values

The systematic application of replaceVolatileValues across all snapshot assertions ensures consistent and reliable test outcomes by normalizing volatile data.

Also applies to: 44-44, 51-51, 59-59


91-110: 🛠️ Refactor suggestion

Add error handling and verify experimental endpoint usage

The test implementation needs error handling, and we should verify the experimental endpoint usage.

  1. Add error handling:
 test("Zarr 3 streaming", async (t) => {
   const zarrJsonResp = await fetch(
     "/data/zarr3_experimental/Organization_X/test-dataset/segmentation/zarr.json",
     {
       headers: new Headers(),
     },
   );
+  t.true(zarrJsonResp.ok, `Failed to fetch zarr.json: ${zarrJsonResp.statusText}`);
   const zarrJson = await zarrJsonResp.text();
   t.snapshot(zarrJson);

   const rawDataResponse = await fetch(
     "/data/zarr3_experimental/Organization_X/test-dataset/segmentation/1/0.1.1.0",
     {
       headers: new Headers(),
     },
   );
+  t.true(rawDataResponse.ok, `Failed to fetch raw data: ${rawDataResponse.statusText}`);
   const bytes = await rawDataResponse.arrayBuffer();
   const base64 = btoa(String.fromCharCode(...new Uint8Array(bytes)));
   t.snapshot(base64);
 });

Let's verify the experimental endpoint usage:

✅ Verification successful

Let me check the implementation details of these files to understand how the experimental endpoint is used.


Experimental endpoint usage is consistent with backend implementation

The experimental endpoint usage in the test is aligned with the backend implementation in DSRemoteTracingstoreClient.scala. The backend has proper handling for both Zarr v2 and experimental Zarr v3 endpoints through the getZarrVersionDependantSubPath helper. The test is correctly exercising this experimental endpoint.

However, the error handling suggestion in the review comment remains valid:

  • Add response status checks using t.true(response.ok) for both fetch calls
  • Include status text in error messages for better debugging
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for other occurrences of the experimental endpoint
# to ensure consistent usage across the codebase
rg -l "zarr3_experimental" --type ts --type scala

Length of output: 220


Script:

#!/bin/bash
# Check the implementation in both files
echo "=== Frontend Test Implementation ==="
rg -A 10 "zarr3_experimental" frontend/javascripts/test/backend-snapshot-tests/datasets.e2e.ts

echo -e "\n=== Scala Implementation ==="
rg -A 10 "zarr3_experimental" webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DSRemoteTracingstoreClient.scala

Length of output: 2631

frontend/javascripts/test/e2e-setup.ts (1)

134-136: LGTM! Import statement addition improves type safety.

The addition of the type import statement in the generated TypeScript file ensures proper type checking and follows TypeScript best practices.

Comment on lines +62 to +70
if (!dataDirectory.listFiles().exists(_.getName == "test-dataset"))
ZipIO.unzipToFolder(
testDatasetZip,
Paths.get(dataDirectory.toPath.toString, "test-dataset"),
includeHiddenFiles = true,
hiddenFilesWhitelist = List(),
truncateCommonPrefix = true,
excludeFromPrefix = None
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve error handling in unzip operation

The unzip operation could fail silently. Consider:

  1. Adding error handling for the unzip operation
  2. Validating the unzipped content
-    if (!dataDirectory.listFiles().exists(_.getName == "test-dataset"))
-      ZipIO.unzipToFolder(
+    if (!dataDirectory.listFiles().exists(_.getName == "test-dataset")) {
+      try {
+        ZipIO.unzipToFolder(
           testDatasetZip,
           Paths.get(dataDirectory.toPath.toString, "test-dataset"),
           includeHiddenFiles = true,
           hiddenFilesWhitelist = List(),
           truncateCommonPrefix = true,
           excludeFromPrefix = None
-      )
+        )
+        // Validate unzipped content
+        if (!new File(dataDirectory, "test-dataset").exists()) {
+          throw new Exception("Failed to unzip test dataset properly")
+        }
+      } catch {
+        case e: Exception => 
+          logger.error("Failed to prepare test dataset", e)
+          throw e
+      }
+    }

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines 73 to 89
test("Zarr streaming", async (t) => {
const zattrsResp = await fetch("/data/zarr/Organization_X/test-dataset/segmentation/.zattrs", {
headers: new Headers(),
});
const zattrs = await zattrsResp.text();
t.snapshot(zattrs);

const rawDataResponse = await fetch(
"/data/zarr/Organization_X/test-dataset/segmentation/1/0.1.1.0",
{
headers: new Headers(),
},
);
const bytes = await rawDataResponse.arrayBuffer();
const base64 = btoa(String.fromCharCode(...new Uint8Array(bytes)));
t.snapshot(base64);
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling for fetch requests

While the test implementation is good, it should handle potential network failures or invalid responses.

Consider adding error handling:

 test("Zarr streaming", async (t) => {
   const zattrsResp = await fetch("/data/zarr/Organization_X/test-dataset/segmentation/.zattrs", {
     headers: new Headers(),
   });
+  t.true(zattrsResp.ok, `Failed to fetch .zattrs: ${zattrsResp.statusText}`);
   const zattrs = await zattrsResp.text();
   t.snapshot(zattrs);

   const rawDataResponse = await fetch(
     "/data/zarr/Organization_X/test-dataset/segmentation/1/0.1.1.0",
     {
       headers: new Headers(),
     },
   );
+  t.true(rawDataResponse.ok, `Failed to fetch raw data: ${rawDataResponse.statusText}`);
   const bytes = await rawDataResponse.arrayBuffer();
   const base64 = btoa(String.fromCharCode(...new Uint8Array(bytes)));
   t.snapshot(base64);
 });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
test("Zarr streaming", async (t) => {
const zattrsResp = await fetch("/data/zarr/Organization_X/test-dataset/segmentation/.zattrs", {
headers: new Headers(),
});
const zattrs = await zattrsResp.text();
t.snapshot(zattrs);
const rawDataResponse = await fetch(
"/data/zarr/Organization_X/test-dataset/segmentation/1/0.1.1.0",
{
headers: new Headers(),
},
);
const bytes = await rawDataResponse.arrayBuffer();
const base64 = btoa(String.fromCharCode(...new Uint8Array(bytes)));
t.snapshot(base64);
});
test("Zarr streaming", async (t) => {
const zattrsResp = await fetch("/data/zarr/Organization_X/test-dataset/segmentation/.zattrs", {
headers: new Headers(),
});
t.true(zattrsResp.ok, `Failed to fetch .zattrs: ${zattrsResp.statusText}`);
const zattrs = await zattrsResp.text();
t.snapshot(zattrs);
const rawDataResponse = await fetch(
"/data/zarr/Organization_X/test-dataset/segmentation/1/0.1.1.0",
{
headers: new Headers(),
},
);
t.true(rawDataResponse.ok, `Failed to fetch raw data: ${rawDataResponse.statusText}`);
const bytes = await rawDataResponse.arrayBuffer();
const base64 = btoa(String.fromCharCode(...new Uint8Array(bytes)));
t.snapshot(base64);
});

@@ -39,6 +39,7 @@ const volatileKeys: Array<string | number | symbol> = [
"lastActivity",
"tracingTime",
"tracingId",
"sortingKey",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Snapshots need to be updated due to "sortingKey" changes

The codebase contains snapshot files that reference "sortingKey", specifically in datasets.e2e.js.md. Adding "sortingKey" to volatileKeys means these snapshots need to be recomputed as they will now handle this property differently during testing.

  • frontend/javascripts/test/snapshots/public-test/test-bundle/test/backend-snapshot-tests/datasets.e2e.js.md contains "sortingKey" references that will be affected
  • Other snapshot files may also need updates when tests are re-run with the new volatileKeys configuration
🔗 Analysis chain

Verify snapshot recomputation after adding "sortingKey".

The addition of "sortingKey" to volatileKeys is appropriate. As per the comment above the array, snapshots need to be recomputed when this array is modified.

Run the following script to check if there are any snapshot files that need updating:


Let me check the content of these snapshot files to see if they contain any instances of "sortingKey" that might be affected.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for snapshot files that might need updating
# Look for snapshot files in the test directory
fd -e snap -e snapshot . frontend/javascripts/test

Length of output: 1460


Script:

#!/bin/bash
# Search for "sortingKey" in snapshot files to understand the impact
rg "sortingKey" frontend/javascripts/test/snapshots/

Length of output: 193

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (4)
test/e2e/End2EndSpec.scala (4)

51-56: Consider making paths configurable

The hardcoded paths "test/dataset/test-dataset.zip" and "binaryData/Organization_X" should ideally be configuration values for better flexibility and maintainability.

+ private val TestDatasetPath = "test/dataset/test-dataset.zip"
+ private val DataDirectoryPath = "binaryData/Organization_X"

  private def ensureTestDataset(): Unit = {
-   val testDatasetPath = "test/dataset/test-dataset.zip"
-   val dataDirectory = new File("binaryData/Organization_X")
+   val testDatasetPath = TestDatasetPath
+   val dataDirectory = new File(DataDirectoryPath)

51-86: Consider breaking down the method into smaller, focused functions

The ensureTestDataset method has multiple responsibilities. Consider breaking it down into smaller, focused functions for better maintainability and testing:

  1. validatePaths
  2. createDirectories
  3. unzipDataset
  4. validateDataset
private def validatePaths(testDatasetZip: File): Unit = {
  if (!testDatasetZip.exists()) {
    throw new Exception("Test dataset zip file does not exist.")
  }
}

private def createDirectories(dataDirectory: File): Unit = {
  if (!dataDirectory.exists()) {
    dataDirectory.mkdirs()
  }
}

private def unzipDataset(testDatasetZip: File, dataDirectory: File): Unit = {
  if (!dataDirectory.listFiles().exists(_.getName == "test-dataset")) {
    ZipIO.unzipToFolder(
      testDatasetZip,
      Paths.get(dataDirectory.toPath.toString, "test-dataset"),
      includeHiddenFiles = true,
      hiddenFilesWhitelist = List(),
      truncateCommonPrefix = true,
      excludeFromPrefix = None
    )
  }
}

private def validateDataset(dataDirectory: File): Unit = {
  // ... validation logic ...
}

private def ensureTestDataset(): Unit = {
  val testDatasetZip = new File(TestDatasetPath)
  val dataDirectory = new File(DataDirectoryPath)
  
  validatePaths(testDatasetZip)
  createDirectories(dataDirectory)
  unzipDataset(testDatasetZip, dataDirectory)
  validateDataset(dataDirectory)
}

73-79: Improve error messages with more context

The error messages could be more descriptive to help with debugging.

-     throw new Exception("Test dataset was not unzipped successfully.")
+     throw new Exception(s"Failed to unzip test dataset to ${dataDirectory.getAbsolutePath}/test-dataset")
-     throw new Exception("Required file does not exist.")
+     throw new Exception(s"Required file ${testFile.getAbsolutePath} does not exist")

62-70: Consider adding logging for better debugging

Add logging statements to track the progress of dataset preparation, especially during CI/CD runs.

     if (!dataDirectory.listFiles().exists(_.getName == "test-dataset"))
+      logger.info(s"Unzipping test dataset to ${dataDirectory.getAbsolutePath}/test-dataset")
       ZipIO.unzipToFolder(
         testDatasetZip,
         Paths.get(dataDirectory.toPath.toString, "test-dataset"),
         includeHiddenFiles = true,
         hiddenFilesWhitelist = List(),
         truncateCommonPrefix = true,
         excludeFromPrefix = None
       )
+      logger.info("Test dataset unzipped successfully")
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 459ff77 and 152acca.

📒 Files selected for processing (1)
  • test/e2e/End2EndSpec.scala (4 hunks)
🔇 Additional comments (2)
test/e2e/End2EndSpec.scala (2)

3-3: LGTM: Imports are appropriate

The new imports are well-scoped and necessary for the file operations being performed.

Also applies to: 12-13


33-34: LGTM: Proper test data initialization

The dataset preparation is correctly placed before the test execution, ensuring prerequisites are met.

Comment on lines +80 to +85
val testFileSource = scala.io.Source.fromFile(testFile)
val testFileContent = try testFileSource.mkString
finally testFileSource.close()
if (testFileContent.isEmpty) {
throw new Exception("Required file is empty.")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Use try-with-resources for file handling

The file resource handling could be improved using Scala's Using to ensure proper resource cleanup.

-   val testFileSource = scala.io.Source.fromFile(testFile)
-   val testFileContent = try testFileSource.mkString
-   finally testFileSource.close()
-   if (testFileContent.isEmpty) {
-     throw new Exception("Required file is empty.")
-   }
+   scala.util.Using(scala.io.Source.fromFile(testFile)) { source =>
+     val content = source.mkString
+     if (content.isEmpty) {
+       throw new Exception("Required file is empty.")
+     }
+     content
+   }.getOrElse(throw new Exception("Failed to read test file"))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
val testFileSource = scala.io.Source.fromFile(testFile)
val testFileContent = try testFileSource.mkString
finally testFileSource.close()
if (testFileContent.isEmpty) {
throw new Exception("Required file is empty.")
}
scala.util.Using(scala.io.Source.fromFile(testFile)) { source =>
val content = source.mkString
if (content.isEmpty) {
throw new Exception("Required file is empty.")
}
content
}.getOrElse(throw new Exception("Failed to read test file"))

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (1)
test/e2e/End2EndSpec.scala (1)

51-87: Enhance error messages with more context.

The error messages could be more descriptive to help with debugging test failures.

-      throw new Exception("Test dataset zip file does not exist.")
+      throw new Exception(s"Test dataset zip file does not exist at path: $testDatasetPath")

-      throw new Exception("Test dataset was not unzipped successfully.")
+      throw new Exception(s"Failed to unzip test dataset to: ${dataDirectory.getAbsolutePath}/test-dataset")

-      throw new Exception("Required file does not exist.")
+      throw new Exception(s"Required file does not exist: ${testFile.getAbsolutePath}")

-      throw new Exception("Required file is empty.")
+      throw new Exception(s"Required file is empty: ${testFile.getAbsolutePath}")
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 152acca and 1d72230.

📒 Files selected for processing (1)
  • test/e2e/End2EndSpec.scala (4 hunks)
🔇 Additional comments (5)
test/e2e/End2EndSpec.scala (5)

3-3: LGTM: Imports are appropriate for the new functionality.

The added imports for ZipIO, File, and Paths are necessary for the dataset management functionality being implemented.

Also applies to: 12-13


33-34: LGTM: Proper test setup sequence.

The ensureTestDataset() call is correctly placed before the HTTP request and webdriver tests, ensuring proper test data initialization.


63-71: Improve error handling in unzip operation.

The unzip operation needs better error handling and validation.

The previous review comment about improving error handling in the unzip operation is still valid. Please implement the suggested changes for proper error handling and validation.


81-86: Use Scala's Using for resource management.

The file reading operation should use Scala's Using for proper resource management.

The previous review comment about using try-with-resources is still valid. Please implement the suggested changes for better resource management.


51-87: Verify test dataset structure in the codebase.

Let's verify the test dataset structure to ensure it contains the necessary Zarr files for streaming tests.

✅ Verification successful

Let me gather more information about the test dataset structure by examining the test files and related configurations.

This will help us understand the expected structure and validate if Zarr-related test configurations are properly set up.


Based on the gathered information, I can now provide the final response:

Test dataset structure verification is appropriate for its purpose

The test dataset setup in End2EndSpec.scala is correctly structured for its intended use. While it doesn't explicitly contain Zarr files, this is expected as:

  • The file datasource-properties.json is the primary requirement for the test setup
  • Zarr-specific tests are separately handled in Zarr3TestSuite.scala with mock data
  • The codebase shows extensive Zarr support with proper implementation files, but streaming tests don't necessarily require actual Zarr files in the test dataset
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify test dataset structure and content

# Check if the test dataset zip exists and examine its structure
if [ -f "test/dataset/test-dataset.zip" ]; then
    echo "Test dataset zip contents:"
    unzip -l test/dataset/test-dataset.zip
    
    # Look for Zarr-related files
    echo -e "\nSearching for Zarr files in the repository:"
    rg -l "\.zarr" || echo "No .zarr files found"
    rg -l "zarr.json" || echo "No zarr.json files found"
fi

Length of output: 4562


Script:

#!/bin/bash
# Check test files and configurations related to the test dataset

# Look for test dataset references and configurations
echo "Checking test dataset references:"
rg -l "test-dataset" test/

# Check content of test configuration files
echo -e "\nChecking Zarr3TestSuite.scala:"
cat test/backend/Zarr3TestSuite.scala

# Look for dataset properties file references
echo -e "\nChecking for dataset properties references:"
rg "datasource-properties.json" test/

Length of output: 2508

if (!dataDirectory.exists()) {
dataDirectory.mkdirs()
}
"chmod 777 binaryData/Organization_X".run()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Replace shell command with Java API for permission management.

Using shell commands for permission management is platform-dependent and potentially unsafe. Consider using Java's NIO API instead.

-    "chmod 777 binaryData/Organization_X".run()
+    dataDirectory.setReadable(true, false)
+    dataDirectory.setWritable(true, false)
+    dataDirectory.setExecutable(true, false)

Committable suggestion skipped: line range outside the PR's diff.

@frcroth frcroth requested review from fm3 and removed request for fm3 November 4, 2024 10:38
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (2)
test/e2e/End2EndSpec.scala (1)

51-70: Consider enhancing error handling and logging

While the implementation is generally good, consider adding more detailed error messages and logging:

  1. Add logging for directory creation and zip extraction steps
  2. Include file paths in error messages for better debugging
 private def ensureTestDataset(): Unit = {
   val testDatasetPath = "test/dataset/test-dataset.zip"
   val dataDirectory = new File("binaryData/Organization_X")
   if (!dataDirectory.exists()) {
+    logger.info(s"Creating directory: ${dataDirectory.getAbsolutePath}")
     dataDirectory.mkdirs()
   }
   val testDatasetZip = new File(testDatasetPath)
   if (!testDatasetZip.exists()) {
-    throw new Exception("Test dataset zip file does not exist.")
+    throw new Exception(s"Test dataset zip file does not exist at: ${testDatasetZip.getAbsolutePath}")
   }
   // Skip unzipping if the test dataset is already present
-  if (!dataDirectory.listFiles().exists(_.getName == "test-dataset"))
+  if (!dataDirectory.listFiles().exists(_.getName == "test-dataset")) {
+    logger.info(s"Unzipping test dataset from ${testDatasetZip.getAbsolutePath}")
     ZipIO.unzipToFolder(
       testDatasetZip,
       Paths.get(dataDirectory.toPath.toString, "test-dataset"),
       includeHiddenFiles = true,
       hiddenFilesWhitelist = List(),
       truncateCommonPrefix = true,
       excludeFromPrefix = None
     )
+  }
.circleci/config.yml (1)

121-121: Consider using more restrictive permissions.

While 777 permissions work in CI, they are overly permissive. Consider using more restrictive permissions (e.g., 755) if possible, as this follows security best practices.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 1d72230 and 595701e.

📒 Files selected for processing (2)
  • .circleci/config.yml (1 hunks)
  • test/e2e/End2EndSpec.scala (4 hunks)
🔇 Additional comments (3)
test/e2e/End2EndSpec.scala (2)

3-3: LGTM: Import statements are appropriate

The added imports are necessary for file operations and ZIP handling in the new dataset preparation functionality.

Also applies to: 12-13


33-34: LGTM: Proper test data initialization

The dataset preparation is correctly placed before the HTTP request and webdriver tests, ensuring test data availability.

.circleci/config.yml (1)

121-121: Verify the directory naming consistency.

Based on the previous review discussion, confirm if using Organization_X here aligns with the rest of the codebase, as there were mentions of legacy naming patterns (Connectomics).

✅ Verification successful

Directory naming Organization_X is consistent with the codebase

The naming pattern Organization_X is consistently used across the configuration files, specifically in:

  • .circleci/config.yml for directory creation
  • docker-compose.yml for volume mounting

There are no references to legacy naming patterns like "Connectomics", and the same directory structure is maintained throughout the configuration files.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for any remaining references to Connectomics in the codebase
# to ensure naming consistency

echo "Searching for 'Connectomics' references in docker-compose.yml and e2e test files..."
rg -i "connectomics" docker-compose.yml "test/**/*.{ts,scala}"

Length of output: 306


Script:

#!/bin/bash
# Let's do a broader search for both Organization_X and any potential legacy naming patterns
echo "Searching for Organization_X references..."
rg -i "organization_x" --type-add 'config:*.{yml,yaml,json}' --type config

echo -e "\nSearching for directory structure..."
fd -t d -d 3 "Organization_X\|organization_x\|connectomics" -x ls -ld {}

echo -e "\nChecking CircleCI config for similar directory creation patterns..."
rg "mkdir.*binary" .circleci/config.yml

Length of output: 735

Comment on lines +72 to +85
// Test if the dataset was unzipped successfully
if (!dataDirectory.listFiles().exists(_.getName == "test-dataset")) {
throw new Exception("Test dataset was not unzipped successfully.")
}
val testFile = new File(dataDirectory, "test-dataset/datasource-properties.json")
if (!testFile.exists()) {
throw new Exception("Required file does not exist.")
}
val testFileSource = scala.io.Source.fromFile(testFile)
val testFileContent = try testFileSource.mkString
finally testFileSource.close()
if (testFileContent.isEmpty) {
throw new Exception("Required file is empty.")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve resource management and validation robustness

The current implementation has two areas for improvement:

  1. Use Scala's Using for better resource management (as suggested in previous review)
  2. Add more comprehensive validation of the unzipped content
-    val testFileSource = scala.io.Source.fromFile(testFile)
-    val testFileContent = try testFileSource.mkString
-    finally testFileSource.close()
-    if (testFileContent.isEmpty) {
-      throw new Exception("Required file is empty.")
-    }
+    val testFileContent = scala.util.Using(scala.io.Source.fromFile(testFile)) { source =>
+      val content = source.mkString
+      if (content.isEmpty) {
+        throw new Exception(s"Required file is empty: ${testFile.getAbsolutePath}")
+      }
+      // Additional validation could be added here
+      if (!content.contains("\"dataFormat\"")) {
+        throw new Exception(s"Invalid datasource properties file format: ${testFile.getAbsolutePath}")
+      }
+      content
+    }.getOrElse(throw new Exception(s"Failed to read test file: ${testFile.getAbsolutePath}"))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Test if the dataset was unzipped successfully
if (!dataDirectory.listFiles().exists(_.getName == "test-dataset")) {
throw new Exception("Test dataset was not unzipped successfully.")
}
val testFile = new File(dataDirectory, "test-dataset/datasource-properties.json")
if (!testFile.exists()) {
throw new Exception("Required file does not exist.")
}
val testFileSource = scala.io.Source.fromFile(testFile)
val testFileContent = try testFileSource.mkString
finally testFileSource.close()
if (testFileContent.isEmpty) {
throw new Exception("Required file is empty.")
}
// Test if the dataset was unzipped successfully
if (!dataDirectory.listFiles().exists(_.getName == "test-dataset")) {
throw new Exception("Test dataset was not unzipped successfully.")
}
val testFile = new File(dataDirectory, "test-dataset/datasource-properties.json")
if (!testFile.exists()) {
throw new Exception("Required file does not exist.")
}
val testFileContent = scala.util.Using(scala.io.Source.fromFile(testFile)) { source =>
val content = source.mkString
if (content.isEmpty) {
throw new Exception(s"Required file is empty: ${testFile.getAbsolutePath}")
}
// Additional validation could be added here
if (!content.contains("\"dataFormat\"")) {
throw new Exception(s"Invalid datasource properties file format: ${testFile.getAbsolutePath}")
}
content
}.getOrElse(throw new Exception(s"Failed to read test file: ${testFile.getAbsolutePath}"))

Copy link
Member

@fm3 fm3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, glad to see that we now have some tests with actual on-disk image data. This will test a lot of other functionality too that the zarr streaming is built upon.

To keep the snapshot files short and readable, consider snapshotting not the whole 32³ bucket data but instead a slice or a checksum.

@frcroth frcroth requested a review from fm3 November 11, 2024 13:30
Copy link
Member

@fm3 fm3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (3)
frontend/javascripts/test/backend-snapshot-tests/datasets.e2e.ts (2)

87-87: Document the slice(-128) usage

The decision to snapshot only the last 128 bytes of the response should be documented.

Add a comment explaining why we're only keeping the last 128 bytes:

-  const base64 = btoa(String.fromCharCode(...new Uint8Array(bytes.slice(-128))));
+  // Only snapshot the last 128 bytes to keep snapshots manageable while still catching potential data corruption
+  const base64 = btoa(String.fromCharCode(...new Uint8Array(bytes.slice(-128))));

Also applies to: 108-108


73-110: Refactor duplicate test logic into helper function

Both Zarr streaming tests share similar structure and could benefit from a helper function.

Consider refactoring like this:

async function testZarrEndpoint(t: ExecutionContext, baseUrl: string) {
  // Test metadata endpoint
  const metadataResp = await fetch(`${baseUrl}/${baseUrl.includes('zarr3') ? 'zarr.json' : '.zattrs'}`, {
    headers: new Headers(),
  });
  t.true(metadataResp.ok, `Failed to fetch metadata: ${metadataResp.statusText}`);
  t.is(metadataResp.headers.get("content-type"), "application/json", "Expected JSON response");
  const metadata = await metadataResp.text();
  t.notThrows(() => JSON.parse(metadata), "Expected valid JSON response");
  t.snapshot(metadata);

  // Test data endpoint
  const rawDataResponse = await fetch(`${baseUrl}/1/0.1.1.0`, {
    headers: new Headers(),
  });
  t.true(rawDataResponse.ok, `Failed to fetch raw data: ${rawDataResponse.statusText}`);
  const bytes = await rawDataResponse.arrayBuffer();
  // Only snapshot the last 128 bytes to keep snapshots manageable while still catching potential data corruption
  const base64 = btoa(String.fromCharCode(...new Uint8Array(bytes.slice(-128))));
  t.snapshot(base64);
}

test("Zarr streaming", async (t) => {
  await testZarrEndpoint(t, "/data/zarr/Organization_X/test-dataset/segmentation");
});

test("Zarr 3 streaming", async (t) => {
  await testZarrEndpoint(t, "/data/zarr3_experimental/Organization_X/test-dataset/segmentation");
});
frontend/javascripts/test/snapshots/public-test/test-bundle/test/backend-snapshot-tests/datasets.e2e.js.md (1)

Line range hint 14-143: LGTM! Good practice standardizing volatile values.

The replacement of volatile values (timestamps, IDs) with constants makes the snapshot tests more deterministic and maintainable. The addition of test datasets specifically for Zarr streaming tests is well structured.

Consider adding a comment in the snapshot file explaining the standardization of volatile values for future maintainers.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 595701e and 30ed5b1.

⛔ Files ignored due to path filters (1)
  • frontend/javascripts/test/snapshots/public-test/test-bundle/test/backend-snapshot-tests/datasets.e2e.js.snap is excluded by !**/*.snap
📒 Files selected for processing (2)
  • frontend/javascripts/test/backend-snapshot-tests/datasets.e2e.ts (4 hunks)
  • frontend/javascripts/test/snapshots/public-test/test-bundle/test/backend-snapshot-tests/datasets.e2e.js.md (11 hunks)
🔇 Additional comments (5)
frontend/javascripts/test/backend-snapshot-tests/datasets.e2e.ts (2)

2-8: LGTM: Good use of replaceVolatileValues for test stability

The addition of replaceVolatileValues and its consistent usage throughout the tests helps manage volatile data in snapshots, making the tests more reliable and less prone to false failures.


39-39: LGTM: Consistent usage of replaceVolatileValues

The modifications to use replaceVolatileValues across all snapshot assertions maintain consistency and improve test stability.

Also applies to: 44-44, 51-51, 59-59

frontend/javascripts/test/snapshots/public-test/test-bundle/test/backend-snapshot-tests/datasets.e2e.js.md (3)

Line range hint 291-324: LGTM! Consistent standardization of IDs.

The standardization of user and team IDs maintains consistency with the volatile value replacement approach.


342-361: Verify completeness of Zarr streaming test coverage.

Good addition of test cases for both Zarr and Zarr 3 formats. The test data includes:

  • Multiscales metadata with proper versioning
  • Coordinate transformations
  • Base64 encoded array data

However, we should verify if all streaming routes are covered as per issue #7997.

Let's check for streaming route coverage:

#!/bin/bash
# Search for Zarr streaming route definitions
rg -g '*.scala' -g '*.ts' -A 5 "zarr.*stream|stream.*zarr"

# Search for other Zarr-related tests
rg -g '*test*' -g '*.spec*' -g '*.e2e*' "zarr" --json | jq 'select(.data.lines.text | contains("zarr") or contains("Zarr"))'

Consider adding comments explaining:

  1. The test pattern in the base64 encoded data
  2. The differences between Zarr and Zarr 3 test cases

159-186: Verify the bounding box and scale configurations.

The test dataset configurations use specific values:

  • Bounding box: 100x100x100 with offset [50,50,25]
  • Scale factors: [11.24, 11.24, 28] nanometers

Let's verify these values match typical Zarr dataset configurations:

✅ Verification successful

Let's search for scale and bounding box configurations in both test and source files to compare the values used in the test with actual production configurations.


Based on the search results, I can now verify the bounding box and scale configurations. Let me generate the final response:

The bounding box and scale configurations are valid and consistent with typical Zarr dataset configurations.

The values in the test dataset:

  • Bounding box: 100x100x100 with offset [50,50,25]
  • Scale factors: [11.24, 11.24, 28] nanometers

These values align with real-world Zarr dataset configurations as evidenced by:

  • The scale factors match the format used in Zarr headers (e.g., scale = { factor: [11.24, 11.24, 28], unit: "nm" })
  • The bounding box dimensions follow the standard format with topLeft coordinates and width/height/depth values
  • Similar scale factors and bounding box configurations are used throughout the codebase for handling Zarr datasets
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for other Zarr dataset configurations in the codebase
rg -g '!*.snap' -g '!*.md' "boundingBox|scale.*factor" --json | jq 'select(.data.lines.text | contains("boundingBox") or contains("scale") or contains("factor"))'

Length of output: 8887


Script:

#!/bin/bash
# Search for scale and bounding box configurations in test files
rg -g '*test*' -g '*.spec*' -g '*.e2e*' -A 5 'scale.*factor|boundingBox|topLeft.*\[.*\].*\]' --json | jq -c 'select(.data.lines.text | contains("scale") or contains("boundingBox") or contains("topLeft"))'

# Search for scale and bounding box configurations in source files
rg -g '!*test*' -g '!*.spec*' -g '!*.e2e*' -A 5 'scale.*factor|boundingBox|topLeft.*\[.*\].*\]' --json | jq -c 'select(.data.lines.text | contains("scale") or contains("boundingBox") or contains("topLeft"))'

Length of output: 267962

Comment on lines 23 to 25
setCurrToken(tokenUserA);
await api.triggerDatasetCheck("http://localhost:9000");
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling for dataset check trigger

While triggering the dataset check is good for test setup, it should handle potential failures gracefully.

Consider adding error handling:

test.before("Reset database and change token", async () => {
  resetDatabase();
  setCurrToken(tokenUserA);
-  await api.triggerDatasetCheck("http://localhost:9000");
+  try {
+    await api.triggerDatasetCheck("http://localhost:9000");
+  } catch (error) {
+    console.error("Failed to trigger dataset check:", error);
+    throw error;
+  }
});

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +73 to +89
test("Zarr streaming", async (t) => {
const zattrsResp = await fetch("/data/zarr/Organization_X/test-dataset/segmentation/.zattrs", {
headers: new Headers(),
});
const zattrs = await zattrsResp.text();
t.snapshot(zattrs);

const rawDataResponse = await fetch(
"/data/zarr/Organization_X/test-dataset/segmentation/1/0.1.1.0",
{
headers: new Headers(),
},
);
const bytes = await rawDataResponse.arrayBuffer();
const base64 = btoa(String.fromCharCode(...new Uint8Array(bytes.slice(-128))));
t.snapshot(base64);
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add error handling and response validation

The Zarr streaming test should include proper error handling and response validation.

Consider applying these improvements:

 test("Zarr streaming", async (t) => {
   const zattrsResp = await fetch("/data/zarr/Organization_X/test-dataset/segmentation/.zattrs", {
     headers: new Headers(),
   });
+  t.true(zattrsResp.ok, `Failed to fetch .zattrs: ${zattrsResp.statusText}`);
+  t.is(zattrsResp.headers.get("content-type"), "application/json", "Expected JSON response");
   const zattrs = await zattrsResp.text();
+  t.notThrows(() => JSON.parse(zattrs), "Expected valid JSON response");
   t.snapshot(zattrs);

   const rawDataResponse = await fetch(
     "/data/zarr/Organization_X/test-dataset/segmentation/1/0.1.1.0",
     {
       headers: new Headers(),
     },
   );
+  t.true(rawDataResponse.ok, `Failed to fetch raw data: ${rawDataResponse.statusText}`);
   const bytes = await rawDataResponse.arrayBuffer();
   const base64 = btoa(String.fromCharCode(...new Uint8Array(bytes.slice(-128))));
   t.snapshot(base64);
 });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
test("Zarr streaming", async (t) => {
const zattrsResp = await fetch("/data/zarr/Organization_X/test-dataset/segmentation/.zattrs", {
headers: new Headers(),
});
const zattrs = await zattrsResp.text();
t.snapshot(zattrs);
const rawDataResponse = await fetch(
"/data/zarr/Organization_X/test-dataset/segmentation/1/0.1.1.0",
{
headers: new Headers(),
},
);
const bytes = await rawDataResponse.arrayBuffer();
const base64 = btoa(String.fromCharCode(...new Uint8Array(bytes.slice(-128))));
t.snapshot(base64);
});
test("Zarr streaming", async (t) => {
const zattrsResp = await fetch("/data/zarr/Organization_X/test-dataset/segmentation/.zattrs", {
headers: new Headers(),
});
t.true(zattrsResp.ok, `Failed to fetch .zattrs: ${zattrsResp.statusText}`);
t.is(zattrsResp.headers.get("content-type"), "application/json", "Expected JSON response");
const zattrs = await zattrsResp.text();
t.notThrows(() => JSON.parse(zattrs), "Expected valid JSON response");
t.snapshot(zattrs);
const rawDataResponse = await fetch(
"/data/zarr/Organization_X/test-dataset/segmentation/1/0.1.1.0",
{
headers: new Headers(),
},
);
t.true(rawDataResponse.ok, `Failed to fetch raw data: ${rawDataResponse.statusText}`);
const bytes = await rawDataResponse.arrayBuffer();
const base64 = btoa(String.fromCharCode(...new Uint8Array(bytes.slice(-128))));
t.snapshot(base64);
});

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (1)
frontend/javascripts/test/snapshots/public-test/test-bundle/test/backend-snapshot-tests/datasets.e2e.js.md (1)

260-268: LGTM: Good implementation of Zarr streaming tests

The test cases appropriately cover both the JSON configuration and binary data aspects of Zarr streaming. Consider adding a comment explaining the expected content of the base64 encoded data for better maintainability.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 30ed5b1 and e871f00.

⛔ Files ignored due to path filters (1)
  • frontend/javascripts/test/snapshots/public-test/test-bundle/test/backend-snapshot-tests/datasets.e2e.js.snap is excluded by !**/*.snap
📒 Files selected for processing (1)
  • frontend/javascripts/test/snapshots/public-test/test-bundle/test/backend-snapshot-tests/datasets.e2e.js.md (9 hunks)
🔇 Additional comments (4)
frontend/javascripts/test/snapshots/public-test/test-bundle/test/backend-snapshot-tests/datasets.e2e.js.md (4)

14-17: LGTM: Good approach to handling volatile values in snapshots

The standardization of volatile values (created, id) and consistent dataset structure makes the snapshots more maintainable and reduces test flakiness.

Also applies to: 30-33, 46-49, 61-73, 77-89, 94-97, 109-123


208-225: LGTM: Appropriate access control configuration

The user and team access configurations are well-structured and maintain the necessary relationships for testing.

Also applies to: 234-241


270-278: LGTM: Comprehensive Zarr 3 test implementation

The test cases properly implement Zarr 3 format testing with appropriate OME specification version. The structure maintains consistency with the previous Zarr tests while accommodating the new format.

#!/bin/bash
# Verify that Zarr 3 specific handling exists in the codebase
rg -l "zarr_format.*3" --type ts --type js

141-146: Verify the scale factors and bounding box dimensions

The test uses specific values for scale factors ([11.24, 11.24, 28]) and bounding box dimensions. Please verify these values match the actual requirements for Zarr streaming tests.

Also applies to: 166-167

✅ Verification successful

The previous command had syntax issues. Let's try a broader search to find related test configurations and dataset specifications. This will help us verify if these values are consistent across the codebase or if they are specific to this test case.


Let's try a different approach by first locating all relevant test files and then searching for numeric configurations within them. This will help us understand if these values are test-specific or part of a broader pattern.


The previous search revealed several relevant files. Let's examine the specific test file and related fixtures to understand if these values are test-specific or derived from actual configuration constants.


Based on the search results, particularly examining frontend/javascripts/test/fixtures/dataset_server_object.ts, I can see that scale factors are indeed used in test fixtures. The file shows:

scale: { factor: [11.239999771118164, 11.239999771118164, 28], unit: UnitLong.nm }

This matches closely with the values in the test (11.24, 11.24, 28). Additionally, the bounding box dimensions in the fixture file show consistent patterns of using powers of 2 for dimensions (10240 for width, height, and depth).

Scale factors and bounding box dimensions are consistent with test fixtures

The scale factors [11.24, 11.24, 28] match the dataset fixture's configuration, and the bounding box dimensions follow the standard pattern used in test data. These values are intentionally set to test specific dataset configurations.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for these specific scale factors in other test files or configuration
rg -l "11.24.*11.24.*28" --type ts --type js

Length of output: 138


Script:

#!/bin/bash
# Search for scale factors in a more flexible way
rg "11\.24.*11\.24" --type ts --type js

# Search for similar bounding box configurations
rg "depth:.*height:.*topLeft:" --type ts --type js -A 5

# Look for other dataset configuration tests
rg "datasets?.e2e" --type ts --type js

Length of output: 140


Script:

#!/bin/bash
# Find all test files related to datasets or Zarr
fd -e js -e ts -e jsx -e tsx | rg -i "(dataset|zarr|e2e)"

# Look for any numeric configurations in these files
fd "datasets" -e js -e ts | xargs rg "\[([\d.]+,\s*){2,}\d+\]"

# Search for bounding box or dimension configurations
fd "datasets" -e js -e ts | xargs rg "(depth|height|topLeft):\s*\d+"

Length of output: 3983


Script:

#!/bin/bash
# Let's examine the specific test file and its snapshot
cat frontend/javascripts/test/backend-snapshot-tests/datasets.e2e.ts

# Also check the fixture data that might define these values
cat frontend/javascripts/test/fixtures/dataset_server_object.ts

# Look for any configuration files that might define these constants
fd -e json -e js -e ts | xargs rg "scaleFactors|resolution"

Length of output: 44271

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add end to end tests for zarr streaming
2 participants