Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async reading in FileSystemDataVault #8126

Merged
merged 7 commits into from
Nov 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion CHANGELOG.unreleased.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,12 +27,13 @@ For upgrade instructions, please check the [migration guide](MIGRATIONS.released
- Migrated nightly screenshot tests from CircleCI to GitHub actions. [#8134](https://github.com/scalableminds/webknossos/pull/8134)
- Migrated nightly screenshot tests for wk.org from CircleCI to GitHub actions. [#8135](https://github.com/scalableminds/webknossos/pull/8135)
- Thumbnails for datasets now use the selected mapping from the view configuration if available. [#8157](https://github.com/scalableminds/webknossos/pull/8157)
- Reading image files on datastore filesystem is now done asynchronously. [#8126](https://github.com/scalableminds/webknossos/pull/8126)

### Fixed
- Fixed a bug during dataset upload in case the configured `datastore.baseFolder` is an absolute path. [#8098](https://github.com/scalableminds/webknossos/pull/8098) [#8103](https://github.com/scalableminds/webknossos/pull/8103)
- Fixed bbox export menu item [#8152](https://github.com/scalableminds/webknossos/pull/8152)
- When trying to save an annotation opened via a link including a sharing token, the token is automatically discarded in case it is insufficient for update actions but the users token is. [#8139](https://github.com/scalableminds/webknossos/pull/8139)
- Fix that scrolling in the trees and segments tab did not work while dragging. [#8162](https://github.com/scalableminds/webknossos/pull/8162)
- Fix that scrolling in the trees and segments tab did not work while dragging. [#8162](https://github.com/scalableminds/webknossos/pull/8162)
- Fixed that uploading a dataset which needs a conversion failed when the angstrom unit was configured for the conversion. [#8173](https://github.com/scalableminds/webknossos/pull/8173)
- Fixed that the skeleton search did not automatically expand groups that contained the selected tree [#8129](https://github.com/scalableminds/webknossos/pull/8129)
- Fixed interactions in the trees and segments tab like the search due to a bug introduced by [#8162](https://github.com/scalableminds/webknossos/pull/8162). [#8186](https://github.com/scalableminds/webknossos/pull/8186)
Expand Down
Original file line number Diff line number Diff line change
@@ -1,15 +1,16 @@
package com.scalableminds.webknossos.datastore.datavault

import com.scalableminds.util.tools.Fox
import net.liftweb.common.Box.tryo
import com.scalableminds.util.tools.Fox.{bool2Fox, box2Fox}
import com.scalableminds.util.tools.Fox.bool2Fox
import com.scalableminds.webknossos.datastore.storage.DataVaultService
import net.liftweb.common.{Box, Full}
import org.apache.commons.lang3.builder.HashCodeBuilder

import java.nio.ByteBuffer
import java.nio.file.{Files, Path, Paths}
import java.nio.channels.{AsynchronousFileChannel, CompletionHandler}
import java.nio.file.{Files, Path, Paths, StandardOpenOption}
import java.util.stream.Collectors
import scala.concurrent.ExecutionContext
import scala.concurrent.{ExecutionContext, Promise}
import scala.jdk.CollectionConverters._

class FileSystemDataVault extends DataVault {
Expand All @@ -24,31 +25,55 @@ class FileSystemDataVault extends DataVault {
private def readBytesLocal(localPath: Path, range: RangeSpecifier)(implicit ec: ExecutionContext): Fox[Array[Byte]] =
if (Files.exists(localPath)) {
range match {
case Complete() => tryo(Files.readAllBytes(localPath)).toFox
case Complete() =>
readAsync(localPath, 0, Math.toIntExact(Files.size(localPath)))

case StartEnd(r) =>
tryo {
val channel = Files.newByteChannel(localPath)
val buf = ByteBuffer.allocateDirect(r.length)
channel.position(r.start)
channel.read(buf)
buf.rewind()
val arr = new Array[Byte](r.length)
buf.get(arr)
arr
}.toFox
readAsync(localPath, r.start, r.length)

case SuffixLength(length) =>
tryo {
val channel = Files.newByteChannel(localPath)
val buf = ByteBuffer.allocateDirect(length)
channel.position(channel.size() - length)
channel.read(buf)
buf.rewind()
val arr = new Array[Byte](length)
buf.get(arr)
arr
}.toFox
val fileSize = Files.size(localPath)
readAsync(localPath, fileSize - length, length)
Comment on lines +35 to +36
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add bounds checking for SuffixLength case

The calculation fileSize - length could result in a negative position if length is greater than fileSize. Add validation to prevent this scenario.

Apply this diff:

  case SuffixLength(length) =>
    val fileSize = Files.size(localPath)
+   val adjustedLength = Math.min(length, Math.toIntExact(fileSize))
+   val position = Math.max(0, fileSize - adjustedLength)
-   readAsync(localPath, fileSize - length, length)
+   readAsync(localPath, position, adjustedLength)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
val fileSize = Files.size(localPath)
readAsync(localPath, fileSize - length, length)
val fileSize = Files.size(localPath)
val adjustedLength = Math.min(length, Math.toIntExact(fileSize))
val position = Math.max(0, fileSize - adjustedLength)
readAsync(localPath, position, adjustedLength)

}
} else Fox.empty
} else {
Fox.empty
}

private def readAsync(path: Path, position: Long, length: Int)(implicit ec: ExecutionContext): Fox[Array[Byte]] = {
val promise = Promise[Box[Array[Byte]]]()
val buffer = ByteBuffer.allocateDirect(length)
var channel: AsynchronousFileChannel = null
Comment on lines +44 to +45
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider reusing DirectByteBuffer for performance

Creating a direct ByteBuffer for each read operation is expensive. Consider implementing a buffer pool or using a ThreadLocal buffer for better performance.

Would you like me to provide an implementation of a buffer pool?


try {
channel = AsynchronousFileChannel.open(path, StandardOpenOption.READ)

channel.read(
buffer,
position,
buffer,
new CompletionHandler[Integer, ByteBuffer] {
override def completed(result: Integer, buffer: ByteBuffer): Unit = {
buffer.rewind()
val arr = new Array[Byte](length)
buffer.get(arr)
promise.success(Full(arr))
channel.close()
}

override def failed(exc: Throwable, buffer: ByteBuffer): Unit = {
promise.failure(exc)
channel.close()
}
}
)
} catch {
case e: Throwable =>
promise.failure(e)
if (channel != null && channel.isOpen) channel.close()
}

promise.future
}

override def listDirectory(path: VaultPath, maxItems: Int)(implicit ec: ExecutionContext): Fox[List[VaultPath]] =
vaultPathToLocalPath(path).map(
Expand Down