Skip to content

Commit

Permalink
Allow renaming datasets & dataset with duplicate names (#8075)
Browse files Browse the repository at this point in the history
* WIP: Adjust schema to allow duplicate dataset names & implement new uir dataset addressing scheme

* reimplement proper dataset name checking route (still keep leave away the check for already existing name)

* WIP: implement wk core backend routes to only use datasetId and no orgaId

- Includes moving ObjectId to uitls package

* WIP: finish using dataset id in wk core backend and dataspath in datastore & add legacy routes

- Undo renaming DataSourceId to LegacyDataSourceId

* WIP: Fix backend compilation

* Fix backend compilation

* WIP: Adapt frontend to new api

* WIP: adapt frontend to new routes

* WIP: Adjust frontend to newest api

* first kinda working version

* Try update schema and evolution

* fix evolution & add first version of reversion (needs to be tested)

* fix frontend tests

* format backend

* fix dataSets.csv

* fix e2e tests

* format backend

* fix frontend

* remove occurences of displayName access / variables in context of a dataset object

* fixed verion routes

* fix reserveUploadRoute

* rename orga_name in jobs to orga_id

* format code

* fix finishUploadRoute

* allow duplicate names when uploading a new dataset

* fix job list view

* fix some datastore requests

* further minor fixes

* make add remote dataset path a post request as it always creates a new dataset (even when the name is already taken)

* WIP: replace missed code parts where dataset address was still wrong / not backwards compatible

* WIP: replace missed code parts where dataset address was still wrong / not backwards compatible

* WIP: adapt annotation upload & task upload to use datasetId

* WIP: adjust backend part of task upload to use new dataset addressing

* Finish adapting task & annotation upload to new format

* Fix inserting dataset into database

* fix nml annotation upload

* format backend

* add hint about new parameter datasetId to csv / bulk task upload

* Move task api routes to a separate file in frontend

* add datasetName and datasetId to returned tasks

* add missing task api routes file (frontend)

* adapt frontend to new task return type

* remove unused imports

* fix frontend tests

* add datasetId to nml output and readd datasetName to nml parsing for legacy support

* add dataset id to frontend nml serialization

* fix parsing dataset id from nml in backend

* fix nml backend tests

* fix typing

* remove logging statement

* fix frontend dataset cache by using the dataset id as the identifier

* send dataset path as datasource.id.name to frontend

* remove unused code

* fix pervious merge with newest master

* fix evolution and reversion

* remove objectid from UploadedVolumeLayer and delete SkeletonTracingWithDatasetId and make nml parser return its own result case class

* use new notion like urls

* rename datasetPath to datasetDirectoryName

* fix backend tests

* delete DatasetURLParser, rename package of ObjectId to objectid, update e2e snapshot tests

* small clean up, fix dataset public writes, fix dataset table highlighting

* fix e2e tests

* make datastore dataset update route http put method

* make datastore dataset update route http put method

* rename datasetParsedId to datasetIdValidated

* bump schema version after merge

* removeexplicit invalid dataset id message when parsing a datasetid from string

* remove overwriting orga name and overwriting dataset name from anntoation upload path

* WIP apply PR feedback

* remove unused method

* rely on datasetId in processing of taskcreation routes

* apply some more review feedback

* cleanup unused implicits

* make link generation for convert_to_wkw and compute_mesh_file backwards compatible

* adjust unfinished uploads to display correct dataset name in upload view

* send datasource id to compose dataset route (not dataset id)
- send datasource id in correct format to backend
- fix dataset renaming in dataset settings

* WIP apply review feedback
- add legacy route for task creation routes
- support jobs results link only via legacy  routes
- WIP: refactor nml parsing code
- Added dataset location to datasets settings advanced tab
- added comment to OpenGraphService about parsing new uri schema

* Finish refactoring nml backend parsing

* ifx nml typing

* fix nml upload

* apply frontend pr review feedback

* apply pr frontend feedback

* add new e2e test to check dataset disambiguation

* re-add backwards compatibility for legacy dataset links without organization id

* change screenshot test dataset id retrieval to be a test.before

* remove outdated comment
- the backend always sends an id for compacted datasets

* temp disable upload test

* fix linting

* fix datasetId expected length

* replace failure fox by returnError fox, fix json error msg

* fix upload test
- make new fields to reserve upload optional (for backward compatibility)
- fix find data request from core backend

* remove debug logging from dataset upload test

* remove debug logs

* format backend

* apply pr various pr feedback

* fix frontend routing & include /view postfix in ds link in ds edit view

* add todo comments

* send separate case class to datastore upon reserveUpload request to core backend

* format backend

* try setting screenshot ci to check this branch

* use auth token in screenshot tests

* renaming path to directoryName in response of disambiguate route

* hopefully fix screenshot tests

* reset application.conf

* remove disambiguate link

* switch screenshots back to use master.webknossos.xyz

* add comment to explain handling legacy urls
- revert nightly.yml full to master

* remove outdated TODO comment

* fix single dataset dnd in dashboard
- improve typing of dnd arguments

* format backend

* add comment reasoning why dataset name setting is not synced with datasource.id.name

* rename some local variables

* pass missing parameter to infer_with_model worker job

* try not showing dataset as changed when it was migrated to the new renamable version

* format backend

* add changelog entry

* add info about ds being renamable to the docs

* undo misc snapshot changes

* make task creation form use new version of task creation parameters

* remove dataSet field from task json object returned from server -> add required legacy adaption

* also rename dataSet in publication routes to dataset
- no legacy routes needed as not used by wklibs

* add changelog entry about dropping legacy routes

* bump api version

* remove old routes with api version lower than 5 & resort methods in legacy controller

* refresh snapshots

* fix legacy create task route
- also remove unised injections and imports

* fix annotation info legacy route

* remove unused import

* - fix in code comments
- fix changelog entry
- add migration entry about dropped api versions

---------

Co-authored-by: Michael Büßemeyer <[email protected]>
Co-authored-by: Michael Büßemeyer <[email protected]>
Co-authored-by: Florian M <[email protected]>
  • Loading branch information
4 people authored Nov 27, 2024
1 parent 5d3d66d commit 6c0a472
Show file tree
Hide file tree
Showing 231 changed files with 3,353 additions and 2,928 deletions.
2 changes: 2 additions & 0 deletions CHANGELOG.unreleased.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ For upgrade instructions, please check the [migration guide](MIGRATIONS.released

### Changed
- Reading image files on datastore filesystem is now done asynchronously. [#8126](https://github.com/scalableminds/webknossos/pull/8126)
- Datasets can now be renamed and can have duplicate names. [#8075](https://github.com/scalableminds/webknossos/pull/8075)
- Improved error messages for starting jobs on datasets from other organizations. [#8181](https://github.com/scalableminds/webknossos/pull/8181)
- Terms of Service for Webknossos are now accepted at registration, not afterward. [#8193](https://github.com/scalableminds/webknossos/pull/8193)
- Removed bounding box size restriction for inferral jobs for super users. [#8200](https://github.com/scalableminds/webknossos/pull/8200)
Expand All @@ -28,6 +29,7 @@ For upgrade instructions, please check the [migration guide](MIGRATIONS.released
- Fix a bug where dataset uploads would fail if the organization directory on disk is missing. [#8230](https://github.com/scalableminds/webknossos/pull/8230)

### Removed
- Removed support for HTTP API versions 3 and 4. [#8075](https://github.com/scalableminds/webknossos/pull/8075)
- Removed Google Analytics integration. [#8201](https://github.com/scalableminds/webknossos/pull/8201)

### Breaking Changes
2 changes: 2 additions & 0 deletions MIGRATIONS.unreleased.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,5 +9,7 @@ User-facing changes are documented in the [changelog](CHANGELOG.released.md).
[Commits](https://github.com/scalableminds/webknossos/compare/24.11.1...HEAD)

- The config option `googleAnalytics.trackingId` is no longer used and can be removed. [#8201](https://github.com/scalableminds/webknossos/pull/8201)
- Removed support for HTTP API versions 3 and 4. [#8075](https://github.com/scalableminds/webknossos/pull/8075)

### Postgres Evolutions:
- [124-decouple-dataset-directory-from-name](conf/evolutions/124-decouple-dataset-directory-from-name)
14 changes: 6 additions & 8 deletions app/controllers/AiModelController.scala
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ import play.api.libs.json.{Json, OFormat}
import play.api.mvc.{Action, AnyContent, PlayBodyParsers}
import play.silhouette.api.Silhouette
import security.WkEnv
import utils.ObjectId
import com.scalableminds.util.objectid.ObjectId

import javax.inject.Inject
import scala.concurrent.ExecutionContext
Expand Down Expand Up @@ -42,7 +42,7 @@ object RunTrainingParameters {

case class RunInferenceParameters(annotationId: Option[ObjectId],
aiModelId: ObjectId,
datasetName: String,
datasetDirectoryName: String,
organizationId: String,
colorLayerName: String,
boundingBox: String,
Expand Down Expand Up @@ -147,7 +147,7 @@ class AiModelController @Inject()(
jobCommand = JobCommand.train_model
commandArgs = Json.obj(
"training_annotations" -> Json.toJson(trainingAnnotations),
"organization_name" -> organization._id,
"organization_id" -> organization._id,
"model_id" -> modelId,
"custom_workflow_provided_by_user" -> request.body.workflowYaml
)
Expand Down Expand Up @@ -180,21 +180,19 @@ class AiModelController @Inject()(
"organization.notFound",
request.body.organizationId)
_ <- bool2Fox(request.identity._organization == organization._id) ?~> "job.runInference.notAllowed.organization" ~> FORBIDDEN
dataset <- datasetDAO.findOneByNameAndOrganization(request.body.datasetName, organization._id) ?~> Messages(
"dataset.notFound",
request.body.datasetName)
dataset <- datasetDAO.findOneByDirectoryNameAndOrganization(request.body.datasetDirectoryName, organization._id)
dataStore <- dataStoreDAO.findOneByName(dataset._dataStore) ?~> "dataStore.notFound"
_ <- aiModelDAO.findOne(request.body.aiModelId) ?~> "aiModel.notFound"
_ <- datasetService.assertValidDatasetName(request.body.newDatasetName)
_ <- datasetService.assertNewDatasetName(request.body.newDatasetName, organization._id)
jobCommand = JobCommand.infer_with_model
boundingBox <- BoundingBox.fromLiteral(request.body.boundingBox).toFox
commandArgs = Json.obj(
"organization_name" -> organization._id,
"organization_id" -> organization._id,
"dataset_name" -> dataset.name,
"color_layer_name" -> request.body.colorLayerName,
"bounding_box" -> boundingBox.toLiteral,
"model_id" -> request.body.aiModelId,
"dataset_directory_name" -> request.body.datasetDirectoryName,
"new_dataset_name" -> request.body.newDatasetName,
"custom_workflow_provided_by_user" -> request.body.workflowYaml
)
Expand Down
26 changes: 8 additions & 18 deletions app/controllers/AnnotationController.scala
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ import org.apache.pekko.util.Timeout
import play.silhouette.api.Silhouette
import com.scalableminds.util.accesscontext.{DBAccessContext, GlobalAccessContext}
import com.scalableminds.util.geometry.BoundingBox
import com.scalableminds.util.objectid.ObjectId
import com.scalableminds.util.time.Instant
import com.scalableminds.util.tools.{Fox, FoxImplicits}
import com.scalableminds.webknossos.datastore.models.annotation.AnnotationLayerType.AnnotationLayerType
Expand Down Expand Up @@ -34,7 +35,7 @@ import play.api.libs.json._
import play.api.mvc.{Action, AnyContent, PlayBodyParsers}
import security.{URLSharing, UserAwareRequestLogging, WkEnv}
import telemetry.SlackNotificationService
import utils.{ObjectId, WkConf}
import utils.WkConf

import javax.inject.Inject
import scala.concurrent.ExecutionContext
Expand Down Expand Up @@ -242,15 +243,11 @@ class AnnotationController @Inject()(
} yield result
}

def createExplorational(organizationId: String, datasetName: String): Action[List[AnnotationLayerParameters]] =
def createExplorational(datasetId: String): Action[List[AnnotationLayerParameters]] =
sil.SecuredAction.async(validateJson[List[AnnotationLayerParameters]]) { implicit request =>
for {
organization <- organizationDAO.findOne(organizationId)(GlobalAccessContext) ?~> Messages(
"organization.notFound",
organizationId) ~> NOT_FOUND
dataset <- datasetDAO.findOneByNameAndOrganization(datasetName, organization._id) ?~> Messages(
"dataset.notFound",
datasetName) ~> NOT_FOUND
datasetIdValidated <- ObjectId.fromString(datasetId)
dataset <- datasetDAO.findOne(datasetIdValidated) ?~> Messages("dataset.notFound", datasetIdValidated) ~> NOT_FOUND
annotation <- annotationService.createExplorationalFor(
request.identity,
dataset._id,
Expand All @@ -262,19 +259,12 @@ class AnnotationController @Inject()(
} yield JsonOk(json)
}

def getSandbox(organization: String,
datasetName: String,
typ: String,
sharingToken: Option[String]): Action[AnyContent] =
def getSandbox(datasetId: String, typ: String, sharingToken: Option[String]): Action[AnyContent] =
sil.UserAwareAction.async { implicit request =>
val ctx = URLSharing.fallbackTokenAccessContext(sharingToken) // users with dataset sharing token may also get a sandbox annotation
for {
organization <- organizationDAO.findOne(organization)(GlobalAccessContext) ?~> Messages(
"organization.notFound",
organization) ~> NOT_FOUND
dataset <- datasetDAO.findOneByNameAndOrganization(datasetName, organization._id)(ctx) ?~> Messages(
"dataset.notFound",
datasetName) ~> NOT_FOUND
datasetIdValidated <- ObjectId.fromString(datasetId)
dataset <- datasetDAO.findOne(datasetIdValidated)(ctx) ?~> Messages("dataset.notFound", datasetIdValidated) ~> NOT_FOUND
tracingType <- TracingType.fromString(typ).toFox
_ <- bool2Fox(tracingType == TracingType.skeleton) ?~> "annotation.sandbox.skeletonOnly"
annotation = Annotation(
Expand Down
Loading

0 comments on commit 6c0a472

Please sign in to comment.