Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch nml task upload to avoid uploading too many nmls with one request #6216

Merged
merged 6 commits into from
May 18, 2022

Conversation

MichaelBuessemeyer
Copy link
Contributor

@MichaelBuessemeyer MichaelBuessemeyer commented May 17, 2022

This PR batches the task creation via nml to a maximum of 100 per request as this is the limit set by the backend. Previously, trying to upload more than 100 nmls at once resulted in an error. Now the batching prevents this error and triggers one request after another until all nmls are uploaded.

URL of deployed dev instance (used for testing):

Steps to test:

  • This test requires a lot of nml files. Therefore you can use the nml file contained in this zip
    tracing.nml.zip. Extract this nml into an empty folder. In that folder create a .sh script and past the following code in that file: for i in {1..150}; do cp tracing.nml "tracing$i.nml"; done. This code will duplicate the nml file 150 times. Then execute the script.
  • Create a new task type for hybrid annotations (with skeleton and volume annotation). If there exists already one, this step can be skipped.
  • Open the task admin view, click on Àdd Taskin the top right corner and use theCreate Task` Tab. There fill out the form with the newly created hybrid task type.
  • Under Task Specification in the form select the Upload NML File option.
  • Then click on the nml upload zone and select all the nml files created during the first step of this list. You might need to wait a little because of the high count of files.
  • Next try to upload all those files as tasks via the Create Task button. This should succeed.
  • If you try to upload the same amount of tasks on the master, an error should occur, preventing the successful task creation.

Issues:


(Please delete unneeded items, merge only when none are left open)

Comment on lines 344 to +372
try {
if (this.state.specificationType === SpecificationEnum.Nml) {
// Workaround: Antd replaces file objects in the formValues with a wrapper file
// The original file object is contained in the originFileObj property
// This is most likely not intentional and may change in a future Antd version
// @ts-expect-error ts-migrate(7006) FIXME: Parameter 'wrapperFile' implicitly has an 'any' ty... Remove this comment to see the full error message
formValues.nmlFiles = formValues.nmlFiles.map((wrapperFile) => wrapperFile.originFileObj);
response = await createTaskFromNML(formValues);
const nmlFiles = formValues.nmlFiles.map((wrapperFile) => wrapperFile.originFileObj);
for (let i = 0; i < nmlFiles.length; i += NUM_TASKS_PER_BATCH) {
const batchOfNmls = nmlFiles.slice(i, i + NUM_TASKS_PER_BATCH);
formValues.nmlFiles = batchOfNmls;
// eslint-disable-next-line no-await-in-loop
const response = await createTaskFromNML(formValues);
taskResponses = taskResponses.concat(response.tasks);
warnings = warnings.concat(response.warnings);
}
} else {
if (this.state.specificationType !== SpecificationEnum.BaseAnnotation) {
// Ensure that the base annotation field is null, if the specification mode
// does not include that field.
formValues.baseAnnotation = null;
}

response = await createTasks([formValues]);
({ tasks: taskResponses, warnings } = await createTasks([formValues]));
}

handleTaskCreationResponse(response);
handleTaskCreationResponse({
tasks: taskResponses,
warnings: _.uniq(warnings),
});
Copy link
Contributor Author

@MichaelBuessemeyer MichaelBuessemeyer May 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this includes quite some code duplication compared to the task_create_bulk_view.
I chose to keep the duplication as the code diverges in a few parts like the kind of request that is triggered. I thought that it might be quite difficult to merge these two behaviors neatly together while keeping the code understandable.
=> But feel free to argue with my decision and suggest how this behavior could be unified and deduplicated.

Copy link
Member

@daniel-wer daniel-wer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works like a charm 🎉

I agree that avoiding the bit of code duplication would be rather difficult and is not worth it.

CHANGELOG.unreleased.md Outdated Show resolved Hide resolved
@MichaelBuessemeyer MichaelBuessemeyer self-assigned this May 18, 2022
@MichaelBuessemeyer MichaelBuessemeyer merged commit 03db25f into master May 18, 2022
@MichaelBuessemeyer MichaelBuessemeyer deleted the batch-nml-task-upload branch May 18, 2022 12:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Task creation should batch requests not only in bulk mode, but also in from-files mode
2 participants