Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New: MultiputUpload part2 #127

Merged
merged 9 commits into from
Dec 12, 2017
Merged

New: MultiputUpload part2 #127

merged 9 commits into from
Dec 12, 2017

Conversation

wenboyu2
Copy link
Contributor

@wenboyu2 wenboyu2 commented Dec 5, 2017

  • create session (initiation, error, success, retry)
  • upload part (initiation, error, success, retry, progress)

@boxcla
Copy link

boxcla commented Dec 5, 2017

Verified that @wenboyu2 has signed the CLA. Thanks for the pull request!


const PART_STATE_NOT_STARTED: 0 = 0;
const PART_STATE_COMPUTING_DIGEST: 1 = 1;
const PART_STATE_DIGEST_READY: 2 = 2;
const PART_STATE_UPLOADING: 3 = 3;
const PART_STATE_UPLOADED: 4 = 4;
const noop = () => {};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use lodash.noop

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

numUploadRetriesPerformed: this.numUploadRetriesPerformed,
numDigestRetriesPerformed: this.numDigestRetriesPerformed,
sha256: this.sha256,
xhr: this.xhr.xhr,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what this would result in.
Might just be {}

Copy link
Contributor

@priyajeet priyajeet Dec 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a toJSON() method in XHR.js if you want more useful info. Then calling JSON.stringify(this.xhr) will automatically use that method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole method can be the toJSON method for this class too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like currently in the webapp xhr field printed is always null or {}, removing that from being printed

'Content-Type': 'application/octet-stream',
Digest: `sha-256=${this.sha256}`,
'Content-Range': `bytes ${this.offset}-${this.rangeEnd}/${this.size}`,
'X-Box-Client-Event-Info': JSON.stringify(clientEventInfo)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double check if this header is whitelisted in our headers for app.box.com

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, it is not. @wenboyu2 you're going to need to to add this header to our CORS whitelist which I'll message you.

I always go back and forth about whether or not to remove this shortcut... This is gonna cause a site issue or at a minimum block some release someday.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if whitelisting is necessary as this request hits UPXY directly (and UPXY also generates the CORS response)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the CORS shortcut we have in nginx (since nginx responds a lot faster than PHP):

if ($http_origin ~* (^https://(.+\.)?(app|ent)\.(box\.com|boxcn\.net|boxenterprise\.net)$)) {
    set $cors "box";
}

if ($request_method = 'OPTIONS') {
    set $cors "${cors}options";
}

# If CORS pre-flight request has subdomain app or ent on the root domain box.com, boxcn.net,
# or boxenterprise.net, handle in nginx to improve performance
if ($cors = "boxoptions") {
    // Set CORS headers + list of allowed headers. The list of headers needs to be manually specified since not all browser support Access-Control-Allow-Headers: * yet
}

This conf is included in the api.box.com nginx template, so you're maybe right that it won't get triggered if we're hitting UPXY directly.

offset: this.offset
},
xhr_ready_state: this.xhr.xhr.readyState,
xhr_status_text: this.xhr.xhr.statusText
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just the event format directly ported over from WebApp code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah

try {
if (this.uploadedBytes < this.size) {
// Not all bytes were uploaded to the server. So upload part again.
throw new Error();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a message to this error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

return;
}
this.consoleLog(`Part ${this.stringify()} is not available on server. Re-uploading.`);
throw new Error();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a message to the error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

src/util/url.js Outdated
* @param {mixed} value - the parameter value to be updated/added
* @return {string}
*/
function updateQueryStringParameter(uri: string, key: string, value: any): string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably best to not handle query string params yourself, lots of edge cases - Preview uses jsuri, see: box/box-content-preview#405

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dev/peerDep it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copied these over from the monolith. will look into jsuri which sounds like a better idea

* @param {function} msgFunc
* @return {void}
*/
consoleLogFunc = (msgFunc: Function): void => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whats the use case of this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@klk- do you remember why this was needed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we use this in event/error handlers where something needs to be dynamically calculated then for logging, see: https://github.com/box/box-ui-elements/pull/127/files#diff-ea9a71cb823147e0f8100d32ad629368R203

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But doesn't this function evaluates the msgFunc immediately and console.log the returned string? Why do we need to wrap it in another function?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, maybe this is to be consistent with consoleLog above and wrap the logic for checking whether the config allows logging. I was thinking of just having one log method that takes either a string log or a log function:

logToConsole(msgOrMsgFunc: string | Function): void => {}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at all the use cases for consoleLogFunc I'm still not sure what do we get from it.

this.consoleLogFunc(() => `Upload completed: ${this.stringify()}. Parts state: ${this.getPartsState()}`);

is the same as

this.consoleLog(`Upload completed: ${this.stringify()}. Parts state: ${this.getPartsState()}`);

right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right...

Copy link
Contributor Author

@wenboyu2 wenboyu2 Dec 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh wait maybe it's because getPartsState() (or whatever needs to be called to compose the message) is expensive, and the point of having consoleLogFunc is to avoid calling that unnecessarily

* @return {void}
*/
consoleLog = (msg: string): void => {
if (this.config.console && window.console) {
Copy link
Contributor

@priyajeet priyajeet Dec 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be moved to Base.js, with the console boolean config being part of options. Maybe something like:

// Base.js
this.logInfo = !!options.console && !!window.console ? window.console.log || noop : noop;
this.logError = !!options.console && !!window.console ? window.console.error || noop : noop;

// Child classes
this.logInfo(some string);

In the future, might change this to emit() log or error events as opposed to console logging.
So the consumer of this API, can listen and either console.log or bind it to an onError() react prop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh nice

initialRetryDelayMs: 5000, // Base for exponential backoff on retries
maxRetryDelayMs: 60000, // Upper bound for time between retries
parallelism: 5, // Maximum number of parts to upload at a time
requestTimeoutMS: 120000, // Idle timeout on part upload, overall request timeout on other requests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You use Ms everywhere else, use it here too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

});

/**
* Returns file part associated with this Part.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarify what exactly "file part" means here, it's not clear from context

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


const newUploadedBytes = parseInt(event.loaded, 10);
this.onProgress(this.uploadedBytes, newUploadedBytes);
this.uploadedBytes = newUploadedBytes;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it makes sense to set the internal bookkeeping state before calling the onProgress callback

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated for clarity

return;
}

this.consoleLog(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consoleLogFunc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh good catch

* @param {number} limit - Number of parts to be listed. Optional.
* @return {Promise<Array<Object>>} Array of parts
*/
listParts = async (partIndex: number, limit: number): Promise<Array<Object>> => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this go in BaseMultiput?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm for now this function is only used when retrying part upload. We can move it to the base class if we need it for creating/committing session or something else, but I doubt it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, yeah eventually we'd probably call this from other places when resuming an upload (not sure if we're ever planning to support that functionality), so we can move it then

this.size = size;
this.state = PART_STATE_NOT_STARTED;
this.timing = {};
this.uploadedBytes = 0;
this.data = {};
this.uploadUrl = `${this.uploadHost}/api/2.0/files/upload_sessions/${sessionId}`;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The uploadUrl should come from sessionEndpoints

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

this.state = PART_STATE_UPLOADED;
this.consoleLogFunc(() => `Upload completed: ${this.toJSON()}. Parts state: ${this.getPartsState()}`);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you need to pass in the getPartsState function into this class just for this log line, it may not be worth it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean we don't need to log the parts states here? Or are you suggesting some other approach to make an individual part aware of the big picture?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely not needed, except for potential debugging purposes in the future. If you want to simplify the code to get rid of it, I'd say go for it

});

describe('upload()', () => {
it('should noop if sha256 is not available', () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to upload the part before the SHA-256 is ready would imply a logic error in the code, which is really bad. In this case I think it makes sense to fail more loudly (with an exception) as the expectation is that it should never happen, and we definitely want to know if it does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

MultiputPartTest.upload();
});

it('should noop if blob is not available', () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

* @param {Object} queryParams
* @return {string}
*/
function updateQueryParameters(url: string, queryParams: Object): string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can sessionEndpoints' values all be of type Uri instead of String, so that we can just call addQueryParam/replaceQueryParam directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we only update the query params for commit and listParts, not sure if it's worth it. I'll consider it when we find ourselves messing with the urls more

onProgress: Function;
onSuccess: Function;
onError: Function;
data: Object;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to use a more descriptive type for this variable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

thiccar
thiccar previously approved these changes Dec 11, 2017
@@ -26,38 +30,299 @@ class MultiputPart extends Base {
| typeof PART_STATE_UPLOADED;
timing: Object;
uploadedBytes: number;
uploadUrl: string;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[minor] It's redundant to have this information in both this variable and session_endpoints, let's simplify it. SInce you use session_endpoints.list_part for the URL in implementation of listParts(), I would say just keep session_endpoints.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

onError: Function;
data: Object;
config: MultiputConfig;
id: number;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC we use strings for part IDs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


this.config = config || DEFAULT_MULTIPUT_CONFIG;
this.sessionEndpoints = sessionEndpoints;
this.canConsoleLog = !!options.consoleLog && !!window.console && !!window.console.log;
Copy link
Contributor

@priyajeet priyajeet Dec 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still needed? Since this.consoleLog will be a noop, you can just call it without check of existance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, because if you can't console log, then we won't call the function to generate the console message which could potentially be costly. see consoleLogFunc

return;
}

this.consoleLogFunc(
Copy link
Contributor

@priyajeet priyajeet Dec 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still confused, why are consoleLogFunc() needed?
Why can't this just be this.consoleLog(string) like line 240 below?

Copy link
Contributor Author

@wenboyu2 wenboyu2 Dec 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah previously this message generator function had a costly function call, so we only want to evaluate that when we can actually console log something. but since it's now removed we can probably just do this.consoleLog


MultiputPartTest.uploadProgressHandler(event);

assert.equal(MultiputPartTest.uploadedBytes, 1);
Copy link
Contributor

@priyajeet priyajeet Dec 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of our tests should use chai expectations/asserts.
So expect().to.be...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it, I'll convert those with the next multiput PR which is coming very soon

@wenboyu2 wenboyu2 merged commit f8c9821 into box:master Dec 12, 2017
@wenboyu2 wenboyu2 deleted the multiput-upload-2 branch December 12, 2017 19:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants