-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(5xx): should return 4xx when file not found #1291
fix(5xx): should return 4xx when file not found #1291
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. Join @kishore03109 and the rest of your teammates on Graphite |
7679174
to
4a2610a
Compare
@@ -333,6 +333,12 @@ export default class GitFileSystemService { | |||
) | |||
|
|||
if (error instanceof GitError) { | |||
if ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would you know why the request even contains things like //filename?
are these actually broken links that cms tries to resolve?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would you know why the request even contains things like //filename?
we have a markdown editor, so there are possibility for typo + the logs suggest this has occurred at least once.
we dont debounce too, to this could occur while the user is typing the correct string.
are these actually broken links that cms tries to resolve?
yes, intuitively should be able to resolve them gracefully by 4xx rather than 5xx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add comments for this check here so that we don't lose context of when this error statements occur?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we also have tests out for this PR since it's fixing a bug and we want to avoid a regression?
error.message.includes("fatal: path") && | ||
error.message.includes("exists on disk, but not in 'HEAD'") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not a blocker for now, because we can raise a PR to improve this + bug is fixed here but i wonder if error.message
is a robust enough property for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like best effort, since this is directly reliant on the error message that git returns us
do have an alternative in mind?
@@ -1121,7 +1127,11 @@ export default class GitFileSystemService { | |||
) | |||
) | |||
.andThen((directoryContents) => { | |||
const resultAsyncs = directoryContents.map((directoryItem) => { | |||
const IGNORED_FILES = [".git"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think the pattern we might wanna aim for here is to respect the filesystem's convention! in here, this means that hidden files/folders (prefixed by a `.) should be ignored rather than a specific instance.
wwdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm i rather keep this as .git since this actually is not the file system convention, but rather the .git system's convention. since they dont have a direct mapping, opting for the stricter check here.
4a2610a
to
6b1f857
Compare
Problem
Described in linear:
missing media files in pages leads to a 5xx when the resource name itself cannot be processed
as an example, consider a link (some_link)[/%2fa/b] -> this resolves to //a/b, which is not processable by our backend as it results in an error like so: Error when getting Git blob hash: Error: HEAD:images/technews//tvws-part2.JPG and in turn causes a 5xx.
this 5xx is caused because GitFileService (iirc) will try to determine the blob hash (see if can find the file on disk) and in doing so, it will get an error (as the file doesn't exist), leading to the 500.
see here for an example where this occurs
in order to reproduce this:
Additionally, we also see a
fatal: path '.git' does not exist in 'HEAD'
in the logs every time the workspace is refreshed.This is because we rely on
fileStats
to get a list of directories, but forget to remove special directories such as.git
.Solution
Reason for this is that git under the hood does a file path normalision for checking if the file exists in local disk, but does not when checking when it exists in head.
consider these three examples:
We use fileStats to see if a file exists before passing it to
GetGitBlobHash
. issue with this is that while the file could exist in disk and still return a correct folder stats, this should still return a file not found error as the path given by the user is still incorrect. (am opting for this rather than normalising the path in the functionGetGitBlobHash
to keep the implementation ofGetGitBlobHash
lean and not think about potential path traversal attacks.)for the special files, we simply do a filter to ignore files like
.git
. We know that the class of errors with the stringsexists on disk
andError when getting Git blob hash
mostly comes from the .git folder by looking at dd logs here.ie all instances of
exists on disk
andError when getting Git blob hash
errors in prod for the past two weeks came from trying to access the .git folders.Breaking Changes
Before & After Screenshots
BEFORE:
AFTER:
Tests
in order to reproduce this:
fatal: path '.git' exists on disk, but not in 'HEAD'