-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changed depth perception to point around nose bridge #87
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly looks good, see requested changes below.
I think we should test it in-person before merging, just to verify everything makes sense.
depth_sum += closest_depth[int(point[1])][int(point[0])] | ||
depth = depth_sum / float(len(img_mouth_points)) | ||
for x in range(u - 4, u + 4): | ||
for y in range(v - 4, v + 4): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, what do you think about making this 4
be configureable by parameter? To allow us to easily tune it.
depth = depth_sum / float(len(img_mouth_points)) | ||
for x in range(u - 4, u + 4): | ||
for y in range(v - 4, v + 4): | ||
depth_sum += closest_depth[int(x)][int(y)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
depth_sum += closest_depth[int(x)][int(y)] | |
depth_sum += closest_depth[int(y)][int(x)] |
Y and X should be swapped, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another suggestion: int(x)
rounds down. But if the landmark point's position is e.g., 51.99
, that means the detector thought it was closer to 52
than 51
. To account for this I think we should round the float before casting it to an int.
depth = depth_sum / float(len(img_mouth_points)) | ||
for x in range(u - 4, u + 4): | ||
for y in range(v - 4, v + 4): | ||
depth_sum += closest_depth[int(x)][int(y)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Soo...as I was testing bite transfer, I began running into issues where the points used here are out-of-bounds of the depth image. What I realized is that the LBF Landmark detector can detect a face even if it is only partially in the image. If it does, it extrapolates where the points outside of the image are. So landmark points outputted by that detector may be out-of-bounds of the image.
Therefore, can you add a check here for whether y and x are in-bounds? Maybe keep track of the number of (x,y) pairs that are out of bounds, and if it is greater than a threshold proportion (0.5? perhaps configureable by parameter?) then it does not publish a depth (since our depth estimate would be unreliable).
for x in range(u - 4, u + 4): | ||
for y in range(v - 4, v + 4): | ||
depth_sum += closest_depth[int(x)][int(y)] | ||
depth = depth_sum / float(81) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like this magic number. I know it comes from 9**2
, but don't think it is intuitive to readers.
Given that some of the (x,y) pairs will get rejected for being out-of-bounds anyway, we can't assume that all 81 points will be in the depth_sum
. Therefore, I'd just recommend enumerating the points in the depth_sum, and using that number to do the average.
img_mouth_center = landmarks[largest_face[1]][0][66] | ||
img_mouth_points = landmarks[largest_face[1]][0][48:68] | ||
# Find marker between eyes. This is used to estimate stomion depth in the | ||
# case that the stomion is hidden behind the fork. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: this is not done "in the case that the stomion is hidden behind the fork", it is done in all cases :)
Perhaps reword to "This is used as a proxy to estimate stomion depth, because the stomion is often hidden behind the fork."
Also, I just realized that this code returns the entire 3d point for the of the bridge of the node. It shouldn't do that -- the x and y should still be the stomion x and y, and the depth should be for the bridge of the nose. However, I just realized a problem with the assumption that the depth of the bridge is the depth of the stomion: it depends on the camera pose relative to the face (because depth is in the camera frame). For example, if the RealSense is below the mouth, then the bridge of the nose will inherently be farther from the camera than the stomion is. The assumption only really holds true if the camera is half-way between the bridge and stomion, facing the user's face. Thoughts @egordon ? The above is how the old code did it, but given that we'll be playing around with staging location, maybe we want a more reliable approach? |
@amalnanavati Relaying out discussion on the potential solution to this: Outlier rejection: remove face points that are too close (i.e. right around the fork). If there are at least a few mouth points remaining, average those to get the stommion depth. If not: fit a plane to the rest of the face points (i.e. depth = np.dot(theta, [u; v; 1]), solve for theta), then use np.dot(theta, [u_stommion, v_stommion, 1]) as the depth for the mouth. |
Closed by #130 |
Description
This PR addresses issue #67. Occasionally, the fork is positioned in front of the user's mouth while face detection runs. This can lead to the camera detecting the depth of the fork rather than the user's mouth. This pull request updates the depth perception to a 9x9 square around the bridge of the nose, which does not get covered by the fork.
Testing procedure
Test and run using the same procedure as #36 (comment). I have not run this code on the robot yet.
Before opening a pull request
python3 -m black .
ada_feeding
directory, run:pylint --recursive=y --rcfile=.pylintrc .
.Before Merging
Squash & Merge