Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3d Projective geometry warping using ground truth depth and pose is not working #6

Open
NagabhushanSN95 opened this issue Jun 25, 2021 · 3 comments

Comments

@NagabhushanSN95
Copy link

NagabhushanSN95 commented Jun 25, 2021

Hi,

Thanks for sharing your database. I'm trying to use the database in a view synthesis related task.
Given two frames, I'm trying to reconstruct one from the other using 3d projective geometry based warping from here.

For depth, I'm using the your code to get disparity map and then compute depth map as 1/disparity and clipping max value to 0

depth1 = numpy.clip(1 / depth1, a_min=0, a_max=1000)

For camera poses, I'm using the viewMatrix you've provided in the json files.

I'm constructing the camera intrinsic using the fov you've mentioned: 50

def camera_intrinsic_transform(vfov=50, hfov=50, capture_width=1680, capture_height=1050, pixel_width=1680,
                               pixel_height=1050):
    camera_intrinsics = numpy.eye(3)
    camera_intrinsics[0, 0] = (capture_width / 2.0) / math.tan(math.radians(hfov / 2.0))
    camera_intrinsics[0, 2] = pixel_width / 2.0
    camera_intrinsics[1, 1] = (capture_height / 2.0) / math.tan(math.radians(vfov / 2.0))
    camera_intrinsics[1, 2] = pixel_height / 2.0
    return camera_intrinsics

However, the reconstructed image is not matching with the rendered image.

@NagabhushanSN95
Copy link
Author

NagabhushanSN95 commented Jun 25, 2021

For example, I'm considering frames 1 and 2 of 67b90283-627b-45cf-9ff2-63dcb95bfc67

The corresponding camera poses are below

transformation1 = numpy.array(
    [0.3948856089590146, 0.9186789480807126, 0.009707391790442067, 133.64471308856065,
     -0.06591960053103078, 0.017792904372381235, 0.9976662115822422, -135.3130767061915,
     0.9163622882946337, -0.3946039834633436, 0.06758511385934628, 917.6464175831973,
     0.0, 0.0, 0.0, 1.0]
).reshape(4, 4)
transformation2 = numpy.array(
    [0.3538838093846013, 0.9352637448678055, 0.0069155367658538455, 92.4380718493328,
     -0.004427438570940968, -0.005718767304706697, 0.999973886901392, -74.00623990858556,
     0.9352788591735129, -0.35390518568112983, 0.0021170437260363707, 949.783528073028,
     0.0, 0.0, 0.0, 1.0]
).reshape(4, 4)

I've attached the frame1, frame2 and frame2_warped (frame1 warped to the view of frame2). Ideally, frame2_warped should match with frame2 (except for the object motions).

frame1
frame1

frame2
frame2

frame2_warped
frame2_warped

Am I missing something?

@oscarmcnulty
Copy link
Owner

It looks like the polarity of the transform you are doing is somehow reversed? Shouldn't frame2_warped be more zoomed in than the original frame1?

You could try using the visualization script at https://github.com/oscarmcnulty/gta-3d-dataset/blob/master/test_vis.py to debug?

@NagabhushanSN95
Copy link
Author

Thanks for the lightning-quick reply. That was my thought too. So, I tried changing the sign of the z-translation after computing the relative transformation. Though the camera zoomed into the scene, the amount of zooming was very small compared to frame2. So, I thought something else might be wrong.

Can you please confirm if my computation of depth, camera intrinsic, and camera poses is correct?
I'll also try to debug using the test_vis.py code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants