Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRAFT: add validation in python for NormalAccessor #328

Closed
wants to merge 3 commits into from

Conversation

naomatheus
Copy link
Contributor

Draft PR to add NormalAccessor validation to Python experimental traits in lonboard.

I had some unrelated git history issues as I renamed the local repository to help stay organized, hence the branch name.

@naomatheus
Copy link
Contributor Author

naomatheus commented Jan 26, 2024

This could potentially be quite off base. But just took a stab at it for a draft.

This PR was created in response to reference here: #322 (comment)

we can go through how to add a new validator for this NormalAccessor data type, which allows us to create a new Python PointCloudLayer.

@kylebarron
Copy link
Member

We should add a mention of the pre-commit setup in DEVELOP.md, but you need to run

pip install -U pre-commit
pre-commit run --all-files

Your PR is erroring on formatting.

lonboard/experimental/traits.py Outdated Show resolved Hide resolved
lonboard/experimental/traits.py Outdated Show resolved Hide resolved
super().__init__(*args, **kwargs)
self.tag(sync=True, **NORMAL_SERIALIZATION)

def validate(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend writing some tests first, which will make it easier to sort out the logic in this function. You can find some related tests here.

The purpose of this validation function is to take whatever user input and both validate it and conform to a standard expectation of how we want to store it internally.

To align with deck.gl and other traits, we should allow a list with three scalar values, or a numpy array, or a pyarrow FixedSizeListArray.

But here the first thing you check is value.shape. Which won't exist for a list type. So it'll crash

In [1]: l = [1, 23, 50]

In [2]: l.shape
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[2], line 1
----> 1 l.shape

AttributeError: 'list' object has no attribute 'shape'

Hence why adding tests first might help

@@ -64,6 +64,14 @@ Anywidget and its dependency ipywidgets handles the serialization from Python in

Push a new tag to the main branch of the format `v*`. A new version will be published to PyPI automatically.

## Contribute
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole doc is geared towards contributors; maybe put this section above ## Publishing and name it ## Linting?

lonboard/_serialization.py Show resolved Hide resolved
lonboard/experimental/traits.py Show resolved Hide resolved
lonboard/experimental/traits.py Show resolved Hide resolved
lonboard/experimental/traits.py Show resolved Hide resolved
if not all(isinstance(item, float) for item in value):
self.error("All elements of Normal array must be floating point type")

return pa.FixedSizeListArray.from_arrays(value, fixed_list_size)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work...

The conceptual difference to understand is between a scalar which will be used as the value for all rows and a column, where a different value per row can be passed.

It's a bit confusing because a normal value has three elements, but an input value looking like [0, 0, 1] is still a scalar because that value applies to all rows.

To pass a scalar into @geoarrow/deck.gl-layers, we want to pass a JSON-like object, like [0, 0, 1], which it will interpret as a single scalar. Passing a pyarrow Array will cause @geoarrow/deck.gl-layers to interpret the input as a column instead. But then it will error because this "column" has only one row, whereas the table has multiple rows.

Instead you should be returning the input list or tuple

return value

Comment on lines +153 to +156
if not np.any(value == 1):
self.error(
obj, value, info="One of the normal vector elements must be 1."
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great idea for a validation step, but I don't think it's correct. There's no good documentation in deck.gl of the exact format expected. Most likely, these normal vectors are expected to be "normalized" which means that their length is expected to be 1. I.e. x^2 + y^2 + z^2 = 1. This means that almost never will one of the elements themselves be 1.

In any case I'd tend to think this is something we shouldn't validate, because deck.gl might be lenient and allow non-normalized vectors.

lonboard/experimental/traits.py Show resolved Hide resolved
lonboard/experimental/traits.py Show resolved Hide resolved
lonboard/experimental/traits.py Show resolved Hide resolved
lonboard/experimental/traits.py Show resolved Hide resolved
@naomatheus
Copy link
Contributor Author

naomatheus commented Feb 20, 2024

Hey @kylebarron I will start a new branch that's better synced with main
I'll retain the branch just for reference

@naomatheus naomatheus closed this Feb 20, 2024
@naomatheus naomatheus mentioned this pull request Feb 20, 2024
kylebarron added a commit that referenced this pull request Feb 29, 2024
Adds NormalAccessor and draft tests
Respond to comments on #328 and move recent work here. Branch #328 will
be deleted

---------

Co-authored-by: Kyle Barron <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants