Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more datatypes in convert DataFrame to numpy #4444

Closed
wants to merge 3 commits into from

Conversation

hkvision
Copy link
Contributor

@hkvision hkvision commented Apr 19, 2022

Currently many datatypes are not correctly converted, and may cause error in subsequent model forward.

@hkvision hkvision requested a review from shanyu-sys April 19, 2022 03:03
@hkvision
Copy link
Contributor Author

Copy link
Contributor

@shanyu-sys shanyu-sys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hkvision
Copy link
Contributor Author

In our unit tests, many of the code looks like this:

df = rdd.map(lambda x: (np.random.randn(50).astype(np.float).tolist(),
                                [int(np.random.randint(0, 2, size=()))])
                     ).toDF(["feature", "label"])

Since python only have float (default is float64), int (default is int64), when converting to DataFrame, the schema will be array and array. Then in this case the model will throw error RuntimeError: expected scalar type Float but found Double

A few unit tests for example test.bigdl.orca.learn.ray.pytorch.test_estimator_pytorch_backend.TestPyTorchEstimator testMethod=test_data_parallel_sgd_correctness throws the opposite error RuntimeError: expected scalar type Double but found Float, which is really weird, probably a bug of error report in PyTorch...

@yushan111 Any comments on this?

@hkvision
Copy link
Contributor Author

hkvision commented Apr 21, 2022

Comment out the data conversion for array[float/double] and jenkins passed: http://10.112.231.51:18888/job/BigDL-Orca-PR-Validation/599/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants