Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various Data Template generation issues on Windows #287

Open
mars-lan opened this issue May 4, 2020 · 13 comments
Open

Various Data Template generation issues on Windows #287

mars-lan opened this issue May 4, 2020 · 13 comments

Comments

@mars-lan
Copy link

mars-lan commented May 4, 2020

generateDataTemplate task failed with the following error when running on Windows.

Caused by: java.lang.IllegalArgumentException: 'other' has different root

The same Pegasus models build fine on Mac & Linux. See datahub-project/datahub#1640 for more details.

@hgcummings
Copy link

hgcummings commented Jul 2, 2020

(Adding to this issue as it falls under "Various" Data Template generation issues, but I'm not seeing the exact same error; let me know if you'd prefer me to split this out as a separate issue).

I'm seeing errors of the following form on Windows when building the linkedin/datahub project, for every single PDL file:

[main] ERROR com.linkedin.restli.tools.data.SchemaFormatTranslator - Parsed top-level schema does not match the schema file name. File: C:\Work\datahub\li-utils\src\main\pegasus\com\linkedin\avro2pegasus\events\common\datamonitor\PlatformName.pdl

The earlier debug log outputs the following:

[main] DEBUG com.linkedin.restli.tools.data.SchemaFormatTranslator - Loaded source schema: com/linkedin/avro2pegasus/events/common/datamonitor/PlatformName, from location: C:\Work\datahub\li-utils\src\main\pegasus\com\linkedin\avro2pegasus\events\common\datamonitor\PlatformName.pdl

... Which reveals the problem. schemaFullname at this point is com/linkedin/avro2pegasus/events/common/datamonitor/PlatformName but it should be com.linkedin.avro2pegasus.events.common.datamonitor.PlatformName, since this is the value that's going to be compared against the parsed top-level schema later.

The root cause is the line that replaces File.separatorChar with '.'. At this point in the code, it's working with URIs (which always use forward slashes, by definition), rather than file paths. So I think it should be replacing / with . on all platforms, rather than replacing the platform-specific file path separator with ..

@mars-lan
Copy link
Author

@evanw555 seems like @hgcummings has found the root cause. Could you look into a fix?

@junchuanwang
Copy link
Contributor

Thanks Mars, we already have internal task tracking this issue

@mars-lan
Copy link
Author

Any update @junchuanwang? Another custom is encountering a similar issue when building DataHub.

@junchuanwang
Copy link
Contributor

@mars-lan Last time our team reviewed the task, we de-prioritized it.

But since now there are more customer asking for the fix, we will review it again to see if we have resource to prioritize

@evanw555
Copy link
Contributor

The specific error mentioned by @hgcummings looks like it may have been solved in https://github.com/linkedin/rest.li/pull/448/files#diff-1ccaa495740d8d078ae73b8dcaff97f17d465f1403ed2c2b8a2342472db7e372L218, which was committed in the last day or two.

@nagarjunakanamarlapudi
Copy link

nagarjunakanamarlapudi commented Oct 28, 2020

@evanw555
When I updated the version of rest.li to 29.7.15, I come across a different error .

This is most likely related to how windows treats the paths in file system.

> Task :metadata-models:generateDataTemplate
There are 133 data schema input files. Using input root folder: C:\Users\nkanamar\Desktop\git-public\main-datahub\datahub\metadata-models\src\main\pegasus
[main] INFO com.linkedin.pegasus.generator.TemplateSpecGenerator - Class name: com.linkedin.data.template.StringArray, bound to schema:{ "type" : "array", "items" : "string" }, instead of schema: { "type" : "array", "items" : { "type" : "typeref", "name" : "SchemaFieldPath", "namespace" : "com.linkedin.dataset", "doc" : "Schema field path as described by schema normalizations rules: http://go/tms-schema", "ref" : "string" } }
[main] INFO com.linkedin.pegasus.generator.TemplateSpecGenerator - Class name: com.linkedin.data.template.StringArray, bound to schema:{ "type" : "array", "items" : "string" }, instead of schema: { "type" : "array", "items" : { "type" : "typeref", "name" : "SchemaFieldPath", "namespace" : "com.linkedin.dataset", "doc" : "Schema field path as described by schema normalizations rules: http://go/tms-schema", "ref" : "string" } }
Exception in thread "main" java.nio.file.InvalidPathException: Illegal char <:> at index 104: C:\Users\nkanamar\Desktop\git-public\main-datahub\datahub\li-utils\build\libs\li-utils-data-template.jar:pegasus/com/linkedin/common/FabricType.pdl
        at sun.nio.fs.WindowsPathParser.normalize(WindowsPathParser.java:182)
        at sun.nio.fs.WindowsPathParser.parse(WindowsPathParser.java:153)
        at sun.nio.fs.WindowsPathParser.parse(WindowsPathParser.java:77)
        at sun.nio.fs.WindowsPath.parse(WindowsPath.java:94)
        at sun.nio.fs.WindowsFileSystem.getPath(WindowsFileSystem.java:255)
        at java.nio.file.Paths.get(Paths.java:84)
        at com.linkedin.pegasus.generator.JavaCodeUtil.annotate(JavaCodeUtil.java:89)
        at com.linkedin.pegasus.generator.JavaDataTemplateGenerator.populateClassContent(JavaDataTemplateGenerator.java:1266)
        at com.linkedin.pegasus.generator.JavaDataTemplateGenerator.generate(JavaDataTemplateGenerator.java:274)
        at com.linkedin.pegasus.generator.PegasusDataTemplateGenerator.run(PegasusDataTemplateGenerator.java:138)
        at com.linkedin.pegasus.generator.PegasusDataTemplateGenerator.main(PegasusDataTemplateGenerator.java:110)

> Task :metadata-models:generateDataTemplate FAILED

FAILURE: Build failed with an exception.

* What went wrong:

@nickibi
Copy link
Contributor

nickibi commented Oct 28, 2020

We acknowledge the issue. We do not officially support Windows. We don't have plan to support Windows at the moment, if this becomes urgent, please feel free to discuss the priority.

@xdl
Copy link

xdl commented Aug 12, 2021

I'd also like to voice preference for getting this data template generation working on Windows as this is preventing us from building DataHub on Windows natively.

@hgcummings
Copy link

On the point of support for Windows, I think there's some value in template generation working, even if DataHub itself isn't expected to run on Windows.

It's fine to run DataHub itself in a container, but being able to populate it from a Windows machine can be quite useful in some scenarios, and the ingestion scripts depend on the data templates. (I have previously resorted to generating the templates in a container then copying them back to the Windows host.)

@gy20121221
Copy link

generateDataTemplate 在 Windows 上运行时,任务失败并出现以下错误。

Caused by: java.lang.IllegalArgumentException: 'other' has different root

相同的 Pegasus 模型在 Mac 和 Linux 上构建良好。有关更多详细信息,请参阅linkedin/datahub#1640

generateDataTemplate 在 Windows 上运行时,任务失败并出现以下错误。

Caused by: java.lang.IllegalArgumentException: 'other' has different root

相同的 Pegasus 模型在 Mac 和 Linux 上构建良好。有关更多详细信息,请参阅linkedin/datahub#1640

Can this problem be solved in the future?

@jkl0898
Copy link

jkl0898 commented Dec 13, 2022

Having the save problem on windows. Hope to support compiling and building on windows.

1 similar comment
@ssyue
Copy link

ssyue commented Mar 2, 2023

Having the save problem on windows. Hope to support compiling and building on windows.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants