Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

io_win32 does not handle non-ascii chars in paths. #3951

Closed
ml232528 opened this issue Nov 27, 2017 · 11 comments
Closed

io_win32 does not handle non-ascii chars in paths. #3951

ml232528 opened this issue Nov 27, 2017 · 11 comments

Comments

@ml232528
Copy link

protoc.exe --cpp_out=. ./helloworld.proto
If the path is in Chinese, this problem will arise.
The GetCurrentDirectoryA function returns the GB18030 encoding
Using MultiByteToWideChar (CP_UTF8, 0, s.c_str (), s.size (), result.get (), len + 1) error ,CP_ACP should be used。

file :src\google\protobuf\stubs\io_win32.cc

@xfxyjwf
Copy link
Contributor

xfxyjwf commented Nov 27, 2017

Can you avoid non-ASCII characters in your dir path? The protobuf code base assumes UTF-8 for char encoding so it likely doesn't work well with non-ASCII characters in non-UTF-8 environment.

@liujisi
Copy link
Contributor

liujisi commented Nov 27, 2017

@laszlocsomor Could you please take a look at this?

@liujisi liujisi changed the title helloworld.proto: No such file or directory io_win32 does not handle non-ascii chars in paths. Nov 27, 2017
@laszlocsomor
Copy link
Contributor

I can take a look later today, have another important bug on my plate.

@laszlocsomor
Copy link
Contributor

Indeed, this is a bug, I could repro it, and it works on Linux and doesn't on Windows.
Let me take a closer look.

@laszlocsomor
Copy link
Contributor

@pherl : I can't self-assign this bug. Could you assign it to me please?

@liujisi liujisi assigned liujisi and unassigned liujisi Nov 29, 2017
@liujisi
Copy link
Contributor

liujisi commented Nov 29, 2017

Hmm, looks like I can only assign the issue to one of the owners of the project.

BTW, what do you think about the other issue about wildcast no longer works on windows protoc (#3957)?

@raphael3207
Copy link

raphael3207 commented Nov 30, 2017 via email

@laszlocsomor
Copy link
Contributor

@pherl : I'm still trying to understand whether the other issue is related to this one or not. Maybe you can tell this: do you know how the CLI expands wildcard patterns to a list of filenames? (I could figure it out but asking is easier. ;)

Also, it seems that already the main method receives a bad octet-stream for non-ascii charcters.
I modified https://github.com/google/protobuf/blob/master/src/google/protobuf/compiler/main.cc to print the arguments as string and as octet-stream. Even though I seem to be able to write non-ascii text into cmd.exe, the protoc.exe's main method already receives question marks.

C:\tempdir>protoc.exe --cpp_out c:\tempdir\cyr\Привет\foo hello_there.proto
DEBUG[main] argv=4
DEBUG[main] argv[0]=(protoc.exe)
  0x70 0x72 0x6f 0x74 0x6f 0x63 0x2e 0x65 0x78 0x65
DEBUG[main] argv[1]=(--cpp_out)
  0x2d 0x2d 0x63 0x70 0x70 0x5f 0x6f 0x75 0x74
DEBUG[main] argv[2]=(c:\tempdir\cyr\??????\foo)
  0x63 0x3a 0x5c 0x74 0x65 0x6d 0x70 0x64 0x69 0x72 0x5c 0x63 0x79 0x72 0x5c 0x3f 0x3f 0x3f 0x3f 0x3f 0x3f 0x5c 0x66 0x6f 0x6f
DEBUG[main] argv[3]=(hello_there.proto)
  0x68 0x65 0x6c 0x6c 0x6f 0x5f 0x74 0x68 0x65 0x72 0x65 0x2e 0x70 0x72 0x6f 0x74 0x6f
c:\tempdir\cyr\??????\foo/: Invalid argument

@laszlocsomor
Copy link
Contributor

For the record, my edit is:

$ git d
diff --git a/src/google/protobuf/compiler/main.cc b/src/google/protobuf/compiler/main.cc
index 680d642..78d32c7 100644
--- a/src/google/protobuf/compiler/main.cc
+++ b/src/google/protobuf/compiler/main.cc
@@ -48,6 +48,14 @@
 #endif  // ! OPENSOURCE_PROTOBUF_CPP_BOOTSTRAP

 int main(int argc, char* argv[]) {
+  fprintf(stdout, "DEBUG[main] argv=%d\n", argc);
+  for (int i = 0; i < argc; ++i) {
+    fprintf(stdout, "DEBUG[main] argv[%d]=(%s)\n ", i, argv[i]);
+    for (char* p = argv[i]; *p; ++p) {
+      fprintf(stdout, " 0x%2x", *p);
+    }
+    fputc('\n', stdout);
+  }

   google::protobuf::compiler::CommandLineInterface cli;
   cli.AllowPlugins("protoc-");

@laszlocsomor
Copy link
Contributor

FYI, I have a bugfix that I'll send a PR for shortly, that fixes the original issue in this thread:

C:\tempdir\cyr\Привет\foo>dir /b foo

C:\tempdir\cyr\Привет\foo>protoc-dev.exe --cpp_out foo hello.proto

C:\tempdir\cyr\Привет\foo>dir /b foo
hello.pb.cc
hello.pb.h

@laszlocsomor
Copy link
Contributor

Created a PR with my bugfix that I mentioned in my previous comment: #3978

laszlocsomor added a commit to laszlocsomor/protobuf that referenced this issue Dec 7, 2017
Do not use "googletest.h", apprently that leads to
linking errors on Windows which I couldn't figure
out how to solve, and decided to just go with
plain gTest instead.

See protocolbuffers#3951
laszlocsomor added a commit to laszlocsomor/protobuf that referenced this issue Dec 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants