Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail with non-ascii char #37

Open
ikus060 opened this issue Dec 30, 2015 · 8 comments
Open

Fail with non-ascii char #37

ikus060 opened this issue Dec 30, 2015 · 8 comments

Comments

@ikus060
Copy link

ikus060 commented Dec 30, 2015

Running j2py on a java file containing non ascii caracthers is failing:

Traceback (most recent call last):
  File "/usr/local/bin/j2py", line 120, in runTransform
    tree = buildAST(source)
  File "/usr/local/lib/python2.7/dist-packages/java2python/compiler/__init__.py", line 15, in buildAST
    lexer = Lexer(StringStream(source))
  File "/usr/local/lib/python2.7/dist-packages/antlr_python_runtime-3.1.3-py2.7.egg/antlr3/streams.py", line 336, in __init__
    self.strdata = unicode(data)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3345: ordinal not in range(128)

I guess j2py is not properly handling unicode string.

@ikus060
Copy link
Author

ikus060 commented Dec 30, 2015

I didn'T review all the code to make sure my modification didn't intorudce any side effect, but this following modification is working for me.

diff --git a/bin/j2py b/bin/j2py
index 6eb1a40..34f1548 100755
--- a/bin/j2py
+++ b/bin/j2py
@@ -6,12 +6,16 @@
 a file, translate it, and write it out.

 """
+from __future__ import unicode_literals
+
 import sys
 from argparse import ArgumentParser, ArgumentTypeError
 from collections import defaultdict
+from io import open
 from logging import _levelNames as logLevels, exception, warning, info, basicConfig
 from os import path, makedirs
 from time import time

 from java2python.compiler import Module, buildAST, transformAST
 from java2python.config import Config
@@ -107,7 +111,7 @@

     try:
         if filein != '-':
-            source = open(filein).read()
+            source = open(filein, encoding='utf-8').read()
         else:
             source = sys.stdin.read()
     except (IOError, ), exc:
diff --git a/java2python/compiler/__init__.py b/java2python/compiler/__init__.py
index 4325201..b16c51f 100644
--- a/java2python/compiler/__init__.py
+++ b/java2python/compiler/__init__.py
@@ -5,6 +5,7 @@
 # This module provides a simpler facade over the rest of the compiler
 # subpackage.  Client code should use the values in this module
 # instead of using directly referencing items within the subpackage.
+from __future__ import unicode_literals

 from java2python.compiler.block import Module
 from java2python.lang import Lexer, Parser, StringStream, TokenStream, TreeAdaptor
diff --git a/java2python/compiler/block.py b/java2python/compiler/block.py
index 4cf7b09..185df39 100644
--- a/java2python/compiler/block.py
+++ b/java2python/compiler/block.py
@@ -11,6 +11,7 @@
 # This means they're very tightly coupled and that the classes are not
 # very reusable.  The module split does allow for grouping of related
 # methods and does hide the cluttered code.
+from __future__ import unicode_literals

 from sys import modules
 from java2python.compiler import template, visitor
@@ -19,7 +20,7 @@
 def addTypeToModule((className, factoryName)):
     """ Constructs and adds a new type to this module. """
     bases = (getattr(template, className), getattr(visitor, className))
-    newType = type(className, bases, dict(factoryName=factoryName))
+    newType = type(str(className), bases, dict(factoryName=factoryName))
     setattr(modules[__name__], className, newType)


diff --git a/java2python/compiler/template.py b/java2python/compiler/template.py
index 4f4dfe1..d7b80bb 100644
--- a/java2python/compiler/template.py
+++ b/java2python/compiler/template.py
@@ -12,8 +12,9 @@
 # the compiler subpackage into multiple modules.  So-called patterns
 # are usually a sign of a bad design and/or language limitations, and
 # this case is no exception.
+from __future__ import unicode_literals

-from cStringIO import StringIO
+from io import StringIO
 from functools import partial
 from itertools import chain, ifilter, imap

diff --git a/java2python/compiler/visitor.py b/java2python/compiler/visitor.py
index f62e53e..9d3bcaf 100644
--- a/java2python/compiler/visitor.py
+++ b/java2python/compiler/visitor.py
@@ -10,7 +10,7 @@
 # at runtime.  These classes use their factory callable more often than their
 # template counterparts; during walking, the typical behavior is to either define
 # the specific Python source, or to defer it to another block, or both.
-
+from __future__ import unicode_literals

 from functools import reduce, partial
 from itertools import ifilter, ifilterfalse, izip, tee
diff --git a/java2python/config/__init__.py b/java2python/config/__init__.py
index 2aa8387..74ce4a3 100644
--- a/java2python/config/__init__.py
+++ b/java2python/config/__init__.py
@@ -1,6 +1,7 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 # java2python.config -> subpackage for run-time configuration.
+from __future__ import unicode_literals

 from functools import reduce
 from imp import load_source
diff --git a/java2python/config/default.py b/java2python/config/default.py
index 92c4a27..da51fe5 100644
--- a/java2python/config/default.py
+++ b/java2python/config/default.py
@@ -4,6 +4,7 @@
 # This is the default configuration file for java2python.  Unless
 # explicity disabled with the '-n' or '--nodefaults' option, the j2py
 # script will import this module for runtime configuration.
+from __future__ import unicode_literals

 from java2python.mod import basic, transform
 from java2python.lang.selector import *
diff --git a/java2python/lang/JavaLexer.py b/java2python/lang/JavaLexer.py
index 9c1725a..3f3f5fe 100644
--- a/java2python/lang/JavaLexer.py
+++ b/java2python/lang/JavaLexer.py
@@ -1,4 +1,5 @@
 # $ANTLR 3.1.3 Mar 18, 2009 10:09:25 Java.g 2012-01-29 13:54:05
+from __future__ import unicode_literals

 import sys
 from antlr3 import *
diff --git a/java2python/lang/JavaParser.py b/java2python/lang/JavaParser.py
index 28b9c64..cd3ff20 100644
--- a/java2python/lang/JavaParser.py
+++ b/java2python/lang/JavaParser.py
@@ -1,4 +1,5 @@
 # $ANTLR 3.1.3 Mar 18, 2009 10:09:25 Java.g 2012-01-29 13:54:04
+from __future__ import unicode_literals

 import sys
 from antlr3 import *
diff --git a/java2python/lang/base.py b/java2python/lang/base.py
index 0633b8e..f0202a1 100644
--- a/java2python/lang/base.py
+++ b/java2python/lang/base.py
@@ -46,6 +46,7 @@
 # Tree objects.  Our adaptor, TreeAdaptor, creates the LocalTree
 # instances.
 #
+from __future__ import unicode_literals

 from cStringIO import StringIO

diff --git a/java2python/lang/selector.py b/java2python/lang/selector.py
index 22b531a..ba9ca54 100644
--- a/java2python/lang/selector.py
+++ b/java2python/lang/selector.py
@@ -14,6 +14,7 @@
 # Projects using java2python should regard this subpackage as
 # experimental.  While the interfaces are not expected to change, the
 # semantics may.  Use with caution.
+from __future__ import unicode_literals

 from java2python.lang import tokens

diff --git a/java2python/lib/__init__.py b/java2python/lib/__init__.py
index efd7b1a..365578f 100644
--- a/java2python/lib/__init__.py
+++ b/java2python/lib/__init__.py
@@ -1,6 +1,7 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 # java2python.lib -> common library bits.
+from __future__ import unicode_literals

 from functools import partial

diff --git a/java2python/mod/basic.py b/java2python/mod/basic.py
index 02e2f57..3a125ae 100644
--- a/java2python/mod/basic.py
+++ b/java2python/mod/basic.py
@@ -1,6 +1,7 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 # java2python.mod.basic -> functions to revise generated source strings.
+from __future__ import unicode_literals

 from itertools import count
 from logging import info, warn
diff --git a/java2python/mod/transform.py b/java2python/mod/transform.py
index 9b2e567..5a8ade6 100644
--- a/java2python/mod/transform.py
+++ b/java2python/mod/transform.py
@@ -10,6 +10,7 @@
 #
 # See the java2python.config.default and java2python.lang.selector modules to
 # understand how and when selectors are associated with these callables.
+from __future__ import unicode_literals

 import re
 from logging import warn

@luipugs
Copy link

luipugs commented Apr 9, 2016

@ikus060 I tried patching with the diff you posted but encountered the following error:

$ patch -u -p1 < fix.diff 
patching file bin/j2py
patch: **** malformed patch at line 21: @@ -107,7 +111,7 @@

How did you apply your diff?

@mahi83
Copy link

mahi83 commented Feb 18, 2017

I am facing the same error. Could you guys tell me what you did to fix it?

@alisonreboud
Copy link

Still have problems with that as well :)

@mazz
Copy link

mazz commented May 14, 2018

use this git patch.

git apply unicode.patch

https://gist.github.com/mazz/8924b1d93cb3d16790e39da001823435

@pascalJakobs
Copy link

hello all,
I can't apply your patch mazz, here follow the errors I got

pi@raspberrypi:~/java2python-0.5.1 $ git apply unicode.patch
error: patch failed: java2python/mod/transform.py:10
error: java2python/mod/transform.py: patch does not apply

Anybody can help please, I'm just a worm

Tks

@mazz
Copy link

mazz commented Aug 20, 2018

@pascalJakobs

virtualenv j2p
cd j2p
source bin/activate
pip install http://antlr3.org/download/Python/antlr_python_runtime-3.1.3.tar.gz
git clone https://github.com/natural/java2python.git
pip install -e java2python
cd java2python
<copy the raw patch to your clipboard>
cat >> unicode.patch
<paste>
<ctrl-d>
git apply unicode.patch

@pascalJakobs
Copy link

pascalJakobs commented Aug 20, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants