-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow SimpleMRS to read from noisy input #92
Comments
On Fri, Jan 13, 2017 at 10:10 AM, Michael Wayne Goodman < ***@***.***> wrote:
Alternatively, if we expect SENT: ... to contain the input string, we
could use that to fill the XMRS object's surface field.
That is a good idea.
…--
Francis Bond <http://www3.ntu.edu.sg/home/fcbond/>
Division of Linguistics and Multilingual Studies
Nanyang Technological University
|
Since this issue is really about converting from ACE output and ignoring the non-MRS data, I added $ echo -e "One dog slept.\nTwo dogs slept." | ace -g ~/grammars/erg-1214-x86-64-0.9.27.dat | ./delphin.sh convert --from ace --pretty-print
NOTE: 1 readings, added 595 / 96 edges to chart (43 fully instantiated, 52 actives used, 27 passives used) RAM: 1524k
NOTE: 2 readings, added 615 / 118 edges to chart (49 fully instantiated, 68 actives used, 32 passives used) RAM: 1649k
NOTE: parsed 2 / 2 sentences, avg 1586k, time 0.01775s
[ "One dog slept."
TOP: h0
INDEX: e2 [ e SF: prop TENSE: past MOOD: indicative PROG: - PERF: - ]
RELS: < [ udef_q<0:3> LBL: h4 ARG0: x3 [ x PERS: 3 NUM: sg IND: + ] RSTR: h5 BODY: h6 ]
[ card<0:3> LBL: h7 ARG0: e9 [ e SF: prop TENSE: untensed MOOD: indicative PROG: - PERF: - ] ARG1: x3 CARG: "1" ]
[ _dog_n_1<4:7> LBL: h7 ARG0: x3 ]
[ _sleep_v_1<8:14> LBL: h1 ARG0: e2 ARG1: x3 ] >
HCONS: < h0 qeq h1 h5 qeq h7 > ]
[ "Two dogs slept."
TOP: h0
INDEX: e2 [ e SF: prop TENSE: past MOOD: indicative PROG: - PERF: - ]
RELS: < [ udef_q<0:3> LBL: h4 ARG0: x3 [ x PERS: 3 NUM: pl IND: + ] RSTR: h5 BODY: h6 ]
[ card<0:3> LBL: h7 ARG0: e9 [ e SF: prop TENSE: untensed MOOD: indicative PROG: - PERF: - ] ARG1: x3 CARG: "2" ]
[ _dog_n_1<4:8> LBL: h7 ARG0: x3 ]
[ _sleep_v_1<9:15> LBL: h1 ARG0: e2 ARG1: x3 ] >
HCONS: < h0 qeq h1 h5 qeq h7 > ]
[ "Two dogs slept."
TOP: h0
INDEX: e2 [ e SF: prop TENSE: past MOOD: indicative PROG: - PERF: - ]
RELS: < [ focus_d<0:15> LBL: h1 ARG0: e4 [ e SF: prop TENSE: untensed MOOD: indicative PROG: - PERF: - ] ARG1: e2 ARG2: e5 [ e SF: prop TENSE: untensed MOOD: indicative PROG: - PERF: - ] ]
[ loc_nonsp<0:3> LBL: h1 ARG0: e5 ARG1: e2 ARG2: x6 [ x PERS: 3 NUM: sg ] ]
[ number_q<0:3> LBL: h7 ARG0: x6 RSTR: h8 BODY: h9 ]
[ card<0:3> LBL: h10 ARG0: x6 ARG1: i12 CARG: "2" ]
[ udef_q<4:8> LBL: h13 ARG0: x3 [ x PERS: 3 NUM: pl IND: + ] RSTR: h14 BODY: h15 ]
[ _dog_n_1<4:8> LBL: h16 ARG0: x3 ]
[ _sleep_v_1<9:15> LBL: h1 ARG0: e2 ARG1: x3 ] >
HCONS: < h0 qeq h1 h8 qeq h10 h14 qeq h16 > ] This change reads both regular ACE parsing output and with the This does not detect the MRS data for generation (when using Also note that this does not use the |
The
-q
option is currently used in pipelines involving ACE and pyDelphin to suppress the input sentence from the output stream, which (along with-T
, which suppresses derivations) allows pyDelphin to read the stream containing only MRS data. This option may not be supported in the future, so perhaps the SimpleMRS reader should allow noisy data (see how the Penman package handles noisy data).Non-MRS data can probably be discarded. Alternatively, if we expect
SENT: ...
to contain the input string, we could use that to fill the XMRS object'ssurface
field.(edited to fix Penman URL)
The text was updated successfully, but these errors were encountered: