-
Notifications
You must be signed in to change notification settings - Fork 0
/
Babel_Guide.txt
275 lines (194 loc) · 10.7 KB
/
Babel_Guide.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
Babel is a knowledge representation language that aims to be as expressive as
most natural language.
Babel is modeled after the C family of computer programming languages which have
syntax that provide terse, human readable and easy to write code.
Parsing natural language is a hard problem, but it can be avoided by writing
Babel. By using this format natural language processing (NLP) can be separated
from the real artificial intelligence work.
Babel is not a representation of parsed natural language. It is designed to
represent specific meaning and does not include many of the subtleties of
natural language. Babel does not try to model natural language completely. The
aim is simply to be able to represent the meaning of most natural language.
Uses of Babel
The primary use of Babel is a format for talking to computers. Babel could
allow for an easier Turing Test to be attempted and may help move beyond the
current crude chat bot designs.
Babel could be used as a generalized goal definition system to define objectives
for an intelligent agent. It could describe problems, potential solutions and
actions to be performed by an agent.
Babel could be of use in translating between natural languages or in other areas
of linguistics. Learning and using Babel might improve your grammar since it
makes you think about the structure of what you are saying.
Current State of Babel
This is the first release of Babel. The language is capable of representing
simple sentences. More complex sentences may have to be broken into multiple
sentences. There are still many areas where the language can be expanded to
make it more expressive and easier to use.
The Babel grammar is written in a BNF variant that is used by the Gold Parsing
System (see http://www.devincook.com/goldparser/). The Gold Parser compiles the
grammar into tables which can then be used be loaded by an parsing engine, which
have been created for a variety of different development platforms. This will
help enable the language to be used on a variety of platforms. The reference
implementation is written in C# and requires the .Net Framework 2.0 or later to
run.
Babel to English translation is not a hard problem. Included in the project is
a translator that gives a reasonably understandable equivalent in English (see
http://translator.babelproject.com/). Getting the translation to a point where
it could pass for human for Turing Test scenarios may be much more difficult.
The English translator is very handy for finding mistakes in Babel code and acts
as a check to make sure you said what you meant to.
A parser and the translator is built into a program imaginatively called Babel
Editor. The Babel Editor environment provides a view of the parsed result that
shows how Babel is represented as objects in the reference implementation.
Great care has been taken to keep the parsed object structures simple.
Babel Editor Screenshot: http://babelproject.com/Babel-Editor-Screenshot.gif
Babel was created by me, Bryan Livingston. I also own and operate
http://cooltext.com. I'm more of a computer programmer than a linguist. I
searched long and hard for a language such as this but found none, so I created
Babel over several years as a side project.
Comments and suggestions are welcome on the mailing list at
http://groups.google.com/group/babelproject. The web site for the project is at
http://babelproject.com.
Learn Babel by Example
Subject Verb Object Pattern
// Joe saw Jane.
Joe.see-(Jane);
Babel is composed of sentences. Each sentence ends with a semi-colon. The "//"
mark indicates a line comment, which means that the text following it will be
ignored by the parser. Whitespace is ignored by the parser which means that you
could put spaces or line breaks anywhere and it will still parse the same.
Sentences follow a Subject-Verb-Object pattern. The subject of the sentence is
usually first. This is followed by a dot to separate the verb. Then the object
of the sentence is enclosed in parenthesis. This follows a C++, C# or Java
pattern of object.method(argument).
In natural language we use morphology to change the sound and spelling of words
to indicate things like plurals and tensing . In Babel this information is
pulled out of the words and added as special characters. The past tensing of
the verb "saw" in the example is done with a minus sign attached in a suffix
like manner. A plus sign is used for future tense words and if no suffix is
present then present tense is assumed for the verb.
Nouns are always capitalized. Currently there is no way to indicate that a noun
a proper noun. In spoken English this is specified by the absence of an article
(e.g. "the", "a" or "an".)
Implicit Subjects and Sentence as an Object
// See Jane run.
You.see(Jane.run());
see(Jane.run());
Whenever the initial subject is left off, an implicit noun "You" is used as the
subject. The above sentences are logical equivalents. In English we often drop
and make implicit the subject of a sentence when it is the person we are
speaking to.
The object of the sentence is another sentence. The object of a sentence can be
empty, a noun, a set of nouns, another sentence, or just an adjective. Unlike
computer programming languages there is max of only one object of a sentence.
Information that you would think would be used for multiple arguments would
probably be attached directly to the verb as a prepositional phrase.
Adjectives and Adverbs
// The tall man walked slowly.
Man tall.walk- slow();
In this sentence the adjective "tall" is attached to the noun "Man" and adverb
"slow" is attached to the verb "Walk".
The suffix "-ly" is dropped from “slowly”. When English is spoken the suffix is
used to indicate that it's an adverb, but in Babel that is made explicit by the
formatting.
There is no object of this particular sentence so the parentheses are left
empty.
Prepositional Phrases and Restrictive Clauses
// The quick brown fox jumped over the lazy dog.
Fox quick brown.jump- over[Dog lazy]();
// The horse that I rode died.
Horse that[I.ride-()].die-();
The prepositional phrase "over the lazy dog" is attached to the verb "jump".
Prepositions are attached as if they were adverbs followed by the object of the
proposition contained in square brackets.
Restrictive clauses are handled much like a prepositional phrase, but they are
instead attached to a noun.
In the reference implementation when these prepositional phrases and restrictive
clauses are parsed they are created as adverb and adjective objects but have the
contents of the brackets attached.
Declarations
// Joe is tall.
Joe.is(tall);
Joe.tall;
// Joe was tall.
Joe.is-(tall);
// Joe is a human.
Joe.Human;
The top two statements are logically identical. When the verb is left off the
verb is assumed to be “is”.
Unfortunately there is no shorthand for was, so in order to express the past
tense you have to explicitly use the “is” verb with a past tense suffix.
When the object in a declaration is a noun then the sentence is saying that the
subject is a subclass of the object.
Noun Sets
// Tashonda sent e-mail, cards, and letters.
Tashonda.send-(Email & Card* & Letter*);
// Tashonda sent e-mail, cards, or letters.
Tashonda.send-(Email | Card* | Letter*);
// Mrs. Doubtfire gave Tabitha and Samantha quizzes.
MrsDoubtfire.give- to[Tabitha & Samantha](Quiz*);
// Juanita and Celso worked hard.
Juanita & Celso.work- hard();
Anywhere that a noun can be used a set of nouns can be used instead. Simply
chain them together with ampersands for an and set and using pipes for or set.
You can not mix and and or in a single set.
Also note that the asterisk is used as a suffix to specify a plural.
Gerunds
// I love running.
I.love(Run=);
// Charles is working in the garden.
Charles.Work= in[Garden];
A gerund is a noun that is built from a verb by attaching an “-ing” suffix.
Note that the gerund words are capitalized since they are nouns.
The “-ing” suffix is also attached to present tense participles, which are verbs
that are used as adjectives, but there is no support in Babel yet for
participles.
Questions
// Where are you going?
where(You.are(Go=));
// What were you reading this morning?
what(You.read- this[Morning]());
// May I postpone this assignment?
may(I.postpone(Assignment));
// Is Joe tall?
is tall(Joe);
// Where is Joe?
where(Joe);
// What is the color of horse that I am riding?
what color(Horse that[I.is-(Ride=)]);
Questions are written with the interrogative word as a verb.
Questions usually use an implicit subject, so a subject of “You” is attached to
the question when parsed. Asking a question is technically giving an order
since an answer is being demanded.
Currently the parser is hard coded to recognize specific interrogatives. This
is less than ideal since it makes the parser English specific and in the future
a symbol may be introduced to indicate that a statement is a question.
Babel Limitations
Babel is very much a work in progress. Below are some of the areas that may be
improved upon.
Conjunctions
Most conjunctions are not currently available in the language. Complex
sentences can not easily be expressed without them.
Articles
The language does not support articles (a, an, and the) right now. In natural
language these are used to indicate that a noun is proper. It's questionable
whether this information is important enough be included in the language.
If support for articles was added it might take the form of adding an underscore
before or after the noun. Or the articles could be added as adjectives,
something which could be done right now.
Questions
Currently the parser is hard coded to recognize specific interrogatives. This
is less than ideal since it makes the parser English specific and in the future
a symbol may be introduced to indicate that a statement is a question.
Modifiers
Words that modify an adjective or adverb (e.g. very, slightly, rather, quite)
can not be expressed properly. For instance, it is not really possible to say
"the very black dog" since the modifier "very" needs to be attached to the
adjective "black" and not the noun "dog".
Negatives (e.g. The dog that is not running.) would also need to be attached as
a modifier.
Babel Emitter
If computers are going to be working with Babel they will need to be able to
generate it in object form and then save it as written Babel code. This is not
a seriously difficult task and will be part of the reference implementation
someday.