-
-
Notifications
You must be signed in to change notification settings - Fork 372
/
Cookbook.scalatex
440 lines (357 loc) · 20.6 KB
/
Cookbook.scalatex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
@import Main._
@import readme.Sample._
@import ammonite.ops._
@val ammoniteTests = 'amm/'repl/'src/'test/'scala/'ammonite/'session
@val replSource = 'amm/'src/'main/'scala/'ammonite
@val advancedTests = ammoniteTests/"AdvancedTests.scala"
@sect("Ammonite Cookbook", "Fun with Scala-Scripting!")
@p
The Ammonite Scala REPL and Scripts are meant to be extended: you can load
in arbitrary Java/Scala modules from the internet via
@sect.ref{import $ivy}. Using this third-party code, you
extend the REPL to do anything you wish to do. Simple
install Java, @sect.ref("Ammonite-REPL", "download Ammonite") onto any
Linux/OSX machine, and try out one of these fun snippets! They work
directly in the @sect.ref{Ammonite-REPL}, or you can save them to
@sect.ref{Scala Scripts} if you want something more permanent.
@ul
@li
@sect.ref{HTTP Requests}
@li
@sect.ref{Scraping HTML}
@li
@sect.ref{GUI Applications}
@li
@sect.ref{Office Automation}
@li
@sect.ref{Image Processing}
@li
@sect.ref{Machine Learning}
@sect{HTTP Requests}
@p
Ammonite does not come with a built-in way to make HTTP requests, but there are Java /Scala modules that do this quite well! Here's an example:
@hl.scala
Welcome to the Ammonite Repl
@@ val resp = requests.get("https://api.github.com/repos/scala/scala")
@@ val parsed = upickle.json.read(resp.text()).asInstanceOf[upickle.Js.Obj]
parsed: upickle.Js.Obj = Obj(
ArrayBuffer(
("id", Num(2888818.0)),
("name", Str("scala")),
("full_name", Str("scala/scala")),
(
"owner",
Obj(
ArrayBuffer(
("login", Str("scala")),
...
@@ for((k, v) <- parsed.value) os.write(os.pwd/'target/'temp/k, upickle.json.write(v))
@@ os.list(os.pwd/'target/'temp)
res6: LsSeq = LsSeq(
/Users/haoyi/Dropbox (Personal)/Workspace/Ammonite/target/temp/archive_url,
/Users/haoyi/Dropbox (Personal)/Workspace/Ammonite/target/temp/assignees_url,
/Users/haoyi/Dropbox (Personal)/Workspace/Ammonite/target/temp/blobs_url,
/Users/haoyi/Dropbox (Personal)/Workspace/Ammonite/target/temp/branches_url,,
...
@p
In this example, we use the @lnk("Requests-Scala", "https://github.com/lihaoyi/requests-scala") library to download a URL, and we use @lnk("uPickle", "http://lihaoyi.github.io/upickle-pprint/upickle/") and OS-Lib to parse the JSON and write it into files. uPickle and Ammonite-Ops are bundled with the Ammonite REPL and are used internally.
@p
This is a small example, but it illustrates the potential: if you find yourself needing to scrape some website or bulk-download large quantities of data from some website's HTTP/JSON API, you can start doing so within a matter of seconds using Ammonite. The results are given to you in nicely structured data, and you can deal with them using any Java or Scala libraries or tools you are used to rather than being forced to munge around in Bash. Sometimes, you may find that you need to get data from somewhere without a nice JSON API, which means you'd need to fall back to @sect.ref{Scraping HTML}...
@sect{Scraping HTML}
@p
Not every website has an API, and not every website is meant to be accessed programmatically. That doesn't mean you can't do it! Using libraries like @lnk("JSoup", "http://jsoup.org/"), you can quickly and easily get the computer to extract useful information from HTML that was meant to humans. Using the Ammonite REPL, you can do it interactively and without needing to set up annoying project scaffolding.
@hl.scala
@@ import $ivy.`org.jsoup:jsoup:1.7.2`
@@ import org.jsoup._ // import Jsoup
@@ import collection.JavaConversions._ // make Java collections easier to use
@@ val doc = Jsoup.connect("http://en.wikipedia.org/").get()
@@ doc.select("h1")
res54: select.Elements = <h1 id="firstHeading" class="firstHeading" lang="en">Main Page</h1>
@@ doc.select("h2") // this is huge and messy
res55: select.Elements = <h2 id="mp-tfa-h2" style="margin:3px; background:#cef2e0; font-family:inherit; font-size:120%; font-weight:bold; border:1px solid #a3bfb1; text-align:left; color:#000; padding:0.2em 0.4em;"><span class="mw-headline" id="From_today.27s_featured_article">From today's featured article</span></h2>
<h2 id="mp-dyk-h2" style="margin:3px; background:#cef2e0; font-family:inherit; font-size:120%; font-weight:bold; border:1px solid #a3bfb1; text-align:left; color:#000; padding:0.2em 0.4em;"><span class="mw-headline" id="Did_you_know...">Did you know...</span></h2>
...
@@ doc.select("h2").map(_.text) // but you can easily pull out the bits you want
res56: collection.mutable.Buffer[String] = ArrayBuffer(
"From today's featured article",
"Did you know...",
"In the news",
"On this day...",
"From today's featured list",
"Today's featured picture",
"Other areas of Wikipedia",
"Wikipedia's sister projects",
"Wikipedia languages",
"Navigation menu"
)
@p
If you wanted to scrape headlines off some news-site or scrape video game reviews off of some gaming site, you don't need to worry about setting up a project and installing libraries and all that stuff. You can simply load libraries like Jsoup right into the Ammonite REPL, copy some example from their website, and start scraping useful information in less than a minute.
@sect{GUI Applications}
@p
The Ammonite REPL runs on the Java virtual machine, which means you can use it to create Desktop GUI applications like anyone else who uses Java! Here's an example of how to create a hello-world interactive desktop app using Swing
@hl.scala
@@ {
import javax.swing._, java.awt.event._
val frame = new JFrame("Hello World Window")
val button = new JButton("Click Me")
button.addActionListener(new ActionListener{
def actionPerformed(e: ActionEvent) = button.setText("You clicked the button!")
})
button.setPreferredSize(new java.awt.Dimension(200, 100))
frame.getContentPane.add(button)
frame.pack()
frame.setVisible(true)
}
@p
This can be run inside the Ammonite REPL without installing anything, and will show the following window with a single button:
@img(src:="Swing1.png", marginLeft.auto, marginRight.auto, loadingLazy)
@p
When clicked, it changes text:
@img(src:="Swing2.png", marginLeft.auto, marginRight.auto, loadingLazy)
@p
Although this is just a small demo, you can use Ammonite yourself to experiment with GUI programming without needing to go through the hassle of setting up an environment and project and all that rigmarole. Just run the code right in the console! You can even interact with the GUI live in the console, e.g. running this snippet of code to add another action listener to keep count of how many times you clicked the button
@hl.scala
@@ {
var count = 0
button.addActionListener(new ActionListener{
def actionPerformed(e: ActionEvent) = {
count += 1
frame.setTitle("Clicks: " + count)
}
})
}
@p
Which immediately becomes visible in the title of the window:
@img(src:="Swing3.png", marginLeft.auto, marginRight.auto, loadingLazy)
@p
Even while you're clicking on the button, you can still access @code{count} in the console:
@hl.scala
@@ count
res12: Int = 6
@p
This is a level of live interactivity which is traditionally hard to come by in the world of desktop GUI applications, but with the Ammonite REPL, it's totally seamless
@p
You can also import JavaFX applications, using ScalaFX. This is easiest if you create a file `jfx.sc` to contain the relevant imports:
@hl.scala
@@ {
import $ivy.{
`org.scalafx::scalafx:16.0.0-R24`
}
lazy val osName = System.getProperty("os.name") match {
case n if n.startsWith("Linux") => "linux"
case n if n.startsWith("Mac") => "mac"
case n if n.startsWith("Windows") => "win"
case _ => throw new Exception("Unknown platform!")
}
val javaFXModules = Seq("base", "controls", "fxml", "graphics", "media", "swing", "web")
javaFXModules.map(m =>
interp.load.ivy(coursierapi.Dependency.of("org.openjfx", s"javafx-$m", "15.0.1").withClassifier(osName))
)
}
@p
And then you can use `import $file.jfx` with the ScalaFX DSL.
@hl.scala
@@ {
import $file.jfx
import scalafx.application.JFXApp3
import scalafx.application.JFXApp3.PrimaryStage
import scalafx.geometry.Insets
import scalafx.scene.Scene
import scalafx.scene.layout.HBox
import scalafx.scene.paint.Color._
import scalafx.scene.paint._
import scalafx.scene.text.Text
import scala.language.implicitConversions
@@main
def main() {
JFXApp3Demo.main(Array())
}
object JFXApp3Demo extends JFXApp3 {
override def start(): Unit = {
stage = new PrimaryStage {
title = "ScalaFX Hello World!"
scene = new Scene {
fill = Color.rgb(38, 38, 38)
content = new HBox {
padding = Insets(50, 80, 50, 80)
children = Seq(
new Text {
text = "Hello World!"
style = "-fx-font: normal bold 100pt sans-serif"
fill = new LinearGradient(endX = 0, stops = Stops(Red, DarkRed))
}
)
}
}
}
}
}
}
@sect{Office Automation}
@p
Apart from writing code, you very often find yourself dealing with documents and spreadsheets of various sorts. This is often rather tedious. Wouldn't it be cool if you could deal with these things programmatically? It turns out that there are open-source Java libraries such as @lnk("Apache POI", "https://poi.apache.org/") that let you do this, and with the Ammonite-REPL you can quickly and easily load these libraries and get to work on your favorite documents. Here's an example extracting some data from my old Resume, in @code{.docx} format:
@hl.scala
@@ import $ivy.`org.apache.poi:poi-ooxml:3.13`
@@ import org.apache.poi.xwpf.usermodel._ // Bring Ms-Word APIs into scope
@@ import collection.JavaConversions._ // Make use of Java collections easier
@@ val path = pwd/'amm/'src/'test/'resources/'testdata/"Resume.docx"
@@ val docx = new XWPFDocument(new java.io.ByteArrayInputStream(os.read.bytes(path)))
@@ docx.get<tab>
getAllEmbedds getParagraphArray
getAllPackagePictures getParagraphPos
getAllPictures getParagraphs
getBodyElements getParagraphsIterator
getBodyElementsIterator getParent
getClass getPart
getCommentByID getPartById
getComments getPartType
getDocument getPictureDataByID
...
@@ docx.getParagraphs.map(_.getText)
res28: collection.mutable.Buffer[String] = ArrayBuffer(
"""
Haoyi Li
""",
"""
Education Massachusetts Institute of Technology Cambridge, MA
""",
"""
Bachelor of Science degree in Computer Science, GPA 4.8/5.0 Sep 2010 - Jun 2013
""",
"""
Work Dropbox San Francisco, CA
"""
...
@@ docx.getHyperlinks.map(_.getURL)
res27: Array[String] = Array(
"http://vimeo.com/87845442",
"http://www.github.com/lihaoyi/scalatags",
"http://www.github.com/lihaoyi/scala.rx",
"http://www.github.com/lihaoyi/scala-js-fiddle",
"http://www.github.com/lihaoyi/metascala",
"https://www.github.com/lihaoyi/macropy",
"http://www.github.com/lihaoyi",
"http://www.github.com/lihaoyi"
)
@p
As you can see, loading the Apache POI library is just a single command, reading in my resume file is just one or two more, and then you can immediately start exploring the document's data model to see what inside interests you. You even get tab-completion on the methods of the document, making it really easy for you to interactively explore all the things that a word document has to offer!
@p
This is just a small example, but you can easily do more things in the same vein: Apache POI lets you create/modify/save @code{.docx} files in addition to reading from them, meaning you can automatically perform batch operations on large numbers of documents. The library also provides mechanisms to load in Excel spreadsheets and Powerpoint slide decks, meaning you have easy, programmable access to the great bulk of any Microsoft-Office files you find yourself dealing with.
@sect{Image Processing}
@p
You can perform lots of image operations in Java. You can use @lnk("BufferedImage", "https://docs.oracle.com/javase/7/docs/api/java/awt/image/BufferedImage.html") if you want to access the low-level details or read/write individual pixels, and using @lnk("Java2D", "http://zetcode.com/gfx/java2d/") you can draw shapes, perform transforms, or do anything you could possibly want to do with the images.
@p
There are also simple libraries like @lnk("Thumbnailator", "https://github.com/coobird/thumbnailator") if you're doing basic things like renaming/resizing/rotating and don't need pixel-per-pixel access. This is an example of using Thumbnailator to resize a folder of images and put them somewhere else:
@hl.scala
@@ import $ivy.`net.coobird:thumbnailator:0.4.8`
@@ import net.coobird.thumbnailator._
import net.coobird.thumbnailator._
@@ val images = ls! pwd/'amm/'src/'test/'resources/'testdata/'images
images: LsSeq = LsSeq(
/Users/haoyi/Dropbox (Personal)/Workspace/Ammonite/repl/src/test/resources/testdata/images/GettingStarted.png,
/Users/haoyi/Dropbox (Personal)/Workspace/Ammonite/repl/src/test/resources/testdata/images/Highlighting.png,
/Users/haoyi/Dropbox (Personal)/Workspace/Ammonite/repl/src/test/resources/testdata/images/SystemShell.png
)
@@ val dest = pwd/'target/'thumbnails
dest: Path = /Users/haoyi/Dropbox (Personal)/Workspace/Ammonite/target/thumbnails
@@ mkdir! dest
@@ for(i <- images) {
Thumbnails.of(i.toString).size(200, 200).toFile(dest/i.last toString)
}
@@ val thumbnails = ls! dest
thumbnails: LsSeq = LsSeq(
/Users/haoyi/Dropbox (Personal)/Workspace/Ammonite/target/thumbnails/GettingStarted.png,
/Users/haoyi/Dropbox (Personal)/Workspace/Ammonite/target/thumbnails/Highlighting.png,
/Users/haoyi/Dropbox (Personal)/Workspace/Ammonite/target/thumbnails/SystemShell.png
)
@@ images.map(_.size) // Original image files are pretty big
res44: Seq[Long] = List(180913L, 208328L, 227570L)
@@ thumbnails.map(_.size) // Thumbnailed image files are much smaller
res45: Seq[Long] = List(11129L, 11790L, 11893L)
@sect{Machine Learning}
@p
The word "Machine Learning" sounds big and intimidating, like something you'd need to spend 6 years getting a PhD before you understand. What if you could "do some machine-learning" (whatever that means) in your spare time, in a minute or two? It turns out there are many Java libraries that can help you with basics, and with the Ammonite REPL getting started is easy.
@p
Here's one example of how you can get started using the @lnk("OpenNLP", "http://opennlp.apache.org/") project to do some natural-language processing in just a few minutes. The example was found @lnk("online", "http://www.programcreek.com/2012/05/opennlp-tutorial/#sentence detector"), and shows how to extract English names from a raw @code{String} using NLP:
@hl.scala
@@ import $ivy.`org.apache.opennlp:opennlp-tools:1.6.0` // load OpenNLP
@@ val tokenDataUrl = "http://opennlp.sourceforge.net/models-1.5/en-token.bin"
@@ val tokenData = requests.get(tokenDataUrl)
@@ import opennlp.tools.tokenize._ // let's get started with OpenNLP!
@@ val str = "Hi. How are you? This is Mike. Did you see book about Peter Smith?"
@@ import java.io.ByteArrayInputStream
@@ val tokenModel = new TokenizerModel(new ByteArrayInputStream(tokenData.text))
@@ val tokenizer = new TokenizerME(tokenModel)
@@ val tokens = tokenizer.tokenize(str)
tplems: Array[String] = Array(
"Hi",
".",
"How",
"are",
"you"
...
@@ import opennlp.tools.namefind._ // Turns out we need more test data...
@@ val nameDataUrl = "http://opennlp.sourceforge.net/models-1.5/en-ner-person.bin"
@@ val nameData = Http(nameDataUrl).asBytes
nameData: HttpResponse[Array[Byte]] = HttpResponse(
Array(
80,
75,
3,
4,
20,
...
@@ val nameModel = new TokenNameFinderModel(new ByteArrayInputStream(nameData.body))
@@ val nameFinder = new NameFinderME(nameModel)
nameFinder: NameFinderME = opennlp.tools.namefind.NameFinderME@@491eb5ef
@@ val names = nameFinder.find(tokens)
names: Array[opennlp.tools.util.Span] = Array([8..9) person, [15..17) person)
@@ opennlp.tools.util.Span.spansToStrings(names, tokens) // Woohoo, names!
res96: Array[String] = Array("Mike", "Peter Smith")
@p
This took a while, but only in comparison to the earlier cookbook recipes: this one is still less than 20 steps, which is not bad for something that installs multiple third-party modules, pulls down training data off the internet, and then does natural language processing to extract the English names from a text blob!
@p
Obviously we did not go very deep into the field. If you did, it would definitely be a lot more reading and understanding than just blindly following tutorials like I did above, and you probably would find it worth the time to set up a proper project. Nevertheless, this quick 5-minute run through of how to perform the basics of NLP is a fun way to get started whether or not you decide to take it further, and is only possible because of the Ammonite REPL!
@sect{Play Framework Server}
@hl.ref("integration"/"src"/"test"/"resources"/"ammonite"/"integration"/"basic"/"PlayFramework.sc")
@p
Ammonite's script-running capabilities can also be used as a way to set up lightweight Scala projects without needing SBT or an IDE to get started. For example, here is a single-file @lnk("Play Framework", "https://www.playframework.com/") test that
@ul
@li
Spins up a HTTP server and
@li
Makes a single HTTP request against it and prints the response
@li
Shuts down the server.
@p
And can be run via
@code
./amm PlayFramework.sc
@p
Although this is just a hello world example, you can easily keep the server running (instead of exiting after a test request) and extend it with more functionality, possibly splitting it into multiple @sect.ref{Script Files}.
@sect{SQL Database}
@p
Ammonite is great for those database jobs that are too complicated for SQL alone. This example uses @lnk("ScalikeJDBC", "http://scalikejdbc.org/") to update some rows.
@hl.scala
@@ import $ivy.{
`org.scalikejdbc::scalikejdbc:3.0.0`,
`ch.qos.logback:logback-classic:1.2.3`,
`mysql:mysql-connector-java:5.1.6`
}
@@ Class.forName("com.mysql.jdbc.Driver")
@@ import scalikejdbc._
@@ ConnectionPool.singleton("jdbc:mysql://localhost/database", "root", "")
@@ implicit val session = AutoSession
@@ val users =
sql"""select id, email from users"""
.map(rs => rs.long("id") -> rs.string("email")).list.apply()
users: List[(Long, String)] = List((1L, "foo@@bar"), (2L, "bar@@baz"))
@@ def isNormalised(email: String): Boolean = ???
@@ def normaliseEmail(email: String): String = ???
@@ val usersWithInvalidEmail =
users.filterNot { case (id, email) => isNormalised(email) }
usersWithInvalidEmail: List[(Long, String)] = List((1L, "foo@@bar"), (2L, "bar@@baz"))
@@ usersWithInvalidEmail.foreach { case (id, email) =>
val updatedEmail = normaliseEmail(email)
sql"""update users set email = ${updatedEmail} where id = ${id};""".update.apply()
}