-
Notifications
You must be signed in to change notification settings - Fork 92
Getting started
The following documentation assumes you're familiar with and already using both the Scala programminag language and the sbt build tool:
In Scala 2.11 and later, add the following to your build.sbt
file's libraryDependencies
:
"org.scala-lang.modules" %% "scala-xml" % "1.0.6"
You can then, for example, use XML literals:
val book: scala.xml.Elem = <book id="b20234">Magic of scala-xml</book>
You can query XML values with an XPath-like syntax:
val id = book \@ "id"
id: String = b20234
val text = book.text
text: String = Magic of scala-xml
XML more often has sub-elements:
val books = <books>
<book id="b1615">Don Quixote</book>
<book id="b1867">War and Peace</book>
</books>
Retrieving the child elements is possible, but a little more complicated:
val titles = (books \ "book").map(_.text).toList
titles: List[String] = List(Don Quixote, War and Peace)
Many return types of scala-xml are are Scala collections. If you aren't familiar with Scala collections, you should read the documentation for Scala collections.
Finding the text of an XML element by its id
val quixote = (books \ "book").find(book => (book \@ "id") == "b1615").map(_.text)
quixote: Option[String] = Some(Don Quixote)
Most operations on collections can use Scala's for-comprehension. For example, consider the following XML data representing a purchase order:
<?xml version="1.0"?>
<purchaseOrder orderDate="1999-10-20">
<shipTo country="US">
<name>Alice Smith</name> <street>123 Maple Street</street>
<city>Mill Valley</city> <state>CA</state> <zip>90952</zip>
</shipTo>
<billTo country="US">
<name>Robert Smith</name> <street>8 Oak Avenue</street>
<city>Old Town</city> <state>PA</state> <zip>95819</zip>
</billTo>
<comment>Hurry, my lawn is going wild!</comment>
<items>
<item partNum="872-AA">
<productName>Lawnmower</productName> <quantity>1</quantity>
<USPrice>148.95</USPrice> <comment>Confirm this is electric</comment>
</item>
<item partNum="926-AA">
<productName>Baby Monitor</productName> <quantity>1</quantity>
<USPrice>39.98</USPrice> <shipDate>1999-05-21</shipDate>
</item>
</items>
</purchaseOrder>
This example is derived from similar code in the article "Scalable Programming Abstractions for XML Services" by Burak Emir, Sebastian Maneth and Martin Odersky. Here a file is loaded, and prices are retrieved for each item and summed together.
val doc = XML.loadFile("po.xml")
var total = BigDecimal(0).setScale(2, scala.math.BigDecimal.RoundingMode.HALF_UP)
for {
item <- doc \\ "item"
price <- item \ "USPrice"
} yield {
println("partnum: " + item \@ "partNum")
total += price.text.toDouble
}
println(s"Grand total " + total)
The program will output:
partnum: 872-AA
partnum: 926-AA
Grand total 188.93
To open XML from files use scala.xml.XML
:
val books = scala.xml.XML.loadFile("books.xml")
To write XML to a file:
scala.xml.XML.save("books.xml", books)
To format XML use the scala.xml.PrettyPrinter
to configure the line length and indentation level:
val pp = new scala.xml.PrettyPrinter(24, 4)
pp.format(books)
<books>
<book id="b1615">
Don Quixote
</book>
<book id="b1867">
War and Peace
</book>
</books>
To transform your XML based on pattern matches, use the scala.xml.transform.RuleTransformer
in combination with one more scala.xml.transform.RewriteRule
definitions.
For example, consider the following XML value for calendar data:
val doc = <calendar>
<week>
<day>Monday</day>
<day>Tuesday</day>
<day>Wednesday</day>
<day>Thursday</day>
<day>Friday</day>
</week>
<year>
<month>January</month>
<month>February</month>
<month>March</month>
</year>
Here's a rule for abbreviating just the days of the week:
val abbreviateDayRule = new RewriteRule {
override def transform(n: Node): Seq[Node] = n match {
case elem: Elem if elem.label == "day" =>
elem.copy(child = elem.child collect {
case Text(data) => Text(data.take(3))
})
case n => n
}
}
You can then create a transformer, and transform the document:
val transformer = new RuleTransformer(abbreviateDayRule)
transformer(doc)
Producing:
<calendar>
<week>
<day>Mon</day>
<day>Tue</day>
<day>Wed</day>
<day>Thu</day>
<day>Fri</day>
</week>
<year>
<month>January</month>
<month>February</month>
<month>March</month>
</year>
</calendar>
Multiple rules can be combined together. Here is a rule for removing Fridays, and adding Saturdays.
val addSaturdayRule = new RewriteRule {
override def transform(n: Node): Seq[Node] = n match {
case elem: Elem if elem.label == "week" =>
elem.copy(child = (elem.child ++ <day>Saturday</day>))
case n => n
}
}
val deleteFridayRule = new RewriteRule {
override def transform(n: Node): Seq[Node] = n match {
case elem: Elem if elem.label == "day" && elem.text == "Friday" => NodeSeq.Empty
case n => n
}
}
val transformer = new RuleTransformer(addSaturdayRule, deleteFridayRule)
transformer(doc)
Here the day Friday is removed and Saturday is added:
<calendar>
<week>
<day>Monday</day>
<day>Tuesday</day>
<day>Wednesday</day>
<day>Thursday</day>
<day>Saturday</day>
</week>
<year>
<month>January</month>
<month>February</month>
<month>March</month>
</year>
</calendar>
Keep in mind that rewrite rules won't compose if they modify children, or modify values that other rewrite rules depend on.
For example, abbreviating the days, and then trying to delete Friday won't work since "Friday" no longer exists.
val transformer = new RuleTransformer(abbreviateDayRule, deleteFridayRule)
transformer(doc)
Also, making a new tree an element with the copy constructor means the children won't be transformed. So making new children under the week
element with Friday removed first, and then adding Saturday (or vice versa), will skip the second rewrite rule:
val transformer = new RuleTransformer(addSaturdayRule, deleteFridayRule)
transformer(doc)