Skip to content
Nikita Zonov edited this page May 15, 2018 · 24 revisions

The following documentation assumes you're familiar with and already using both the Scala programming language and the sbt build tool:

In Scala 2.11 and later, add the following to your build.sbt file's libraryDependencies:

"org.scala-lang.modules" %% "scala-xml" % "1.1.0"

You also need to use sbt version 1.1.2, or later.

For earlier versions of sbt, you need to add the following:

fork := true

Once you've added the dependency, you can use XML literals, for example:

val book: scala.xml.Elem = <book id="b20234">Magic of scala-xml</book>

You can query XML values with an XPath-like syntax:

val id = book \@ "id"
id: String = b20234

val text = book.text
text: String = Magic of scala-xml

XML more often has sub-elements:

val books = <books>
  <book id="b1615">Don Quixote</book>
  <book id="b1867">War and Peace</book>
</books>

Retrieving the child elements is possible, but a little more complicated:

val titles = (books \ "book").map(_.text).toList
titles: List[String] = List(Don Quixote, War and Peace)

Many return types of scala-xml are Scala collections. If you aren't familiar with Scala collections, you should read the documentation for Scala collections.

Finding the text of an XML element by its id

val quixote = (books \ "book").find(book => (book \@ "id") == "b1615").map(_.text)
quixote: Option[String] = Some(Don Quixote)

Most operations on collections can use Scala's for-comprehension. For example, consider the following XML data representing a purchase order:

<?xml version="1.0"?>
<purchaseOrder orderDate="1999-10-20">
  <shipTo country="US">
    <name>Alice Smith</name> <street>123 Maple Street</street>
    <city>Mill Valley</city> <state>CA</state> <zip>90952</zip>
  </shipTo>
  <billTo country="US">
    <name>Robert Smith</name> <street>8 Oak Avenue</street>
    <city>Old Town</city> <state>PA</state> <zip>95819</zip>
  </billTo>
  <comment>Hurry, my lawn is going wild!</comment>
  <items>
    <item partNum="872-AA">
      <productName>Lawnmower</productName> <quantity>1</quantity>
      <USPrice>148.95</USPrice> <comment>Confirm this is electric</comment>
    </item>
    <item partNum="926-AA">
      <productName>Baby Monitor</productName> <quantity>1</quantity>
    <USPrice>39.98</USPrice> <shipDate>1999-05-21</shipDate>
    </item>
  </items>
</purchaseOrder>

This example is derived from similar code in the article "Scalable Programming Abstractions for XML Services" by Burak Emir, Sebastian Maneth and Martin Odersky. Here a file is loaded, and prices are retrieved for each item and summed together.

val doc = XML.loadFile("po.xml")
var total = BigDecimal(0).setScale(2, scala.math.BigDecimal.RoundingMode.HALF_UP)
for {
  item  <- doc \\ "item"
  price <- item \ "USPrice"
} yield {
  println("partnum: " + item \@ "partNum")
  total += price.text.toDouble
}
println(s"Grand total " + total)

The program will output:

partnum: 872-AA
partnum: 926-AA
Grand total 188.93

To open XML from files use scala.xml.XML:

val books = scala.xml.XML.loadFile("books.xml")

To write XML to a file:

scala.xml.XML.save("books.xml", books)

To format XML use the scala.xml.PrettyPrinter to configure the line length and indentation level:

val pp = new scala.xml.PrettyPrinter(24, 4)
pp.format(books)
<books>
    <book id="b1615">
        Don Quixote
    </book>
    <book id="b1867">
        War and Peace
    </book>
</books>

To transform your XML based on pattern matches, use the scala.xml.transform.RuleTransformer in combination with one more scala.xml.transform.RewriteRule definitions.

For example, consider the following XML value for calendar data:

val doc = <calendar>
  <week>
    <day>Monday</day>
    <day>Tuesday</day>
    <day>Wednesday</day>
    <day>Thursday</day>
    <day>Friday</day>
  </week>
  <year>
    <month>January</month>
    <month>February</month>
    <month>March</month>
  </year>
</calendar>

Here's a rule for abbreviating just the days of the week:

val abbreviateDayRule = new RewriteRule {
  override def transform(n: Node): Seq[Node] = n match {
    case elem: Elem if elem.label == "day" =>
      elem.copy(child = elem.child collect {
        case Text(data) => Text(data.take(3))
      })
    case n => n
  }
}

You can then create a transformer, and transform the document:

val transform = new RuleTransformer(abbreviateDayRule)
transform(doc)

Producing:

<calendar>
  <week>
    <day>Mon</day>
    <day>Tue</day>
    <day>Wed</day>
    <day>Thu</day>
    <day>Fri</day>
  </week>
  <year>
    <month>January</month>
    <month>February</month>
    <month>March</month>
  </year>
</calendar>

Multiple rules can be combined together. Here is a rule for removing Fridays, and adding Saturdays.

val addSaturdayRule = new RewriteRule {
  override def transform(n: Node): Seq[Node] = n match {
    case elem: Elem if elem.label == "week" =>
      elem.copy(child = (elem.child ++ <day>Saturday</day>))
    case n => n
  }
}
val deleteFridayRule = new RewriteRule {
  override def transform(n: Node): Seq[Node] = n match {
    case elem: Elem if elem.label == "day" && elem.text == "Friday" => NodeSeq.Empty
    case n => n
  }
}
val transform = new RuleTransformer(addSaturdayRule, deleteFridayRule)
transform(doc)

Here the day Friday is removed and Saturday is added:

<calendar>
  <week>
    <day>Monday</day>
    <day>Tuesday</day>
    <day>Wednesday</day>
    <day>Thursday</day>
    <day>Saturday</day>
  </week>
  <year>
    <month>January</month>
    <month>February</month>
    <month>March</month>
  </year>
</calendar>

Keep in mind that rewrite rules won't compose if they modify children, or modify values that other rewrite rules depend on.

For example, abbreviating the days, and then trying to delete Friday won't work since "Friday" no longer exists.

val transform = new RuleTransformer(abbreviateDayRule, deleteFridayRule)
transform(doc)

Produces a calendar with fridays, still:

<calendar>
  <week>
    <day>Mon</day>
    <day>Tue</day>
    <day>Wed</day>
    <day>Thu</day>
    <day>Fri</day>
  </week>
  <year>
    <month>January</month>
    <month>February</month>
    <month>March</month>
  </year>
</calendar>
Clone this wiki locally