This library provides two functions to create an Elixir Map data structure from an XML string.
Usage:
XmlToMap.naive_map("<foo><bar>123</bar></foo>")
Results in:
%{"foo" => %{"bar" => "123"}}
Converts XML string to an Elixir map with strings for keys, not atoms, since atoms are not garbage collected.
This tool is inspired by Rails Hash.from_xml()
.
I call the function "naive", because there are known short comings and there is some controversy around using a conversion tool like this since XML and Maps are non-isomorphic and there is no standard way to convert all the information from one format to another. The recommended way to pull specific well structured information from XML is to use something like xpath. But if you understand the risks and still prefer to convert the whole XML string to a map then this tool is for you!
It is currently not able to parse XML namespace. It also can't determine if a child should be a list unless it sees a repeated child. If and only if nodes are repeated at the same level will they become a list.
# there are two points inside foo, so the value of "point" becomes a list. Had "foo" only contained one point then there would be no list but instead one nested map
XmlToMap.naive_map("<foo><point><x>1</x><y>5</y></point><point><x>2</x><y>9</y></point></foo>")
# => %{"foo" => %{"point" => [%{"x" => "1", "y" => "5"}, %{"x" => "2", "y" => "9"}]}}
Previously this package did not handle XML node attributes. The current version takes inspiration from a go goxml2json package and exports attributes in the map while prepending "-" so you know they are attributes.
Whenever we encounter an XML node with BOTH attributes and children, we wrap "#content" around the node's inner value.
For example this snippet has a Height
leaf node with attribute Units
and a value of 0.50
:
<ItemDimensions>
<Height Units="inches">0.50</Height>
</ItemDimensions>
This would become this snippet:
...
"ItemDimensions": {
"Height": {
"#content": "0.50",
"-Units": "inches"
}
}
Empty tags will have a value of nil.
This function produces arguably more verbose and less straightforward result, but it preserves the order of sequences and might be then instantiated back to XML in an isomorphic way.
In general, it reflects the tree structure of an input XML per se.
Usage:
XmlToMap.nested_map("<foo><bar>123</bar></foo>")
Results in:
%{
attributes: [],
name: "foo",
content: %{attributes: [], name: "bar", content: "123"}
}
The function does not make any assumptions about the content and recursively builds the result.
<ItemDimensions>
<Height Units="inches">0.50</Height>
</ItemDimensions>
This would become this snippet:
%{
attributes: [],
name: "ItemDimensions",
content: %{attributes: [{"Units", "inches"}], name: "Height", content: "0.50"}
}
Which might then be converted to JSON of the same shape.
To make it less verbose, pass purge_empty: true
as the second parameter to XmlToMap.nested_map/2
:
XmlToMap.nested_map("<foo><bar arg='yes'>123</bar><baz/></foo>", purge_empty: true)
Results in:
%{
name: "foo",
content: [
%{attributes: [{"arg", "yes"}], name: "bar", content: "123"},
%{name: "baz"}
]
}
There is a dependency on Erlsom to parse XML then converts the 'simple_form' structure into a map.
I prefer Erlsom because it is the best documented erlang XML parser and because it mentions that it does not produce new atoms during the scanning.
See tests for some example usage.
The package can be installed as:
-
Add
:elixir_xml_to_map
to your list of dependencies inmix.exs
:def deps do [{:elixir_xml_to_map, "~> 2.0"}] end
-
Ensure
:elixir_xml_to_map
is started before your application:def application do [extra_applications: [:elixir_xml_to_map]] end
Copyright (c) 2016-present, Homan Chou
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.