Working with XML
Encode and decode to and from XML. Whitespace is not conserved for round trips - but the order of the fields are.
Consecutive xml nodes with the same name are assumed to be arrays.
XML content data, attributes processing instructions and directives are all created as plain fields.
This can be controlled by:
Flag | Default | Sample XML |
---|---|---|
--xml-attribute-prefix | + (changing to +@ soon) | Legs in <cat legs="4"/> |
--xml-content-name | +content | Meow in <cat>Meow <fur>true</true></cat> |
--xml-directive-name | +directive | <!DOCTYPE config system "blah"> |
--xml-proc-inst-prefix | +p_ | <?xml version="1"?> |
Default Attribute Prefix will be changing in v4.30! In order to avoid name conflicts (e.g. having an attribute named "content" will create a field that clashes with the default content name of "+content") the attribute prefix will be changing to "+@".
This will affect users that have not set their own prefix and are not roundtripping XML changes.
In addition to the above flags, there are the following xml encoder/decoder options controlled by flags:
Flag | Default | Description |
---|---|---|
--xml-strict-mode | false | Strict mode enforces the requirements of the XML specification. When switched off the parser allows input containing common mistakes. See the Golang xml decoder for more details. |
--xml-keep-namespace | true | Keeps the namespace of attributes |
--xml-raw-token | true | Does not verify that start and end elements match and does not translate name space prefixes to their corresponding URLs. |
--xml-skip-proc-inst | false | Skips over processing instructions, e.g. <?xml version="1"?> |
--xml-skip-directives | false | Skips over directives, e.g. <!DOCTYPE config system "blah"> |
See below for examples
Notice how all the values are strings, see the next example on how you can fix that.
Given a sample.xml file of:
<?xml version="1.0" encoding="UTF-8"?>
<cat>
<says>meow</says>
<legs>4</legs>
<cute>true</cute>
</cat>
then
yq -oy '.' sample.xml
will output
+p_xml: version="1.0" encoding="UTF-8"
cat:
says: meow
legs: "4"
cute: "true"
All values are assumed to be strings when parsing XML, but you can use the
from_yaml
operator on all the strings values to autoparse into the correct type.Given a sample.xml file of:
<?xml version="1.0" encoding="UTF-8"?>
<cat>
<says>meow</says>
<legs>4</legs>
<cute>true</cute>
</cat>
then
yq -oy ' (.. | select(tag == "!!str")) |= from_yaml' sample.xml
will output
+p_xml: version="1.0" encoding="UTF-8"
cat:
says: meow
legs: 4
cute: true
Consecutive nodes with identical xml names are assumed to be arrays.
Given a sample.xml file of:
<?xml version="1.0" encoding="UTF-8"?>
<animal>cat</animal>
<animal>goat</animal>
then
yq -oy '.' sample.xml
will output
+p_xml: version="1.0" encoding="UTF-8"
animal:
- cat
- goat
In XML, if your array has a single item, then yq doesn't know its an array. This is how you can consistently force it to be an array. This handles the 3 scenarios of having nothing in the array, having a single item and having multiple.
Given a sample.xml file of:
<zoo><animal>cat</animal></zoo>
then
yq -oy '.zoo.animal |= ([] + .)' sample.xml
will output
zoo:
animal:
- cat
Attributes are converted to fields, with the default attribute prefix '+'. Use '--xml-attribute-prefix` to set your own.
Given a sample.xml file of:
<?xml version="1.0" encoding="UTF-8"?>
<cat legs="4">
<legs>7</legs>
</cat>
then
yq -oy '.' sample.xml
will output
+p_xml: version="1.0" encoding="UTF-8"
cat:
+@legs: "4"
legs: "7"
Content is added as a field, using the default content name of
+content
. Use --xml-content-name
to set your own.Given a sample.xml file of:
<?xml version="1.0" encoding="UTF-8"?>
<cat legs="4">meow</cat>
then
yq -oy '.' sample.xml
will output
+p_xml: version="1.0" encoding="UTF-8"
cat:
+content: meow
+@legs: "4"
Multiple content texts are collected into a sequence.
Given a sample.xml file of:
<root> value <!-- comment-->anotherValue <a>frog</a> cool!</root>
then
yq -oy '.' sample.xml
will output
root:
+content: # comment
- value
- anotherValue
- cool!
a: frog
DTD entities are processed as directives.
Given a sample.xml file of:
<?xml version="1.0"?>
<!DOCTYPE root [
<!ENTITY writer "Blah.">
<!ENTITY copyright "Blah">
]>
<root>
<item>&writer;©right;</item>
</root>
then
yq '.' sample.xml
will output
<?xml version="1.0"?>
<!DOCTYPE root [
<!ENTITY writer "Blah.">
<!ENTITY copyright "Blah">
]>
<root>
<item>&writer;&copyright;</item>
</root>
DTDs are directives, skip over directives to skip DTDs.
Given a sample.xml file of:
<?xml version="1.0"?>
<!DOCTYPE root [
<!ENTITY writer "Blah.">
<!ENTITY copyright "Blah">
]>
<root>
<item>&writer;©right;</item>
</root>
then
yq --xml-skip-directives '.' sample.xml
will output
<?xml version="1.0"?>
<root>
<item>&writer;&copyright;</item>
</root>
A best attempt is made to preserve comments.
Given a sample.xml file of:
<!-- before cat -->
<cat>
<!-- in cat before -->
<x>3<!-- multi
line comment
for x --></x>
<!-- before y -->
<y>
<!-- in y before -->
<d><!-- in d before -->z<!-- in d after --></d>
<!-- in y after -->
</y>
<!-- in_cat_after -->
</cat>
<!-- after cat -->
then
yq -oy '.' sample.xml
will output
# before cat
cat:
# in cat before
x: "3" # multi
# line comment
# for x
# before y
y:
# in y before
# in d before
d: z # in d after
# in y after
# in_cat_after
# after cat
Defaults to true
Given a sample.xml file of:
<?xml version="1.0"?>
<map xmlns="some-namespace" xmlns:xsi="some-instance" xsi:schemaLocation="some-url"></map>
then
yq --xml-keep-namespace=false '.' sample.xml
will output
<?xml version="1.0"?>
<map xmlns="some-namespace" xsi="some-instance" schemaLocation="some-url"></map>
instead of
<?xml version="1.0"?>
<map xmlns="some-namespace" xmlns:xsi="some-instance" xsi:schemaLocation="some-url"></map>
Defaults to true
Given a sample.xml file of:
<?xml version="1.0"?>
<map xmlns="some-namespace" xmlns:xsi="some-instance" xsi:schemaLocation="some-url"></map>
then
yq --xml-raw-token=false '.' sample.xml
will output
<?xml version="1.0"?>
<map xmlns="some-namespace" xmlns:xsi="some-instance" some-instance:schemaLocation="some-url"></map>
instead of
<?xml version="1.0"?>
<map xmlns="some-namespace" xmlns:xsi="some-instance" xsi:schemaLocation="some-url"></map>
Given a sample.yml file of:
cat: purrs
then
yq -o=xml '.' sample.yml
will output
<cat>purrs</cat>
Given a sample.yml file of:
pets:
cat:
- purrs
- meows
then
yq -o=xml '.' sample.yml
will output
<pets>
<cat>purrs</cat>
<cat>meows</cat>
</pets>
Fields with the matching xml-attribute-prefix are assumed to be attributes.
Given a sample.yml file of:
cat:
+@name: tiger
meows: true
then
yq -o=xml '.' sample.yml
will output
<cat name="tiger">
<meows>true</meows>
</cat>
Fields with the matching xml-content-name is assumed to be content.
Given a sample.yml file of:
cat:
+@name: tiger
+content: cool
then
yq -o=xml '.' sample.yml
will output
<cat name="tiger">cool</cat>
A best attempt is made to copy comments to xml.
Given a sample.yml file of:
#
# header comment
# above_cat
#
cat: # inline_cat
# above_array
array: # inline_array
- val1 # inline_val1
# above_val2
- val2 # inline_val2
# below_cat
then
yq -o=xml '.' sample.yml
will output
<!--
header comment
above_cat
-->
<!-- inline_cat -->
<cat><!-- above_array inline_array -->
<array>val1<!-- inline_val1 --></array>
<array><!-- above_val2 -->val2<!-- inline_val2 --></array>
</cat><!-- below_cat -->
Use the special xml names to add/modify proc instructions and directives.
Given a sample.yml file of:
+p_xml: version="1.0"
+directive: 'DOCTYPE config SYSTEM "/etc/iwatch/iwatch.dtd" '
apple:
+p_coolioo: version="1.0"
+directive: 'CATYPE meow purr puss '
b: things
then
yq -o=xml '.' sample.yml
will output
<?xml version="1.0"?>
<!DOCTYPE config SYSTEM "/etc/iwatch/iwatch.dtd" >
<apple><?coolioo version="1.0"?><!CATYPE meow purr puss >
<b>things</b>
</apple>
A best effort is made, but comment positions and white space are not preserved perfectly.
Given a sample.xml file of:
<!-- before cat -->
<cat>
<!-- in cat before -->
<x>3<!-- multi
line comment
for x --></x>
<!-- before y -->
<y>
<!-- in y before -->
<d><!-- in d before -->z<!-- in d after --></d>
<!-- in y after -->
</y>
<!-- in_cat_after -->
</cat>
<!-- after cat -->
then
yq '.' sample.xml
will output
<!-- before cat -->
<cat><!-- in cat before -->
<x>3<!-- multi
line comment
for x --></x><!-- before y -->
<y><!-- in y before
in d before -->
<d>z<!-- in d after --></d><!-- in y after -->
</y><!-- in_cat_after -->
</cat><!-- after cat -->
yq parses XML proc instructions and directives into nodes. Unfortunately the underlying XML parser loses whitespace information.
Given a sample.xml file of:
<?xml version="1.0"?>
<!DOCTYPE config SYSTEM "/etc/iwatch/iwatch.dtd" >
<apple>
<?coolioo version="1.0"?>
<!CATYPE meow purr puss >
<b>things</b>
</apple>
then
yq '.' sample.xml
will output
<?xml version="1.0"?>
<!DOCTYPE config SYSTEM "/etc/iwatch/iwatch.dtd" >
<apple><?coolioo version="1.0"?><!CATYPE meow purr puss >
<b>things</b>
</apple>
Last modified 23d ago