00001 node-xml2js
00002 ===========
00003
00004 Ever had the urge to parse XML? And wanted to access the data in some sane,
00005 easy way? Don't want to compile a C parser, for whatever reason? Then xml2js is
00006 what you're looking for!
00007
00008 Description
00009 ===========
00010
00011 Simple XML to JavaScript object converter. It supports bi-directional conversion.
00012 Uses [sax-js](https:
00013 [xmlbuilder-js](https:
00014
00015 Note: If you're looking for a full DOM parser, you probably want
00016 [JSDom](https://github.com/tmpvar/jsdom).
00017
00018 Installation
00019 ============
00020
00021 Simplest way to install `xml2js` is to use [npm](http://npmjs.org), just `npm
00022 install xml2js` which will download xml2js and all dependencies.
00023
00024 xml2js is also available via [Bower](http://bower.io/), just `bower install
00025 xml2js` which will download xml2js and all dependencies.
00026
00027 Usage
00028 =====
00029
00030 No extensive tutorials required because you are a smart developer! The task of
00031 parsing XML should be an easy one, so let's make it so! Here's some examples.
00032
00033 Shoot-and-forget usage
00034 ----------------------
00035
00036 You want to parse XML as simple and easy as possible? It's dangerous to go
00037 alone, take this:
00038
00039 ```javascript
00040 var parseString = require('xml2js').parseString;
00041 var xml = "<root>Hello xml2js!</root>"
00042 parseString(xml, function (err, result) {
00043 console.dir(result);
00044 });
00045 ```
00046
00047 Can't get easier than this, right? This works starting with `xml2js` 0.2.3.
00048 With CoffeeScript it looks like this:
00049
00050 ```coffeescript
00051 {parseString} = require 'xml2js'
00052 xml = "<root>Hello xml2js!</root>"
00053 parseString xml, (err, result) ->
00054 console.dir result
00055 ```
00056
00057 If you need some special options, fear not, `xml2js` supports a number of
00058 options (see below), you can specify these as second argument:
00059
00060 ```javascript
00061 parseString(xml, {trim: true}, function (err, result) {
00062 });
00063 ```
00064
00065 Simple as pie usage
00066 -------------------
00067
00068 That's right, if you have been using xml-simple or a home-grown
00069 wrapper, this was added in 0.1.11 just for you:
00070
00071 ```javascript
00072 var fs = require('fs'),
00073 xml2js = require('xml2js');
00074
00075 var parser = new xml2js.Parser();
00076 fs.readFile(__dirname + '/foo.xml', function(err, data) {
00077 parser.parseString(data, function (err, result) {
00078 console.dir(result);
00079 console.log('Done');
00080 });
00081 });
00082 ```
00083
00084 Look ma, no event listeners!
00085
00086 You can also use `xml2js` from
00087 [CoffeeScript](https:
00088 the clutter:
00089
00090 ```coffeescript
00091 fs = require 'fs',
00092 xml2js = require 'xml2js'
00093
00094 parser = new xml2js.Parser()
00095 fs.readFile __dirname + '/foo.xml', (err, data) ->
00096 parser.parseString data, (err, result) ->
00097 console.dir result
00098 console.log 'Done.'
00099 ```
00100
00101 But what happens if you forget the `new` keyword to create a new `Parser`? In
00102 the middle of a nightly coding session, it might get lost, after all. Worry
00103 not, we got you covered! Starting with 0.2.8 you can also leave it out, in
00104 which case `xml2js` will helpfully add it for you, no bad surprises and
00105 inexplicable bugs!
00106
00107 Parsing multiple files
00108 ----------------------
00109
00110 If you want to parse multiple files, you have multiple possibilities:
00111
00112 * You can create one `xml2js.Parser` per file. That's the recommended one
00113 and is promised to always *just work*.
00114 * You can call `reset()` on your parser object.
00115 * You can hope everything goes well anyway. This behaviour is not
00116 guaranteed work always, if ever. Use option #1 if possible. Thanks!
00117
00118 So you wanna some JSON?
00119 -----------------------
00120
00121 Just wrap the `result` object in a call to `JSON.stringify` like this
00122 `JSON.stringify(result)`. You get a string containing the JSON representation
00123 of the parsed object that you can feed to JSON-hungry consumers.
00124
00125 Displaying results
00126 ------------------
00127
00128 You might wonder why, using `console.dir` or `console.log` the output at some
00129 level is only `[Object]`. Don't worry, this is not because `xml2js` got lazy.
00130 That's because Node uses `util.inspect` to convert the object into strings and
00131 that function stops after `depth=2` which is a bit low for most XML.
00132
00133 To display the whole deal, you can use `console.log(util.inspect(result, false,
00134 null))`, which displays the whole result.
00135
00136 So much for that, but what if you use
00137 [eyes](https://github.com/cloudhead/eyes.js) for nice colored output and it
00138 truncates the output with `…`? Don't fear, there's also a solution for that,
00139 you just need to increase the `maxLength` limit by creating a custom inspector
00140 `var inspect = require('eyes').inspector({maxLength: false})` and then you can
00141 easily `inspect(result)`.
00142
00143 XML builder usage
00144 -----------------
00145
00146 Since 0.4.0, objects can be also be used to build XML:
00147
00148 ```javascript
00149 var fs = require('fs'),
00150 xml2js = require('xml2js');
00151
00152 var obj = {name: "Super", Surname: "Man", age: 23};
00153
00154 var builder = new xml2js.Builder();
00155 var xml = builder.buildObject(obj);
00156 ```
00157
00158 At the moment, a one to one bi-directional conversion is guaranteed only for
00159 default configuration, except for `attrkey`, `charkey` and `explicitArray` options
00160 you can redefine to your taste. Writing CDATA is supported via setting the `cdata`
00161 option to `true`.
00162
00163 Processing attribute, tag names and values
00164 ------------------------------------------
00165
00166 Since 0.4.1 you can optionally provide the parser with attribute name and tag name processors as well as element value processors (Since 0.4.14, you can also optionally provide the parser with attribute value processors):
00167
00168 ```javascript
00169
00170 function nameToUpperCase(name){
00171 return name.toUpperCase();
00172 }
00173
00174 //transform all attribute and tag names and values to uppercase
00175 parseString(xml, {
00176 tagNameProcessors: [nameToUpperCase],
00177 attrNameProcessors: [nameToUpperCase],
00178 valueProcessors: [nameToUpperCase],
00179 attrValueProcessors: [nameToUpperCase]},
00180 function (err, result) {
00181 // processed data
00182 });
00183 ```
00184
00185 The `tagNameProcessors`, `attrNameProcessors`, `attrValueProcessors` and `valueProcessors` options
00186 accept an `Array` of functions with the following signature:
00187
00188 ```javascript
00189 function (name){
00190 //do something with `name`
00191 return name
00192 }
00193 ```
00194
00195 Some processors are provided out-of-the-box and can be found in `lib/processors.js`:
00196
00197 - `normalize`: transforms the name to lowercase.
00198 (Automatically used when `options.normalize` is set to `true`)
00199
00200 - `firstCharLowerCase`: transforms the first character to lower case.
00201 E.g. 'MyTagName' becomes 'myTagName'
00202
00203 - `stripPrefix`: strips the xml namespace prefix. E.g `<foo:Bar/>` will become 'Bar'.
00204 (N.B.: the `xmlns` prefix is NOT stripped.)
00205
00206 - `parseNumbers`: parses integer-like strings as integers and float-like strings as floats
00207 E.g. "0" becomes 0 and "15.56" becomes 15.56
00208
00209 - `parseBooleans`: parses boolean-like strings to booleans
00210 E.g. "true" becomes true and "False" becomes false
00211
00212 Options
00213 =======
00214
00215 Apart from the default settings, there are a number of options that can be
00216 specified for the parser. Options are specified by ``new Parser({optionName:
00217 value})``. Possible options are:
00218
00219 * `attrkey` (default: `$`): Prefix that is used to access the attributes.
00220 Version 0.1 default was `@`.
00221 * `charkey` (default: `_`): Prefix that is used to access the character
00222 content. Version 0.1 default was `#`.
00223 * `explicitCharkey` (default: `false`)
00224 * `trim` (default: `false`): Trim the whitespace at the beginning and end of
00225 text nodes.
00226 * `normalizeTags` (default: `false`): Normalize all tag names to lowercase.
00227 * `normalize` (default: `false`): Trim whitespaces inside text nodes.
00228 * `explicitRoot` (default: `true`): Set this if you want to get the root
00229 node in the resulting object.
00230 * `emptyTag` (default: `''`): what will the value of empty nodes be.
00231 * `explicitArray` (default: `true`): Always put child nodes in an array if
00232 true; otherwise an array is created only if there is more than one.
00233 * `ignoreAttrs` (default: `false`): Ignore all XML attributes and only create
00234 text nodes.
00235 * `mergeAttrs` (default: `false`): Merge attributes and child elements as
00236 properties of the parent, instead of keying attributes off a child
00237 attribute object. This option is ignored if `ignoreAttrs` is `false`.
00238 * `validator` (default `null`): You can specify a callable that validates
00239 the resulting structure somehow, however you want. See unit tests
00240 for an example.
00241 * `xmlns` (default `false`): Give each element a field usually called '$ns'
00242 (the first character is the same as attrkey) that contains its local name
00243 and namespace URI.
00244 * `explicitChildren` (default `false`): Put child elements to separate
00245 property. Doesn't work with `mergeAttrs = true`. If element has no children
00246 then "children" won't be created. Added in 0.2.5.
00247 * `childkey` (default `$$`): Prefix that is used to access child elements if
00248 `explicitChildren` is set to `true`. Added in 0.2.5.
00249 * `preserveChildrenOrder` (default `false`): Modifies the behavior of
00250 `explicitChildren` so that the value of the "children" property becomes an
00251 ordered array. When this is `true`, every node will also get a `#name` field
00252 whose value will correspond to the XML nodeName, so that you may iterate
00253 the "children" array and still be able to determine node names. The named
00254 (and potentially unordered) properties are also retained in this
00255 configuration at the same level as the ordered "children" array. Added in
00256 0.4.9.
00257 * `charsAsChildren` (default `false`): Determines whether chars should be
00258 considered children if `explicitChildren` is on. Added in 0.2.5.
00259 * `includeWhiteChars` (default `false`): Determines whether whitespace-only
00260 text nodes should be included. Added in 0.4.17.
00261 * `async` (default `false`): Should the callbacks be async? This *might* be
00262 an incompatible change if your code depends on sync execution of callbacks.
00263 Future versions of `xml2js` might change this default, so the recommendation
00264 is to not depend on sync execution anyway. Added in 0.2.6.
00265 * `strict` (default `true`): Set sax-js to strict or non-strict parsing mode.
00266 Defaults to `true` which is *highly* recommended, since parsing HTML which
00267 is not well-formed XML might yield just about anything. Added in 0.2.7.
00268 * `attrNameProcessors` (default: `null`): Allows the addition of attribute
00269 name processing functions. Accepts an `Array` of functions with following
00270 signature:
00271 ```javascript
00272 function (name){
00273 //do something with `name`
00274 return name
00275 }
00276 ```
00277 Added in 0.4.14
00278 * `attrValueProcessors` (default: `null`): Allows the addition of attribute
00279 value processing functions. Accepts an `Array` of functions with following
00280 signature:
00281 ```javascript
00282 function (name){
00283 //do something with `name`
00284 return name
00285 }
00286 ```
00287 Added in 0.4.1
00288 * `tagNameProcessors` (default: `null`): Allows the addition of tag name
00289 processing functions. Accepts an `Array` of functions with following
00290 signature:
00291 ```javascript
00292 function (name){
00293 //do something with `name`
00294 return name
00295 }
00296 ```
00297 Added in 0.4.1
00298 * `valueProcessors` (default: `null`): Allows the addition of element value
00299 processing functions. Accepts an `Array` of functions with following
00300 signature:
00301 ```javascript
00302 function (name){
00303 //do something with `name`
00304 return name
00305 }
00306 ```
00307 Added in 0.4.6
00308
00309 Options for the `Builder` class
00310 -------------------------------
00311 These options are specified by ``new Builder({optionName: value})``.
00312 Possible options are:
00313
00314 * `rootName` (default `root` or the root key name): root element name to be used in case
00315 `explicitRoot` is `false` or to override the root element name.
00316 * `renderOpts` (default `{ 'pretty': true, 'indent': ' ', 'newline': '\n' }`):
00317 Rendering options for xmlbuilder-js.
00318 * pretty: prettify generated XML
00319 * indent: whitespace for indentation (only when pretty)
00320 * newline: newline char (only when pretty)
00321 * `xmldec` (default `{ 'version': '1.0', 'encoding': 'UTF-8', 'standalone': true }`:
00322 XML declaration attributes.
00323 * `xmldec.version` A version number string, e.g. 1.0
00324 * `xmldec.encoding` Encoding declaration, e.g. UTF-8
00325 * `xmldec.standalone` standalone document declaration: true or false
00326 * `doctype` (default `null`): optional DTD. Eg. `{'ext': 'hello.dtd'}`
00327 * `headless` (default: `false`): omit the XML header. Added in 0.4.3.
00328 * `allowSurrogateChars` (default: `false`): allows using characters from the Unicode
00329 surrogate blocks.
00330 * `cdata` (default: `false`): wrap text nodes in `<![CDATA[ ... ]]>` instead of
00331 escaping when necessary. Does not add `<![CDATA[ ... ]]>` if it is not required.
00332 Added in 0.4.5.
00333
00334 `renderOpts`, `xmldec`,`doctype` and `headless` pass through to
00335 [xmlbuilder-js](https://github.com/oozcitak/xmlbuilder-js).
00336
00337 Updating to new version
00338 =======================
00339
00340 Version 0.2 changed the default parsing settings, but version 0.1.14 introduced
00341 the default settings for version 0.2, so these settings can be tried before the
00342 migration.
00343
00344 ```javascript
00345 var xml2js = require('xml2js');
00346 var parser = new xml2js.Parser(xml2js.defaults["0.2"]);
00347 ```
00348
00349 To get the 0.1 defaults in version 0.2 you can just use
00350 `xml2js.defaults["0.1"]` in the same place. This provides you with enough time
00351 to migrate to the saner way of parsing in `xml2js` 0.2. We try to make the
00352 migration as simple and gentle as possible, but some breakage cannot be
00353 avoided.
00354
00355 So, what exactly did change and why? In 0.2 we changed some defaults to parse
00356 the XML in a more universal and sane way. So we disabled `normalize` and `trim`
00357 so `xml2js` does not cut out any text content. You can reenable this at will of
00358 course. A more important change is that we return the root tag in the resulting
00359 JavaScript structure via the `explicitRoot` setting, so you need to access the
00360 first element. This is useful for anybody who wants to know what the root node
00361 is and preserves more information. The last major change was to enable
00362 `explicitArray`, so everytime it is possible that one might embed more than one
00363 sub-tag into a tag, xml2js >= 0.2 returns an array even if the array just
00364 includes one element. This is useful when dealing with APIs that return
00365 variable amounts of subtags.
00366
00367 Running tests, development
00368 ==========================
00369
00370 [](https://travis-ci.org/Leonidas-from-XIV/node-xml2js)
00371 [](https://coveralls.io/r/Leonidas-from-XIV/node-xml2js?branch=master)
00372 [](https://david-dm.org/Leonidas-from-XIV/node-xml2js)
00373
00374 The development requirements are handled by npm, you just need to install them.
00375 We also have a number of unit tests, they can be run using `npm test` directly
00376 from the project root. This runs zap to discover all the tests and execute
00377 them.
00378
00379 If you like to contribute, keep in mind that `xml2js` is written in
00380 CoffeeScript, so don't develop on the JavaScript files that are checked into
00381 the repository for convenience reasons. Also, please write some unit test to
00382 check your behaviour and if it is some user-facing thing, add some
00383 documentation to this README, so people will know it exists. Thanks in advance!
00384
00385 Getting support
00386 ===============
00387
00388 Please, if you have a problem with the library, first make sure you read this
00389 README. If you read this far, thanks, you're good. Then, please make sure your
00390 problem really is with `xml2js`. It is? Okay, then I'll look at it. Send me a
00391 mail and we can talk. Please don't open issues, as I don't think that is the
00392 proper forum for support problems. Some problems might as well really be bugs
00393 in `xml2js`, if so I'll let you know to open an issue instead :)
00394
00395 But if you know you really found a bug, feel free to open an issue instead.