Simbody
3.7
|
This class provides a minimalist capability for reading and writing XML documents, as files or strings. More...
Public Member Functions | |
Construction | |
You can start with an empty Xml::Document or initialize it from a file. | |
Document () | |
Create an empty XML Document with default declaration and default root element with tag "_Root". More... | |
Document (const String &pathname) | |
Create a new XML document and initialize it from the contents of the given file name. More... | |
Document (const Document &source) | |
Copy constructor makes a deep copy of the entire source document; nothing is shared between the source and the copy. More... | |
Document & | operator= (const Document &souce) |
Copy assignment frees all heap space associated with the current Xml::Document and then makes a deep copy of the source document; nothing is shared between the source and the copy. More... | |
~Document () | |
The destructor cleans up all heap space associated with this document. More... | |
void | clear () |
Restore this document to its default-constructed state. More... | |
Top-level node manipulation | |
These methods provide access to the top-level nodes, that is, those that are directly owned by the Xml::Document. Comment and Unknown nodes are allowed anywhere at the top level, but Text nodes are not allowed and there is just one distinguished Element node, the root element. If you want to add Text or Element nodes, add them to the root element rather than at the document level. | |
Element | getRootElement () |
Return an Element handle referencing the top-level element in this Xml::Document, known as the "root element". More... | |
const String & | getRootTag () const |
Shortcut for getting the tag word of the root element which is usually the document type. More... | |
void | setRootTag (const String &tag) |
Shortcut for changing the tag word of the root element which is usually the document type. More... | |
void | insertTopLevelNodeAfter (const node_iterator &afterThis, Node insertThis) |
Insert a top-level Comment or Unknown node just after the location indicated by the node_iterator, or at the end of the list if the iterator is node_end(). More... | |
void | insertTopLevelNodeBefore (const node_iterator &beforeThis, Node insertThis) |
Insert a top-level Comment or Unknown node just before the location indicated by the node_iterator. More... | |
void | eraseTopLevelNode (const node_iterator &deleteThis) |
Delete the indicated top-level node, which must not be the root element, and must not be node_end(). More... | |
Node | removeTopLevelNode (const node_iterator &removeThis) |
Remove the indicated top-level node from the document, returning it as an orphan rather than erasing it. More... | |
Iteration through top-level nodes (rarely used) | |
If you want to run through this document's top-level nodes (of which the "root element" is one), these methods provide begin and end iterators. By default you'll see all the nodes (types Comment, Unknown, and the lone top-level Element) but you can restrict the node types that you'll see via the NodeType mask. Iteration is rarely used at this top level since you almost never care about about the Comment and Unknown nodes here and you can get to the root element directly using getRootElement().
| |
node_iterator | node_begin (NodeType allowed=AnyNodes) |
Obtain an iterator to all the top-level nodes or a subset restricted via the allowed NodeType mask. More... | |
node_iterator | node_end () const |
This node_end() iterator indicates the end of a sequence of nodes regardless of the NodeType restriction on the iterator being used. More... | |
XML Declaration attributes (rarely used) | |
These methods deal with the mysterious XML "declaration" line that comes at the beginning of every XML document; that is the line that begins with "<?xml" and ends with "?>". There are at most three of these attributes and they have well-defined names that are always the same (default values shown):
You can examine and change these attributes with the methods in this section, however unless you really know what you're doing you should just leave the declaration alone; you'll get reasonable behavior automatically. | |
String | getXmlVersion () const |
Returns the Xml "version" attribute as a string (from the declaration line at the beginning of the document). More... | |
String | getXmlEncoding () const |
Returns the Xml "encoding" attribute as a string (from the declaration line at the beginning of the document). More... | |
bool | getXmlIsStandalone () const |
Returns the Xml "standalone" attribute as a bool (from the declaration line at the beginning of the document); default is true ("yes" in a file), meaning that the document can be parsed correctly without any other documents. More... | |
void | setXmlVersion (const String &version) |
Set the Xml "version" attribute; this will be written to the "declaration" line which is first in any Xml document. More... | |
void | setXmlEncoding (const String &encoding) |
Set the Xml "encoding" attribute; this doesn't affect the in-memory representation but can affect how the document gets written out. More... | |
void | setXmlIsStandalone (bool isStandalone) |
Set the Xml "standalone" attribute; this is normally true (corresponding to standalone="yes") and won't appear in the declaration line in that case when we write it out. More... | |
Friends | |
class | Node |
Related Functions | |
(Note that these are not member functions.) | |
std::ostream & | operator<< (std::ostream &o, const Document &doc) |
Output a "pretty printed" textual representation of the given Xml::Document to an std::ostream, using the document's current indent string for formatting. More... | |
Serializing and I/O | |
These methods deal with conversion to and from the in-memory representation of the XML document from and to files and strings. | |
void | readFromFile (const String &pathname) |
Read the contents of this Xml::Document from the file whose pathname is supplied. More... | |
void | writeToFile (const String &pathname) const |
Write the contents of this in-memory Xml::Document to the file whose pathname is supplied. More... | |
void | readFromString (const String &xmlDocument) |
Read the contents of this Xml::Document from the supplied string. More... | |
void | readFromString (const char *xmlDocument) |
Alternate form that reads from a null-terminated C string (char*) rather than a C++ string object. More... | |
void | writeToString (String &xmlDocument, bool compact=false) const |
Write the contents of this in-memory Xml::Document to the supplied string. More... | |
void | setIndentString (const String &indent) |
Set the string to be used for indentation when we produce a "pretty-printed" serialized form of this document. The default is to use four spaces for each level of indentation. More... | |
const String & | getIndentString () const |
Return the current value of the indent string. The default is four spaces. More... | |
static void | setXmlCondenseWhiteSpace (bool shouldCondense) |
Set global mode to control whether white space is preserved or condensed down to a single space (affects all subsequent document reads; not document specific). More... | |
static bool | isXmlWhiteSpaceCondensed () |
Return the current setting of the global "condense white space" option. More... | |
This class provides a minimalist capability for reading and writing XML documents, as files or strings.
This is based with gratitude on the excellent open source XML parser TinyXML (http://www.grinninglizard.com/tinyxml/). Note that this is a non-validating parser, meaning it deals only with the XML file itself and not with a Document Type Definition (DTD), XML Schema, or any other description of the XML file's expected contents. Instead, the structure of your code that uses this class encodes the expected structure and contents of the XML document.
Our in-memory model of an XML document is simplified even further than TinyXML's. There a lot to know about XML; you could start here: http://en.wikipedia.org/wiki/XML. However, everything you need to know in order to read and write XML documents with the SimTK::Xml::Document class is described below.
Much of the detailed documentation is in the class Xml::Element; be sure to look there as well as at this overview.
We consider an XML document to be a tree of "Nodes". There are only four types of nodes: Comments, Unknowns, Text, and Elements. Only Elements can contain Text and other nodes, including recursively child Element nodes. Elements can also have "Attributes" which are name:value pairs (not nodes).
The XML document as a whole is represented by an object of class Xml::Document. The Xml::Document object directly contains a short list of nodes, consisting only of Comments, Unknowns, and a single Element called the "root element". The tag word associated with the root element is called the "root tag" and conventionally identifies the kind of document this is. For example, XML files produced by VTK begin with a root tag "<VTKFile>".
We go to some pain to make sure every Xml::Document fits the above model so that you don't have to think about anything else. For example, if the file as read in has multiple root-level elements, or has document-level text, we will enclose all the element and text nodes within document start tag "<_Root>" and end tag "</_Root>" thus making it fit the description above. We call this "canonicalizing" the document.
Element nodes can be classified into "value elements" and "compound elements". A value element is a "leaf" element (no child elements) that contains at most one Text node. For example, a document might contain value elements like these:
All of these have a unique value so it makes sense to talk about "the" value of these elements (the empty "preferences" element has a null value). These are very common in XML documents, and the Xml::Element class makes them very easy to work with. For example, if Element elt is the "<vector>" element from the example, you could retrieve its value as a Vec3 like this:
This would automatically throw an error if the element wasn't a value element or if its value didn't have the right format to convert to a Vec3.
Note that it is okay for a value element to have attributes; those are ignored in determining the element's value. Any element that is not a value element is a "compound element", meaning it has either child elements and/or more than one Text node.
To read an XML document, you create an Xml::Document object and tell it to read in the document from a file or from a string. The document will be parsed and canonicalized into the in-memory model described above. Then to rummage around in the document, you ask the Xml::Document object for its root element, and check the root tag to see that it is the type of document you are expecting. You can check the root element's attributes, and then process its contents (child nodes). Iterators are provided for running through all the attributes, all the child nodes contained in the element, or all the child nodes of a particular type. For a child node that is an element, you check the tag and then pass the element to some piece of code that knows how to deal with that kind of element and its children recursively.
Here is a complete example of reading in an Xml file "example.xml", printing the root tag and then the types of all the document-level nodes, in STL iterator style:
Exactly one of the above nodes will have type "ElementNode"; that is the root element. To print out the types of nodes contained in the root element, you could write:
You can insert, remove, and modify nodes and attributes in a document, or create a document from scratch. Then you can write the results in a "pretty-printed" or compact format to a file or a string; for pretty-printing you can override the default indentation string (four spaces). Whenever we write an XML document, we write it in canoncial format, regardless of how it looked when we found it.
At the document level, you can only insert Comment and Unknown nodes. Text and Element nodes can be inserted only at the root element level and below.
This section provides detailed information about the syntax of XML files as we accept and produce them. You won't have to know these details to read and write XML files using the SimTK::Xml::Document class, but you may find this helpful for when you have to look at an XML file in a text editor.
(Ignore the quote characters below; those are present so I can get this text through Doxygen.)
An XML file contains a single document which consists at the top level of
Elements can be containers of other nodes and are thus the basis for the tree structure of XML files. Elements can contain:
A declaration (see below) also has attributes, but there are only three: version, encoding, and standalone ('yes' or 'no'). Unknowns are constructs found in the file that are not recognized; they might be errors but they are likely to be more sophisticated uses of XML that our feeble parser doesn't understand. Unknowns are tags where the tag word doesn't begin with a letter or underscore and isn't one of the very few other tags we recognize, like comments. As an example, a DTD tag like this would come through as an Unknown node here:
Here is the top-level structure we expect of a well-formed XML document, and we will impose this structure on XML documents that don't have it. This allows us to simplify the in-memory model as discussed above.
That is, the first line should be a declaration, most commonly exactly the characters shown above, without the "standalone" attribute which will default to "yes". If we don't see a declaration when reading an XML document, we'll assume we read the one above. Then the document should contain exactly one root element representing the type of document and document-level attributes. The tag for the root element is not literally "roottag" but some name that makes sense for the given document. Note that the root element is an ordinary element so "contents" can contain text and child elements (as well as comments and unknowns).
When reading an XML document, if it has exactly one document-level element and no document-level text, we'll take the document as-is. If there is more than one document-level element, or we find some document-level text, we'll assume that the root element is missing and act as though we had seen a root element "<_Root>" at the beginning and "</_Root>" at the end so the root tag will be "_Root". Note that this means that we will interpret even a plain text file as a well-formed XML document:
The above XML document has a single document-level element and that element contains one Text node whose value is the original text.
SimTK::Xml::Document::Document | ( | ) |
Create an empty XML Document with default declaration and default root element with tag "_Root".
If you were to print out this document now you would see:
|
explicit |
Create a new XML document and initialize it from the contents of the given file name.
An exception will be thrown if the file doesn't exist or can't be parsed.
SimTK::Xml::Document::Document | ( | const Document & | source | ) |
Copy constructor makes a deep copy of the entire source document; nothing is shared between the source and the copy.
SimTK::Xml::Document::~Document | ( | ) |
The destructor cleans up all heap space associated with this document.
Copy assignment frees all heap space associated with the current Xml::Document and then makes a deep copy of the source document; nothing is shared between the source and the copy.
void SimTK::Xml::Document::clear | ( | ) |
Restore this document to its default-constructed state.
void SimTK::Xml::Document::readFromFile | ( | const String & | pathname | ) |
Read the contents of this Xml::Document from the file whose pathname is supplied.
This first clears the current document so the new one completely replaces the old one.
void SimTK::Xml::Document::writeToFile | ( | const String & | pathname | ) | const |
Write the contents of this in-memory Xml::Document to the file whose pathname is supplied.
The file will be created if it doesn't exist, overwritten if it does exist. The file will be "pretty-printed" using the current indent string.
void SimTK::Xml::Document::readFromString | ( | const String & | xmlDocument | ) |
Read the contents of this Xml::Document from the supplied string.
This first clears the current document so the new one completely replaces the old one.
void SimTK::Xml::Document::readFromString | ( | const char * | xmlDocument | ) |
Alternate form that reads from a null-terminated C string (char*) rather than a C++ string object.
This would otherwise be implicitly converted to string first which would require copying.
void SimTK::Xml::Document::writeToString | ( | String & | xmlDocument, |
bool | compact = false |
||
) | const |
Write the contents of this in-memory Xml::Document to the supplied string.
The string cleared first so will be completely overwritten. Normally the output is "pretty-printed" as it is for a file, but if you set compact to true the tabs and newlines will be suppressed to make a more compact representation.
void SimTK::Xml::Document::setIndentString | ( | const String & | indent | ) |
Set the string to be used for indentation when we produce a "pretty-printed" serialized form of this document. The default is to use four spaces for each level of indentation.
const String& SimTK::Xml::Document::getIndentString | ( | ) | const |
Return the current value of the indent string. The default is four spaces.
|
static |
Set global mode to control whether white space is preserved or condensed down to a single space (affects all subsequent document reads; not document specific).
The default is to condense.
|
static |
Return the current setting of the global "condense white space" option.
Note that this option affects all Xml reads; it is not document specific.
Element SimTK::Xml::Document::getRootElement | ( | ) |
Return an Element handle referencing the top-level element in this Xml::Document, known as the "root element".
The tag word of this element is usually the type of document. This is the only top-level element; all others are its children and descendents. Once you have the root Element handle, you can also use any of the Element methods to manipulate it. If you need a node_iterator that refers to the root element (perhaps to use one of the top-level insert methods), use node_begin() with a NodeType filter:
That works since there is only one element at this level.
const String& SimTK::Xml::Document::getRootTag | ( | ) | const |
Shortcut for getting the tag word of the root element which is usually the document type.
This is the same as getRootElement().getElementTag().
void SimTK::Xml::Document::setRootTag | ( | const String & | tag | ) |
Shortcut for changing the tag word of the root element which is usually the document type.
This is the same as getRootElement().setElementTag(tag).
void SimTK::Xml::Document::insertTopLevelNodeAfter | ( | const node_iterator & | afterThis, |
Node | insertThis | ||
) |
Insert a top-level Comment or Unknown node just after the location indicated by the node_iterator, or at the end of the list if the iterator is node_end().
The iterator must refer to a top-level node. The Xml::Document takes over ownership of the Node which must be a Comment or Unknown node and must have been an orphan. The supplied Node handle will retain a reference to the node within the document and can still be used to make changes, but will no longer by an orphan.
void SimTK::Xml::Document::insertTopLevelNodeBefore | ( | const node_iterator & | beforeThis, |
Node | insertThis | ||
) |
Insert a top-level Comment or Unknown node just before the location indicated by the node_iterator.
See insertTopLevelNodeAfter() for details.
void SimTK::Xml::Document::eraseTopLevelNode | ( | const node_iterator & | deleteThis | ) |
Delete the indicated top-level node, which must not be the root element, and must not be node_end().
That is, it must be a top-level Comment or Unknown node which will be removed from the Xml::Document and deleted. The iterator is invalid after this call; be sure not to use it again. Also, there must not be any handles referencing the now-deleted node.
Node SimTK::Xml::Document::removeTopLevelNode | ( | const node_iterator & | removeThis | ) |
Remove the indicated top-level node from the document, returning it as an orphan rather than erasing it.
The node must not be the root element, and must not be node_end(). That is, it must be a top-level Comment or Unknown node which will be removed from the Xml::Document and returned as an orphan Node. The iterator is invalid after this call; be sure not to use it again.
node_iterator SimTK::Xml::Document::node_begin | ( | NodeType | allowed = AnyNodes | ) |
Obtain an iterator to all the top-level nodes or a subset restricted via the allowed NodeType mask.
node_iterator SimTK::Xml::Document::node_end | ( | ) | const |
This node_end() iterator indicates the end of a sequence of nodes regardless of the NodeType restriction on the iterator being used.
String SimTK::Xml::Document::getXmlVersion | ( | ) | const |
Returns the Xml "version" attribute as a string (from the declaration line at the beginning of the document).
String SimTK::Xml::Document::getXmlEncoding | ( | ) | const |
Returns the Xml "encoding" attribute as a string (from the declaration line at the beginning of the document).
bool SimTK::Xml::Document::getXmlIsStandalone | ( | ) | const |
Returns the Xml "standalone" attribute as a bool (from the declaration line at the beginning of the document); default is true ("yes" in a file), meaning that the document can be parsed correctly without any other documents.
We won't include "standalone" in the declaration line for any Xml documents we generate unless the value is false ("no" in a file).
void SimTK::Xml::Document::setXmlVersion | ( | const String & | version | ) |
void SimTK::Xml::Document::setXmlEncoding | ( | const String & | encoding | ) |
Set the Xml "encoding" attribute; this doesn't affect the in-memory representation but can affect how the document gets written out.
void SimTK::Xml::Document::setXmlIsStandalone | ( | bool | isStandalone | ) |
Set the Xml "standalone" attribute; this is normally true (corresponding to standalone="yes") and won't appear in the declaration line in that case when we write it out.
If you set this to false then standalone="no" will appear in the declaration line when it is written.
|
friend |
|
related |
Output a "pretty printed" textual representation of the given Xml::Document to an std::ostream, using the document's current indent string for formatting.