|
mdz_xml Overview and Referencemdz_xml - very lightweight, versatile and speedy C library for parsing XML-documents and building DOM-structure. Source code of library is highly-portable, conforms to ANSI C 89/90 Standard. Builds for Win32/Win64, Linux, FreeBSD, Android, macOS are available. Please refer to mdz_xml Wiki for API details. mdz_xml Advantages 1. High portability: the whole code conforms to ANSI C 89/90 Standard. 2. Little dependencies: basically, mdz_xml functions are only dependent on standard C-library memory-management/access functions. It means you can use library in your code without any further dependencies except standard platform libraries/APIs. 3. Very fast: Parsing of XML-document is ca. 40% faster than "pugiXML" (which is considered one of fastest XML-parsers) needs. And is ca. 20 times faster than of "MSXML" from Microsoft. 4. Parsing of very large XML-documents: basically, for 64-bit version the only limitation is your available RAM. Still the parser uses memory very effectively, thus parsing of very large files (we mean 2 GB large and bigger files here) is still possible using mdz_xml with a 50% less RAM-consumption than "pugiXML" or "MSXML" (which are very memory-effective too). 5. Parse error position: you are getting position of invalid (non-parsed) character in XML-document if there is parse error. 6. Flexibility: you can parse read-only XML-documents. You can parse writable XML-documents even faster and with less memory-consumption than read-only parse. You can parse files - our file reading is very fast too. 7. Navigation: you can navigate through XML DOM-structure to collect necessary information from elements/children, attributes or texts. 8. Control: you can allow/disallow characters to be parsed in element names, attributes, texts, etc. Parser itself handles <?..?> instructions, <!--...--> comments, and mixed content
mdz_xml API Reference
mdz_xml is a library for parsing of XML-documents and creating DOM-structure
Parser processes only ASCII/ANSI single-byte XML-documents. There are following limitations currently:
- "Process Instructions" (instructions inside ... ?>) - are parsed but not processed and not inserted in DOM - Comments (XML-code inside ) - is parsed but not processed and not inserted in DOM - Mixed content is supported
- Following characters are treated as space-characters: '\n' (aka LF, code 0x0a), '\r' (aka CR, code 0x0d), '\t' (aka TAB, code 0x09), ' ' (aka SPACE, code 0x20)
- Element name may start with following characters: ':', '_', [A..Z], [a..z] - Element name may contain following characters: ':', '_', '-', '.', [A..Z], [a..z], [0..9] - Restrictions to element name also apply to attribute name
- Element text may contain every ANSI/ASCII character except: 0 ('\0', code 0x00) and '<' - Restrictions to element text also apply to attribute value
- CDATA Sections are not parsed and not processed (parsing error will be returned)
- Whitespaces in element text/attribute value are preserved
Library init/uninit functions:mdz_xml_init mdz_xml_uninitParser create/destroy functions:mdz_xml_create mdz_xml_destroyParse functions:mdz_xml_parse mdz_xml_parseWritable mdz_xml_parseFileService functions:mdz_xml_allowLetter mdz_xml_isAllowedLetter mdz_xml_setParseCallback mdz_xml_removeParseCallback mdz_xml_getTotalElementsNavigation functions:mdz_xml_getRootElement mdz_xml_getElement mdz_xml_getAttribute Initializes ansi library. This function should be called before any other function of the library.mdz_bool mdz_xml_init(const uint32_t* pFirstNameHash, const uint32_t* pLastNameHash, const uint32_t* pEmailHash, const uint32_t* pLicenseHash); Parameter | Description |
---|
pFirstNameHash | user first name hash code | pLastNameHash | user last name hash code | pEmailHash | user e-mail hash code | pLicenseHash | license hash code |
Return | Description |
---|
mdz_true | if the initialization has succeed, otherwise false | mdz_xml Reference Un-initializes ansi library and frees corresponding memory allocations.void mdz_xml_uninit(void); mdz_xml Reference Create XML-parser instance.struct mdz_Xml* mdz_xml_create(void); Return | Description |
---|
NULL | if library is not initialized with mdz_xml_init() call | NULL | if memory allocation failed | Result | pointer to instance for use in other mdz_xml functions | mdz_xml Reference Destroy XML-parser instance, including underlying data. After destroying, pointer to the instance is set to NULL .void mdz_xml_destroy(struct mdz_Xml** ppXml); Parameter | Description |
---|
ppXml | pointer to pointer to XML-parser instance returned by mdz_xml_create() | mdz_xml Reference Read-only parsing XML-document under pcText with length nTextLength . After successful parsing, DOM-structure will be built with a root at mdz_Xml .m_pRootElement.mdz_bool mdz_xml_parse(const struct mdz_Xml* pXml, const char* pcText, size_t nTextLength, mdz_bool bTrimSpaces); Parameter | Description |
---|
pXml | pointer to XML-parser instance returned by mdz_xml_create() | pcText | XML-document to parce (single-bytes string) | nTextLength | length of XML-document to parce in bytes | bTrimSpaces | if leading/trail spaces in element text should be trimmed/deleted (makes DOM-structure in-memory size smaller) |
Return | Description |
---|
mdz_false | if pXml == NULL | mdz_false | if license is invalid/expired (MDZ_XML_ERROR_LICENSE) | mdz_false | if pcText is NULL or empty or nTextLength is too small (MDZ_XML_ERROR_PARAM) | mdz_false | if there are memory-allocation problems (MDZ_XML_ERROR_MEMORY) | mdz_false | if there are parsing problems (MDZ_XML_ERROR_PARSE). In shis case mdz_Xml .m_nErrorPos contains position of unexpected/erroneous character in XML-document | mdz_true | parsing succeeded | mdz_xml Reference Writable parsing XML-document under pcText with length nTextLength . During this parse XML-document can be modified (without losing XML-information), to significantly decrease in-memory size of DOM-structure comparing to mdz_xml_parse() result.mdz_bool mdz_xml_parseWritable(const struct mdz_Xml* pXml, char* pcText, size_t nTextLength, mdz_bool bTrimSpaces); Parameter | Description |
---|
pXml | pointer to XML-parser instance returned by mdz_xml_create() | pcText | XML-document to parce (single-bytes string) | nTextLength | length of XML-document to parce in bytes | bTrimSpaces | if leading/trail spaces in element text should be trimmed/deleted (makes DOM-structure in-memory size smaller) |
Return | Description |
---|
mdz_false | if pXml == NULL | mdz_false | if license is invalid/expired (MDZ_XML_ERROR_LICENSE) | mdz_false | if pcText is NULL or empty or nTextLength is too small (MDZ_XML_ERROR_PARAM) | mdz_false | if there are memory-allocation problems (MDZ_XML_ERROR_MEMORY) | mdz_false | if there are parsing problems (MDZ_XML_ERROR_PARSE). In shis case mdz_Xml .m_nErrorPos contains position of unexpected/erroneous character in XML-document | mdz_true | parsing succeeded | mdz_xml Reference Reads from file and makes writable parsing of XML-document.mdz_bool mdz_xml_parseFile(const struct mdz_Xml* pXml, const char* pcFileName, mdz_bool bParseWritable, mdz_bool bTrimSpaces); Parameter | Description |
---|
pXml | pointer to XML-parser instance returned by mdz_xml_create() | pcFileName | XML-document filename, including path. File-content will be treated as a single-bytes string | bParseWritable | if writable parsion of XML-document is allowed. File content will not be changed, only loaded from file XML-document | bTrimSpaces | if leading/trail spaces in element text should be trimmed/deleted (makes DOM-structure in-memory size smaller) |
Return | Description |
---|
mdz_false | if pXml == NULL | mdz_false | if license is invalid/expired (MDZ_XML_ERROR_LICENSE) | mdz_false | if pcFileName is NULL or empty (MDZ_XML_ERROR_PARAM) | mdz_false | if file reading problems (MDZ_XML_ERROR_FILE) | mdz_false | if there are memory-allocation problems (MDZ_XML_ERROR_MEMORY) | mdz_false | if there are parsing problems (MDZ_XML_ERROR_PARSE). In shis case mdz_Xml .m_nErrorPos contains position of unexpected/erroneous character in XML-document | mdz_true | parsing succeeded | mdz_xml Reference Allow/disallow letter cLetter to be used in text, defined by enTextType . Please note: there are no checks on letters - thus invalid settings (like allowing '<' in element name) may break parsing functionality. This function is applied to global settings and used by all parser instances.mdz_bool mdz_xml_allowLetter(enum mdz_xml_text_type enTextType, unsigned char cLetter, mdz_bool bAllow); Parameter | Description |
---|
enTextType | text type for allowing/disallowing | cLetter | letter to allow/disallow | bAllow | mdz_false to disallow, otherwise allow |
Return | Description |
---|
mdz_false | if enTextType is out of range | mdz_true | setting succeeded | mdz_xml Reference Check if the letter cLetter is allowed to be used in text, defined by enTextType .mdz_bool mdz_xml_isAllowedLetter(enum mdz_xml_text_type enTextType, unsigned char cLetter); Parameter | Description |
---|
enTextType | text type for allowing/disallowing | cLetter | letter to allow/disallow |
Return | Description |
---|
mdz_false | if enTextType is out of range or the letter is disallowed | mdz_true | letter is allowed | mdz_xml Reference Set callback for reporting percent of XML-document processed during parsing.mdz_bool mdz_xml_setParseCallback(const struct mdz_Xml* pXml, mdz_ProgressCallbackFunc pCallbackFunc, unsigned char nPercentProCall); Parameter | Description |
---|
pXml | pointer to XML-parser instance returned by mdz_xml_create() | pCallbackFunc | callback function, reports percent of processing [0..100] in parameter nPercent | nPercentProCall | after how many percents of parsing callback should be called. Parameter value should be in [1..50] |
Return | Description |
---|
mdz_false | if pXml == NULL | mdz_false | if pCallbackFunc == NULL , or nPercentProCall == 0, or nPercentProCall > 50 (MDZ_XML_ERROR_PARAM) | mdz_true | callback is set | mdz_xml Reference Remove callback set in mdz_xml_setParseCallback() .mdz_bool mdz_xml_removeParseCallback(const struct mdz_Xml* pXml); Parameter | Description |
---|
pXml | pointer to XML-parser instance returned by mdz_xml_create() |
Return | Description |
---|
mdz_false | if pXml == NULL | mdz_true | callback is removed | mdz_xml Reference Get total number of parsed XML-elements (may be use for progress-reporting services)size_t mdz_xml_getTotalElements(const struct mdz_Xml* pXml); Parameter | Description |
---|
pXml | pointer to XML-parser instance returned by mdz_xml_create() |
Return | Description |
---|
SIZE_MAX | if pXml == NULL | count of elements | otherwise | mdz_xml Reference Get Root-Element of parser. Root-Element always exists after successful parsing and includes root-element of XML-document as a child, or multiple elements as children (if XML-document contains multiple roots).mdz_bool mdz_xml_getRootElement(const struct mdz_Xml* pXml, struct mdz_XmlElement* pRootElement); Parameter | Description |
---|
pXml | pointer to XML-parser instance returned by mdz_xml_create() | pRootElement | where Root-Element data should be written |
Return | Description |
---|
mdz_false | if pXml == NULL | mdz_false | if license is invalid/expired (MDZ_XML_ERROR_LICENSE) | mdz_false | if pRootElement is NULL (MDZ_XML_ERROR_PARAM) | mdz_false | if last parsing failed (MDZ_XML_ERROR_PARSE) | mdz_true | operation succeeded | mdz_xml Reference Get Element (or Text for XML with mixed-content) from parser.mdz_bool mdz_xml_getElement(const struct mdz_Xml* pXml, const void* pWhereElement, struct mdz_XmlElement* pElement); Parameter | Description |
---|
pXml | pointer to XML-parser instance returned by mdz_xml_create() | pWhereElement | pointer to Element (or Text) to get. Use Element.m_pFirstChild for retrieving the first element in children list, or Element.m_pNext for retrieving the next element in list | pElement | where Element (or Text) data should be written |
Return | Description |
---|
mdz_false | if pXml == NULL | mdz_false | if license is invalid/expired (MDZ_XML_ERROR_LICENSE) | mdz_false | if pWhereElement is NULL or pElement is NULL (MDZ_XML_ERROR_PARAM) | mdz_false | if last parsing failed (MDZ_XML_ERROR_PARSE) | mdz_false | if pWhereElement is invalid (MDZ_XML_ERROR_ELEMENT) | mdz_true | operation succeeded | mdz_xml Reference Get element attribute from parser.mdz_bool mdz_xml_getAttribute(struct mdz_Xml* pXml, const void* pWhereElement, size_t nAttributeIndex, struct mdz_XmlAttribute* pAttribute); Parameter | Description |
---|
pXml | pointer to XML-parser instance returned by mdz_xml_create() | pWhereElement | pointer to Element (or Text) to get attribute for. Use Element.m_pThis for retrieving attribute of element | nAttributeIndex | 0-based index of attribute to get | pAttribute | where attribute data should be written |
Return | Description |
---|
mdz_false | if pXml == NULL | mdz_false | if license is invalid/expired (MDZ_XML_ERROR_LICENSE) | mdz_false | if pWhereElement is NULL or pAttribute is NULL (MDZ_XML_ERROR_PARAM) | mdz_false | if last parsing failed (MDZ_XML_ERROR_PARSE) | mdz_false | if pWhereElement is invalid (MDZ_XML_ERROR_ELEMENT) | mdz_false | if nAttributeIndex is out of attributes range (MDZ_XML_ERROR_INDEX) | mdz_false | if there is some memory-management problem (MDZ_XML_ERROR_MEMORY) | mdz_true | operation succeeded | mdz_xml Reference
|