“The Utility AppleScript Code page gives more sample code showing how to access information from the data structure returned by the parse XML command.”
Sorry, that page has long since been lost. The code, if memory serves, was really just a set of AS handlers that iterated through the lists of items produced by XML Tools looking for particular element names.
I have AS code that uses Mark’s XML Tools.osax to parse sample XML, which contains two records of 60-some fields each, with three levels of nesting.
I’d be happy to send it to you (or anyone else who might have use for it). Request it off-list by email. Click on my name to open my profile on the Debugger forum web site. Then click the “Expand” button to reveal my email address.
The poster was originally using System Events, which is fine for very small jobs but notoriously slow. With a bit of tweaking and reworking in ASObjC, the time to parse the XML went from 17+ minutes to <0.4 seconds.
I was wondering when you would pop in with an ASObjC solution.
I would prefer to use a native solution (like ASObjC) rather than a 3rd party osax or script lib, but ease of use and good documentation play a huge role.
While I’m sure the ASObjC solution you linked to is a great solution for that very specific use case, it didn’t really help me much in learning how to use ASObjC for XML processing.
Satimage has provided great documentation and tutorials for their XMLLib.
Is there anything equivalent for ASObjC XML processing?
There isn’t any ASObjC documentation, and the Objective-C documentation is not particularly deep (search for Apple’s Introduction to Tree-Based XML Programming Guide for Cocoa).
However, XPath is a W3C language, so there’s a mass of stuff about it available on the Web. And if what you wish to do can be done with XPath, it’s generally very quick and involves minimal code.
As a taster, in ASObjC you mostly deal with three main classes: NSXMLDocument, NSXMLElement and NSXMLNode. The first two are subclasses of the latter.
XML is a pretty broad topic, but I suspect the most common requirement in scripts is XML parsing. The script in this thread Converting HTML (or xml) table to AppleScript list is good example of XPath in action.
I really have to second Shane’s suggestion to use XPath if possible. Back in the mists of time when I created XML Tools, I moved on to something called XSLTTools (now discontinued) which allowed XPath queries into an XML document. This was so much more useful for pulling information out of XML and making it usable within AppleScript. Especially true for deep or wide XML hierarchies.
I should probably mention that there’s also a streaming parser, NSXMLParser. It’s not for the feint-hearted and not exactly fast in ASObjC, but it might be of interest to some. Search forIntroduction to Event-Driven XML Programming Guide for Cocoa to read more.
-- Based on <http://troybrant.net/blog/2010/09/simple-xml-to-nsdictionary-converter/>
use AppleScript version "2.4"
use scripting additions
use framework "Foundation"
property dictStack : missing value -- stack to hold array of dictionaries
property textInProgress : "" -- string to collect text as it is found
property anError : missing value -- if we get an error, store it here
on makeRecordWithXML:xmlString
-- set up properties
set my dictStack to current application's NSMutableArray's array() -- empty mutable array
dictStack's addObject:(current application's NSMutableDictionary's |dictionary|()) -- add empty mutable dictionary
set my textInProgress to current application's NSMutableString's |string|() -- empty mutable string
-- convert XML from string to data
set anNSString to current application's NSString's stringWithString:xmlString
set theData to anNSString's dataUsingEncoding:(current application's NSUTF8StringEncoding)
-- initialize an XML parser with the data
set theNSXMLParser to current application's NSXMLParser's alloc()'s initWithData:theData
-- set this script to be the parser's delegate
theNSXMLParser's setDelegate:me
-- tell it to parse the XML
set theResult to theNSXMLParser's parse()
if theResult then -- went OK, get first item on stack
return ((my dictStack)'s firstObject()) as record
else -- error, so return error
error (my anError's localizedDescription() as text)
end if
end makeRecordWithXML:
-- this is an XML parser delegate method. Called when new element found
on parser:anNSXMLParser didStartElement:elementName namespaceURI:aString qualifiedName:qName attributes:aRecord
-- store reference to last item on the stack
set parentDict to my dictStack's lastObject()
-- make new child
set childDict to current application's NSMutableDictionary's |dictionary|()
-- if there are attributes, add them as a record with key "attributes"
if aRecord's |count|() > 0 then
childDict's setValue:aRecord forKey:"attributes"
end if
-- see if there's already an item for this key
set existingValue to parentDict's objectForKey:elementName
if existingValue is not missing value then
-- there is, so if it's an array, store it...
if (existingValue's isKindOfClass:(current application's NSMutableArray)) as boolean then
set theArray to existingValue
else
-- otherwise create an array and add it
set theArray to current application's NSMutableArray's arrayWithObject:existingValue
parentDict's setObject:theArray forKey:elementName
end if
-- then add the new dictionary to the array
theArray's addObject:childDict
else
-- add new dictionary directly to the parent
parentDict's setObject:childDict forKey:elementName
end if
-- also add the new dictionary to the end of the stack
(my dictStack)'s addObject:childDict
end parser:didStartElement:namespaceURI:qualifiedName:attributes:
-- this is an XML parser delegate method. Called at the end of an element
on parser:anNSXMLParser didEndElement:elementName namespaceURI:aString qualifiedName:qName
-- if any text has been stored, add it as a record with key "contents"
if my textInProgress's |length|() > 0 then
set dictInProgress to my dictStack's lastObject()
dictInProgress's setObject:textInProgress forKey:"contents"
-- reset textInProgress property for next element
set my textInProgress to current application's NSMutableString's |string|()
end if
-- remove last item from the stack
my dictStack's removeLastObject()
end parser:didEndElement:namespaceURI:qualifiedName:
-- this is an XML parser delegate method. Called when string is found. May be called repeatedly
on parser:anNSXMLParser foundCharacters:aString
-- only append string if it's not solely made of space characters (which should be, but aren't, caught by another delegate method)
if (aString's stringByTrimmingCharactersInSet:(current application's NSCharacterSet's whitespaceAndNewlineCharacterSet()))'s |length|() > 0 then
(my textInProgress)'s appendString:aString
end if
end parser:foundCharacters:
-- this is an XML parser delegate method. Called when there's an error
on parser:anNSXMLParser parseErrorOccurred:anNSError
set my anError to anNSError
end parser:parseErrorOccurred:
set xmlString to "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>
<character>
<firstName>Saga</firstName>
<lastName>Norén</lastName>
<city>Malmö</city>
<partner approach=\"dogged\">
<firstName>Martin</firstName>
<lastName>Rohde</lastName>
<city>København</city>
</partner>
</character>"
its makeRecordWithXML:xmlString
--> {|character|:{firstName:{|contents|:"Saga"}, lastName:{|contents|:"Norén"}, city:{|contents|:"Malmö"}, partner:{firstName:{|contents|:"Martin"}, lastName:{|contents|:"8"}, city:{|contents|:"København"}, attributes:{approach:"dogged"}}}}
While I will jump into just anything with ASObj-C and at least try it I am very hesitant to try the ASObj-C XML stuff. ASObj-C can be cryptic at times but trying to debug code for a XML parse can be one gigantic headache. If I ever did learn it I would start off with simpler and move up to more complex. Like Jim says it is hard to generalize for a more complex example. XML parsing is very tedious to check and debug when it is complex.
I’ve thought about addressing that in the ASObj-C database but I would only do that if I was willing to start simple and work up to complex.
Yes, you can. It can be a bit more complicated, depending what you want to do, but it can be done.
Here’s a simple example (I’ve trimmed the XML for space reasons):
use AppleScript version "2.4" -- Yosemite (10.10) or later
use framework "Foundation"
use scripting additions
set theXML to "<x:xmpmeta xmlns:x=\"adobe:ns:meta/\" x:xmptk=\"Adobe XMP Core 5.6-c137 79.159768, 2016/08/11-13:24:42 \">
<rdf:RDF xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">
<rdf:Description rdf:about=\"\"
xmlns:xmp=\"http://ns.adobe.com/xap/1.0/\"
xmlns:xmpMM=\"http://ns.adobe.com/xap/1.0/mm/\"
xmlns:stEvt=\"http://ns.adobe.com/xap/1.0/sType/ResourceEvent#\"
xmlns:stRef=\"http://ns.adobe.com/xap/1.0/sType/ResourceRef#\"
xmlns:dc=\"http://purl.org/dc/elements/1.1/\"
xmlns:pdf=\"http://ns.adobe.com/pdf/1.3/\">
<xmp:CreatorTool>Adobe InDesign CC 2017 (Macintosh)</xmp:CreatorTool>
<xmp:CreateDate>2017-03-19T12:21:51+11:00</xmp:CreateDate>
<xmp:MetadataDate>2017-03-19T12:21:51+11:00</xmp:MetadataDate>
<xmp:ModifyDate>2017-03-19T12:21:51+11:00</xmp:ModifyDate>
<xmpMM:InstanceID>uuid:e826b00e-8f3d-d448-be8f-1b4a3e36b6e4</xmpMM:InstanceID>
<dc:format>application/pdf</dc:format>
<pdf:Producer>Adobe PDF Library 15.0</pdf:Producer>
<pdf:Trapped>False</pdf:Trapped>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>"
set {theXMLDoc, theError} to current application's NSXMLDocument's alloc()'s initWithXMLString:theXML options:0 |error|:(reference)
-- ignoring namespace
set {theNodes, theError} to theXMLDoc's nodesForXPath:"//*[local-name()='InstanceID']" |error|:(reference)
-- or, more simply:
set {theNodes, theError} to theXMLDoc's nodesForXPath:"//*:InstanceID" |error|:(reference)
-- these are more specific but return the same thing
set {theNodes, theError} to theXMLDoc's nodesForXPath:"//*[local-name()='Description']/*[local-name()='InstanceID']" |error|:(reference)
set {theNodes, theError} to theXMLDoc's nodesForXPath:"//*[local-name()='RDF']/*[local-name()='Description']/*[local-name()='InstanceID']" |error|:(reference)
set {theNodes, theError} to theXMLDoc's nodesForXPath:"/*[local-name()='xmpmeta']/*[local-name()='RDF']/*[local-name()='Description']/*[local-name()='InstanceID']" |error|:(reference)
if theNodes = missing value then error (theError's localizedDescription() as text)
return theNodes's valueForKey:"stringValue"