Efficient way to reorganize lists

3 years ago me was pretty smart as far as I can tell from browsing this forum… :slight_smile:

I’m working on “aligning” contents of multiple TextEdit documents. “Aligning” consists of creating a data set, an array, where the first item is the list of the first item (paragraphs) of all documents, etc.

Basically, I want the same data structure as Excel’s “string value of used range”. Excel creates that array super quickly but having to loop over the number of “paragraphs” in TextEdit considerably slows down the thing…

To not have to work in a TE tell block, I create a raw data set that consists of the list of paragraphs from the first document, followed by the list of paragraphs from the second document, etc.

Then I quit the tell block (hoping it will make a difference) and loop on the thing. But it’s still way too long: ~500 x 2 items to reorganize require 4 secs, ~6000 x 2 require 70 secs, the same data set only requires 2 secs in Excel…

There are plenty of reasons why working in Excel is not an option (first, not “out of the box” solution) and so I’d like to drastically optimize the list reorganization, to keep it in the same order of magnitude as what Excel does (ie, < 10s).

Is that an ASObjC job ?

The documents are all open ?

You are writing something like:

tell application "TextEdit"
    first paragraph of documents
end tell

?

Jean. There are at least three possible options, which are an ASObjC array, an ASObjC set, and a script-object-enhanced AppleScript list. It would be helpful if you posted a simple example of “aligning”.

The code is here:

Lines 60-62 fill an array with the contents of the various windows:

 {{1st line of first window, 2nd line of first window, etc.}
  {1st line of second window, 2nd line of second window, etc.}
  {etc.}}

then I quit the tell block and I process the thing between lines 81-92 where I go

{{1st line of first window, 1st line of second window, etc.}
 {2nd line of first window, 2nd line of second window, etc.}
 {etc.}}

There is a word for that when you deal with matrixes (which the thing is, basically).

And that is only to get to Excel’s “used range” data structure that you can find here and that takes just seconds:

Basically the 2 scripts share the same XML processing, but the TextEdit script has to go through all the “data structuration” process when Excel doesn’t.

The XML processing too takes time, but it’s more reasonable (also both are way shorter when run as apps than when run from within SD, so my estimations are not extremely good there, and the “progress” part seems to not slow down things that much in terms of speed).

Jean. I’ve written a handler which will do what you want with an ASObjC array. It takes exactly the same approach as you use, so I suspect it won’t be any faster and could be slower. It’s easy to test, though. A script-object-enhanced list might be faster, or perhaps there’s an entirely different approach that hasn’t occurred to me.

use framework "Foundation"
use scripting additions

set rawDataList to {{"1st line of first window", "2nd line of first window", "3rd line of first window"}, {"1st line of second window", "2nd line of second window", "3rd line of second window"}, {"1st line of third window", "2nd line of third window", "3rd line of third window"}}

set theAlignedList to makeAlignedList(rawDataList)

on makeAlignedList(rawDataList)
	set rawDataArray to current application's NSArray's arrayWithArray:rawDataList
	set arrayCount to rawDataArray's |count|()
	set dataCount to (rawDataArray's objectAtIndex:0)'s |count|()
	
	set combinedArray to current application's NSMutableArray's new()
	repeat with i from 0 to (dataCount - 1)
		set subArray to current application's NSMutableArray's new()
		repeat with j from 0 to (arrayCount - 1)
			(subArray's addObject:((rawDataArray's objectAtIndex:j)'s objectAtIndex:i))
		end repeat
		(combinedArray's addObject:subArray)
	end repeat
	
	return combinedArray as list
end makeAlignedList

--> {{"1st line of first window", "1st line of second window", "1st line of third window"}, {"2nd line of first window", "2nd line of second window", "2nd line of third window"}, {"3rd line of first window", "3rd line of second window", "3rd line of third window"}}

Thank you ! I’ll give that a try and will report on speed.

Matrix rotation.

If you’re happy to use my BridgePlus script library, this should be pretty quick:

use scripting additions
use framework "Foundation"
use script "BridgePlus"
load framework

set theLists to {{1.1, 2, 3}, {4, 5, 6}, {7, 8, 9}}
set theResult to current application's SMSForder's colsToRowsIn:theLists |error|:(missing value)
theResult as list --> {{1.1, 4, 7}, {2, 5, 8}, {3, 6, 9}}
1 Like

I ran some timing tests and the array handler performed poorly. The results were:

Using an array - 891 milliseconds
Using a list - 2.370 seconds
Using a script-object-enhanced list - 23 milliseconds

I’ve included one of testing scripts below–it uses a script-object-enhanced list.

use framework "Foundation"
use scripting additions

-- untimed code
set theMatrix to {}
repeat with i from 1 to 100
	set theSublist to {}
	repeat with j from 1 to 100
		set end of theSublist to "Line " & j & " of sublist " & i
	end repeat
	set end of theMatrix to theSublist
end repeat

-- start time
set startTime to current application's CACurrentMediaTime()

-- timed code
set rotatedMatrix to getRotatedMatrix(theMatrix)

on getRotatedMatrix(theMatrixList)
	script o
		property theMatrix : theMatrixList
		property rotatedMatrix : {}
	end script
	
	repeat with i from 1 to length of (item 1 of o's theMatrix)
		set aList to {}
		repeat with j from 1 to length of o's theMatrix
			set end of aList to item i of item j of o's theMatrix
		end repeat
		set end of o's rotatedMatrix to aList
	end repeat
	return o's rotatedMatrix
end getRotatedMatrix

-- elapsed time
set elapsedTime to (current application's CACurrentMediaTime()) - startTime
set numberFormatter to current application's NSNumberFormatter's new()
if elapsedTime > 1 then
	numberFormatter's setFormat:"0.000"
	set elapsedTime to ((numberFormatter's stringFromNumber:elapsedTime) as text) & " seconds"
else
	(numberFormatter's setFormat:"0")
	set elapsedTime to ((numberFormatter's stringFromNumber:(elapsedTime * 1000)) as text) & " milliseconds"
end if

-- result
elapsedTime --> 23 milliseconds

## theMatrix
## rotatedMatrix
## count rotatedMatrix
1 Like

Indeed, it is super quick.

Now, when I run the code from SD it works fine but when I run is as an app, it fails… The progress report dialog appears in a flash and then it’s gone. When I run the code from SD, all the lines are handled as I can see the progress in the window…

I’m not sure how to debug such a problem. Any suggestions ?

Indeed, there is little difference between an array and a list.

I guess I must understand how to use script objects now… I just found a thread on macscripters dating from 2003-2004 but it’s way over my head…

How does the app fail?

I’m not sure. As I wrote, the progress dialog shows for a very short time and then disappears.

Maybe there is something in the manual about debugging a running app?

Can you send me the app and some sample files to test?

1 Like

I was getting things ready to send to you, and I tested again, and it worked… I guess it was a fumble on my side. Thank you for the help.

I’ve looked for info about script objects (I also have the Neuburg book and the Apress book) but it’s clearly way above what I can parse. Is there a nice introduction to what they are and how I can use them in the context of list manipulation?

Jean. The AppleScript Language Guide contains a good discussion of script objects, although it doesn’t address the specific topic of interest to you.

https://developer.apple.com/library/archive/documentation/AppleScript/Conceptual/AppleScriptLangGuide/conceptual/ASLR_script_objects.html#//apple_ref/doc/uid/TP40000983-CH207-BAJJCIAA

The documented method for speeding the manipulation of lists is the “a reference to” operator. This works well in my experience but is somewhat limited.

https://developer.apple.com/library/archive/documentation/AppleScript/Conceptual/AppleScriptLangGuide/reference/ASLR_classes.html#//apple_ref/doc/uid/TP40000983-CH1g-DontLinkElementID_574

I’ve never found any documentation that pertains to the use of script objects to speed manipulation of lists, but it clearly works. It’s often used in scripts posted on the MacScripter site.

A third approach is an implicit script object, which involves preceding lists with the word my, although this only works at the top level of a script.

Some time back I quantified the speed advantages of the above approaches and posted the results in the thread linked below. These tests involve a list of lists but similar results would occur with a list.

https://macscripter.net/viewtopic.php?id=48401

It should be noted that all of the above approaches work in some instances but not others, and the only way to know for sure is to test. I normally use Script Geek for this purpose, but on occasion I use the testing script posted earlier in this thread.

1 Like

Thank you so much for the reference. I also have the 2015 edition of the Langage Reference as PDF, I’ll take some time to read it and try to make sense of all that !