Hi, I’d like to create lines with a Prefix, a Counter and a Suffix, e.g.
test 1 bla
test 2 bla
test 3 bla
test 4 bla
test 5 bla
For low counter values an AppleScript works fine, but it gets very slow with higher values. So I tried
AppleScript with script object (to speed up the repeat)
ASObjC
ASObjC with script object
Questions:
Why does AppleScript get slower with higher values?
(10.000 in 2.5 seconds, 100.000 in 160 seconds)
In which situations can a script object be used to speed up a repeat?
Is there a faster way to create “Prefix Counter Suffix” lines?
-- Create lines with Prefix, Counter and Suffix
use AppleScript version "2.4"
use framework "Foundation"
use scripting additions
#script s
#property theRepetitions : 1000 --> ScriptGeek: AppleScript 3,3 x faster than ASObjC
#property theRepetitions : 10000 --> ScriptGeek: AppleScript 1,5 x faster than ASObjC
#property theRepetitions : 20000 --> ScriptGeek: AppleScript 1,6 x faster than ASObjC
#property theRepetitions : 25000 --> ScriptGeek: ASObjC 1,23 x faster than AppleScript
#property theRepetitions : 30000 --> ScriptGeek: ASObjC 1,6 x faster than AppleScript
property theRepetitions : 50000 --> ScriptGeek: ASObjC 1,9 x faster than AppleScript
#end script
set theLine_Start to ""
set theLine_End to " bla"
--------------------------------------------------------------- 1. AppleScript ----------------------------------------------------------------
set theText to ""
repeat with i from 1 to theRepetitions
set theText to theText & ((theLine_Start & i & theLine_End & linefeed) as string)
end repeat
return theText
-------------------------------------------------------- 1.1 AppleScript (with script) ---------------------------------------------------------
set theText to ""
repeat with i from 1 to s's theRepetitions
set theText to theText & ((theLine_Start & i & theLine_End & linefeed) as string)
end repeat
return theText
------------------------------------------------------------------ 2. ASObjC -------------------------------------------------------------------
set theString to (current application's NSMutableString's stringWithString:"")
repeat with i from 1 to theRepetitions
(theString's appendString:((theLine_Start & i & theLine_End & linefeed) as string))
end repeat
set theText to theString as string
return theText
----------------------------------------------------- 2.1 AppleScriptObjC (with script) ------------------------------------------------------
set theString to (current application's NSMutableString's stringWithString:"")
repeat with i from 1 to s's theRepetitions
(theString's appendString:((theLine_Start & i & theLine_End & linefeed) as string))
end repeat
set theText to theString as string
return theText
There is if you’re open to using a script library:
use scripting additions
use framework "Foundation"
use script "BridgePlus"
load framework
set patternString to "test %@ blah"
set theResult to ((current application's SMSForder's arrayWithPattern:patternString startNumber:1 endNumber:100000 minDigits:1)'s componentsJoinedByString:linefeed) as text
Pete. I don’t know the answer to your first question, although basic AppleScript has typically done poorly when handling very-large quantities of data.
As to our second question, a script object is normally used with lists when the goal is to increase execution speed. To illustrate, I ran your original script 1 and two variations of that script in Script Geek with 100,000 repetitions, and the timing results were:
A) Your original script - 65 seconds
B) Your original script modified to use a list - 121 seconds
C) Your original script modified to use both a list and an implicit script object - 6 seconds.
As to your third question, your ASObjC script is faster than script C) above but the difference is small. I thought the use of an array might speed things but it didn’t.
D) Your script 2. - 6 seconds
E) My ASObjC script utilizing an array (just FWIW) - 11 seconds
My script C) above is:
set theRepetitions to 100000
set theLine_Start to "test "
set theLine_End to " bla"
set theText to {}
repeat with i from 1 to theRepetitions
set end of my theText to ((theLine_Start & i & theLine_End & linefeed) as string)
end repeat
return (my theText as text) -- 6.195 seconds
My script E) above is:
use framework "Foundation"
use scripting additions
set theRepetitions to 100000
set theLine_Start to "test "
set theLine_End to " bla"
set theArray to current application's NSMutableArray's new()
repeat with i from 1 to theRepetitions
set joinedString to current application's NSString's stringWithFormat_("%@ %@ %@%@", theLine_Start, i as text, theLine_End, linefeed)
(theArray's addObject:joinedString)
end repeat
return theArray as list as text -- 11.225 second
Just for the sake of completeness, the following is the same as my script C) above but with an explicit script object, which is more commonly used. The timing result was 6 seconds.
testHandler()
on testHandler()
script o
property theText : {}
end script
set theRepetitions to 100000
set theLine_Start to "test "
set theLine_End to " bla"
repeat with i from 1 to theRepetitions
set end of o's theText to (theLine_Start & i & theLine_End & linefeed)
end repeat
return (o's theText as text) -- 6.280 seconds
end testHandler
Although referencing a long list through a script object can greatly speed up access to its items, there can still be a bit of a bottleneck when growing it to the required length by appending items to its end. I think this is something to do with the number of times fresh memory has to be allocated to it, but I’m not entirely sure.
A faster approach here is to grow the list to the required length by list concatenation (larger memory allocations less often) and then to replace its placeholder items with the relevant texts — using script object referencing of course:
on testHandler()
script o
property theText : {missing value}
end script
set theRepetitions to 100000
set theLine_Start to "test "
set theLine_End to " bla"
-- Progressively double the text list's length by concatenating
-- it to itself until it's more than half the required length.
set len to 1
set halfCount to theRepetitions div 2
repeat until (len > halfCount)
set o's theText to o's theText & o's theText
set len to len + len
end repeat
-- Make up the length to what's required, if necessary.
set diff to theRepetitions - len
if (diff > 0) then set o's theText to o's theText & items 1 thru diff of o's theText
-- Replace each 'missing value' in the list with the appropriate text.
repeat with i from 1 to theRepetitions
set item i of o's theText to theLine_Start & i & theLine_End
end repeat
-- Coerce the list to text with a linefeed delimiter and return the result.
return join(o's theText, linefeed)
end testHandler
on join(lst, delim)
set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to delim
set txt to lst as text
set AppleScript's text item delimiters to astid
return txt
end join
testHandler()
At first I questioned why creating a list of 100,000 missing-value items and then replacing the missing-value items with the desired text would make this faster. I then reread your post, which clearly explained this as “list concatenation (larger memory allocations less often)”.
I tested your script the same as in my post above and the result was 2.555 seconds. So, to summarize:
My first post above contains the following timing result:
B) Your original script modified to use a list - 121 seconds
This is not correct, and the actual timing result is 6.282 seconds. This means that the use of a script object in my scripts has minimal impact on the timing result (a fifth of a second if that). Sorry for the error.
The script tested was:
set theRepetitions to 100000
set theLine_Start to "test "
set theLine_End to " bla"
set theText to {}
repeat with i from 1 to theRepetitions
set end of theText to (theLine_Start & i & theLine_End & linefeed)
end repeat
return theText as text -- 6.282 seconds
The benefits of using a script object to reduce execution time are unpredictable. Sometimes they are of great benefit, but on other occasions they do little, and I’ve never been able to find a clear pattern. Fortunately, they are easy to test.
BTW, it took me a bit of time to understand Nigel’s script, which is ingenious, but having reached that point its operation is fairly straightforward. I would never be able to write a script like that on my own, but I will probably use that approach in scripts I write in the future. This has been an interesting thread.
Back in days of yore, Apple’s now largely defunct AppleScript-Users mailing list had a handful of AppleScript engineers on it who often explained roughly how things worked and would also take feedback.
The AppleScript Language Guide’s coverage of lists contains a section revealing that using a variable set to a reference to the list variable can greatly speed up inserting items into the list. But it doesn’t explain why. (In practice, it’s only relevant with really large lists.) One of the AppleScript engineers on the mailing list did explain it once, but I must admit I never fully understood the explanation. Something to do with certain safety checks being bypassed when lists are addressed this way.
Someone else discovered by chance that a similar speed-up occurs with lists in script objects — and analysis by yours truly revealed that the specifiers required to address such lists from outside are similar to those in the ‘references’ to lists created by a reference to, ie. «list variable» of «script». And they work slight faster because they’re compiled directly into the running code instead of being stored in a variable.
The memory allocation business was easier to understand. When an item’s concatenated to a list, it’s coerced to list itself and the two lists are concatenated to create a third list. Memory has to be allocated for this third list (and probably for the second one too) each time this happens. However, when a list’s initialised, it’s given enough memory for the addition of a certain number of items. (By “items”, I mean here pointers in the list rather than the items themselves.) So it can be faster to append items to its end than to concatenate items to it, since the same list is simply made longer within the memory already allocated instead of new lists being created. Once the allocated memory’s filled though, more has to be allocated before further items can be appended, more when that’s used up, and so on. This can happen many times when adding hundreds of thousands of items.
The workaround a few posts up works by concatenating the list to itself as it grows longer, effectively concatenating long lists to long lists, which is faster than appending or concatenating hundred of thousands of individual items to a single list. Once a list with the required number of slots is obtained, setting them to the required values using script object referencing is itself very fast.
Thanks Nigel for the explanations. I’ve used on occasion both the a-reference-to operator and script objects to make working with lists faster. I assumed there was some connection between the two, but it’s good to know what that is.