Duplicating tables in Word with AppleScript takes too long

I have a script that duplicates tables in a Word document, but it’s very slow. I just tried, and I was doing 215 tables, and after 10 minutes it only had done 87 tables.

There is not much difference between Word 2019 (current) and Word 2011. Also switching off all Sharing options (in the hope it did something with remote Apple Events) didn’t speed up stuff. I noticed Word was using a lot of CPU.

It seems the time it takes increases exponentially (or at least not linear). In the first minute it duplicated around 35 tables, and it becomes slower with time.

The code:

using terms from application "Microsoft Word"
	tell document report_name
		--tablenr: the table to duplicate
		--nr_needed: how many copies we need
		set my progress description to "Duplicate foto-tables"
		set table_list to tables -- all tables from document
		set tables_needed to tablenr + nr_needed - 1
		set table_counter to tablenr
		repeat while table_counter < tables_needed
			duplicate item tablenr of table_list
			set table_counter to table_counter + 1
			set progress completed steps to progress completed steps + 1
		end repeat
	end tell
end using

Is there another way to duplicates this table? Any idea’s?

I don’t have or use Microsoft Word, so may not be able to address your issue head on. But I do notice a couple of oddities in your script, which will be worth correcting:

  1. Your use of using terms from application appears unnecessary, but also negligent in being an insufficient substitute for a tell application block. using terms from allows passive access to the scripting terminology used by an application (mostly in situations where terminologies between multiple dictionaries might clash), but none of the class objects/elements (document) or commands (duplicate) are ever told they belong to or are instructed to target application "Microsoft Word".

    I did a brief simplified experiment in Script Debugger:

    using terms from application "Script Editor"
    	tell document 1
    		return open
    	end tell
    end using terms from
    

    While you may currently have thought that the open command would be received by Script Editor, it is actually received (correctly) by Script Debugger, which is evident from the path that gets returned following the open command (in the context of an error message stating the document is already open and presently executing a script, which is immaterial).

  2. Here, you set a variable:

    set table_list to tables
    

    Here, you access a variable with a similar-sounding identifier, but one that hasn’t been previously declared and/or set:

    duplicate item tablenr of tableList
    

Following on from this is a quick neatening up of the code, most for my benefit, as there were a lot of variables being set only to be used to set another. There’s nothing wrong with this, and in fact, many would say it demonstrates logical methodology and aids readability. My dyslexia requires me to minimise my code as much as possible to make it readable for me, and in so doing, you’ll see a couple of functional edits I made as I went along:

tell application "Microsoft Word"
	tell document report_name
		--tablenr: the table to duplicate
		--nr_needed: how many copies we need
		set my progress description to "Duplicate foto-tables"
		
		set [i, j] to [tablenr, tablenr + nr_needed - 1]
		set tableList to a reference to its tables i thru j -- the subset of tables to be duplicated

		repeat with T in my tableList
			duplicate T
			set progress completed steps to progress completed steps + 1
		end repeat
	end tell
end tell
  1. After removing the sea of identifiers, it emerged that, between them, they hold a pair of indices, demarcating the start and end of the range of tables you are wanting to duplicate. I assigned these to variables i and j, respectively, then narrowed the list of tables stored in tableList accordingly. If, say, you only wished to a copy four tables our of 300,000 then having the script retrieve and store the properties for the other 299,996 tables is a waste of storage space (memory), time (execution speed), and probably the bottleneck in your original script, depending on how many tables you have in total.

  2. I also added a reference to for the assignment of tableList. I won’t go into it now, but basically it means that a table object in the list is only retrieved at the point of use, i.e. during the command duplicate. Therefore, only one table is being accessed and stored at any one time as the repeat loop works its way through the tableList.

    There’s a small caveat to this, which might mean that you see tables being duplicated more than once and others not at all. This will possibly happen if the duplicate command by default positions the newly-duplicated table immediately next to its progenitor (rather than at the very end of all the tables), and if the new table is assigned an index number based on its position rather than the order in which it was created. The former would imply that duplicating table 3 creates a new table 4, and the old tables numbered 4 onwards would have their index values incremented—this would be annoying; the latter mechanism would duplicate table 3 by creating table 945 (in a 944-long list of tables). So watch out fo this, and if it occurs, you’ll need to remove a reference to, which would be a shame.

  3. I changed the nature of the repeat loop to iterate through its object references, rather than using a counter. The implementation you have in your original script is going to be slower: every iteration forces it to perform a mathematical operation to increment a variable’s value (it sounds trivial, but it’s demonstrable; and it involves three operations, as it has to access the variable, increment it, then assign it again, although that’s probably not precisely what’s occurring as this isn’t QBasic, but anyway…). Moreover, the while condition needs evaluating every single iteration as well. Then having to access the right table using an index-based reference may or may not be slower than a seemingly more “direct” object-to-hand approach (I say “seemingly” because the event log will still show AppleScript making an index-based call to access the list item, but this isn’t always transparent or reflective of what’s necessarily happening below).

  4. Finally, I inserted the word my to reference tableList during the definition of the repeat loop. It may not agree with AppleScript at run time, but you’ll know straight away as it will throw an error and halt execution, in which case you can just remove my. Its presence allows AppleScript to access the items of a list fast, compared to a direct access, which is slow. It’s equivalent to a construct like this:

    script
        property array : myList
    end script
    

    in order to access the items in myList by going through array that belongs to the script object. Using my allows me to do a similar detour, by accessing the list by going through the top-level script object (though it’s less about going through script objects, and more about forcing access to the list via references to references. As for how that confers speed gains, some of it is down to what I touched on with a reference to earlier; but mostly, I have no idea, and it’s counter-intuitive in my mind).


Summary

  • Use a tell application block instead of a using terms from block
  • Check your variable names are spelled correctly (if you have any try blocks in your script, get rid of them. Errors are your friend)
  • tableList now only retrieves the tables to be duplicated (likely the biggest influencer)
  • a reference to for speed and space efficiency, but if you notice tables duplicated repeatedly and others omitted, remove this
  • repeat loop now doing less work and using object references rather than index searches
  • my for indirect list access ⇒ faster

After-thought:

Did you ever experiment with duplicating a list of objects instead of just one ? Might be worth trying, because if you can do this:

set [i, j] to [tablenr, tablenr + nr_needed - 1]
tell application "Microsoft Word" to ¬
    tell document report_name to ¬
        duplicate its tables i thru j

that could be a game-changer.

About the using terms. This code is used within a handler, and the tell document report_name is actually a tell doc, and doc is passed as an argument. I think I used the using terms mostly because so I could test both Word 2011 and Word 2016 with only changing the identifier to the application at one point. It might not be very useful now, but it works, and I don’t have a reason to change it at the moment.

The line set table_list to tables is a weird one I still don’t really understand (I forgot to rename one instance of tableList to table_list; that’s fixed now).

If I replace table_list with tables I get an error message on the duplicate line (the variable table_thing is not defined).

set table_thing to item table_counter of tables
duplicate table_thing --fails

When I declare table_thing as a local variable, it fails on the set line.

local table_thing
set table_thing to item table_counter of tables --fails
duplicate table_thing

This code works fine, but I’m not sure why… One thing I notice is when duplicating the variable table_list is not updated. One needs to explicitly run the first line for it to be updated.

set table_list to tables
set table_thing to item table_counter of table_list
duplicate table_thing

I tried your enhanced version of the code. However, it fails with something like item does not exist. This is probably because you reference tables that don’t exist because they haven’t been duplicated yet…

As you can see in my example above, I now duplicate the duplicated table instead of duplicating the first table, in the hope appending is faster than inserting, but it doesn’t make much difference.

I do think your after-thought is promising. If I keep doubling the number of tables, at one stage I have enough. This should be much faster. However, I haven’t been able to duplicate a selection/range.

duplicate its tables table_nr thru (table_nr + doubling_factor - 1) --fails

I probably should create a range or something. Could you make sense of Word’s scripting dictionary? I put the sdef-files on GitHub?

I have not tested this, but generally I prefer to use Word VBA Macros when the operation is entirely within Word. I have not checked, but if VBA provides a table duplicate method, then it might be worth a try.

If you still want to mange from AppleScript, then just call the VBA macro from your script.

Never realized VBScript was an option on Mac. I have some questions?

  • Does the VBScript needs to be embedded in the document, or can it run stand alone?
  • How do you call VBScript from AppleScript?

Thanks.