# Using map, filter and reduce or fold in Applescript

#1

(Branching from a discussion with Phil http://forum.latenightsw.com/t/converting-an-nsattributedstring-into-an-html-string/1048/14)

Map filter and ‘reduce’ or ‘fold’ are useful when we need to assemble scripts quickly for ourselves, reducing the risks of haste and accident that loops are heir to.

They are essentially pasteable pre-tested loops of generic patterns:

• map for obtaining transformed lists of unchanged length
• fold (or ‘reduce’ in JS) for obtaining single values from a list
• filter for defined pruning

First some examples, and then a note on doing it in AppleScript.

Examples:

``````on double(x)
x * 2
end double

on half(x)
x / 2
end half

on even(x)
x mod 2 = 0
end even

a + b

on mult(a, b)
a * b
end mult

on sum(xs)
end sum

on product(xs)
foldl(mult, 1, xs)
end product

-- TEST ------------------------------------------------------------------
on run
map(double, {1, 2, 3, 4, 5})
--> {2, 4, 6, 8, 10}

map(half, {1, 2, 3, 4, 5})
--> {0.5, 1.0, 1.5, 2.0, 2.5}

filter(even, {1, 2, 3, 4, 5, 6, 7, 8, 9, 10})
--> {2, 4, 6, 8, 10}

sum({1, 2, 3, 4, 5, 6, 7, 8, 9, 10})
--> 55

product({1, 2, 3, 4, 5, 6, 7, 8, 9, 10})
--> 3628800
end run

-- GENERICS --------------------------------------------------------------

-- filter :: (a -> Bool) -> [a] -> [a]
on filter(f, xs)
tell mReturn(f)
set lst to {}
set lng to length of xs
repeat with i from 1 to lng
set v to item i of xs
if |λ|(v, i, xs) then set end of lst to v
end repeat
return lst
end tell
end filter

-- foldl :: (a -> b -> a) -> a -> [b] -> a
on foldl(f, startValue, xs)
tell mReturn(f)
set v to startValue
set lng to length of xs
repeat with i from 1 to lng
set v to |λ|(v, item i of xs, i, xs)
end repeat
return v
end tell
end foldl

-- map :: (a -> b) -> [a] -> [b]
on map(f, xs)
tell mReturn(f)
set lng to length of xs
set lst to {}
repeat with i from 1 to lng
set end of lst to |λ|(item i of xs, i, xs)
end repeat
return lst
end tell
end map

-- Lift 2nd class handler function into 1st class script wrapper
-- mReturn :: First-class m => (a -> b) -> m (a -> b)
on mReturn(f)
if class of f is script then
f
else
script
property |λ| : f
end script
end if
end mReturn
``````

Map fold and filter in AppleScript

The generality and reusability of map, fold and filter depend on our passing other functions as arguments to them.

Plain AppleScript handlers are not in themselves ‘first-class’ functions that can fully survive being passed as values like this, and at first this looks like an obstacle.

Fortunately, however:

1. Script objects with handlers are first class objects, and
2. we can wrap any handler in a script at run-time.

The oddly named mReturn function which I personally use to do this might be named more clearly as something like scriptWrapped(handler)

``````-- mReturn :: First-class m => (a -> b) -> m (a -> b)
on mReturn(f)
if class of f is script then
f
else
script
property |λ| : f
end script
end if
end mReturn
``````
1. If the argument is already a script, it is returned unchanged
2. otherwise, it is assumed to be a handler, and the return value is a script which has that handler as a property
3. the original name of the handler is discarded, and something more anonymous, like lambda or |λ| is used instead.

This allows us to write things like `map(double, {1,2,3,4,5})` – defining map in a way which script-wraps any handler that is passed as its first argument:

``````-- map :: (a -> b) -> [a] -> [b]
on map(f, xs)
tell mReturn(f)
set lng to length of xs
set lst to {}
repeat with i from 1 to lng
set end of lst to |λ|(item i of xs, i, xs)
end repeat
return lst
end tell
end map
``````

In short, our map is just a pre-packaged, pre-tested and reusable loop, in which any handler passed as the argument f is applied in turn to each item in the list xs.

A few more details

Inside the map loop (and equally inside the filter and foldl loops), you may notice that the list transforming function (passed as f, and applied as |λ|, can have up to 3 arguments, rather than just one.

This echoes the pattern used by JavaScript’s Array.map method,

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/map

in which the function passed to map can refer not only to the current element in the list, but also to the index of that element (zero-based in JS, but 1-based in AS), and additionally to the whole list.

Passing scripts rather than handlers to map, filter, fold

Defining these generic loop patterns in terms of an mReturn or scriptWrapped function allows us to pass either handlers or pre-existing scripts as arguments.

We could, for example write:

``````on run

set xs to {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}

script
on |λ|(x)
x * 2
end |λ|
end script
map(result, xs)
--> {2, 4, 6, 8, 10, 12, 14, 16, 18, 20}

-- or

script double
on |λ|(x)
x * 2
end |λ|
end script
map(double, xs)
--> {2, 4, 6, 8, 10, 12, 14, 16, 18, 20}

end run

-- map :: (a -> b) -> [a] -> [b]
on map(f, xs)
tell mReturn(f)
set lng to length of xs
set lst to {}
repeat with i from 1 to lng
set end of lst to |λ|(item i of xs, i, xs)
end repeat
return lst
end tell
end map

-- Lift 2nd class handler function into 1st class script wrapper
-- mReturn :: First-class m => (a -> b) -> m (a -> b)
on mReturn(f)
if class of f is script then
f
else
script
property |λ| : f
end script
end if
end mReturn
``````

Why and when ?

These patterns work well, in my experience, when I am writing code for my own use.

• Scripts can be quickly assembled from reusable generics (I maintain parallel libraries of c. 350 generic functions in AS and JS, and I’m beginning to assemble one in Swift),
• The way they click together is rather simple and predictable - and I find that I spend much less time in debugging as a result
• Using the same generic patterns across different languages eases a bit of the cognitive burden for me, and I also find that I use the functions as reference manuals. I may not always use the simplest ones (It can be faster, on occasions when that’s helpful, to just inline the code, but if I know how do do something in one language, and have forgotten the details in the other, I tend to just look up the familiar generic function, and glance at its implementation in the language that I’m writing in.

When not to use these patterns ?

When particular economics or practicalities make human time (hand-coding at a lower level, stepping through debuggers when a hand-written mutation has found some puzzling edge condition) less expensive than execution time, then you should always be able to manually soup up your inner loops.

Even then (and even I sometimes want a diagram, for example, to generate more swiftly), I find that it can work well to first build a pre-fab structure with (map and fold and other generics) and then replace some of the inner iterations with something hand written.

In practice tho, in my own context, I am increasingly finding (with OmniGraffle etc now starting to acquire their own JSContexts) that when maxed-out zippiness at any cost is what I want, I may end up doing it in JS).

(Not always, of course, where no JSContext is to hand, AS is, as we know, still much faster at batch-setting properties).

As for ‘FP’, well, others take different views but I happen to be a bit agnostic about that notion – is there really a clear dividing line ? All coding involves functions. Even sequences and semi-colons can be interpreted as functions …

But I do know that map filter and fold/reduce make my life a bit easier in any language

Saving Mail message attachments
Converting an NSAttributedString into an html string
(Mark Alldritt) #2

Fair enough, but it does not have to be so complex. AppleScript only requires that identifiers containing handlers not be local to a handler. Your use of script objects can removed in order to see this in action:

``````on map(f, theData)
global ff
set ff to f
set theResult to {}
repeat with anItem in theData
set end of theResult to ff(anItem)
end repeat
return theResult
end map

on double(x)
x * 2
end double

map(double, {1, 2, 3, 4, 5})
--> {2, 4, 6, 8, 10}
``````

This version does not handle recursion (i.e. `f` calling `map`). To deal with that we can introduce a local script object to provide a scope to hold a non-local variable (property) containing the handler (use of `ff` vs `|λ|` is simply a matter of style and preference):

``````on map(f, theData)
script s
property ff : f
end script
set theResult to {}
repeat with anItem in theData
set end of theResult to s's ff(anItem)
end repeat
return theResult
end map

on double(x)
x * 2
end double

map(double, {1, 2, 3, 4, 5})
--> {2, 4, 6, 8, 10}
``````

So this is what ComplexPoint’s code is actually accomplishing (all I’ve really done is remove the `tell` part). His code is more highly abstracted and may possibly execute a little faster in some instances.

#3

There are certainly different routes to it – Matt Neuburg’s book showcases a slightly filter for example.

There is always a danger, however, of confusing ‘complexity’ with unfamiliarity.

One you start start to make global name-bindings, the real complexity of the system – the network of potential interactions between different parts of the code, and the number of states that the system can get into, actually becomes much more complex.

The ‘simplicity’ of the system is not the same as the familiarity of the syntax.

On the other hand, it is of course, a perfectly fair point that achieving an inherently simpler set of interactions – building only by composition, and with mainly pre-fabricated parts for example – may come at the cost of overcoming some unfamiliarity.

#4

``````on map(f, theData)
script s
property ff : f
end script
set theResult to {}
repeat with anItem in theData
set end of theResult to s's ff(anItem)
end repeat
return theResult
end map
``````

While syntactically well-formed, and feasible in some limited cases, doesn’t actually work for the more general pattern of flexibly passing either handlers or scripts.

Your redefinition would yield only a run-time error with my earlier examples of passing scripts - it fails here, for example:

``````on run

set xs to {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}

script
on |λ|(x)
x * 2
end |λ|
end script
map(result, xs)
--> {2, 4, 6, 8, 10, 12, 14, 16, 18, 20}

-- or

script double
on |λ|(x)
x * 2
end |λ|
end script
map(double, xs)
--> {2, 4, 6, 8, 10, 12, 14, 16, 18, 20}

end run

on map(f, theData)
script s
property ff : f
end script
set theResult to {}
repeat with anItem in theData
set end of theResult to s's ff(anItem)
end repeat
return theResult
end map
``````

(and the pattern of passing scripts, rather than handlers, becomes useful when we want to capture values from closures)

Using concatMap with closures in AppleScript
(Phil Stokes) #5

That’s something that I can certainly find a use for. I welcome ways to reduce the litter of repeat blocks that I end up with in my scripts (and code: I haven’t really explored these kind of functions in Swift either).

#6

I think the fold function (in its left and right-handed variants – foldl and foldr) might repay a bit of experimentation and familiarisation.

It’s really the deepest and most flexible of the pre-packaged, pre-tested and reusable loops.

You can do pretty much anything with it, including, if you wanted, defining map and filter in terms of it.

The a is the accumulator value, initialised as startValue, and the x is the current item in the list.

In each step of an iteration, the function passed to a fold does some work, and returns an updated value for the accumulator.

``````-- MAP IN TERMS OF FOLDL ------------------------------------------------------------------

-- map :: (a -> b) -> [a] -> [b]
on map(f, xs)
script
property g : mReturn(f)'s |λ|
on |λ|(a, x)
a & g(x)
end |λ|
end script
foldl(result, {}, xs)
end map

-- FILTER IN TERMS OF FOLDL ------------------------------------------------------------------

-- filter :: (a -> Bool) -> [a] -> [a]
on filter(p, xs)
script
property f : mReturn(p)'s |λ|
on |λ|(a, x)
if f(x) then
a & x
else
a
end if
end |λ|
end script
foldl(result, {}, xs)
end filter

on even(x)
x mod 2 = 0
end even

on double(x)
x * 2
end double

-- TEST ------------------------------------------------------------------
on run
set xs to {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}

map(double, xs)
--> {2, 4, 6, 8, 10, 12, 14, 16, 18, 20}

filter(even, xs)
--> {2, 4, 6, 8, 10}
end run

-- GENERIC ------------------------------------------------------------------

-- foldl :: (a -> b -> a) -> a -> [b] -> a
on foldl(f, startValue, xs)
tell mReturn(f)
set v to startValue
set lng to length of xs
repeat with i from 1 to lng
set v to |λ|(v, item i of xs, i, xs)
end repeat
return v
end tell
end foldl

-- foldr :: (a -> b -> b) -> b -> [a] -> b
on foldr(f, startValue, xs)
tell mReturn(f)
set v to startValue
set lng to length of xs
repeat with i from lng to 1 by -1
set v to |λ|(item i of xs, v, i, xs)
end repeat
return v
end tell
end foldr

-- Lift 2nd class handler function into 1st class script wrapper
-- mReturn :: First-class m => (a -> b) -> m (a -> b)
on mReturn(f)
if class of f is script then
f
else
script
property |λ| : f
end script
end if
end mReturn
``````

#7

Whether it is ‘better’ would depend on what you want to optimise for, and what is familiar to you.

Pasting in a pre-cooked and reusable concatMap is just an alternative to hand-crafting your own repeat loops and associated variables each time.

(Boundary conditions and variable mutations in loops can tend to be the location of a fair proportion of bugs and glitches, and copy-pasting reusable generic loops does seem, empirically, to reduce the bug count a bit, as well as saving some typing).

Unfamiliarity is of course, also a speed bump, so mileage on the value of this approach is likely to be variable.

Saving Mail message attachments
(Vince Angeloni) #8

Heh — yeah. I’d have to reread Matt Neuburg’s book just to understand what is going on in that script. Thanks for posting, though… gives me something to aspire to learn.

#9

map or filter may be good places to start.

(Matt Neuburg does present a filter in his book, though his is perhaps a little more complex, and, being recursively implemented, runs out of stack space and returns an error with longer lists (c. 600)).

(Mark Alldritt) #10

A post was merged into an existing topic: Saving Mail message attachments

(Daniel Rodrigue) #11

Is that kind of “way of writing applescript” should be explain to new user? This kind of annotation show either it come from quite “old AppleScripter” or someone who read carefully the documentation…
I like it!

#12

Je me souviens encore de l’année où McCarthy a publié son célèbre article qui présentait le Lisp et les expressions symboliques.

Cela m’a beaucoup impressionné.

(Je n’avais toujours pas appris à lire, évidemment, mais tout m’impressionnait à cette époque)

(CK) #13

To explain briefly, variable names in AppleScript can ordinarily be comprised of numbers, letters and underscores only, but must not start with a number.

So `foo_bar123`, `_foobar123`, `_123` and even `_` are all valid AppleScript variable names.

`123foobar` is not. Nor is anything that contains illegal characters, such as `foo&bar` or `foo bar` (contains a space).

AppleScript, however, provides a way to circumvent these restrictions in the form of the notation `|...|`, i.e. two vertical bars surrounding the name of the variable represented here by the `...`.

Between these two vertical bars, any character you wish (except a vertical bar) can be used to make up the variable name, and in any order.

So, `|123foobar|`, `|foo&bar|` and `|foo bar|` all become valid AppleScript variable names. Even `|...|` (with a literal `"..."`) is a valid variable name, as is `|à cheval donné on ne regarde pas les dents|` (that’s a single variable with a very long name).

When one first discovers this is possible, it’s tempting to use it to insert all sorts of fancy notation into one’s script. However, it soon becomes a chore when you realise that `|𝑥²|` takes ages to type; even with text expansions/snippets, it slows a fast typer (typist?) down substantially. Also, when your script starts filling up with symbols like `|\$_#|` and `|%USER%|`, it very quickly makes the script less and less readable, even to yourself, and there’s the increased possibility that you forget what each variable was for when its name is more abstract than descriptive.

Without meaning to speak on @ComplexPoint’s behalf, I have actually only ever seen him utilise this double-bar notation specifically for `|λ|`, which I assume is a nod to lambda functions that some other programming languages feature, where one can define a function without an identifier, i.e. an anonymous function.

I use one or two additional double-bar variable names, but for very specific, identifiable purposes that I don’t vary, such as `|?|` for a predicate that evaluates to `true` or `false`, and one or two mathematical notational conveniences. Whilst I had many more in the beginning, the cost vs benefit of using them quickly whittled them down to an intrinsic few (whilst others got replaced by `_1`, `_2`, etc., which, for me, denote arguments passed to a `run` handler where descriptive labels aren’t reliable; and `_` and `__`, which are now reserved for paths to parent or top-level script objects—all of these simply mimicking bash scripting terminology).