Using map, filter and reduce or fold in Applescript


#1

(Branching from a discussion with Phil http://forum.latenightsw.com/t/converting-an-nsattributedstring-into-an-html-string/1048/14)

Map filter and ‘reduce’ or ‘fold’ are useful when we need to assemble scripts quickly for ourselves, reducing the risks of haste and accident that loops are heir to.

They are essentially pasteable pre-tested loops of generic patterns:

  • map for obtaining transformed lists of unchanged length
  • fold (or ‘reduce’ in JS) for obtaining single values from a list
  • filter for defined pruning

First some examples, and then a note on doing it in AppleScript.

Examples:

on double(x)
    x * 2
end double

on half(x)
    x / 2
end half

on even(x)
    x mod 2 = 0
end even

on add(a, b)
    a + b
end add

on mult(a, b)
    a * b
end mult

on sum(xs)
    foldl(add, 0, xs)
end sum

on product(xs)
    foldl(mult, 1, xs)
end product


-- TEST ------------------------------------------------------------------
on run
    map(double, {1, 2, 3, 4, 5})
    --> {2, 4, 6, 8, 10}
    
    map(half, {1, 2, 3, 4, 5})
    --> {0.5, 1.0, 1.5, 2.0, 2.5}
    
    filter(even, {1, 2, 3, 4, 5, 6, 7, 8, 9, 10})
    --> {2, 4, 6, 8, 10}
    
    sum({1, 2, 3, 4, 5, 6, 7, 8, 9, 10})
    --> 55
    
    product({1, 2, 3, 4, 5, 6, 7, 8, 9, 10})
    --> 3628800
end run



-- GENERICS --------------------------------------------------------------

-- filter :: (a -> Bool) -> [a] -> [a]
on filter(f, xs)
    tell mReturn(f)
        set lst to {}
        set lng to length of xs
        repeat with i from 1 to lng
            set v to item i of xs
            if |λ|(v, i, xs) then set end of lst to v
        end repeat
        return lst
    end tell
end filter

-- foldl :: (a -> b -> a) -> a -> [b] -> a
on foldl(f, startValue, xs)
    tell mReturn(f)
        set v to startValue
        set lng to length of xs
        repeat with i from 1 to lng
            set v to |λ|(v, item i of xs, i, xs)
        end repeat
        return v
    end tell
end foldl

-- map :: (a -> b) -> [a] -> [b]
on map(f, xs)
    tell mReturn(f)
        set lng to length of xs
        set lst to {}
        repeat with i from 1 to lng
            set end of lst to |λ|(item i of xs, i, xs)
        end repeat
        return lst
    end tell
end map

-- Lift 2nd class handler function into 1st class script wrapper
-- mReturn :: First-class m => (a -> b) -> m (a -> b)
on mReturn(f)
    if class of f is script then
        f
    else
        script
            property |λ| : f
        end script
    end if
end mReturn

Map fold and filter in AppleScript

The generality and reusability of map, fold and filter depend on our passing other functions as arguments to them.

Plain AppleScript handlers are not in themselves ‘first-class’ functions that can fully survive being passed as values like this, and at first this looks like an obstacle.

Fortunately, however:

  1. Script objects with handlers are first class objects, and
  2. we can wrap any handler in a script at run-time.

The oddly named mReturn function which I personally use to do this might be named more clearly as something like scriptWrapped(handler)

-- mReturn :: First-class m => (a -> b) -> m (a -> b)
on mReturn(f)
    if class of f is script then
        f
    else
        script
            property |λ| : f
        end script
    end if
end mReturn
  1. If the argument is already a script, it is returned unchanged
  2. otherwise, it is assumed to be a handler, and the return value is a script which has that handler as a property
  3. the original name of the handler is discarded, and something more anonymous, like lambda or |λ| is used instead.

This allows us to write things like map(double, {1,2,3,4,5}) – defining map in a way which script-wraps any handler that is passed as its first argument:

-- map :: (a -> b) -> [a] -> [b]
on map(f, xs)
    tell mReturn(f)
        set lng to length of xs
        set lst to {}
        repeat with i from 1 to lng
            set end of lst to |λ|(item i of xs, i, xs)
        end repeat
        return lst
    end tell
end map

In short, our map is just a pre-packaged, pre-tested and reusable loop, in which any handler passed as the argument f is applied in turn to each item in the list xs.

A few more details

Additional arguments for |λ|

Inside the map loop (and equally inside the filter and foldl loops), you may notice that the list transforming function (passed as f, and applied as |λ|, can have up to 3 arguments, rather than just one.

This echoes the pattern used by JavaScript’s Array.map method,

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/map

in which the function passed to map can refer not only to the current element in the list, but also to the index of that element (zero-based in JS, but 1-based in AS), and additionally to the whole list.

Passing scripts rather than handlers to map, filter, fold

Defining these generic loop patterns in terms of an mReturn or scriptWrapped function allows us to pass either handlers or pre-existing scripts as arguments.

We could, for example write:

on run
    
    set xs to {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
    
    script
        on |λ|(x)
            x * 2
        end |λ|
    end script
    map(result, xs)
    --> {2, 4, 6, 8, 10, 12, 14, 16, 18, 20}
    
    -- or 
    
    script double
        on |λ|(x)
            x * 2
        end |λ|
    end script
    map(double, xs)
    --> {2, 4, 6, 8, 10, 12, 14, 16, 18, 20}
    
end run


-- map :: (a -> b) -> [a] -> [b]
on map(f, xs)
    tell mReturn(f)
        set lng to length of xs
        set lst to {}
        repeat with i from 1 to lng
            set end of lst to |λ|(item i of xs, i, xs)
        end repeat
        return lst
    end tell
end map

-- Lift 2nd class handler function into 1st class script wrapper
-- mReturn :: First-class m => (a -> b) -> m (a -> b)
on mReturn(f)
    if class of f is script then
        f
    else
        script
            property |λ| : f
        end script
    end if
end mReturn

Why and when ?

These patterns work well, in my experience, when I am writing code for my own use.

  • Scripts can be quickly assembled from reusable generics (I maintain parallel libraries of c. 350 generic functions in AS and JS, and I’m beginning to assemble one in Swift),
  • The way they click together is rather simple and predictable - and I find that I spend much less time in debugging as a result
  • Using the same generic patterns across different languages eases a bit of the cognitive burden for me, and I also find that I use the functions as reference manuals. I may not always use the simplest ones (It can be faster, on occasions when that’s helpful, to just inline the code, but if I know how do do something in one language, and have forgotten the details in the other, I tend to just look up the familiar generic function, and glance at its implementation in the language that I’m writing in.

When not to use these patterns ?

When particular economics or practicalities make human time (hand-coding at a lower level, stepping through debuggers when a hand-written mutation has found some puzzling edge condition) less expensive than execution time, then you should always be able to manually soup up your inner loops.

Even then (and even I sometimes want a diagram, for example, to generate more swiftly), I find that it can work well to first build a pre-fab structure with (map and fold and other generics) and then replace some of the inner iterations with something hand written.

In practice tho, in my own context, I am increasingly finding (with OmniGraffle etc now starting to acquire their own JSContexts) that when maxed-out zippiness at any cost is what I want, I may end up doing it in JS).

(Not always, of course, where no JSContext is to hand, AS is, as we know, still much faster at batch-setting properties).

As for ‘FP’, well, others take different views but I happen to be a bit agnostic about that notion – is there really a clear dividing line ? All coding involves functions. Even sequences and semi-colons can be interpreted as functions …

But I do know that map filter and fold/reduce make my life a bit easier in any language :slight_smile:


Saving Mail message attachments
Converting an NSAttributedString into an html string
(Mark Alldritt) #2

Fair enough, but it does not have to be so complex. AppleScript only requires that identifiers containing handlers not be local to a handler. Your use of script objects can removed in order to see this in action:

on map(f, theData)
	global ff
	set ff to f
	set theResult to {}
	repeat with anItem in theData
		set end of theResult to ff(anItem)
	end repeat
	return theResult
end map

on double(x)
	x * 2
end double

map(double, {1, 2, 3, 4, 5})
--> {2, 4, 6, 8, 10}

This version does not handle recursion (i.e. f calling map). To deal with that we can introduce a local script object to provide a scope to hold a non-local variable (property) containing the handler (use of ff vs |λ| is simply a matter of style and preference):

on map(f, theData)
	script s
		property ff : f
	end script
	set theResult to {}
	repeat with anItem in theData
		set end of theResult to s's ff(anItem)
	end repeat
	return theResult
end map

on double(x)
	x * 2
end double

map(double, {1, 2, 3, 4, 5})
--> {2, 4, 6, 8, 10}

So this is what ComplexPoint’s code is actually accomplishing (all I’ve really done is remove the tell part). His code is more highly abstracted and may possibly execute a little faster in some instances.


#3

There are certainly different routes to it – Matt Neuburg’s book showcases a slightly filter for example.

There is always a danger, however, of confusing ‘complexity’ with unfamiliarity.

One you start start to make global name-bindings, the real complexity of the system – the network of potential interactions between different parts of the code, and the number of states that the system can get into, actually becomes much more complex.

The ‘simplicity’ of the system is not the same as the familiarity of the syntax.

On the other hand, it is of course, a perfectly fair point that achieving an inherently simpler set of interactions – building only by composition, and with mainly pre-fabricated parts for example – may come at the cost of overcoming some unfamiliarity.


#4

Incidentally, your example here:

on map(f, theData)
    script s
        property ff : f
    end script
    set theResult to {}
    repeat with anItem in theData
        set end of theResult to s's ff(anItem)
    end repeat
    return theResult
end map

While syntactically well-formed, and feasible in some limited cases, doesn’t actually work for the more general pattern of flexibly passing either handlers or scripts.

Your redefinition would yield only a run-time error with my earlier examples of passing scripts - it fails here, for example:

on run
    
    set xs to {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
    
    script
        on |λ|(x)
            x * 2
        end |λ|
    end script
    map(result, xs)
    --> {2, 4, 6, 8, 10, 12, 14, 16, 18, 20}
    
    -- or 
    
    script double
        on |λ|(x)
            x * 2
        end |λ|
    end script
    map(double, xs)
    --> {2, 4, 6, 8, 10, 12, 14, 16, 18, 20}
    
end run

on map(f, theData)
    script s
        property ff : f
    end script
    set theResult to {}
    repeat with anItem in theData
        set end of theResult to s's ff(anItem)
    end repeat
    return theResult
end map

(and the pattern of passing scripts, rather than handlers, becomes useful when we want to capture values from closures)


Using concatMap with closures in AppleScript
(Phil Stokes) #5

That’s something that I can certainly find a use for. I welcome ways to reduce the litter of repeat blocks that I end up with in my scripts (and code: I haven’t really explored these kind of functions in Swift either).


#6

I think the fold function (in its left and right-handed variants – foldl and foldr) might repay a bit of experimentation and familiarisation.

It’s really the deepest and most flexible of the pre-packaged, pre-tested and reusable loops.

You can do pretty much anything with it, including, if you wanted, defining map and filter in terms of it.

The a is the accumulator value, initialised as startValue, and the x is the current item in the list.

In each step of an iteration, the function passed to a fold does some work, and returns an updated value for the accumulator.

-- MAP IN TERMS OF FOLDL ------------------------------------------------------------------

-- map :: (a -> b) -> [a] -> [b]
on map(f, xs)
    script
        property g : mReturn(f)'s |λ|
        on |λ|(a, x)
            a & g(x)
        end |λ|
    end script
    foldl(result, {}, xs)
end map

-- FILTER IN TERMS OF FOLDL ------------------------------------------------------------------

-- filter :: (a -> Bool) -> [a] -> [a]
on filter(p, xs)
    script
        property f : mReturn(p)'s |λ|
        on |λ|(a, x)
            if f(x) then
                a & x
            else
                a
            end if
        end |λ|
    end script
    foldl(result, {}, xs)
end filter

on even(x)
    x mod 2 = 0
end even

on double(x)
    x * 2
end double

-- TEST ------------------------------------------------------------------
on run
    set xs to {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
    
    map(double, xs)
    --> {2, 4, 6, 8, 10, 12, 14, 16, 18, 20}
    
    filter(even, xs)
    --> {2, 4, 6, 8, 10}
end run

-- GENERIC ------------------------------------------------------------------

-- foldl :: (a -> b -> a) -> a -> [b] -> a
on foldl(f, startValue, xs)
    tell mReturn(f)
        set v to startValue
        set lng to length of xs
        repeat with i from 1 to lng
            set v to |λ|(v, item i of xs, i, xs)
        end repeat
        return v
    end tell
end foldl

-- foldr :: (a -> b -> b) -> b -> [a] -> b
on foldr(f, startValue, xs)
    tell mReturn(f)
        set v to startValue
        set lng to length of xs
        repeat with i from lng to 1 by -1
            set v to |λ|(item i of xs, v, i, xs)
        end repeat
        return v
    end tell
end foldr

-- Lift 2nd class handler function into 1st class script wrapper
-- mReturn :: First-class m => (a -> b) -> m (a -> b)
on mReturn(f)
    if class of f is script then
        f
    else
        script
            property |λ| : f
        end script
    end if
end mReturn