My goal is to extract an array of three numbers from a string, in the following script.
use framework "Foundation"
use scripting additions
set stringContainingfNumbers to "39 apples found. Showing 1 - 20 sorted by name"
set anNSStringContainingfNumbers to current application's NSString's stringWithString:stringContainingNumbers
set anNSArrayContainingNumbers to ((anNSStringContainingNumbers's componentsSeparatedByString:" ")'s valueForKey:"intValue")
The script, partially succeeded but yielded a series of unwanted zeros.
=> (NSArray) {39, 0, 0, 0, 1, 0, 20, 0, 0, 0}
I tried to sort the array using a predicate to remove the zeros from the array with the following lines, but it erred.
set thePred to {current application's NSPredicate's predicateWithFormat:"intValue BETWEEN {1, 9}"}
set result to (anNSArrayContainingNumbers's filteredArrayUsingPredicate:thePred)
=>…unrecognized selector sent to instance …
What might be a correct method to extract numbers from a string?
One approach might be to lex the string, grouping runs of numeric characters together:
-------------- LIST OF NUMBERS FOUND IN STRING -------------
-- numbersInString :: String -> [Number]
on numbersInString(s)
script numericOrOther
on |λ|(a, b)
isNumeric(a) = isNumeric(b)
end |λ|
end script
script numbersOnly
on |λ|(cs)
set w to concat(cs)
if isNaN(w) then
{}
else
{w as number}
end if
end |λ|
end script
concatMap(numbersOnly, ¬
groupBy(numericOrOther, characters of s))
end numbersInString
---------------------------- TEST --------------------------
on run
set sample to "39 apples found. Showing 1 - 20 sorted by name"
numbersInString(sample)
--> {39, 1, 20}
end run
-------------------------- GENERIC -------------------------
-- https://github.com/RobTrew/prelude-jxa
-- concat :: [[a]] -> [a]
-- concat :: [String] -> String
on concat(xs)
set lng to length of xs
if 0 < lng and string is class of (item 1 of xs) then
set acc to ""
else
set acc to {}
end if
repeat with i from 1 to lng
set acc to acc & item i of xs
end repeat
acc
end concat
-- concatMap :: (a -> [b]) -> [a] -> [b]
on concatMap(f, xs)
set lng to length of xs
set acc to {}
tell mReturn(f)
repeat with i from 1 to lng
set acc to acc & (|λ|(item i of xs, i, xs))
end repeat
end tell
return acc
end concatMap
-- foldl :: (a -> b -> a) -> a -> [b] -> a
on foldl(f, startValue, xs)
tell mReturn(f)
set v to startValue
set lng to length of xs
repeat with i from 1 to lng
set v to |λ|(v, item i of xs, i, xs)
end repeat
return v
end tell
end foldl
-- Typical usage: groupBy(on(eq, f), xs)
-- groupBy :: (a -> a -> Bool) -> [a] -> [[a]]
on groupBy(f, xs)
set mf to mReturn(f)
script enGroup
on |λ|(a, x)
if length of (active of a) > 0 then
set h to item 1 of active of a
else
set h to missing value
end if
if h is not missing value and mf's |λ|(h, x) then
{active:(active of a) & {x}, sofar:sofar of a}
else
{active:{x}, sofar:(sofar of a) & {active of a}}
end if
end |λ|
end script
if length of xs > 0 then
set dct to foldl(enGroup, {active:{item 1 of xs}, sofar:{}}, rest of xs)
if length of (active of dct) > 0 then
sofar of dct & {active of dct}
else
sofar of dct
end if
else
{}
end if
end groupBy
-- isNumeric :: Char -> Bool
on isNumeric(c)
set n to (id of c)
(48 ≤ n and 57 ≥ n) or ("-." contains c)
end isNumeric
-- isNaN :: String -> Bool
on isNaN(s)
try
s as number
false
on error
true
end try
end isNaN
-- mReturn :: First-class m => (a -> b) -> m (a -> b)
on mReturn(f)
-- 2nd class handler function lifted into 1st class script wrapper.
if script is class of f then
f
else
script
property |λ| : f
end script
end if
end mReturn
You could use a regular expression search. And you could use a string scanner:
use framework "Foundation"
use scripting additions
set stringContainingNumbers to "39 apples found. Showing 1 - 20 sorted by name"
set theScanner to current application's NSScanner's scannerWithString:stringContainingNumbers
set digits to current application's NSCharacterSet's decimalDigitCharacterSet()
set theNums to {}
repeat
theScanner's scanUpToCharactersFromSet:digits intoString:(missing value)
set {theResult, theValue} to theScanner's scanInteger:(reference)
if theResult is false then exit repeat -- no more
set end of theNums to theValue as integer
end repeat
return theNums
This is quite similar to scriptingmd’s own code:
use AppleScript version "2.4" -- OS X 10.10 (Yosemite) or later
use framework "Foundation"
use scripting additions
set stringContainingNumbers to "39 apples found. Showing 1 - 20 sorted by name"
set NSStringContainingNumbers to current application's NSString's stringWithString:(stringContainingNumbers)
set nonDigits to current application's NSCharacterSet's decimalDigitCharacterSet()'s invertedSet()
set numsEtc to (NSStringContainingNumbers's componentsSeparatedByCharactersInSet:(nonDigits))'s mutableCopy()
tell numsEtc to removeObject:("")
set theNums to (numsEtc's valueForKey:("integerValue")) as list
set stringContainingNumbers to "39 apples found. ¬
Showing 1 - 20 sorted by name. ¬
Big numbers like 1,234. ¬
Also decimals like 1.234"
set wordsFromString to words of stringContainingNumbers
set numbersInString to {}
repeat with thisWord in wordsFromString
try
set the end of numbersInString to thisWord as number
end try
end repeat
return numbersInString
I appreciate all of the responses on multiple methods to extract a number from a string, whether by
• Lexing a string and grouping runs of numeric characters together
• Scanning a string for a string containing numbers
• Removing the inverted string’s set of decimal digits
• Setting a list to string’s words, identifiable as a number.
When I reviewed my original post as to the reasons for its predicate method failing, I found three errors:
1• Erroneously inserting an extra character “f” into some but not every string variable between the words “Containing” and “Numbers”, which I have removed
2• Erroneously bracketing with curly braces {} instead of using syntactically correct parentheses () when setting thePred variable, which I have replaced
3• Overly restricting the range of Objective C’s Between comparison parser, which I have changed from {1,9} to {1,100}
With those corrections, the following predicate script now finds a number in a string.
use AppleScript version "2.4" -- OS X 10.10 (Yosemite) or later
use framework "Foundation"
use scripting additions
set stringContainingNumbers to "39 apples found. Showing 1 - 20 sorted by name"
set anNSStringContainingNumbers to current application's NSString's stringWithString:stringContainingNumbers
set anNSArrayContainingNumbers to ((anNSStringContainingNumbers's componentsSeparatedByString:" ")'s valueForKey:"intValue")
set thePred to (current application's NSPredicate's predicateWithFormat:"intValue BETWEEN {1, 100}")
set result to (anNSArrayContainingNumbers's filteredArrayUsingPredicate:thePred) as list
In the interest of understanding this process, I welcome any further insights to, or critiques of my script.
It’s only finding the integer values in a string, and discarding any decimal values (not rounding up when appropriate). Plus if a number is written as 1,234, it doesn’t work. But since you’re limiting the values to 100 that seems to be fine for your purposes.
Extracting numbers from words in the string is the only method to reliably get the correct result from this input:
set stringContainingNumbers to "39 apples found.
Showing 1 - 20 sorted by name.
Big numbers like 1234.
Or Written this way 1,234
Also decimals like 1.234
Or 9.876"
-->>{39, 1, 20, 1234, 1234, 1.234, 9.876}
I think you’re solving a more general problem — there was no indication that the OP was also after reals. And as a general solution, it has a fairly serious failing: it doesn’t handle negative values. It will also fail in locales that use different decimal and grouping separators.
Yes. Honestly, I think this is a case where one size can’t fit all (and risks being needlessly complicated trying). But I suspect a regular-expressions-based solution would be most efficient.