Find a Regular Expression and Return Ranges

@ShaneStanley

Hey Shane,

Would you mind demonstrating how to return the found ranges for the regular expression instead of the strings?

Also – could you point me to the properties of NSRegularExpression in the developer docs? I was trying to see what I could extract and what the nomenclature was but got totally lost in the mire…

-Chris

--------------------------------------------------------
use framework "Foundation"
use scripting additions
--------------------------------------------------------

set myData to "
my data 01 - my_data_label_01
my data 02 - my_data_label_02
my data 03 - my_data_label_03
"

my regexFindWithCapture:"my data" fromString:myData resultTemplate:"$1"

# my regexFindWithCapture:"(?m)^(.)(.) (.)(.)(.)" fromString:myData resultTemplate:"$0"
# 
# my regexFindWithCapture:"(?m)^.+" fromString:myData resultTemplate:"$0"

--------------------------------------------------------
--» HANDLERS
--------------------------------------------------------
on regexFindWithCapture:thePattern fromString:theString resultTemplate:templateStr
   set theString to current application's NSString's stringWithString:theString
   set theRegEx to current application's NSRegularExpression's regularExpressionWithPattern:thePattern options:0 |error|:(missing value)
   set theFinds to theRegEx's matchesInString:theString options:0 range:{0, theString's |length|()}
   set theResult to current application's NSMutableArray's array()
   
   repeat with aFind in theFinds
      set foundString to (theRegEx's replacementStringForResult:aFind inString:theString |offset|:0 template:templateStr)
      (theResult's addObject:foundString)
   end repeat
   
   return theResult as list
   
end regexFindWithCapture:fromString:resultTemplate:
--------------------------------------------------------

There are three methods you can use. The most common is the one in your script:

   set theFinds to theRegEx's matchesInString:theString options:0 range:{0, theString's |length|()}

That returns an array of NSTextCheckingResult objects. Their relevant methods are: -range, -numberOfRanges, and -rangeAtIndex:.

So, ignoring capture groups, you can get all the ranges like this:

   set theRanges to (theRegEx's matchesInString:theString options:0 range:{0, theString's |length|()})'s valueForKey:"range"

For capture groups, you have to loop through, something like:

set theRanges to {}
set theFinds to theRegEx's matchesInString:theString options:0 range:{0, theString's |length|()}
repeat with aFind in theFinds
	set end of theRanges to (aFind's rangeAtIndex:1) -- index of capture group you want
end repeat

If you’re only after a single or first match, you can use rangeOfFirstMatchInString:options:range: for the full range, or firstMatchInString:options:range: to get a single NSTextCheckingResult.

1 Like

Hey Shane,

Many thanks!

Okay, I’ve got the capture-group method working (I think).

This is useful when finding and replacing text in a Script Debugger script document when you don’t want to work with the entire script text.

For instance – I’ve got a script that updates just the modification-date in my script header.

It would be easier to just find / replace in the entire script text, replace the script text, and compile, but that doesn’t feel as organic – and I want to be able to eyeball the replaced text before compiling.

-Chris

--------------------------------------------------------
# Auth: Christopher Stone { Heavy Lifting by Shane Stanley }
# dCre: 2021/09/07 01:34
# dMod: 2021/09/07 01:48
# Appl: AppleScriptObjC
# Task: Find Using RegEx with Capture and Returning Range(s).
# Libs: None
# Osax: None
# Tags: @Applescript, @Script, @ASObjC, @Find, @RegEx, @Range
# Test: macOS 10.14.6
# Vers: 1.00
--------------------------------------------------------
use framework "Foundation"
use scripting additions
--------------------------------------------------------

set myData to "

My Data 01 - my_data_label_01
My Data 02 - my_data_label_02

"

my regexFindWithCaptureReturningRange:"(?i)(my data)( \\d+)" fromString:myData captureGroup:0

--------------------------------------------------------
on regexFindWithCaptureReturningRange:thePattern fromString:dataStr captureGroup:theCaptureGroup
   set dataStr to current application's NSString's stringWithString:dataStr
   set theRegEx to current application's NSRegularExpression's regularExpressionWithPattern:thePattern options:0 |error|:(missing value)
   set foundStrRangeList to {}
   set foundItemArray to theRegEx's matchesInString:dataStr options:0 range:{0, dataStr's |length|()}
   repeat with foundItem in foundItemArray
      set end of foundStrRangeList to (foundItem's rangeAtIndex:theCaptureGroup) -- index of capture group you want
   end repeat
   return foundStrRangeList as list
end regexFindWithCaptureReturningRange:fromString:captureGroup:
--------------------------------------------------------