With the following script, there is a problem with the length of string in one specific case:
When you drag&drop a file from Finder, to paste the file path and the path contains some diacriticals.
In this case the length is augmented by the count of diacriticals.
The weird thing is that if you type the same characters, the length is wright.
For example, with some folder named “Modèles personnalisés”, the length returned by the script will be the actual length + 2
use framework "Foundation"
tell document 2 of application id "asDB"
set {theLocation, theLength} to (character range of selection)
set theLocation to (theLocation - 1) -- location is zero based in objC
set scriptText to current application's NSString's stringWithString:(source text as string)
set {location:lineLocation, |length|:lineLength} to (scriptText's lineRangeForRange:{theLocation, theLength}) -- the entire line
set selection to {lineLocation + 1, lineLength}
return lineLength
end tell```
For consistency, you could retrieve the string’s precomposedStringWithCanonicalMapping() property, which will ensure that the string is normalised so that characters with diacriticals are composed into a single code unit.
use framework "Foundation"
set str to current application's NSString's stringWithString:"Modèles personnalisés"
log str's |length|() --> 23
log str's precomposedStringWithCanonicalMapping()'s |length|() --> 21
NB. The snippet above, if copied and pasted, will actually log 21 for both values, as the forum does its own job of normalising the characters in Modèles personnalisés.
I created the string for testing by first decomposing it using decomposedStringWithCanonicalMapping().
There was a bug in versions before 7.0.4 where the location and length were returned as Cocoa values, which are based on the number of 16-bit values. In 7.0.4, the values are now returned as AppleScript counts, using AppleScript’s definition of characters. This makes more sense for traditional scripts.
At the same time a new document property was introduced, selection ASObjC range. This returns the Cocoa values (including zero-based indexing locations). You should use this property in ASObjC scripts. And if you use selection ASObjC range as record, you get a record you can pass directly in ASObjC code. (Check out the explanation in the scripting dictionary.)
So your code would become:
tell document 2 of application id "com.latenightsw.ScriptDebugger7"
set theRange to selection ASObjC range as record
set scriptText to current application's NSString's stringWithString:(source text as string)
set newRange to (scriptText's lineRangeForRange:theRange) -- the entire line
set selection ASObjC range to newRange
return |length| of newRange
end tell
Just be aware that it will only work where characters can all fit in 16-bit Unichars. That’s probably going to cover most common accented characters, but not things like some emoticons.
Yes, I thought that there will be cases where it could be a problem even if I haven’t thought about emoticons (never use them at work).
In fact, I thought about Hebrew and Arabic characters where vowels are diacriticals.
And also some slavic or nordic specialities.
But I think they all fit in 16 bit containers…