Degree Symbol = o and O


(Jeff Horton) #1

I had to check for a degree symbol in text and replace with HTML using applescript.

When looping thru characters if the character was a O or o (just your normal o as in food) and I was checking for an actual degree symbol, it sees the upper and lower case o as a º

Very odd. Not a big deal b/c I can trap differently using the ascii # which is 188, but just strange how the degree symbol = o

Here’s a script you can test, you can see I am obviously checking for the “º symbol, but the If/Then returns the 11th character which is a o.

set x to “here are some words”

repeat with c from 1 to length of x
if character c of x is “º” then
return (c & " character - " & character c of x as string)
end if
end repeat


(Nigel Garvey) #2

Hi.

When copy/pasted into SD or SE, the character in your post is a masculine ordinal indicator (character id 186), not a degree sign (character id 176). It seems that AppleScript regards the masculine and feminine ordinal indicators as being equal to the lower-case “o” and “a” — which is essentially what they are linguistically, so the behaviour’s correct in that sense. You’d have to test for the characters by id:

set x to "here are some words"

repeat with c from 1 to length of x
	if (id of character c of x is 186) then -- Masculine ordinal indicator?
		return (c & " character - " & character c of x)
	end if
end repeat

(Shane Stanley) #3

But it’s a bit surprising, no? It seems to me that there should at least be something like ignoring normalization.


(Nigel Garvey) #4

I was surprised myself, but broadly sympathetic, being a former musician and assuming Apple was following some Unicode rule. Trying to find that rule this morning, I see that Unicode characters can have so many properties it’s surprising that string comparisons don’t grind to a halt after a word or two.

Or considering what's in the strings or cutting the clever stuff. :wink: