Identify ALL CAPS

Before I go about reinventing the wheel here, I have some data in text format where each category label is on a line by itself in ALL CAPS.

I’m wondering if there’s a simple way to tell if a given string is all caps, or has any characters that are not upper-case A-Z.

I’m thinking a handler might do it using ascii numbers, and step through the string until it gets to the first non-uppercase.

I’ve tried “considering case” for similar things in the past but that doesn’t seem very reliable. Any suggestions?

This is what I have, but I’m worried it may slow things down




IsItUpperCase("AZ")
--true
IsItUpperCase("AbZ")
--false

on IsItUpperCase(anyString)
   
   set saveTID to AppleScript's text item delimiters
   set AppleScript's text item delimiters to {""}
   
   set testString to text items of anyString
   
   repeat with thisChar in testString
      set asciiNum to ASCII number of thisChar
         if ((asciiNum < 65 or asciiNum > 90) and (asciiNum is not 32)) then
         
         set AppleScript's text item delimiters to saveTID
         return false
      end if
   end repeat
set AppleScript's text item delimiters to saveTID   
   return true
end IsItUpperCase

I included space characters because some of the categories have those too

Hi Ed,

If it’s ok for you to use AppleScriptObjC in your script, you can try:

use framework "Foundation"
use scripting additions

set theString to "THIS IS A CAPITAL STRING"
set theString to "this is a lowercase string"
set theString to "This is a Mixed String"

set thePredicate to current application's NSPredicate's predicateWithFormat:"SELF MATCHES[d] %@" argumentArray:{"([[:upper:]]|\\h)*"}
return thePredicate's evaluateWithObject:theString

ASCII number is long deprecated (and rightly so), and painfully slow. Just changing it to use id will speed your code up immensely. And using characters instead of text items will remove the need for all delimiter code. But it’s still totally Anglo-centric code.

Here are a couple of alternatives:

use AppleScript version "2.5" -- macOS 10.11 or later
use framework "Foundation"
use scripting additions

on isItUpperCase(theString)
	set aString to current application's NSString's stringWithString:theString
	return (aString's isEqualToString:(aString's uppercaseString())) as boolean
end isItUpperCase

isItUpperCase("ONE tWO THREE")

Or:

use AppleScript version "2.5" -- macOS 10.11 or later
use framework "Foundation"
use scripting additions

on isItUpperCase(theString)
	set aString to current application's NSString's stringWithString:theString
	return (aString's rangeOfCharacterFromSet:(current application's NSCharacterSet's lowercaseLetterCharacterSet()))'s |length| = 0
end isItUpperCase

isItUpperCase("ONE tWO THREE")
1 Like

Thanks! I’m totally in business now!

Since English is not my mother tongue, I sometimes misunderstand what is said:
I thought you were wondering if a string contains only capital letters and spaces …
I’m I mistaking?

1 Like

You are correct, that is what I was asking for.

It just happens that with this particular dataset, I was able to work with these solutions that only look at upper and lower case characters and ignore others.

If I get a true result I just test for the few characters that don’t belong (" ", “-”, “.”, “,”)

If you want to restrict it to a specific set of characters, you can do this:

use AppleScript version "2.5" -- macOS 10.11 or later
use framework "Foundation"
use scripting additions

on isItUpperCase(theString)
	set aString to current application's NSString's stringWithString:theString
	set forbiddenChars to (current application's NSCharacterSet's characterSetWithCharactersInString:"ABCDEFGHIJKLMNOPQRSTUVWXYZ ")'s invertedSet() -- change string to suit
	return (aString's rangeOfCharacterFromSet:forbiddenChars)'s |length| = 0
end isItUpperCase

isItUpperCase("ONE TWO THREE")
1 Like

Thanks guys.

I’m always glad to learn and to improve my English and my AppleScriptObjC skills!
:wink:

1 Like

Ed, looks like you may have a solution, but this is very easily done using RegEx:
\b[A-Z]+

That will match every word in the source text that is all caps.

1 Like

You were on the mark with using text item delimiters. You just need the right delimiters:

on isAllCaps(input as text)
	local input
	
	set lowercase to "abcdefghijklmnopqrstuvwxyz"
	set uppercase to "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
	
	set my text item delimiters to me & characters in lowercase
	
	considering case but ignoring punctuation, white space and diacriticals
		text items of input as text = the input
	end considering
end isAllCaps

Positive test case:

return isAllCaps("HELLO, WORLD!") --> true

Negative test case:

return isAllCaps("Hello, World!") --> false

2nd negative test case:

return isAllCaps("helloworld") --> false
1 Like

Lots of variants available here.

In the rather unlikely case that run-time performance was genuinely the highest value, the fastest approach would of course be:

osascript -l JavaScript

with something like:

(() => {
    'use strict';

    // isAllUpper :: String -> Bool
    const isAllUpper = s =>
        s ===  s.toLocaleUpperCase();
        
        
    return ["HELLO, WORLD!", "Hello, World!", "helloworld"].map(
        isAllUpper
    );
})();

Otherwise, assuming that maximal compression of execution time is rarely worth its real costs, perhaps:

use AppleScript version "2.4"
use framework "Foundation"
use scripting additions

-- isAllUpper :: String -> Bool
on isAllUpper(s)
    considering case
        s = toUpper(s)
    end considering
end isAllUpper

-- TESTING
on run
    map(isAllUpper, {"HELLO, WORLD!", "Hello, World!", "helloworld"})
    
    --> {true, false, false}
end run

-- toUpper :: String -> String
on toUpper(str)
    set ca to current application
    ((ca's NSString's stringWithString:(str))'s ¬
        uppercaseStringWithLocale:(ca's NSLocale's currentLocale())) as text
end toUpper

-- map :: (a -> b) -> [a] -> [b]
on map(f, xs)
    -- The list obtained by applying f
    -- to each element of xs.
    tell mReturn(f)
        set lng to length of xs
        set lst to {}
        repeat with i from 1 to lng
            set end of lst to |λ|(item i of xs, i, xs)
        end repeat
        return lst
    end tell
end map

-- mReturn :: First-class m => (a -> b) -> m (a -> b)
on mReturn(f)
    -- 2nd class handler function lifted into 1st class script wrapper. 
    if script is class of f then
        f
    else
        script
            property |λ| : f
        end script
    end if
end mReturn