But this AppleScript always returns 0 words (testing under Sequoia):
on open f
if (count of f) is greater than 1 then
activate
display dialog "One PDF only, please." buttons {"OK"}
error number -128
end if
set pdfPath to POSIX path of f as string
if pdfPath does not end with ".pdf" then
activate
display dialog "I only work with PDF files." buttons {"OK"}
error number -128
end if
set wordCount to do shell script "ps2ascii" & space & quoted form of pdfPath & space & "| wc -w"
display dialog pdfPath & return & return & "contains" & return & return & wordCount & space & "words." buttons {"OK"}
end open
And to answer my own question in case it helps anyone else:
on open f
if (count of f) is greater than 1 then
activate
display dialog "One PDF only, please." buttons {"OK"}
error number -128
end if
set pdfPath to POSIX path of f as string
if pdfPath does not end with ".pdf" then
activate
display dialog "I only work with PDF files." buttons {"OK"}
error number -128
end if
set wordCount to do shell script "wc -w <<< ps2ascii" & space & quoted form of pdfPath
display dialog pdfPath & return & return & "contains" & return & return & first word of wordCount & space & "words." buttons {"OK"}
end open
FWIW, with recent versions of macOS, a shortcut might be considered. The following is easily tailored to meet specific needs, and this could (probably) include performing OCR if the PDF contains an image with text.
@peavine - That is extremely elegant. I haven’t written any Shortcuts, but I see how easy they are. I tend to prefer the flexibility of AppleScript, and have one question:
Does anyone know the command-line equivalent of the step that gets the text from a PDF file? I can’t find any hint online of what the built-in tool might be. Everything seems to require brew or a download.
emendelson. I don’t know of any command-line tool that will get text from a PDF, but I’m not very knowledgeable in this area. This is easily done with ASObjC, though.