Read and write RTF files

foundation
how-to
asobjc
appkit

(Mark Alldritt) #1

You can read RTF files into attributed strings, and vice-versa, in a couple of ways. Notice that although NSAttributedString belongs to Foundation framework, the methods for dealing with RTF data are defined in AppKit framework, so you need the appropriate use statement.

First, the code for reading. Here we read the file as raw data, and create an attributed string from that.

use AppleScript version "2.4"
use scripting additions
use framework "Foundation"
use framework "AppKit" -- needed for used rtf methods

-- classes, constants, and enums used
property NSData : a reference to current application's NSData
property NSAttributedString : a reference to current application's NSAttributedString
property NSDictionary : a reference to current application's NSDictionary
property NSString : a reference to current application's NSString
property NSRTFTextDocumentType : a reference to current application's NSRTFTextDocumentType

set posixPath to POSIX path of (choose file with prompt "Choose an RTF file" of type {"rtf"})
-- read file as RTF data
set theData to NSData's dataWithContentsOfFile:posixPath
-- create attributed string from the data
set {theStyledString, docAttributes} to NSAttributedString's alloc()'s initWithRTF:theData documentAttributes:(reference)
if theStyledString is missing value then error "Could not read RTF file"

The variable docAttributes contains extra information about the document that can be used to create a new file, but it’s optional.

Let’s change the string by inserting a line at the beginning:

-- make copy you can modify
set theStyledString to theStyledString's mutableCopy()
-- insert text at beginning
theStyledString's replaceCharactersInRange:{0, 0} withString:("Extra text" & linefeed)

To save the modified attributed string you need to create RTF data from it, and then write the data to a file as you would any other data:

--  first turn attributed string into RTF data
set theData to theStyledString's RTFFromRange:{0, theStyledString's |length|()} documentAttributes:docAttributes
-- build path for new file
set posixPath to NSString's stringWithString:posixPath
set newPath to (posixPath's stringByDeletingPathExtension()'s stringByAppendingString:"-copy")'s stringByAppendingPathExtension:(posixPath's pathExtension())
-- write the data to the file
theData's writeToFile:newPath atomically:true

If you are creating a file from scratch, you won’t have the docAttributes value. You can create it like this:

set docAttributes to {DocumentType:NSRTFTextDocumentType}

(Jim Underwood) #2

Mark (@alldritt), many thanks for these how-to articles you have been posting. I find them very helpful. :+1:


(Jim Underwood) #3

Mark, thanks again for this great example.

It works fine as is, but when I try to a changing the text (attributed string) using RegEx it gets an error. What am I doing wrong?

### THIS WORKS ###
-- insert text at beginning
theAttString's replaceCharactersInRange:{0, 0} withString:("Extra text" & linefeed)

### THIS FAILS ###
--- RegEx Example of Change ---
set theAttString to (theAttString's stringByReplacingOccurrencesOfString:"Line (\\d+)" withString:"Bullet $1" options:(current application's NSRegularExpressionSearch) range:{0, theAttString's |length|()})

-->-[NSConcreteMutableAttributedString stringByReplacingOccurrencesOfString:withString:options:range:]: unrecognized selector sent to instance 0x618000a219a0


(Jim Underwood) #4

@alldritt and @ShaneStanley,

I made some progress. The RegEx change is working, but now I don’t know how to convert the nsPlainText back to RTF for the Clipboard. Can you please help with this?

Here’s my complete test script, all based on scripts you guys have been so kind to share:

use AppleScript version "2.4" -- Yosemite (10.10) or later
use framework "Foundation"
use framework "AppKit"
use scripting additions

-- classes, constants, and enums used
property NSData : a reference to current application's NSData
property NSAttributedString : a reference to current application's NSAttributedString
property NSDictionary : a reference to current application's NSDictionary
property NSString : a reference to current application's NSString
property NSRTFTextDocumentType : a reference to current application's NSRTFTextDocumentType
property NSPasteboardTypeRTF : a reference to current application's NSPasteboardTypeRTF
property nsCurApp : a reference to current application


set reFind to "Line (\\d+)"
set reReplace to "Bullet $1"


-----------------------------
-- GET RTF FROM CLIPBOARD --
-----------------------------

set pb to current application's NSPasteboard's generalPasteboard() -- get pasteboard
set theData to pb's dataForType:(current application's NSPasteboardTypeRTF) -- get rtf data off pasteboard
if theData = missing value then error "No rtf data found on clipboard"

-- make into attributed string
--- set theAttString to current application's NSAttributedString's alloc()'s initWithRTF:theData documentAttributes:(missing value)

### IF You Want to Get RTF from File Instead ###
(*
set posixPath to POSIX path of (choose file with prompt "Choose an RTF file" of type {"rtf"})
-- read file as RTF data
set theData to NSData's dataWithContentsOfFile:posixPath
*)


-- create attributed string from the data
set {theAttString, docAttributes} to NSAttributedString's alloc()'s initWithRTF:theData documentAttributes:(reference)


(*
--- Decode Rich Text ---
  set nsRichText to current application's NSMutableAttributedString's alloc()'s initWithRTFD:nsRichTextEncoded documentAttributes:(missing value)
*)



-- make copy you can modify
set theAttString to theAttString's mutableCopy()

### THIS WORKS ###
-- insert text at beginning
theAttString's replaceCharactersInRange:{0, 0} withString:("Extra text" & linefeed)

### THIS FAILS ###
(*
--- RegEx Example of Change ---
set theAttString to (theAttString's stringByReplacingOccurrencesOfString:reFind withString:reReplace options:(current application's NSRegularExpressionSearch) range:{0, theAttString's |length|()})

-->-[NSConcreteMutableAttributedString stringByReplacingOccurrencesOfString:withString:options:range:]: unrecognized selector sent to instance 0x618000a219a0

*)
### THIS WORKS ###
--- GET PLAIN TEXT from Rich Text ---
set nsPlainText to theAttString's |string|()
set nsPlainText to (nsPlainText's stringByReplacingOccurrencesOfString:reFind withString:reReplace options:(current application's NSRegularExpressionSearch) range:{0, nsPlainText's |length|()})


### AFTER CHANGES, Re-CREATE the RTF DATA ###
## But HOW??? ##
--- ALL of the below get an error ---

----------------------------------
-- convert back to RTFD data
----------------------------------

(*
set nsRichTextEncoded to nsRichText's RTFDFromRange:{0, nsRichText's |length|()} documentAttributes:(missing value)

*)


set rtfData to theAttString's RTFFromRange:{0, nsPlainText's |length|()} documentAttributes:(missing value)

--  first turn attributed string into RTF data
--set rtfData to nsPlainText's RTFFromRange:{0, nsPlainText's |length|()} documentAttributes:docAttributes
--set rtfData to theAttString's RTFFromRange:{0, theAttString's |length|()} documentAttributes:{DocumentType:NSRTFTextDocumentType}



pb's clearContents()
pb's setData:rtfData forType:NSPasteboardTypeRTF

return


(Shane Stanley) #5

As you’ve found, you can’t call NSString methods on NSAttributedStrings. Converting a string back to RTF is also likely to be problematic. What you have to do is use the NSString for searching, and then do the replacing to the NSAttributedString. And the key to doing that is using ranges.

So in the simplest case, where you know there will only be one found instance and the replacement has no back references, you can do this:

set nsPlainText to theAttString's |string|()
set theRange to (nsPlainText's rangeOfString:reFind options:NSRegularExpressionSearch)
theAttString's replaceCharactersInRange:theRange withString:reReplace

If there might be multiple instances, you can do this:

set {theRegex, theError} to current application's NSRegularExpression's regularExpressionWithPattern:reFind options:0 |error|:(reference)
if theRegex = missing value then error theError's localizedDescription() as text
set nsPlainText to theAttString's |string|()
set theMatches to (theRegex's matchesInString:nsPlainText options:0 range:{0, nsPlainText's |length|()}) as list
set theMatches to reverse of theMatches -- so you work from back to front to keep ranges accurate
repeat with aMatch in theMatches
	(theAttString's replaceCharactersInRange:(aMatch's range()) withString:reReplace)
end repeat

But to cover all bases including capture groups, you need to go a step further:

set {theRegex, theError} to current application's NSRegularExpression's regularExpressionWithPattern:reFind options:0 |error|:(reference)
if theRegex = missing value then error theError's localizedDescription() as text
set nsPlainText to theAttString's |string|()
set theMatches to (theRegex's matchesInString:nsPlainText options:0 range:{0, nsPlainText's |length|()}) as list
set theMatches to reverse of theMatches -- so you work from back to front to keep ranges accurate
repeat with aMatch in theMatches
	set newString to (theRegex's replacementStringForResult:aMatch inString:nsPlainText |offset|:0 template:reReplace)
	(theAttString's replaceCharactersInRange:(aMatch's range()) withString:newString)
end repeat

(Jim Underwood) #6

Many thanks again, Shane. That was perfect! :+1:

I’ll cleanup and test my script, and then post the final here.


(Jim Underwood) #7

Shane, thanks again for all your great help. Could not have done this without you, or without Mark’s (@alldritt) great script.

Here’s my “final” script, which provides options for input/output to/from both Clipboard and File (set within the script).

I have no doubt that my script can be further optimized and improved. So, as always, I welcome any/all to feel free to post any comments, issues, suggestions, and/or improved script.

Mark, my apologies if you feel like this has hijacked your thread. Please feel free to move all posts concerning it to a new topic if you feel that is best.


property ptyScriptName : "Change Rich Text RTF Retain Format"
property ptyScriptVer : "1.3" --  ADD Options for Input/Output
property ptyScriptDate : "2018-04-29"
property ptyScriptAuthor : "JMichaelTX" -- heavy lifting by @ShaneStanley & Mark @alldritt
(*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
PURPOSE:
  • Read RTF Object from Clipboard OR File, Change text using RegEx (but retain format),
  
RETURNS:  Output to Clipboard OR File with RTF

REQUIRED:
  1.  macOS 10.11.6+
  2.  Mac Applications
      • TextEdit (if file output is used)
      
TAGS:  @SW.KM @Lang.ASObjC @Lang.AS @CAT.Actions @CAT.Decode @Auth.Shane @CAT.RTF @Auth.JMichaelTX

REF:  The following were used in some way in the writing of this script.
  I wrote this script, and all errors are mine.  It was based on large part on:

  1.  2017-10-03, ShaneStanley, Late Night Software Ltd.
      How Do I base64 Decode and Encode Multiple Lines?
      http://forum.latenightsw.com/t/how-do-i-base64-decode-and-encode-multiple-lines/759/11
  
  2.  2018-03-22, alldritt, Late Night Software Ltd.
      Read and write RTF files
      http://forum.latenightsw.com/t/read-and-write-rtf-files/1200

  3.  2018-04-28, ShaneStanley, Late Night Software Ltd.
      Read and write RTF files
      http://forum.latenightsw.com/t/read-and-write-rtf-files/1200/5


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*)
use AppleScript version "2.4" -- Yosemite (10.10) or later
use framework "Foundation"
use framework "AppKit"
use scripting additions

property LF : linefeed

-- classes, constants, and enums used
property NSData : a reference to current application's NSData
property NSAttributedString : a reference to current application's NSAttributedString
property NSDictionary : a reference to current application's NSDictionary
property NSString : a reference to current application's NSString
property NSRTFTextDocumentType : a reference to current application's NSRTFTextDocumentType
property NSPasteboardTypeRTF : a reference to current application's NSPasteboardTypeRTF
property nsCurApp : a reference to current application
----------------------------------------------------------------------------

try --~~~~~~~~~~~~~~~~~~~~~~ TRY ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  
  ### SCRIPT USER INPUT DATA ###
  
  set rtfInputSource to "File" -- "Clipboard" OR "File"
  set rtfOutputDest to "File" -- "Clipboard" OR "File"
  
  --- Default Name if File Output is Chosen ---
  --  (Input File Name will be used if Input is File)
  set rtfOutputFileName to "RTF Output File.rtf"
  
  --- SET RegEx Find and Replace Patterns ---
  set reFind to "Line (\\d+)"
  set reReplace to "Bullet $1"
  
  ### end user input data ###
  
  if (rtfInputSource = "Clipboard") then
    -----------------------------
    -- GET RTF FROM CLIPBOARD --
    -----------------------------
    
    set pb to current application's NSPasteboard's generalPasteboard() -- get pasteboard
    set rtfNSData to pb's dataForType:(current application's NSPasteboardTypeRTF) -- get rtf data off pasteboard
    if rtfNSData = missing value then error "No rtf data found on clipboard"
    
  else
    -----------------------------
    -- GET RTF FROM FILE --
    -----------------------------
    
    set rtfInputFilePath to POSIX path of (choose file with prompt "Choose an RTF file" of type {"rtf"})
    --- Set Output Name to Input Name ---
    tell (info for rtfInputFilePath) to set rtfOutputFileName to its name
    
    -- READ FILE AS RTF DATA --
    set rtfNSData to NSData's dataWithContentsOfFile:rtfInputFilePath
  end if
  
  -- CREATE ATTRIBUTED STRING FROM THE DATA --
  set {rtfAttString, docAttributes} to NSAttributedString's alloc()'s initWithRTF:rtfNSData documentAttributes:(reference)
  
  -- MAKE COPY YOU CAN MODIFY --
  set rtfAttString to rtfAttString's mutableCopy()
  
  --- INSERT TEXT AT BEGINNING (example) ---
  ###    rtfAttString's replaceCharactersInRange:{0, 0} withString:("Extra text" & linefeed)
  
  --- GET PLAIN TEXT from Rich Text ---
  set nsPlainText to rtfAttString's |string|()
  
  --- CREATE NS REGEX OBJECT ---
  set {nsRegEx, theError} to current application's NSRegularExpression's regularExpressionWithPattern:reFind options:0 |error|:(reference)
  if nsRegEx = missing value then error theError's localizedDescription() as text
  
  --- GET LIST OF MATCHES IN PLAIN TEXT ---
  set nsReMatchList to (nsRegEx's matchesInString:nsPlainText options:0 range:{0, nsPlainText's |length|()}) as list
  set nsReMatchList to reverse of nsReMatchList -- so you work from back to front to keep ranges accurate
  
  --- UPDATE RTF ATTRIBUTED STRING FOR EACH MATCH ---
  set numMatches to count of nsReMatchList
  
  repeat with nsReMatch in nsReMatchList
    set nsChangedStr to (nsRegEx's replacementStringForResult:nsReMatch inString:nsPlainText |offset|:0 template:reReplace)
    (rtfAttString's replaceCharactersInRange:(nsReMatch's range()) withString:nsChangedStr)
  end repeat
  
  ----------------------------------
  -- CONVERT BACK TO RTF DATA --
  ----------------------------------
  
  set rtfData to rtfAttString's RTFFromRange:{0, nsPlainText's |length|()} documentAttributes:(docAttributes)
  
  if (rtfOutputDest = "Clipboard") then
    --- SET Clipboard (PasteBoard) ---
    
    pb's clearContents()
    pb's setData:rtfData forType:NSPasteboardTypeRTF
    
  else --- OUTPUT RESULTS TO RTF FILE ---
    
    set rtfOutputPath to POSIX path of (choose file name with prompt "Choose RTF Output File Name" default name rtfOutputFileName)
    rtfData's writeToFile:rtfOutputPath atomically:true
    
    tell application "TextEdit"
      activate
      open rtfOutputPath
    end tell
    
  end if
  
  --- RETURN Summary of Results ---
  
  if (rtfInputSource = "File") then set rtfInputSource to rtfInputSource & ": " & rtfInputFilePath
  if (rtfOutputDest = "File") then set rtfOutputDest to rtfOutputDest & ": " & rtfOutputPath
  
  set scriptResults to "Number of RegEx Matches Found & Changed: " & numMatches & LF & ¬
    "RegEx Find:    " & reFind & LF & ¬
    "RegEx Replace: " & reReplace & LF & ¬
    "Source: " & rtfInputSource & LF & ¬
    "Output: " & rtfOutputDest
  
  --~~~~~~~ END TRY ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
on error errMsg number errNum
  
  if errNum = -128 then ## User Canceled
    set errMsg to "[USER_CANCELED]"
  end if
  
  set scriptResults to "[ERROR]" & return & errMsg & return & return ¬
    & "SCRIPT: " & ptyScriptName & "   Ver: " & ptyScriptVer & return ¬
    & "Error Number: " & errNum
  
end try --~~~~~~~~~~~~~~~~~~~~ END TRY/ERROR ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

return scriptResults

--~~~~~~~~~~~~~~~~~~~ END OF MAIN SCRIPT ~~~~~~~~~~~~~~~~~~~~~~


Example source file

Test Source for RTF Script.rtf.zip (514 Bytes)

image


(Shane Stanley) #8

I’m probably just being anal, but the mixing of current application as sometimes inline and sometimes in properties grates a bit. I suggest you compile, then choose Edit -> AppleScriptObjC -> Migrate to Properties.


(Nigel Garvey) #9

Thanks for pointing that out! I’d never have thought to look when using a method belonging to a Foundation class!

The script comment’s a bit misleading, though ….


(Shane Stanley) #10

It’s not common, but it does happen. My favorite example is NSArray’s shuffledArray property — it’s actually defined in GameplayKit.framework.

Fixed.


(Jim Underwood) #11

Thanks for pointing that out. It occurred due to merging of multiple scripts.

One question about the best practice of putting ASObjC enums etc in properties: Since script properties are visible only within the script itself, and not to handers in script libraries, I wonder what’s the best way to handle this?


(Shane Stanley) #12

You basically need to define the properties in the scripts that use them, whether they be libraries or clients of libraries. So if you’re using an ASObjC property in both, you define it in both. Is that what you’re asking?


(Jim Underwood) #13

Yes, thank you. Is there any downside, any performance hit, to defining a bunch of ASObjC properties in a script library where maybe they aren’t needed that much?


(Shane Stanley) #14

None that I’m aware of. If anything, using properties can be a (tiny) performance boost.