Text encoding when open plain text files

asobjc

(Andreas Kiel) #1

I try to build an app which processes subtitle files.
It is based on an old ASS application. With the transformation I naturally would love to make some things nicer or add functionality.

One thing is an option to interpret text encoding (i.e. UTF-8, Mac Os Roman etc.)
Is there a simple way to integrate that?

Any ideas are welcome.


(Shane Stanley) #2

The basic approach is like this:

set {theString, theError} to current application's NSString's stringWithContentsOfFile:posixPath encoding:(current application's NSUTF8StringEncoding) |error|:(reference)
if theString = missing value then error theError's localizedDescription() as text

(Andreas Kiel) #3

Thanks Shane,

I’ve done that before, but had some issues. Not very often. So I was thinking of something like in TextEdit’s encoding options.
But it’s not that important now.

Again thanks for your fast reply!


(Shane Stanley) #4

I’m not sure what you mean — TextEdit mostly reads as an NSAttributedString.

(I’m also changing this to AppleScript category, because there’s nothing inherently Xcode-app-based in it.)


(Andreas Kiel) #5

Probably I did ask wrong. I was not talking about RTF(D).
I was asking for “Plain Text Encoding” options.
By default it is set to “Automatic” which seems to be what you suggested.
If that doesn’t work you can select from an popup with more encoding options.
To achieve I would need to create a popup which index is called and then an encoding of a list of encoding options is called (or so)


(Shane Stanley) #6

If you want automatic, you can use a method that will try to guess:

set posixPath to POSIX path of (choose file)
set theData to current application's NSData's alloc()'s initWithContentsOfFile:posixPath
set theOptions to current application's NSDictionary's dictionaryWithObject:false forKey:(current application's NSStringEncodingDetectionAllowLossyKey)
set {theEncoding, theString} to current application's NSString's stringEncodingForData:theData encodingOptions:theOptions convertedString:(reference) usedLossyConversion:(missing value)

Otherwise you can try UTF-8, and if it fails, drop back to WinLatin1 or something.