Binary file handling

I’m still in progress converting old ASS stuff to ASobjC.
One of the scripts was a about reading a binary file.
Though the script itself is 20 years old it somehow works fine but now in Xcode I often get a table overflow (crash) with the slightly ‘modernized’ version.

So the question is whether I should re-write it.
The files are about broadcast subtitles so it was easy to read/process the file by text items and get the ‘ascii number’ (not id) of each, create a hex from it when needed and decide what to do.
A new approach might be to get a hexdump to avoid the extra step ascii to hex.
But here I fail. I haven’t found a way to get a hexdump in a way I could use it in a script.
“Hexdump -v -C filePath” as shell script always had been a good help to analyze the file:

> 00000490 03 8a 0b 0b 74 65 78 74 20 32 8f 8f 8f 8f 8f 8f |…text 2…|

but the first part and the part in pipes make it useless for working with scripts (I think)

Thanks in advance for any thoughts.

It’s hard to give any advice without knowing what you’re actually wanting to do with the data.

It is a bit difficult to explain. Main thing is to make it human readable and edit/convert to another format.
Maybe the attached pic helps a bit - it uses the snippet from above (roughly)

The format description can be found here:

The reason I came here was to ask how to format my debug output. I wrote a hexdump routine. It was more difficult than I expected. fixed many a bug in the routine. It formats nicely in regular applescript. You may need to expand the output
hexDumpFormatOne – assume offset starts at one
hexDumpFormatZero – assumes a zero “c” style offset.

I included my two hex conversion routines. If you are expecting to read a hex dump you need the offset in hex.

on run
  global debug
  set debug to 3
end run
-- ------------------------------------------------------ 
(* 
hexDumpFormatOne("build variable is  ", build)

  http://krypted.com/mac-os-x/to-hex-and-back/
               0    2    4    6    8    a    c    e     0 2 4 6 8 a c e
0000000:   3c 703e 5369 6d70 6c65 2070 7574 2c20   <p>Simple put, 
            *)
on hexDumpFormatOne(textMessage, hex)
	global debug
	set aNul to character id 1
	
	if debug ≥ 5 then log "in ~~~ hexDumpFormatOne ~~~"
	if debug ≥ 7 then log "input string is " & return & hex
	
	-- -r -p
	set displayValue to aNul & hex
	set toUnix to "/bin/echo -n " & (quoted form of displayValue) & " | xxd  "
	if debug ≥ 7 then log "toUnix is " & toUnix
	
	try
		set fromUnix to do shell script toUnix
		
		-- two hex digits
		set displayText to replaceCharacter(fromUnix, 10, "  ")
		if debug ≥ 7 then
			log return & displayText
			log "length of displayText is " & length of displayText
		end if
		-- one character
		set displayText to replaceCharacter(displayText, 51, " ")
		if debug ≥ 7 then
			log return & displayText
			log "almost there ..... length of displayText is " & length of displayText
		end if
		log "variable " & textMessage & " in hex is " & return & "         0    2    4    6    8    a    c    e     0 2 4 6 8 a c e" & return & displayText
	on error errMsg number n
		log "==> convert hex string to string failed. " & errMsg & " with number " & n
	end try
	if debug ≥ 7 then
		log "leaving ~.~ hexDumpFormatOne ~.~"
	end if
end hexDumpFormatOne

(* 
StefanK in https://macscripter.net/viewtopic.php?id=43852 
Replaces one or more characters based on the length of theCharacter. 

  Big Warning!!!
  ==============
    This on block is called by hexDumpFormatOne().  
    Therefor, you may not call hexDumpFormatOne() from this on block.
	If you so so, you get yourself into an endless loop. 
	Use hexDumpFormatZero() instead.
	
	script -k <output file name>
	osascript /Applications/applescriptFiles/workwithclipboardV13-HTML.app
	use Activity Monito to stop osascript
	
*)

on replaceCharacter(theText, theOffset, theCharacter)
	global debug
	if debug ≥ 7 then log "in ~~~ replaceCharacter ~~~"
	if debug ≥ 7 then
		log "  theOffset is " & getIntegerAndHex(theOffset) & " with theCharacter >" & theCharacter & "<  length of theText is " & getIntegerAndHex(length of theText)
		log "theText is " & theText
	end if
	
	set theOutput to theText -- ready to return if need be.
	repeat 1 times
		-- sanity checks
		if theOffset ≤ 0 then
			display dialog "No character to replace at " & theOffset & " with character " & theCharacter & " in " & theText giving up after 10
			log "==> Adjust theOffset to be wihin the string."
			exit repeat -------------- return ---------->					
		end if
		if (theOffset - (length of theCharacter)) ≤ 0 then
			display dialog "Too near the front of the buffer.  " & theOffset & " with character " & theCharacter & " in " & theText giving up after 10
			log "==> Too near the front of the buffer. "
			exit repeat -------------- return ---------->
		end if
		if (theOffset + (length of theCharacter) - 1) > (length of theText) then
			display dialog "To near the end of the buffer. " & theOffset & " with character " & theCharacter & " in " & theText giving up after 10
			log "==> Too near the end of the buffer. "
			log "  " & "theOffset is " & theOffset & " with theCharacter >" & theCharacter & "<  in " & theText
			log "length of buffer is " & getIntegerAndHex(length of theText)
			exit repeat -------------- return ---------->					
		end if
		
		if debug ≥ 7 then
			log "theOffset is " & getIntegerAndHex(theOffset)
			log "theCharacter is " & theCharacter
		end if
		
		try
			-- what if we are at the end of the buffer.  We cannot get any remainder text.
			if theOffset ≥ (length of theText) then
				set theOutput to (text 1 thru (theOffset - 1) of theText) & theCharacter
			else
				set theOutput to (text 1 thru (theOffset - 1) of theText) & theCharacter & (text (theOffset + (length of theCharacter)) thru -1 of theText)
			end if
		on error errMsg number n
			log "==> No go. " & errMsg & " with number " & n
			exit repeat -------------- return ---------->
		end try
	end repeat
	return theOutput
end replaceCharacter

on hexDumpFormatZero(textMessage, hex)
	global debug
	if debug ≥ 5 then log "in ~~~ hexDumpFormatZero ~~~"
	if debug ≥ 5 then log "input string is " & hex
	-- -r -p
	set toUnix to "/bin/echo -n " & (quoted form of hex) & " | xxd  "
	if debug ≥ 5 then log "toUnix is " & toUnix
	try
		set displayText to do shell script toUnix
		
		log "variable " & textMessage & " in hex is " & return & "         0    2    4    6    8    a    c    e     0 2 4 6 8 a c e" & return & displayText
	on error errMsg number n
		log "==> convert hex string to string failed. " & errMsg & " with number " & n
	end try
end hexDumpFormatZero


-- ------------------------------------------------------
(* 
log " foundMarkerOffset is " & getIntegerAndHex(foundMarkerOffset) 
*)
on getIntegerAndHex(aNumber)
	global debug
	if debug ≥ 5 then log "in ~~~ getIntegerAndHex ~~~"
	
	return aNumber & " in Hex " & integerToHex(aNumber)
end getIntegerAndHex

-- ------------------------------------------------------ 
(*
https://macscripter.net/viewtopic.php?id=43713
  *)
on integerToHex(nDec)
	global debug
	if debug ≥ 5 then log "in ~~~ integerToHex ~~~"
	try
		set nHex to do shell script "perl -e 'printf(\"%X\", " & nDec & ")'" --> "F0"
	on error errMsg number n
		log "==> convert integer to hex. " & errMsg & " with number " & n
		set nHex to ""
	end try
	return nHex
end integerToHex