A better way to use SQLite?

sqlite
foundation
asobjc

(Shane Stanley) #1

I’ve been playing with SQLite lately. It’s quite usable from AppleScript via the command line tool sqlite3 and do shell script, but it seems to me that this approach has several drawbacks from AppleScript.

The biggest issue is that everything is reduced to text. So you can end up doing a lot of text manipulation just to get stuff in and out. This is especially the case where there are a lot of results. And while it’s not slow, it could be faster.

There is a third-party framework, FMDB, that puts an Objective-C wrapper around SQLite, so it can be called from AppleScriptObjC. Unfortunately some of it is lower level and more complicated than would be ideal for AppleScript.

So I’ve written a new framework, FMDBAS. This is the FMDB framework wrapped in a more AppleScript-friendly face (but maintaining access to the more advanced stuff for those who need it). Initial tests show it’s faster – in some cases considerably so. It’s main advantages are:

  • Easy use of SQL ? placeholders, so you can optionally pass parameters as lists containing differing classes. You can handle classes such as dates and images much more easily.

  • Query results are returned in either arrays/lists or dictionaries/records for each row, so there’s no need to massage a string to extract the results.

  • Query results are returned initially as Cocoa arrays. You can coerce them to AppleScript lists (a simple “as list”), but in the case of large results it’s often quicker to deal with the raw arrays. And because arrays are returned as a simple pointer, large result sets are returned much, much quicker.

  • Support for transactions. You can batch updates together for greater efficiency.

  • Lower overhead than shell, and combined with transactions this means there’s less need to build long statements to perform things like multiple insertions.

  • Better error handling.

There are a few cons:

  • You need to add the framework to your system. I may wrap it in a library at some stage.

  • You need at least a basic understanding of AppleScriptObjC syntax to use it.

  • It won’t work under versions earlier than 10.10 at this stage.

It’s still at the early testing stage, and I’m looking for interested volunteers. If that’s you, please email me.

As a taster, here are two scripts borrowed from Adam Bell’s macscripter.net Introduction to SQLite3 tutorial http://www.macscripter.net/viewtopic.php?id=24744, one to build a database and another to read it. (I’ve modified the first slightly to avoid problems running it repeatedly.) Here’s the first script using the do shell script method:

set loc to space & "~/desktop/TestDB.db" & space
set head to "sqlite3" & loc & quote
set tail to quote
-- "head" tells SQLite where to put our db if it doesn't exist, identifies it if it does.
-- "head" is the opening statement of every future command to our db.
-- "tail" ends every query started with "head".

-- Next, we set up a table and give the columns labels (there can be several).
-- Note the space between the semicolon (which ends every line) and the quote.
set tableName to "mods"
set newTbl to "drop table " & tableName & ";create table " & tableName & "(firstname, lastname, country); "

-- Now we set up the data to be entered in the table
set d1 to "insert into mods values('Ray', 'Barber', 'USA'); "
set d2 to "insert into mods values('jj', 'Sancho', 'Spain'); "
set d3 to "insert into mods values('Adam', 'Bell', 'Canada'); "
set d4 to "insert into mods values('Bruce', 'Phillips', 'USA'); "
set d5 to "insert into mods values('Kim', 'Hunter', 'USA'); "
set d6 to "insert into mods values('Kevin', 'Bradley', 'USA'); "

-- And finally, build the SQLite query and execute it
do shell script head & newTbl & d1 & d2 & d3 & d4 & d5 & d6 & tail

And here’s the FMDBAS equivalent:

use AppleScript version "2.4" -- Yosemite (10.10) or later
use framework "Foundation"
use framework "FMDBAS"
use scripting additions

-- create and open db
set thePath to current application's NSString's stringWithString:"~/desktop/TestDB.db"
set theDb to current application's FMDatabase's databaseWithPathAS:(thePath's stringByExpandingTildeInPath())
theDb's openWithFlagsAS:(current application's FMDBASOpenReadWriteCreate)

-- create table, clearing if necessary it
set tableName to "mods"
theDb's executeUpdateASBool:("drop table if exists " & tableName)
theDb's executeUpdateAS:("create table " & tableName & " (firstname, lastname, country)")

-- enter data
theDb's beginTransactionAS()
repeat with aList in {{"Ray", "Barber", "USA"}, {"jj", "Sancho", "Spain"}, {"Adam", "Bell", "Canada"}, {"Bruce", "Phillips", "USA"}, {"Kim", "Hunter", "USA"}, {"Kevin", "Bradley", "USA"}}
	(theDb's executeUpdateAS:"insert into mods values(?, ?, ?)" withArguments:aList)
end repeat
theDb's commitAS()
-- close database
theDb's closeAS()

Here’s the shell script reading example:

-- Combining the path into the head:
set head to "sqlite3 ~/desktop/TestDB.db " & quote
-- Now the query:
set Spain to "select firstname, lastname from mods where country = 'Spain'; "
-- Get the answer (which will appear in your script as a list, so we'll name the parts
set {firstname, lastName} to words of (do shell script head & Spain & quote)
--> {"jj", "Sancho"}
set tLast to "select lastname from mods where firstname = 'Bruce'; "
set surname to do shell script head & tLast & quote --> "Phillips "

And the FMDBAS version:

use AppleScript version "2.4" -- Yosemite (10.10) or later
use framework "Foundation"
use framework "FMDBAS"
use scripting additions

-- create and open db
set thePath to current application's NSString's stringWithString:"~/desktop/TestDB.db"
set theDb to current application's FMDatabase's databaseWithPathAS:(thePath's stringByExpandingTildeInPath())
theDb's openWithFlagsAS:(current application's FMDBASOpenReadOnly)

-- query it
set theResult to theDb's executeQueryAS:("select firstname, lastname from mods where country = 'Spain'")
-- *or*:
-- set otherResult to theDb's executeQueryAS:("select firstname, lastname from mods where country = ?") withArguments:{"Spain"}
set surname to theDb's executeQueryAS:("select lastname from mods where firstname = ?") withArguments:{"Bruce"}
theDb's closeAS()
return {theResult as list, surname as list}

These are tiny examples, but the first case is about 10x faster, and the second closer to 100x faster.


About the SQLite Lib category
(Jim Underwood) #2

Wow! That is amazing! I’m all in. (EMail sent). This could be a real game-changer.
Thanks for taking on this project, Shane.


(Ed Stockly) #3

This is very interesting, Shane.

A number of years ago I wrote a series of scripts using SQL. The first imported a pipe-delimited database with inconsistent fields for every TV show airing on a set of channels on a particular day. (Inconsistent means the field order changed depending on if the program was a movie, a sporting event or other).

The rest of the script would extract data for a specific channel, timeslot or show, to be used to update a listing.

I’ve been wanting to expand it for some time, and now I think I can start.


(Jim Underwood) #4

Shane, I received your Beta 1 files, and have installed, read, and completed successfully an execute of all of the test scripts you provided:

General sample.scpt
Macscripter read.scpt
Macscripter write.scpt
Safari history.scpt

The documentation seems clear enough to me, and I’m doing some rich-text reformatting of it in an Evernote Note (mostly for my benefit now and future ref), which I’ll share very soon.

Ah, what the heck. In the spirit of a Beta, here’s my alpha/beta EN Note: FMDBAS Framework Notes – Getting Started

  • You can view it as a web page without any Evernote software.
  • If you have Evernote, you can download into your account (but I wouldn’t recommend that yet as it is still a work in progress)
  • It is NOT finished yet. Have not done the Query section.
  • It has NOT been carefully reviewed yet, so I could have unintentionally screwed something up.
  • But if anyone want’s to offer constructive criticism, I’m all ears. :wink:

So far so good. I made the trial lap without any engine failures. :wink:
I think this is going to be great! Easy to use and very powerful.

I’ll start some detailed testing comparisons this weekend with some existing SQL scripts I have.


(Phil Stokes) #5

Thanks Shane. I don’t have time to look into this right now, but it’s definitely bookmarked for future use. There’s sql databases littered all over the mac that I need to access for security or forensics purposes. I normally do that with bash or python scripts, but having this option in AS could be handy.

One of my main uses for AppleScript is to provide simple click-and-go scripts for users to run diagnostic reports on their macs, so wrapping that f/w in a library would be useful, too.


(Shane Stanley) #6

Thanks for taking the time to do all that, @JMichaelTX .

One issue I’m looking for feedback from everyone on is the design. In particular, the decision to replace built-in methods that return boolean values for success or failure with methods that throw an AppleScript error on failure. I think this is more AppleScript style, although there’s a limitation that means all error with have the same number (-10000). Does this make sense to people?

Further, in the case of the update methods, I’ve added boolean versions as well. But I’m wondering if these are just an unnecessary complication. Thoughts?

Lastly, in the next version I’m going to add dedicated AS forms of the +databaseWith… methods. This is partly for consistency, but also for better error trapping. I’m also adding methods for loading a database into memory, and saving a database from memory to a file. My feeling is that that just about covers most bases. So is anything missing?


(Shane Stanley) #7

Note that you can also include a framework directly in a script bundle without a library. You just need to put it in a Frameworks folder inside the bundle’s Contents folder (not inside Resources).


(Phil Stokes) #8

Ah! That’s cool. I didn’t know that.


(Ed Stockly) #9

So far, this looks very promising. The ease with which I can generate appleScript lists and records from SQL especially.

You’re saying there may be a library in the future, I’m hoping that includes a full AE Dictionary? That would be great!

Not exactly sure how I would use this to import from a CSV or pipe delimited text file.


(Shane Stanley) #10

Honestly, unlikely. It’s an extra level of overhead, and this is meant to be as fast as possible – and SQL stuff is not exactly for dabblers. It would also be a lot of work, and I’m not sure there’s wide enough appeal to warrant it.

That said, there’s nothing to stop someone else doing just that.

You going to have to convert to a list of lists, and then use executeUpdatesAS:withArguments:. But being able to pass CSV directly is an interesting idea…


(Jim Underwood) #11

As long as you’re thinking about that, it would also be useful to:

  1. Pass TSV (Tab Separated Values)
  2. Pass a path to a file of either TSV or CSV, for either input or output.

Thanks.

BTW, I don’t need a scripting dictionary.


(Shane Stanley) #12

Yes, that would make sense.

That raises issue of what format the file is in, which complicates matters. I suppose I could insist on UTF-8.


(Jim Underwood) #13

I’m good with that. I would think that would cover most use cases.
I can see where this (read/write CSV/TSV files internally) could be extremely useful in migration (export/import) of large blocks of data between apps/databases.

Thanks for considering. BTW, if you need to, you can always delay this stuff to Ver 2. :smile:


(Shane Stanley) #14

I still have misgivings. I mean, it’s only saving two lines of code:

set {theString, theError} to current application's NSString's stringWithContentsOfFile:posixPath encoding:(current application's NSUTF8StringEncoding) |error|:(reference)
if theString is not missing value then error theError's localizedDescription() as text

and the tiniest amount of overhead. And there’s some sense in making sure that part of the process is working before even opening the database.


(Ed Stockly) #15

Interesting, when you use “overhead” that seems to suggest that a dictionary would slow it down, even when the dictionary commands are not being used. Is that correct?


(Jim Underwood) #16

@ShaneStanley,

Based on Beta 3 (2017-07-10) running Script Debugger 6.0.5 (6A205) on macOS 10.11.6.

To keep things simple and consistent, why not name TSV methods same as CSV methods:

--- USE THESE METHOD NAMES ---
theDb's executeUpdatesAS:"insert into test values (?, ?, ?, ?)" withTSV:tabbedText
theDb's executeUpdatesAS:"insert into test values (?, ?, ?, ?)" withTSVFromPath:(POSIX path of tempPath) 

But then it occurred to me that maybe you don’t need separate methods for TSV and CSV data:

--- ACTUALLY, MAYBE YOU DON'T NEED BOTH METHODS.  THIS WORKS ---
theDb's executeUpdatesAS:"insert into test values (?, ?, ?, ?)" ¬
  withCSV:tabText commaIs:tab

--- BUT INSTEAD OF ---
withCSV: commaIs:

--- USE ---
withText: delim:

--- SO YOU'D HAVE ---
theDb's executeUpdatesAS:"insert into test values (?, ?, ?, ?)" ¬
  withText:tabText delim:tab

I’d also prefer a simpler method for the “from path”:

theDb's executeUpdatesAS:"insert into test values (?, ?, ?, ?)" ¬
  fromPath:(POSIX path of tempPath) delim:","

Your thoughts?


(Shane Stanley) #17

I did it the way I did because I see CSV as a recognized (pseudo) acronym, with TSV less so. But there’s something to be said for brevity, not to mention avoiding the whole issue of spelling separated :slight_smile:.

That’s not going to work. The TSV stuff just breaks fields on tabs – there’s no way of including a tab within a field, for example. But CSV allows escaping of the delimiter, as well as allowing it in quoted fields – the process is more complex. So there has to be some way of knowing what’s wanted.

But there’s certainly value in being able to specify things like pipes and other characters, or even strings, as field delimiters.


(Shane Stanley) #18

Updated

FWIW, I’m after any more feedback on terminology. I was initially thinking of releasing it as a framework first and as a library later, but I’m now going to the library stage straight away. It won’t have defined terminology, but it will allow more simplified handler names.

For example, instead of using:

set theDb to current application's FMDatabase's databaseWithPathAS:posixPath
theDb's openWithFlagsAS:(current application's FMDBASOpenReadWriteCreate)
set theList to (theDb's executeQueryAS:"select * from test") as list

It will be like:

set theDb to theLib's makeNewDbWith:posixPath -- actually accepts file, alias, NSURL, or HFS/POSIX path
theDb's openReadWriteCreate()
set theList to (theDb's doQuery:"select * from test") as list

That notwithstanding, if anything in the existing terminology jars, now’s a good time to yell.


(Jim Underwood) #19

I don’t understand. A tab is a character just like a comma, so why can’t it be treated the same in the algorithm?

IAC, you could still have one method. The “delim:” parameter would tell you how to handle the data. If it is a TAB, then use the TAB algorithm.

Here’s a script that compares:

  1. Tab Delimited data using the TSV method
  2. Tab Delimited data using the CSV method
  3. Comman Delimited data using the CSV method with “,”
  4. Semi-colon Delimited data using the CSV method with “;”

BTW, please note the change in parent folder for the DB.
I detest making my desktop cluttered with test files, or really any files.

use AppleScript version "2.4" -- Yosemite (10.10) or later
use framework "Foundation"
use framework "FMDBAS"
use scripting additions

property parentPath : "~/Documents/Test/"
property dbName1 : "TestMultiDelim.db"

-- Tab-separated value
-- leading and trailing linebreaks are trimmed
-- returns and linefeeds are recognised as line/row breaks

--- TAB DELIM
set tabbedText to tab & "Two\",\" two" & tab & "Three" & tab & "Four \"four\" four" & linefeed & "Five\",\" five" & tab & "Six" & tab & "Seven" & tab & "Eight" & return & "Nine" & tab & "Ten" & tab & "Eleven" & tab & "Twelve" & linefeed & return

--- COMMA DELIM ---
set csvText to "COMMA/CSV,Two\",\" two,Three,\"Four \"\"four\"\" four\"" & linefeed & "Five\",\" five,Six,Seven,Eight" & return & "Nine,\"Ten , ten\",Eleven,Twelve" & linefeed & return

--- SEMI-COLON DELIM ---
set csvText2 to "SEMI-COLON/CSV;Two\",\" two;Three;\"Four \"\"four\"\" four\"" & linefeed & "Five\";\" five;Six;Seven;Eight" & return & "Nine;\"Ten ; ten\";Eleven;Twelve" & linefeed & return


-- create and open db
set thePath to (current application's NSString's stringWithString:(parentPath & dbName1))'s stringByExpandingTildeInPath()
set theDb to current application's FMDatabase's databaseWithPathAS:thePath
theDb's openWithFlagsAS:(current application's FMDBASOpenReadWriteCreate)

-- create table, update from tabbedText, read contents, then close
theDb's executeUpdateAS:("drop table if exists test")
theDb's executeUpdateAS:"create table test (a, b, c, d)"

--- INSERT TAB DELIM USING TSV METHOD ---
theDb's executeUpdatesAS:"insert into test values (?, ?, ?, ?)" withTabSeparatedValues:("TAB/TSV" & tabbedText)

--- INSERT TAB DELIM USING CSV METHOD ---
theDb's executeUpdatesAS:"insert into test values (?, ?, ?, ?)" withCSV:("TAB/CSV" & tabbedText) commaIs:tab

--- INSERT COMMA DELIM USING CSV METHOD ---
theDb's executeUpdatesAS:"insert into test values (?, ?, ?, ?)" withCSV:csvText commaIs:","


--- INSERT SEMI-COLON DELIM USING CSV METHOD ---
theDb's executeUpdatesAS:"insert into test values (?, ?, ?, ?)" withCSV:csvText2 commaIs:";"


set result1 to (theDb's executeQueryAS:"select * from test") as list
theDb's closeAS()

set AppleScript's text item delimiters to linefeed

return result1

###Results

The CSV method seems to always strip one pair of quotes (if any) from each cell. Is this by design? Standard CSV rules?


(Shane Stanley) #20

Because there are specific rules for CSV. You’ve seen one example: if a CSV field is completely within quotes, those quotes are treated as escaping quotes only, and are discarded.

You could use a tab as a delimiter in CSV – but that’s quite different from typical tab-delimited text, which is usually made by slicing first into paragraphs, and then dividing each paragraph on each tab, with no escaping allowed.

(Assuming you have a spreadsheet app, try doing some CSV and tab-delimited exports to see the difference.)