I am looking for a way to automate a tedious process.
We have a number of evergreen web pages with numerous links. On a regular basis we need to check that every link is still taking users to the correct location.
To do this now we’re opening the pages, clicking on every link.
What I’m hoping for is a script that will look at a page (either in safari, or preferably reading the URL) extract all the links and then look at the page (either in safari, or preferably reading the URL).
We’d then compare that result to the expected result and flag any that don’t match.
Right now it’s the first step, extracting the clickable links, that I need help with. (Seems like that’s the simplest).
use framework "Foundation"
use scripting additions
set thePage to "https://www.apple.com"
set theURL to current application's |NSURL|'s URLWithString:thePage
set theSource to current application's NSString's stringWithContentsOfURL:theURL encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
set dataDetector to current application's NSDataDetector's dataDetectorWithTypes:(current application's NSTextCheckingTypeLink) |error|:(missing value)
set linkArray to dataDetector's matchesInString:theSource options:0 range:{location:0, |length|:theSource's |length|()}
return (linkArray's valueForKeyPath:"URL.absoluteString") as list
@ionah
In your nice script I add searchTag to return a list if it find more and 1 URL.
use framework "Foundation"
use scripting additions
set thePage to "https://www.apple.com"
its searchFor:"drama" inURL:thePage
on searchFor:searchTag inURL:URLString
set theURL to current application's |NSURL|'s URLWithString:URLString
set theSource to current application's NSString's stringWithContentsOfURL:theURL encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
set dataDetector to current application's NSDataDetector's dataDetectorWithTypes:(current application's NSTextCheckingTypeLink) |error|:(missing value)
set linkArray to dataDetector's matchesInString:theSource options:0 range:{location:0, |length|:theSource's |length|()}
set URLList to (linkArray's valueForKeyPath:"URL.absoluteString") as list
set resultList to {}
repeat with anItem in URLList
if searchTag is in anItem then
set end of resultList to (contents of anItem)
end if
end repeat
return resultList
end searchFor:inURL:
Or you could use NSPredicate to filter the list and NSSet to clean duplicates:
use framework "Foundation"
use scripting additions
my linksFrom:"https://www.apple.com" withTag:"drama"
on linksFrom:thePage withTag:theTag
set theURL to current application's |NSURL|'s URLWithString:thePage
set theSource to current application's NSString's stringWithContentsOfURL:theURL encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
set dataDetector to current application's NSDataDetector's dataDetectorWithTypes:(current application's NSTextCheckingTypeLink) |error|:(missing value)
set linkArray to dataDetector's matchesInString:theSource options:0 range:{location:0, |length|:theSource's |length|()}
set linkArray to (linkArray's valueForKeyPath:"URL.absoluteString")
set thePred to current application's NSPredicate's predicateWithFormat:"self contains[cd] %@" argumentArray:{theTag}
set linkArray to linkArray's filteredArrayUsingPredicate:thePred
set linkArray to current application's NSSet's setWithArray:linkArray
return linkArray's allObjects() as list
end linksFrom:withTag:
@ionah
That is nice, I have a lot of NSPredicate scripts on my old computer.
Forgot how useful it is.
I try to follow Shane Stanley’s ASObjC Style Guide
As your script use 3 function or handlers and the 4 is the main one.
use framework "Foundation"
use scripting additions
its linksFrom:"https://www.apple.com" withTag:"drama"
on linksFrom:URLString withTag:theTag
set theSource to its URLWithString:URLString
set linkArray to detectorWithLink(theSource)
return (its filterArray:linkArray predicateWithFormat:"self contains[cd] %@" withArguments:{theTag}) as list
end linksFrom:withTag:
on URLWithString:URLString
set theURL to current application's |NSURL|'s URLWithString:URLString
set {theContents, theError} to current application's NSString's stringWithContentsOfURL:theURL encoding:(current application's NSUTF8StringEncoding) |error|:(reference)
if theContents is missing value then error (theError's localizedDescription() as string)
return theContents
end URLWithString:
on detectorWithLink(theSource)
set dataDetector to current application's NSDataDetector's dataDetectorWithTypes:(current application's NSTextCheckingTypeLink) |error|:(missing value)
set theDetector to dataDetector's matchesInString:theSource options:0 range:{location:0, |length|:theSource's |length|()}
set theDetector to (theDetector's valueForKeyPath:"URL.absoluteString")
return theDetector
end detectorWithLink
on filterArray:anArray predicateWithFormat:formatString withArguments:argumentList
set thePredicate to current application's NSPredicate's predicateWithFormat:formatString argumentArray:argumentList
set anArray to anArray's filteredArrayUsingPredicate:thePredicate
set anArray to current application's NSSet's setWithArray:anArray
return anArray's allObjects()
end filterArray:predicateWithFormat:withArguments:
I want to do something that’s related. I want to scrape all JPEGs on a page, and copy each to a folder from which I can do some post processing. I’ll study these solutions (unless someone has something closer to my need).