The proper way to script periodically pulling a page from an https site
- by DarthShader
I want to create a command-line script for Cygwin/Bash that logs into a site, navigates to a specific page and compares it with the results of the last run.
So far, I have it working with Lynx like so:
----snpipped, just setting variables----
echo "# Command logfile created by Lynx 2.8.5rel.5 (29 Oct 2005)
----snipped the recorded keystrokes-------
key Right Arrow
key p
key Right Arrow
key ^U" >> $tmp1 #p, right arrow initiate the page saving
#"type" the filename inside the "where to save" dialog
for i in $(seq 0 $((${#tmp2} - 1)))
do
echo "key ${tmp2:$i:1}" >> $tmp1
done
#hit enter and quit
echo "key ^J
key y
key q
key y
" >> $tmp1
lynx -accept_all_cookies -cmd_script=$tmp1 https://thewebpage.com/login
diff $tmp2 $oldComp
mv $tmp2 $oldComp
It definitely does not feel "right": the cmd_script consists of relative user actions instead of specifying the exact link names and actions. So, if anything on the site ever changes, switches places, or a new link is added - I will have to re-create the actions.
Also, I can't check for any errors so I can't abort the script if something goes wrong (login failed, etc)
Another alternative I have been looking at is Mechanize with Ruby (as a note - I have 0 experience with Ruby).
What would be the best way to improve or rewrite this?