Initial design for the 3rd version
The current version of the Expiry Checker records expired page hits - including the page urls, owners and expiry dates - in a file in alphabetical order, with no duplicates. It is computationally wasteful. In v3 I propose to just append the url to the expired-pages file each time an expired page is hit (so that the urls are stored in order of occurrence rather than in alphabetical order), without worrying about duplicates, and without bothering to store the page owners and expiry dates since they can be looked up later on.So the first script will be pretty trivial. It will be invoked by the 1x1 pixel hidden image in each web page, as is currently the case. It'll have to be called 'expirychecker.php' because that's what's been coded into the web pages. The script will need to lock the file while it writes to it, of course.
expirychecker.php4 if the page has expired then append page url to expired-urls.xml endifThe second script is the one that'll do all the work, and the idea is for it is to be invoked once every Monday morning via a cron job. This is what I have in mind...
expiredpagehandler.php4 copy expired-urls.xml to expired-urls-last-week.xml initialise an (empty) expired-pages-list... ...where each expired-page is of the form: [url,owner,expirydate] for each raw url recorded in expired-urls-last-week.xml normalise the url append [url,-,-] to expired-pages-list, avoiding duplicates endfor for each expired-page in expired-pages-list read page owner and expirydate into [url,owner,expirydate] if expirydate is older than today's date then send email to owner if expirydate is over a month older than today's date then send email to webmaster endif endif endfor initialise the expired-pages.xml file for each node in the list append node data to expired-pages.xml endifThe last four lines are there to enable operation of the Expiry Data Viewer script.