· Martin's Expiry Checker v3 Blog ·

M.E.Bush > Misc. > Expiry Checker v3 Blog > 9-Aug-2006

Initial design for the 3rd version

The current version of the Expiry Checker records expired page hits - including the page urls, owners and expiry dates - in a file in alphabetical order, with no duplicates. It is computationally wasteful. In v3 I propose to just append the url to the expired-pages file each time an expired page is hit (so that the urls are stored in order of occurrence rather than in alphabetical order), without worrying about duplicates, and without bothering to store the page owners and expiry dates since they can be looked up later on.

So the first script will be pretty trivial. It will be invoked by the 1x1 pixel hidden image in each web page, as is currently the case. It'll have to be called 'expirychecker.php' because that's what's been coded into the web pages. The script will need to lock the file while it writes to it, of course.
expirychecker.php4

if the page has expired then
  append page url to expired-urls.xml
endif
The second script is the one that'll do all the work, and the idea is for it is to be invoked once every Monday morning via a cron job. This is what I have in mind...
expiredpagehandler.php4

copy expired-urls.xml to expired-urls-last-week.xml
initialise an (empty) expired-pages-list...
...where each expired-page is of the form: [url,owner,expirydate]
for each raw url recorded in expired-urls-last-week.xml
  normalise the url
  append [url,-,-] to expired-pages-list, avoiding duplicates
endfor
for each expired-page in expired-pages-list
  read page owner and expirydate into [url,owner,expirydate]
  if expirydate is older than today's date then
    send email to owner
    if expirydate is over a month older than today's date then
      send email to webmaster
    endif
  endif
endfor
initialise the expired-pages.xml file
for each node in the list
  append node data to expired-pages.xml
endif
The last four lines are there to enable operation of the Expiry Data Viewer script.