#!/usr/local/bin/perl
use warnings;
use strict;

use App::SimpleScan;

my $app = new App::SimpleScan;
exit $app->go;

__END__

=head1 NAME

simple_scan - scan a set of Web pages for strings present/absent

=head1 ABSTRACT

App::SimpleScan - Mini-language for website testing

=head1 SYNOPSIS

  simple_scan [--generate] [--run] {file file file ...}

=head1 USAGE

  # Run the tests in the files supplied on the command line.
  # --run (or -run; we're flexible) is assumed if you give no switches.
  % simple_scan file1 file2 file3

  # Generate a set of tests and save them, then run them.
  % <complex pipe> | simple_scan --generate > pipe_scan.t

  # Run one simple test
  % echo "http://yahoo.com yahoo Y Look for yahoo.com"  | simple_scan -run

=head1 DESCRIPTION

C<simple_scan> reads either files supplied on the command line, or standard
input. It creates and runs, or prints, or even both, a L<Test::WWW::Simple>
test for the criteria supplied to it.

C<simple_scan>'s input should be in the following format:

  <URL> <pattern> <Y|N> <comment>

The I<URL> is any URL; I<pattern> is a Perl regular expression, delimited by
slashes; I<Y|N> is C<Y> if the pattern should match, or C<N> if the pattern 
should B<not> match; and I<comment> is any arbitrary text you like (as long 
as it's all on the same line as everything else). 

C<simple_scan>will do its best to try to interpret your pattern; if it can't
parse it as a regular expression, it will assume you meant to match against
a literal character string instead; so a pattern like

   /<b>this</b>/

Would be interpreted as the literal string "<b>this</b>".

=head1 COMMAND-LINE SWITCHES

We use L<Getopt::Long> to get the command-line options, so we're really very
flexible as to how they're entered. You can use either one dash (as in
C<-foo>) or two (as in C<--bar>). You only need to enter the minimum number
or characters to match a given switch.

=over 4

=item C<--run>

C<--run> tells C<simple_scan> to immediately run the tests it's created. Can
be abbreviated to C<-r>.

This option is mosst useful for one-shot tests that you're not planning to
run repeatedly.

=item C<--generate>

C<--generate> tells C<simple_scan> to print the test it's generated on the
standard output.

This option is useful to build up a test suite to be reused later.

=back

Both C<-r> and C<-g> can be specified at the same time to run a test and print 
it simultaneously; this is useful when you want to save a test to be run later 
as well as right now without having to regenerate the test.

=head1 PRAGMAS

Pragmas are ways to influence what C<simple_scan> does when generating tests.
They don't output anything themselves.

Pragmas are specified with C<%%> in column 1 and the pragma name immediately
following. Any arguments aer supplied after a colon, like this:

   %%foo: bar baz

This invokes the C<foo> pragma with the argument C<bar baz>.

=head2 Substitutions

Any pragma that's otherwise unrecognized by C<simple_scan> is treated as a 
substitution. Substitutions assume that you have a name and a set of strings
following it; these strings wil be substituted into the test specs occuring
between this (set) of substitutions and the next (set).

Here's an example. 

   %% user dconway chromatic petdance
   %% use_perl_id Ovid pemungkah
   http://search.cpan.org/~<user>
   http://use.perl.org/~<use_perl_id>/journal/
   http://search.yahoo.com/
   ...
   
This would fetch the CPAN index page for the users dconway, chromatic, and 
petdance, and the use.perl journals for users Ovid and pemungkah. Finally,
it would (just once) fetch the Yahoo! search page - because there are no
substitutions in that line, it would only be evaluated once.

The substitutions can occur anywhere in the line, including in the comment.

This pragma allows for very simple-minded internationalization. For instance,
let's assume that you want to substitute each of a list of two-character 
country codes into a string (most likely somewhere in the URL, but possibly 
in the comment too). 

C<simple_scan> will do this for you, creating a test for each country code
you specify. For instance:

   %%xx: es au my jp
   http://>xx<.mysite.com/     /blargh/  Y  look for blargh (>xx<)

This would generate 4 tests, for C<es.mysite.com>, C<au.mysite.com>, 
c<my.mysite.com>, and C<jp.mysite.com>, all looking to match C<blargh> 
somewhere on the page.

=head2 agent

The C<agent> pragma allows you to switch user agents during the test. 
C<Test::WWW::Simple>'s default is C<Windows IE 6>, but you can switch it
to any of the other user agents supported by C<WWW::Mechanize>.

   http://gemal.dk/browserspy/basic.html /Explorer/ Y Should be Explorer
   %%agent: Mac Safari
   http://gemal.dk/browserspy/basic.html /Safari/ Y Should be Safari

=head2 cache

The C<cache> pragma turns on URL caching; once enabled, the page returned
on the I<first> access to a URL is returned directly from a memory cache,
without its being reaccessed from the Web.

Using C<cache> can result in major speedups for tests which repeatedly
hit the same page.

=head2 nocache

The C<nocache> pragma turns I<off> URL caching; this is useful if you
have something like a REST interface that may return different values 
from repeated accesses to the same URL.

=head1 AUTHOR

Joe McMahon E<lt>mcmahon@cpan.orgE<gt>

=head1 COPYRIGHT AND LICENSE

Copyright (c) 2005 by Yahoo!

This script is free software; you can redistribute it or modify it under the
same terms as Perl itself, either Perl version 5.6.1 or, at your option, any
later version of Perl 5 you may have available.
