FindProxy
Overview
FindProxy examines the contents of files, or web pages, extracts (and
optionally tests) likely looking proxy entries.
By default, a single test is applied to each proxy found - whether it
can GET a reference web page intact. Originally designed to detect
censoring web proxies, the program uses a guaranteed censored web page
as this reference page by default. When this page is user-specified
(via '-r') and given the value 'none', the proxies are extracted in the
standard address:port format, but no tests are done. This will soon
become the default mode of operation. StatProxy (another tool in the
proxyTools distribution) has equivalent functionality now (tests 0 and
14).
By default, any proxy strings found which have ports which are known to
be blocked by firewall, are ignored. The firewall may be specified
using the '-F' option, or automatically determined from the test
location IP address. User specified lists of ports to ignore may also
be given.
By default, tests are carried out from the user's computer. The user
may specify the tests to be carried out via any accessible CONNECT
proxy, and by choosing such a proxy carefully, the user may conceal his
real IP address.
By default, any initial proxy list web page is obtained by use of the
local ISP's proxy. If that blocks access to the list site, a CONNECT
proxy may be specified for this purpose too.
FindProxy uses Perl regular expression matching tro detect the likely
proxy strings within the page/file content, allowing it to match many
kinds of content structure, while still allowing easy maintenance for
new page formats. Several dozens of different web pages, proxy list
file formats, mailing lists, bulletin boards etc. are currently
interpreted correctly.
Unique amongst the proxyTools, this program requires a URL format for
the specification of the location of the target content. The URL may
specify a local file (file://, or file:) or a web location (http://,
news:, etc.). Examples are given below.
Like most of the proxyTools, a configuration file can be used so the
user need not repeatedly type command line options.
Installation and operation of findProxy
I'm assuming you're using MS Windows with ActiveState Perl; users of
other operating systems will
probably have no trouble following this anyway. I'm also assuming the
configuration file is unmodified.
Unzip the proxyTools.zip package.
There are two ways to run this thing:
a) from the command line (a DOS shell) like:
perl findProxy.pl <options> <url>
or
b) from a shortcut/link. In this case, you'll probably want to edit
the options and url parameter into the script itself. Not recommended.
Note: if you add .pl to the PATHEXT environment variable under MS
Windows, and you have the standard '.pl' extension associated with
perl.exe, you won't need to type the 'perl ' part or the .pl
in the command line. So you can just type:
findProxy <options> <url>
Examples:
The <url> below may be a file in the current directory, like:
file:listOfProxies.txt
or it may a web page, like:
http://www.angelfire.com/my/6waynes/checkedPublicProxies.html
or even ftp etc.
For clarification, the -p proxy is used to get the list in the
first place, so it's unnecessary if the url is a local 'file:'.
But this proxy is (by default) also used for the CONNECT connections,
so it's needed whenever the -C option is used (and, of course, must
be a CONNECT capable proxy). There is now an option to specifiy a
different proxy for this purpose.
Also note that any missing command line options will be defaulted
to some (usually) sensible value in the code - you can set these
once and for all by editing the configuration file.
Examples
perl findProxy.pl -h
will give a quick list of the options (and their defaults) available
Even more details are found by using the builtin documentation:
perldoc findProxy.pl
______________________
perl findProxy.pl <url>
will test proxies at <url> for those which are directly usable
from
the UAE (by configuring into your web browser). A UAE proxy is used,
if necessary, using the normal GET to test them. Output is noisy,
timeout is 60 secs. Note the code defaults to a UAE proxy, so users
outside the UAE must at
least add a -p <proxy> they have access to (for non-file: URLs).
______________________
perl findProxy.pl -F UAE-dialup <url>
As above. -F options are those available in the firewalls.xml database.
______________________
perl findProxy.pl -F UAE-dialup -p http://194.170.168.236:8080/
<url>
As above (this is the proxy it's defaulting to anyway).
KSA people should use -F KSA-ISU *and* a -p <proxy> they can
access.
______________________
perl findProxy.pl -F firewall-none -C -p
http://194.170.168.236:8080/ <url>
will check a list of proxies from <url>, using a
CONNECT via the proxy at 194.170.168.236:8080 (a UAE proxy which
allows CONNECT). No proxies are excluded from the check. Oh ...
'bess' proxies are still excluded :-)
______________________
perl findProxy.pl -F UAE-dialup] <url>
will check a list at <url> ignoring all proxies on blocked ports
for all UAE-dialup user.
______________________
perl findProxy.pl -i [portList] <url>
will check the list at <url> ignoring *only* the ports listed in
portList (as many as you like, separated by commas). If there is only
one port in <portList>, the square brackets may be omitted.
This means the program can be used from any country/corporation
once the list of blocked ports is known.
______________________
perl findProxy.pl -r none
http://www.angelfire.com/my/6waynes/checkedPublicProxies.html
will extract the proxies found in the page specified and list them in
standard format (saving to a file by default). No tests will be done.
Let me know about any bugs. Feel free to hack the code around (and send
me any useful
patches).
Have fun
wayne@nym.alias.net