XML tags used for configuration
This document is still under construction
Overview
The XML files below allow complete flexibility in configuring localProxy.
By appropriate selection of the tags used in the user configuration file,
any of the default tags specified in the default configuration files may
be overloaded. Every tag mentioned below may be changed by this means. New
tags may be added and will take effect (assuming the code recognizes them)
at the next build of the database. This build is done by localProxy whenever
the back end is started (with a configuration other than last, saved). It
is custom made for a particular users requirements as well as for the hosts/services
available, the firewall rules and http proxy censorship the user may be confronting,
and the services the user requires.
The load process consists of:
- load all the configuration files into temporary perl hash 'trees'
- user configuration services item, creating a service instance in
the running configuration 'tree' and overloading the defaults in each service
instance with the user specified requirements. This does not create one
'services' subtree; it creates a subtree for each service instance. This
may change.
- create the localProxy internal service instances (proxy autoconfig,
ad-zapping proxy, control service, PROPFIND http proxy)
- merge the global data into the root of the running configuration
tree
- overload the global data with the user specified global data
- copy each enabled commStrat item to all service instances, overload
each with user commStrat specifications
- overload the default firewall objects with user specified data
- for each service and commStrat, select the layer 0 and layer 1 hosts
from the hosts database, taking into account the host capabilities needed
by each service, commStrat, firewall. Firewall rules (e.g. blocked ports),
name servers and subnet mask information are used at this stage
The running configuration is finally viewable by clicking the 'show running
configuration' button in the GUI.
User configurations (config-*.xml)
The file (in the general case) is divided into items, each one corresponding
to a default configuration below (and labelled accordingly).
The default setup is to have a 'non-censored, http proxy' and a 'Usenet
news' service specified in this file. Each service specified causes one instance
of the corresponding service class to be built and started. For example,
it is common to have two news services started here. Each one has (at least)
a different local port and a different target service (news server:port).
Extra services may be added at run time using the 'Add plugin service' facility
The global data is merged to the root level of the running configuration
(there is no globals subtree).
All other tags in this file merely overload the corresponding tag in the
default database. For these, it is necessary to find the tree path to the
tag you wish to overload from the root down to the tag and then duplicate
it in XML. Not as difficult as it sounds - the running configuration tree
display ('show running configuration') displays the tree and the many configuration
examples show the rest.
Globals
maxNrHostsPerLayer - an overall limit on the number of hosts
that LP will use in any layer for any commStrat for any service.
selectHostsByScore - all hosts are always selected for the respective
service.commStrat.layer by a default scoring algorithm. This tag allows
the user to specify the weights to be used in that scoring algorithm in
the subkeys. The subkeys may be:
reliabilityWeight - weight to be given to
the overall reliability/availability of the host (the probability that it
is up (uptimeRatio) multiplied by the probability that it responds correctly
to a request when up)
speedWeight - weight to be given to the known
speed of this host in the scoring algorithm
thresholdScore - if defined, LP will not
use any hosts which score lower than this value in the running configuration.
useFirewall - the default firewall used is the one with the
same name as the configuration selected by the user. This global tag allows
a different firewall to be used. This is necessary (for example) when a user
copies a default configuration to one with his own name for customization
(a recommended procedure).
additionalBlockedPorts - a global which is added to the list
of blocked TCP ports specified in the firewalls data structure. It's a global
because it's easier to specify that in a config file, rather than position
this in the firewalls data structure. It was originally introduced to allow
a quick config to disallow IRC (port 6667) direct connections from LP to
ensure anonymity (the IP address seen by the IRC servers would never be that
of the LP host).
lpControlCredentials - username:password which authenticates a user
connecting via the LP web interface.
authentication - this key may be initialized with all the authentication
credentials the user needs for proxies and firewalls. The subkeys used by
localProxy are:
realm_<realmName>
service_<service address:port>
In both cases, the value is a string containg the
credentials required to respond to a challenge. The format of the string
is "realmName, serviceAddr:Port, username, userPassword". The special realm
'firewall' may be specified by the user to ensure that authentication is
sent whenever a non-proxy type of response is received from the firewall
(LP uses '501' error responses, for example). The realm '501' may also be
specified. The recommended way to handle such situations is to use the GUI
to set the credentials for the exact realm and service given in such challenges.
requested - the challenges received, and waiting
for the user to supply username and password. These are not specified by
the user.
CommStrats
isEnabled - this commStrat is enabled (will be considered and
used by LP when building the running configuration)
commStratFriendlyName - just a friendly name to refer to this
commStrat; used for display purposes only
forceServiceToLineOrientedMode - some services need to be handled
on a line by line basis only. When this is set, LP must wait on each socket
for a complete line of input before sending the data through to the outgoing
socket.If the incoming socket does not close, and the connecting client never
sends the expected line-end, LP will hang for a noticeable time, so non-line
oriented mode is always preferred.
targetServiceLayer - may be 0 or 1. The target service is the
remote service to be abstracted locally. This service may be a simple one
(ex. news) or a distributable one (ex. http proxy). Whether it is simple
or distributed, the corresponding host:port list appears in the targetServiceLayer
for this commStrat.
useOnlyWithServiceClass - this commStrat will only be considered
or used by LP when building the running configuration for this service class.
Ex. commStrat 2 probably makes no sense for use with a news service (and
the code would never find the right pattern to invoke it's use anyway).
methods - commStrat 2 only. Each method is a different URL encoding
algorithm used by LP to get the URL through censoring http proxies. The
following values (separated by commas) are possible:
1 - escape certain components of the URL
2 - insert /./
3 - premature URL ending w/ randomness (unused)
4 - long random URL (unused)
5 - insert fake parameter
6 - script control (unused)
7 - change case in URL. Look for 'naughty' words in URL path,
particularly.
8 - Windows \ delimiter
9
10
11
12
13
14
15
a - change host fqdn to dotted quad IP address
b - add /test/.. to path
c - change method spelling
d - lowercase a method character
e - change host fqdn to 32 bit IP address
f - // at path start (often works, but fails too often - unused)
g - add preamble to enable use of things like a CGI proxy, or an akamai rotator.
These proxies do not need to rewrite links (LP does it for you). This may
become a new commStrat (3, probably) in future.
Hosts
isEnabled - the host is enabled for use by localProxy. Ignored
completely if this value is 0.
reliability - the fraction of the time it successfully fulfils
a request, once we connect
uptimeRatio - the fraction of time we can connect to it.
CONNECTCapable - I'm uncertain whether to keep this one.
The useful flags (below) are straight from statProxy.pl and indicate the
capability of this host to CONNECT to specific ports on remote machines.
At the moment, a kludge in mergeHosts.pl and IIRC in the LP code allows CONNECTCapable3128
to be seen as meaning that the host is capable of CONNECTing to any port.
This causes (for example port 119) some hosts to be selected for service.commStrats
where they will not work, and localProxy must adapt.
CONNECTCapablennn - this agent is able,
upon request, to make us a CONNECT tunnel to some remoteHost:nnn
GETCapable - the basic http proxy test. This proxy is able
to return a requested reference web page.
PROPFINDCapable - Web based email clients (Microsoft, at least)
use web DAV and need proxies which can handle this. PROPFIND is a convenient
capability to test for.
eDonkeyCapable - allow hosts which offer eDonkey services to be selected
by LP for the eDonkey services. Experimental service.
doesNotPassIPAddressThrough - the canonical test for an anonymous
proxy. StatProxy actually makes the proxy connect back to the statProxy host
to see what headers it is sending.
modifiedURLVulnerablex - this proxy does not censor
the URL when URL modification type x is applied. See the methods tag in commStrats
for a list of the values and meanings of x
nonCensoring - this proxy does not censor requested pages
authorizationRequired - this proxy requires authorization.
If so, the value of this tag is username:password. It will be taken exactly
as entered here, MIME encoded and transmitted to the proxy for basic authentication
referencePageTime - the time taken to get a reference web
page. At the moment this is www.sex.com for non-censoring proxies and www.panix.com
for censoring proxies. I should make them all use the same page, of course.
onlyAllowsTcpAccessFrom - the proxy only allows access from
the CIDR notation subnets (or certain temporary strings) specified in the
value. This is commonly known as an ACL (Access Control List). At the moment,
I accept string 'all' to mean all, and anything else (non CIDR) to mean nowhere.
You will see 'notUAE' and 'notPanix' in the values for this tag - I'm using
it to record the tests I have done. The effect such strings is that the ACL
is assumed by LP to be for everywhere.
insideFirewall - used to shortcut the check to see if this host
is accessible by from the host running localProxy. If the port needed to
access the host is determined to be blocked, this value is tested see if
the host is, in fact, inside the firewall named by the user configuration
(and therefore accessible anyway). IP proxies are usually in this category.
See subnetsInside and otherAccessibleSubnets in the
firewalls config for descriptions of final, but more expensive accessibility
tests.
fqdn - the 'fully qualified domain name' is the alphabetical
address given to a host in a network.Example: proxy1.emirates.net.ae.
This tag is here for human readability only.
userPass - localProxy will use this to respond to authentication challenges
from this host. Format is "username:password".
Services
class - the class of service. Each service class may be instantiated
multiple times at the user request (in his config file). Examples of service
classes are:
httpProxy - a non-censoring http proxy service
news - a Usenet news service
mayForceCloseAfterTimeOutOnNoResponse - LP may forcibly close
a socket used for this service if a response is not received within a specifiable
time after an outgoing request. These are the ones you notice which normally
keep your web browser trying after a page has apparently loaded. Browser/client
will often retry the connection and succeed after the hung socket is closed.
timeOutOnNoResponse - the corresponding timeout value (seconds)
requiredLayernHostSpecs - LP selects hosts to add to
layer n for each service and commStrat based upon the specifications given
here. This key provides a subkey for each commStrat, giving the required
host capabilities:
n - the commStrat number for which
this subkey applies. The value of this subkey may a comma separated list
of the proxy property keys (to be compared against the host capabilities in
hosts.xml). A host will be rejected if the property specified here is not
both defined and non-zero for the host. Note also that a property written
here by preceding a normal property tag with a '-' is taken as a negative
specification, meaning a host is rejected if the property is not both defined
and zero. In both specifications, hosts with undefined tags are currently
rejected - there are some exceptions hard-coded which I am working to remove,
but I wish to retain the ability for users to provide a proxy with no test
results known.
Firewalls
otherAccessibleSubnets - some users are in the subnet of
a ISP A (inside the firewall of ISP A) and yet are able to access a subnet
of ISP B. These other accessible subnets are listed in this tag.
nameServer - a comma separated list of nameservers usable
by people inside this firewall.
firewallSpecFriendlyName - a friendly name for display purposes
only
url - the web site main page for this ISP
outgoing - this key conatins subkeys related to the blocking
of outgoing connections
blockedIPAddresses - known IP address blocks
(ex. Net2Phone servers in the UAE)
blockedTCPPorts - thelist of known blocked
TCP ports
openTCPPorts - the list of known open TCP
ports (unused by LP)
blockedUDPPorts - the list of known blocked
UDP ports
censoredProtocols - known censored protocols
(ex. transparent proxy blocking of http)
subnetsInside - the subnets (in CIDR notation) which are inside
this firewall. These may be private addresses, as well. LP uses these to
detetrmine whether a given host is accessible, or not even if the TCP port
required is blocked. Necessary to be able to include the local ISPs proxies
in LP builds (which you may want to do because the local ISP proxies may
be vulnerable to commStrat 1 (CONNECT), or commStrat 2 attacks).
holes - CIDR notation often implies a long continuous range
of IP addresses; in practice there are often holes in the range. For completeness,
these are recorded here, but LP does not take any notice of this yet. Normally
not important.
onlyAllowsTcpAccessFrom - the entire firewall only allows access
from the CIDR notation subnets (or certain temporary strings) specified in
the value. This is referred to (incorrectly) in LP as an ACL (Access Control
List). At the moment, I accept string 'all' to mean all, and anything else
(non CIDR) to mean nowhere. You will see 'notUAE' and 'notPanix' in the values
for this tag - I'm using it to record the tests I have done. The effect such
strings is that the ACL is assumed by LP to apply for everywhere outside.