XML tags used for configuration

This document is still under construction

Overview

The XML files below allow complete flexibility in configuring localProxy. By appropriate selection of the tags used in the user configuration file, any of the default tags specified in the default configuration files may be overloaded. Every tag mentioned below may be changed by this means. New tags may be added and will take effect (assuming the code recognizes them) at the next build of the database. This build is done by localProxy whenever the back end is started (with a configuration other than last, saved). It is custom made for a particular users requirements as well as for the hosts/services available, the firewall rules and http proxy censorship the user may be confronting, and the services the user requires.

The load process consists of:
The running configuration is finally viewable by clicking the 'show running configuration' button in the GUI.

User configurations (config-*.xml)

The file (in the general case) is divided into items, each one corresponding to a default configuration below (and labelled accordingly).

The default setup is to have a 'non-censored, http proxy' and a 'Usenet news' service specified in this file. Each service specified causes one instance of the corresponding service class to be built and started. For example, it is common to have two news services started here. Each one has (at least) a different local port and a different target service (news server:port). Extra services may be added at run time using the 'Add plugin service' facility

The global data is merged to the root level of the running configuration (there is no globals subtree).

All other tags in this file merely overload the corresponding tag in the default database. For these, it is necessary to find the tree path to the tag you wish to overload from the root down to the tag and then duplicate it in XML. Not as difficult as it sounds - the running configuration tree display ('show running configuration') displays the tree and the many configuration examples show the rest.

Globals

maxNrHostsPerLayer - an overall limit on the number of hosts that LP will use in any layer for any commStrat for any service.
selectHostsByScore - all hosts are always selected for the respective service.commStrat.layer by a default scoring algorithm. This tag allows the user to specify the weights to be used in that scoring algorithm in the subkeys. The subkeys may be:
    reliabilityWeight - weight to be given to the overall reliability/availability of the host (the probability that it is up (uptimeRatio) multiplied by the probability that it responds correctly to a request when up)
    speedWeight - weight to be given to the known speed of this host in the scoring algorithm
    thresholdScore - if defined, LP will not use any hosts which score lower than this value in the running configuration.
useFirewall - the default firewall used is the one with the same name as the configuration selected by the user. This global tag allows a different firewall to be used. This is necessary (for example) when a user copies a default configuration to one with his own name for customization (a recommended procedure).
additionalBlockedPorts - a global which is added to the list of blocked TCP ports specified in the firewalls data structure. It's a global because it's easier to specify that in a config file, rather than position this in the firewalls data structure. It was originally introduced to allow a quick config to disallow IRC (port 6667) direct connections from LP to ensure anonymity (the IP address seen by the IRC servers would never be that of the LP host).
lpControlCredentials - username:password which authenticates a user connecting via the LP web interface.
authentication - this key may be initialized with all the authentication credentials the user needs for proxies and firewalls. The subkeys used by localProxy are:
    realm_<realmName>
    service_<service address:port>
    In both cases, the value is a string containg the credentials required to respond to a challenge. The format of the string is "realmName, serviceAddr:Port, username, userPassword". The special realm 'firewall' may be specified by the user to ensure that authentication is sent whenever a non-proxy type of response is received from the firewall (LP uses '501' error responses, for example). The realm '501' may also be specified. The recommended way to handle such situations is to use the GUI to set the credentials for the exact realm and service given in such challenges.
    requested - the challenges received, and waiting for the user to supply username and password. These are not specified by the user.

CommStrats

isEnabled - this commStrat is enabled (will be considered and used by LP when building the running configuration)
commStratFriendlyName - just a friendly name to refer to this commStrat; used for display purposes only
forceServiceToLineOrientedMode - some services need to be handled on a line by line basis only. When this is set, LP must wait on each socket for a complete line of input before sending the data through to the outgoing socket.If the incoming socket does not close, and the connecting client never sends the expected line-end, LP will hang for a noticeable time, so non-line oriented mode is always preferred.
targetServiceLayer - may be 0 or 1. The target service is the remote service to be abstracted locally. This service may be a simple one (ex. news) or a distributable one (ex. http proxy). Whether it is simple or distributed, the corresponding host:port list appears in the targetServiceLayer for this commStrat.
useOnlyWithServiceClass - this commStrat will only be considered or used by LP when building the running configuration for this service class. Ex. commStrat 2 probably makes no sense for use with a news service (and the code would never find the right pattern to invoke it's use anyway).
methods - commStrat 2 only. Each method is a different URL encoding algorithm used by LP to get the URL through censoring http proxies. The following values (separated by commas) are possible:
1 - escape certain components of the URL
2 - insert /./
3 - premature URL ending w/ randomness (unused)
4 - long random URL (unused)
5 - insert fake parameter
6 - script control (unused)
7 - change case in URL. Look for 'naughty' words in URL path, particularly.
8 - Windows \ delimiter
9
10
11
12
13
14
15
a - change host fqdn to dotted quad IP address
b - add /test/.. to path
c - change method spelling
d - lowercase a method character
e - change host fqdn to 32 bit IP address
f - // at path start (often works, but fails too often - unused)
g - add preamble to enable use of things like a CGI proxy, or an akamai rotator. These proxies do not need to rewrite links (LP does it for you). This may become a new commStrat (3, probably) in future.

Hosts

isEnabled - the host is enabled for use by localProxy. Ignored completely if this value is 0.
reliability - the fraction of the time it successfully fulfils a request, once we connect
uptimeRatio - the fraction of time we can connect to it.
CONNECTCapable - I'm uncertain whether to keep this one. The useful flags (below) are straight from statProxy.pl and indicate the capability of this host to CONNECT to specific ports on remote machines. At the moment, a kludge in mergeHosts.pl and IIRC in the LP code allows CONNECTCapable3128 to be seen as meaning that the host is capable of CONNECTing to any port. This causes (for example port 119) some hosts to be selected for service.commStrats where they will not work, and localProxy must adapt.
CONNECTCapablennn - this agent is able, upon request, to make us a CONNECT tunnel to some remoteHost:nnn
GETCapable - the basic http proxy test. This proxy is able to return a requested reference web page.
PROPFINDCapable - Web based email clients (Microsoft, at least) use web DAV and need proxies which can handle this. PROPFIND is a convenient capability to test for.
eDonkeyCapable - allow hosts which offer eDonkey services to be selected by LP for the eDonkey services. Experimental service.
doesNotPassIPAddressThrough - the canonical test for an anonymous proxy. StatProxy actually makes the proxy connect back to the statProxy host to see what headers it is sending.
modifiedURLVulnerablex - this proxy does not censor the URL when URL modification type x is applied. See the methods tag in commStrats for a list of the values and meanings of x
nonCensoring - this proxy does not censor requested pages
authorizationRequired - this proxy requires authorization. If so, the value of this tag is username:password. It will be taken exactly as entered here, MIME encoded and transmitted to the proxy for basic authentication
referencePageTime - the time taken to get a reference web page. At the moment this is www.sex.com for non-censoring proxies and www.panix.com for censoring proxies. I should make them all use the same page, of course.
onlyAllowsTcpAccessFrom - the proxy only allows access from the CIDR notation subnets (or certain temporary strings) specified in the value. This is commonly known as an ACL (Access Control List). At the moment, I accept string 'all' to mean all, and anything else (non CIDR) to mean nowhere. You will see 'notUAE' and 'notPanix' in the values for this tag - I'm using it to record the tests I have done. The effect such strings is that the ACL is assumed by LP to be for everywhere.
insideFirewall - used to shortcut the check to see if this host is accessible by from the host running localProxy. If the port needed to access the host is determined to be blocked, this value is tested see if the host is, in fact, inside the firewall named by the user configuration (and therefore accessible anyway). IP proxies are usually in this category. See subnetsInside and otherAccessibleSubnets in the firewalls config for descriptions of final, but more expensive accessibility tests.
fqdn - the 'fully qualified domain name' is the alphabetical address given to a host  in a network.Example: proxy1.emirates.net.ae. This tag is here for human readability only.
userPass - localProxy will use this to respond to authentication challenges from this host. Format is "username:password".

Services

class - the class of service. Each service class may be instantiated multiple times at the user request (in his config file). Examples of service classes are:
    httpProxy - a non-censoring http proxy service
    news - a Usenet news service
mayForceCloseAfterTimeOutOnNoResponse - LP may forcibly close a socket used for this service if a response is not received within a specifiable time after an outgoing request. These are the ones you notice which normally keep your web browser trying after a page has apparently loaded. Browser/client will often retry the connection and succeed after the hung socket is closed.
timeOutOnNoResponse - the corresponding timeout value (seconds)
requiredLayernHostSpecs - LP selects hosts to add to layer n for each service and commStrat based upon the specifications given here. This key provides a subkey for each commStrat, giving the required host capabilities:
    n - the commStrat number for which this subkey applies. The value of this subkey may a comma separated list of the proxy property keys (to be compared against the host capabilities in hosts.xml). A host will be rejected if the property specified here is not both defined and non-zero for the host. Note also that a property written here by preceding a normal property tag with a '-' is taken as a negative specification, meaning a host is rejected if the property is not both defined and zero. In both specifications, hosts with undefined tags are currently rejected - there are some exceptions hard-coded which I am working to remove, but I wish to retain the ability for users to provide a proxy with no test results known.

Firewalls

otherAccessibleSubnets - some users are in the subnet of a ISP A (inside the firewall of ISP A) and yet are able to access a subnet of ISP B. These other accessible subnets are listed in this tag.
nameServer - a comma separated list of nameservers usable by people inside this firewall.
firewallSpecFriendlyName - a friendly name for display purposes only
url - the web site main page for this ISP
outgoing - this key conatins subkeys related to the blocking of outgoing connections
    blockedIPAddresses - known IP address blocks (ex. Net2Phone servers in the UAE)
    blockedTCPPorts - thelist of known blocked TCP ports
    openTCPPorts - the list of known open TCP ports (unused by LP)
    blockedUDPPorts - the list of known blocked UDP ports
    censoredProtocols - known censored protocols (ex. transparent proxy blocking of http)
subnetsInside - the subnets (in CIDR notation) which are inside this firewall. These may be private addresses, as well. LP uses these to detetrmine whether a given host is accessible, or not even if the TCP port required is blocked. Necessary to be able to include the local ISPs proxies in LP builds (which you may want to do because the local ISP proxies may be vulnerable to commStrat 1 (CONNECT), or commStrat 2 attacks).
holes - CIDR notation often implies a long continuous range of IP addresses; in practice there are often holes in the range. For completeness, these are recorded here, but LP does not take any notice of this yet. Normally not important.
onlyAllowsTcpAccessFrom - the entire firewall only allows access from the CIDR notation subnets (or certain temporary strings) specified in the value. This is referred to (incorrectly) in LP as an ACL (Access Control List). At the moment, I accept string 'all' to mean all, and anything else (non CIDR) to mean nowhere. You will see 'notUAE' and 'notPanix' in the values for this tag - I'm using it to record the tests I have done. The effect such strings is that the ACL is assumed by LP to apply for everywhere outside.