User:Tepples/hosts builder

This specification proposes a free software tool to build a hosts file.

Configuration
The tool reads its configuration from a file in a format called Innie, an INI-like format also used by the Action 53 ROM builder. (We reject Python's similar  module for lack of explicit support for duplicate sections and keys.)

The  section contains settings that control the entire output. All can be overridden on the command line.


 * Output path
 * INI: ; command line:  ,
 * Write the file to this path. The special file name  (a single hyphen) denotes standard output. Default is.


 * Hosts per line
 * INI: ; command line:  ,
 * Number of hostnames to associate with each IP address in each line of the output file. The default is 1, as some operating systems' hosts file parsers support only 1.


 * Default IP address
 * INI: ; command line:
 * Set the default IP address for blacklists. The default is, but some computer security tools reportedly need.

The  section lists sources from which to build a blacklist. Each must begin with, which gives the source a title. Each must also include a  and a   in the local file system.


 * Update URL
 * A URL The URL must use HTTPS, HTTP, FTP, or another scheme that Python's  module supports.


 * Update frequency
 * How often to copy the  into the  . Value is a positive whole number followed by ,  ,  ,  ,  , or  . Default is.

Example config file:
 * 1) Example configuration file

[options] output = hosts hosts-per-line = 5 map-to=0.0.0.0
 * 1) command line switches can override these options

[sources] source=Local Test Server path =test_servers.txt format=hosts map-to=127.0.0.1

source=Staging Test Servers path =test_servers.txt format=hosts action=none

source=Popular Sites path =popular_sites.txt format=hostnames action=resolve

source=Known Trackers path =trackers.txt format=hostnames

source=MVPS path =mvps_hosts.txt url  =http://winhelp2002.mvps.org/hosts.txt expires=7 days format=hosts

File formats
The tool receives blacklist and whitelist sources in two formats:


 * Hosts file ( or  )
 * This associates hostnames to IP addresses. If the first word (sequence of nonblank characters) in each line of the file forms a valid IPv4 or IPv6 address, all following words on the same line are treated as hostnames.


 * Hostname file ( or  )
 * This is a simpler format. All words that are DNS names with at least two parts are considered hostnames.

In both hosts files and hostname files, a line whose first nonblank character is the pound sign is a comment and thus ignored.

Actions
The tool can do any of several things with the data loaded from a file:


 * Hosts
 * For hosts files, include these hostnames literally in the output. For hostname files, treat as blacklist. This is the default.


 * Blacklist (, , or  )
 * Map all hostnames in this source to the same host, such as . This is the default for hostname files.


 * Resolve ( or  )
 * Look up using a recursive resolver, such as Python's . This is good for whitelisting your most commonly visited websites


 * None
 * Ignore this source.