User:Eighty5cacao/misc/HTTPS Everywhere/feature


 * This page is unmaintained and probably mentions some features that are undesirable under our current philosophy and/or impossible to implement under WebExtensions.

This page is not intended for the developers to read directly. It is a scratchpad for me to collect my ideas if/when I file bug reports for these features.

Allow user rulesets to be flagged as "superseding" a built-in ruleset
...so as to reduce noise in an automated problem-reporting system for users who are testing changes to built-in rulesets (and would currently need to disable the built-in ruleset in question).

Perhaps a first draft of this could be, "If a user ruleset has a name conflict with a built-in ruleset, (optionally) prefer the user ruleset and log the situation as a warning rather than an error" (user-visible UI warnings MAY be provided, and any part of the behavior MAY be conditional upon a configuration setting and/or an attribute of the user ruleset's  element). (NB: this GitHub ticket by a regular developer seems to be describing this)

A later implementation could cover the case where the user ruleset has a different name from a built-in ruleset, whose name is specified through an attribute like.

User exclusion sets
Suppose there were a built-in ruleset named "Example0" that covered the domain, and   on that domain suddenly broke (such as by redirecting to http). We should let users write something like: TODO: Consider whether the UI design demands a  attribute (should the user exclusion set be given its own line in the ruleset list, or should it silently apply without a checkbox choice?). Also: whether s are needed, what syntax should be defined for "exclude from  ," and whether we should provide a mechanism to add coverage to an existing ruleset rather than merely exclusions (Tor Trac ticket 10033 and GitHub ticket 296 mention a work-in-progress Chrome implementation of the last).

Provide a means by which a rule in one ruleset can override a downgrade rule in another ruleset
The proposed attribute name is.

behaves like  except that the code should check whether there is an enabled rule that would rewrite the opposite way (as there would be if both rulesets were enabled in the example below) and ignore the   if so.

An example for a real site, specifically an xkcd comic that has unsecurable mixed scripts and fails to display any image when those scripts are blocked:

As a further example, a similar situation has already been found in the search feature on Stack Exchange sites (the in Stack-Exchange.xml ought to be a ). (Does the fighting between the current Stack-Exchange.xml and Stack-Exchange-mixedcontent.xml cause redirect loops if the latter is manually enabled? This needs testing.)

This feature should be used only for pages that have true (unsecurable) mixed content that breaks major functionality (including, but not limited to, layout breakage severe enough to make the site unusable by an experienced, normally-abled user).

To consider: Instead of defining, is it better simply to give the existing   attribute such "soft" behavior (i.e., explicitly give normal  s precedence over    s)? IIRC, currently, no code in the HTTPS Everywhere browser extension actually reads the  attribute; it is read only by validation scripts as part of the build process.

New values for platform attribute

 * - Currently, rulesets for sites with incomplete certificate chains are simply 'd. This is a potential alternative for browsers that support fetching intermediate certificates in accord with Authority Information Access fields; currently it seems equivalent to   according to this comment.
 * - Really means false mixed POST; Firefox checks for these separately from mixed content, so a fix for bmo:878890 might not automatically address this; true mixed POSTs should(?) generally be handled by splitting coverage of the referring page to a default-off ruleset if major functionality is broken
 * - when enforcement of same-origin policies causes problems with XMLHttpRequest calls (see torbug:7851)
 * (or ?) - exact definition to be decided later; needed in order to distinguish "good" and "bad" MCB implementations
 * - needed if we ever want to allow clearnet domains to be rewritten to hidden services - mailing list discussion exists on whether this is worth doing at all - newer tickets include GitHub #3798

Pseudoplatforms
These MUST NOT disable any rulesets without explicitly warning the user first. Instead, they SHOULD clarify the wording of the browser's TLS error pages, specifically to explain that the needed TLS feature may be broken by an intercepting proxy or webmaster misconfiguration. An initial implementation MAY treat these as no-ops. That is, in order to enable the corresponding rulesets by default, the browser addon MAY choose to pretend that all supported browsers match these platform values.

The behavior described above deviates from that for the existing  attribute; thus a new attribute needs to be defined, perhaps.
 * - The Let's Encrypt CA is often reported problematic on Chrome for Windows XP, presumably due to lack of a required intermediate certificate in the Microsoft-supplied certificate database (TODO: or signature algorithm?). This is a subplatform due to the deprecation of Windows XP.
 * - for sites that require SNI in order for a matching certificate to be obtained, such as those that are hosted on WebFaction or that use Cloudflare's free service tier. Compare the  attribute in Chromium's HSTS preload list. (Dubious because: The non-SNI platform most likely to be encountered is Firefox with the Convergence addon [or its fork FreeSpeechMe, when configured to validate non-Namecoin sites?], but neither addon is still being maintained.)
 * - for sites that require TLS 1.3 or higher

Override DNS lookups
Suggested syntax:  (example is only illustrative; IP address no longer accurate)

To be used to work around broken load-balancing arrangements

There should also be positive  and   attributes, to force the use of the specified IP(s) for the specified hostname(s), even if the browser receives a single A record for some other IP.

(of course,  should be available too)

We should probably have different attribute names to specify hosts via either simple matches (like ) or regexes (like  ).

For a  element to be effective for a given host, that host MUST also be listed in the ruleset's  s.

An attempt to blacklist an IP address corresponding to the only available A or AAAA record for a domain SHOULD be treated as a no-op and MUST generate a log message (TODO: at what severity?).

TODO: Decide how multiple IPs or hostnames should be delimited (comma? pipe? ...)

A Firefox implementation might depend on bmo:652295, though that bug isn't quite about overriding the built-in DNS resolver...

Load balancing
That is, allow a rule to specify multiple rewrite destinations among which one will be chosen randomly, to be used in cases where equivalent content is available on multiple hostnames. Some examples for real sites (just to show the syntax, not to demonstrate best practices):

To be reevaluated: The rewriting of any given URL should be deterministic within a browser session and/or a given time interval; that is, the chosen rewrite should be memoized.

If it is considered undesirable to repeatedly consume entropy from the browser's PRNG, perhaps a suitable pseudorandom number might be some HMAC using the originally-requested URL as the message and a single CSPRNG output (generated once per session or time interval) as the key.

Advanced string operations
such as letter case transformations and percent (un)encoding. Perhaps the  field could contain something like   to mean "lowercase version of the string matched by the first parens in the corresponding  "? This could be useful for dealing with redirection scripts: TODO: explain other use cases

What an automated problem-reporting system should cover
The existing proposal(s) seem(s) only to cover rulesets manually disabled by the user. The problem is a limitation of the current UI: If a casual user doesn't bother to click on the icon or the Tools menu entry, they may not be aware that a redirect loop exists. If it is the top-level document that has experienced a redirect loop, they may think there is no rule coverage for that URL. Consequently, they might not disable the ruleset in question. Thus, a problem-reporting system should also handle redirect loops.

It's probably also a good idea to report SSL/TLS protocol errors for sites with active rulesets (certificate-related or not); among other reasons, such errors may not be noticed if they are triggered by third-party content. (Perhaps we should twiddle the pref on Mozilla's TLS error reporter to point at an EFF/Tor Project-owned server...)

(For everything between here and the top of the section, "ruleset" means built-in rulesets only.)

If reporting that a user has manually disabled a (built-in) ruleset, allow optionally reporting whether there are any user rulesets that are active for the URLs for which the built-in rulesets were found disabled - but don't report on the contents of said user rulesets, of course

TODO: Discussion exists at GitHub issue 1888 with a proposed implementation in pull request 2601, but that implementation of the reporting mechanism appears to need revision because it does not yet ignore user rulesets.

Warn the user more loudly when user rulesets fail to load
...or possibly also when they are being disabled by default.

Theoretically, anyone technically oriented enough to work with user rulesets should be smart enough to validate their XML and regexes by eye (or script). However, people like me sometimes make stupid typos and then (1) get too lazy to check the Error Console and/or (2) visit websites that spam the Error Console heavily for unrelated reasons

Warn the user more loudly when redirect loops exist

 * Immediately after the redirect loop happens: infobar and/or change of menubar icon
 * Or replace the displayed count of active rulesets with an exclamation mark

Cert overrides
For logical consistency, the torbug:8958 proposal should probably be a new element name rather than an attribute of  elements; say,   (a real example adapted from bmo:644640). We should probably also define an " " mechanism to override errors other than mismatches. (Observe that pinning the cert fingerprint via  would be satisfactory for both expiration and chain problems; on the other hand, we MUST NOT define , as overriding the TLS stack's idea of the current time could cause it to send an OCSP request that is bound to fail because of the cert being expired.)

Should the  attribute be a simple match (as in   elements) or a regex (as in  )? Or should we provide options for both?

TODO: Make sure any specific proposal can handle load-balancing arrangements such as the one used by (Fry's Electronics). We probably need plural names:,. (In this specific case, a possible syntax would be .)

Friendly-name attribute
Implement some UI/preference for an alternate name attribute, for users who would prefer to avoid seeing TLDs in ruleset names and/or prefer to see official company names written in full; examples based on existing rulesets: