You can find sources of the filter language in filter/
directory. File filter/config.Y contains filter grammar and basically translates
the source from user into a tree of f_inst structures. These trees are
later interpreted using code in filter/filter.c.
A filter is represented by a tree of f_inst structures, later translated
into lists called f_line. All the instructions are defined and documented
in filter/f-inst.c definition file.
Filters use a f_val structure for their data. Each f_val
contains type and value (types are constants prefixed with T_).
Look into filter/data.h for more information and appropriate calls.
enum filter_return interpret (struct filter_state * fs, const struct f_line * line, uint argc, const struct f_val * argv, uint resc, struct f_val * resv)
filter state
-- undescribed --
-- undescribed --
-- undescribed --
-- undescribed --
-- undescribed --
Interpret given tree of filter instructions. This is core function of filter system and does all the hard work.
code (which is instruction code), aux (which is extension to instruction code, typically type), arg1 and arg2 - arguments. Depending on instruction, arguments are either integers, or pointers to instruction trees. Common instructions like +, that have two expressions as arguments use TWOARGS macro to get both of them evaluated.
enum filter_return f_run (const struct filter * filter, struct rte * rte, int flags) -- run a filter for a route
filter to run
route being filtered, must be write-able
flags
If rte->attrs is cached, the returned rte allocates a new rta on tmp_pool, otherwise the filters may modify it.
enum filter_return f_eval_rte (const struct f_line * expr, struct rte * rte, uint argc, const struct f_val * argv, uint resc, struct f_val * resv) -- run a filter line for an uncached route
filter line to run
route being filtered, may be modified
-- undescribed --
-- undescribed --
-- undescribed --
-- undescribed --
This specific filter entry point runs the given filter line (which must not have any arguments) on the given route.
The route MUST NOT have REF_COW set and its attributes MUST NOT be cached by rta_lookup().
int filter_same (const struct filter * new, const struct filter * old) -- compare two filters
first filter to be compared
second filter to be compared
Returns 1 in case filters are same, otherwise 0. If there are underlying bugs, it will rather say 0 on same filters than say 1 on different.
void filter_commit (struct config * new, struct config * old) -- do filter comparisons on all the named functions and filters
-- undescribed --
-- undescribed --
struct f_tree * build_tree (struct f_tree * from)
degenerated tree (linked by tree->left) to be transformed into form suitable for find_tree()
Transforms degenerated tree into balanced tree.
int same_tree (const struct f_tree * t1, const struct f_tree * t2)
first tree to be compared
second one
Compares two trees and returns 1 if they are same
We use a (compressed) trie to represent prefix sets. Every node in the trie represents one prefix (addr/plen) and plen also indicates the index of bits in the address that are used to branch at the node. Note that such prefix is not necessary a member of the prefix set, it is just a canonical prefix associated with a node. Prefix lengths of nodes are aligned to multiples of TRIE_STEP (4) and there is 16-way branching in each node. Therefore, we say that a node is associated with a range of prefix lengths (plen .. plen + TRIE_STEP - 1).
The prefix set is not just a set of prefixes, it is defined by a set of prefix patterns. Each prefix pattern consists of ppaddr/pplen and two integers: low and high. The tested prefix paddr/plen matches that pattern if the first MIN(plen, pplen) bits of paddr and ppaddr are the same and low <= plen <= high.
There are two ways to represent accepted prefixes for a node. First, there is a bitmask local, which represents independently all 15 prefixes that extend the canonical prefix of the node and are within a range of prefix lengths associated with the node. E.g., for node 10.0.0.0/8 they are 10.0.0.0/8, 10.0.0.0/9, 10.128.0.0/9, .. 10.224.0.0/11. This order (first by length, then lexicographically) is used for indexing the bitmask local, starting at position 1. I.e., index is 2^(plen - base) + offset within the same length, see function trie_local_mask6() for details.
Second, we use a bitmask accept to represent accepted prefix lengths at a node. The bit is set means that all prefixes of given length that are either subprefixes or superprefixes of the canonical prefix are accepted. As there are 33 prefix lengths (0..32 for IPv4), but there is just one prefix of zero length in the whole trie so we have zero flag in f_trie (indicating whether the trie accepts prefix 0.0.0.0/0) as a special case, and accept bitmask represents accepted prefix lengths from 1 to 32.
One complication is handling of prefix patterns with unaligned prefix length. When such pattern is to be added, we add a primary node above (with rounded down prefix length nlen) and a set of secondary nodes below (with rounded up prefix lengths slen). Accepted prefix lengths of the original prefix pattern are then represented in different places based on their lengths. For prefixes shorter than nlen, it is accept bitmask of the primary node, for prefixes between nlen and slen - 1 it is local bitmask of the primary node, and for prefixes longer of equal slen it is accept bitmasks of secondary nodes.
There are two cases in prefix matching - a match when the length of the prefix is smaller that the length of the prefix pattern, (plen < pplen) and otherwise. The second case is simple - we just walk through the trie and look at every visited node whether that prefix accepts our prefix length (plen). The first case is tricky - we do not want to examine every descendant of a final node, so (when we create the trie) we have to propagate that information from nodes to their ascendants.
There are two kinds of propagations - propagation from child's accept bitmask to parent's accept bitmask, and propagation from child's accept bitmask to parent's local bitmask. The first kind is simple - as all superprefixes of a parent are also all superprefixes of appropriate length of a child, then we can just add (by bitwise or) a child accept mask masked by parent prefix length mask to the parent accept mask. This handles prefixes shorter than node plen.
The second kind of propagation is necessary to handle superprefixes of a child that are represented by parent local mask - that are in the range of prefix lengths associated with the parent. For each accepted (by child accept mask) prefix length from that range, we need to set appropriate bit in local mask. See function trie_amask_to_local() for details.
There are four cases when we walk through a trie:
- we are in NULL - we are out of path (prefixes are inconsistent) - we are in the wanted (final) node (node length == plen) - we are beyond the end of path (node length > plen) - we are still on path and keep walking (node length < plen)
The walking code in trie_match_net() is structured according to these cases.
Iteration over prefixes in a trie can be done using TRIE_WALK() macro, or directly using trie_walk_init() and trie_walk_next() functions. The second approach allows suspending the iteration and continuing in it later. Prefixes are enumerated in the usual lexicographic order and may be restricted to a subset of the trie (all subnets of a specified prefix).
Note that the trie walk does not reliably enumerate `implicit' prefixes defined by low and high fields in prefix patterns, it is supposed to be used on tries constructed from `explicit' prefixes (low == plen == high in call to trie_add_prefix()).
The trie walk has three basic state variables stored in the struct f_trie_walk_state -- the current node in stack[stack_pos], accept_length for iteration over inter-node prefixes (non-branching prefixes on compressed path between the current node and its parent node, stored in the bitmap accept of the current node) and local_pos for iteration over intra-node prefixes (stored in the bitmap local).
The trie also supports longest-prefix-match query by trie_match_longest_ip4() and it can be extended to iteration over all covering prefixes for a given prefix (from longest to shortest) using TRIE_WALK_TO_ROOT_IP4() macro. There are also IPv6 versions (for practical reasons, these functions and macros are separate for IPv4 and IPv6). There is the same limitation to enumeration of `implicit' prefixes like with the previous TRIE_WALK() macro.
struct f_trie * f_new_trie (linpool * lp, uint data_size) -- allocates and returns a new empty trie
linear pool to allocate items from
user data attached to node
void * trie_add_prefix (struct f_trie * t, const net_addr * net, uint l, uint h)
trie to add to
IP network prefix
prefix lower bound
prefix upper bound
Adds prefix (prefix pattern) n to trie t. l and h are lower and upper bounds on accepted prefix lengths, both inclusive. 0 <= l, h <= 32 (128 for IPv6).
Returns a pointer to the allocated node. The function can return a pointer to an existing node if px and plen are the same. If px/plen == 0/0 (or ::/0), a pointer to the root node is returned. Returns NULL when called with mismatched IPv4/IPv6 net type.
int trie_match_net (const struct f_trie * t, const net_addr * n)
trie
net address
Tries to find a matching net in the trie such that prefix n matches that prefix pattern. Returns 1 if there is such prefix pattern in the trie.
int trie_match_longest_ip4 (const struct f_trie * t, const net_addr_ip4 * net, net_addr_ip4 * dst, ip4_addr * found0)
trie
net address
return value
optional returned bitmask of found nodes
Perform longest prefix match for the address net and return the resulting prefix in the buffer dst. The bitmask found0 is used to report lengths of prefixes on the path from the root to the resulting prefix. E.g., if there is also a /20 shorter matching prefix, then 20-th bit is set in found0. This can be used to enumerate all matching prefixes for the network net using function trie_match_next_longest_ip4() or macro TRIE_WALK_TO_ROOT_IP4().
This function assumes IPv4 trie, there is also an IPv6 variant. The net argument is typed as net_addr_ip4, but would accept any IPv4-based net_addr, like net4_prefix(). Anyway, returned dst is always net_addr_ip4.
1 if a matching prefix was found, 0 if not.
int trie_match_longest_ip6 (const struct f_trie * t, const net_addr_ip6 * net, net_addr_ip6 * dst, ip6_addr * found0)
trie
net address
return value
optional returned bitmask of found nodes
Perform longest prefix match for the address net and return the resulting prefix in the buffer dst. The bitmask found0 is used to report lengths of prefixes on the path from the root to the resulting prefix. E.g., if there is also a /20 shorter matching prefix, then 20-th bit is set in found0. This can be used to enumerate all matching prefixes for the network net using function trie_match_next_longest_ip6() or macro TRIE_WALK_TO_ROOT_IP6().
This function assumes IPv6 trie, there is also an IPv4 variant. The net argument is typed as net_addr_ip6, but would accept any IPv6-based net_addr, like net6_prefix(). Anyway, returned dst is always net_addr_ip6.
1 if a matching prefix was found, 0 if not.
void trie_walk_init (struct f_trie_walk_state * s, const struct f_trie * t, const net_addr * net)
walk state
trie
optional subnet for walk
Initialize walk state for subsequent walk through nodes of the trie t by trie_walk_next(). The argument net allows to restrict walk to given subnet, otherwise full walk over all nodes is used. This is done by finding node at or below net and starting position in it.
int trie_walk_next (struct f_trie_walk_state * s, net_addr * net)
walk state
return value
Find the next prefix in the trie walk and return it in the buffer net. Prefixes are walked in the usual lexicographic order and may be restricted to a subset of the trie during walk setup by trie_walk_init(). Note that the trie walk does not iterate reliably over 'implicit' prefixes defined by low and high fields in prefix patterns, it is supposed to be used on tries constructed from 'explicit' prefixes (low == plen == high in call to trie_add_prefix()).
1 if the next prefix was found, 0 for the end of walk.
int trie_same (const struct f_trie * t1, const struct f_trie * t2)
first trie to be compared
second one
Compares two tries and returns 1 if they are same
void trie_format (const struct f_trie * t, buffer * buf)
trie to be formatted
destination buffer
Prints the trie to the supplied buffer.