Links and edges in FTFs

  • Links are two-place relations. They link element x to element y.
  • Edges are one-place relations, i.e., they are properties of x.
  • There are links and edges for nodes and words.
  • To understand FTFs you have to comprehend how these work together.

Click a link!

The following FTF diagram contains a clickable map for links and edges. Select one of the links or edges to explain it.

Last word = <unknown> First word = <unknown> Last word = <unknown> First word = <unknown> Leaf = <unknown> Leaf = <unknown> Last child = <unknown> Root = <unknown> First child = <unknown> Next word = <unknown> Parent = (immediate) parent Parent = (immediate) parent Parent = (immediate) parent Next child = Immediately after

FTF Links

There are three different two-place relations (called ‘links’) in FTFs, which are drawn either as lines joining one node to another, or directional arrows. The general principle underwriting the set of options is that they should be as simple as possible but as complex as necessary.

You can achieve quite subtle queries using a combination of relations. For example, in a phrase structure grammar such as TOSCAICE, ‘crossing links are not allowed’, i.e., nodes are strongly ordered by the word order. You may use word ordering (e.g., ‘Next word = after’) in conjunction with ‘Next child’ relations such as ‘different branches’ and <unknown>, in order to specify that one branch of a tree, while being disconnected from a first, must nevertheless follow it in the sentence.

Parent:child (‘Parent’) links

There are only two ‘Parent’ links, and they are both ordered. This means that a child node in an FTF can never match a node ‘above’ its parent.

Having experimented with an <unknown> option, we do not believe that there is a linguistically useful query involving an unordered relationship between an FTF parent and child. Moreover, with such an option it is very easy to form a structurally nonsensical query - precisely what FTFs are meant to avoid.
  Name Meaning = examples
parent The child in the FTF must match a node immediately below the parent.
ancestor The child in the FTF must match a node below the parent in the tree.

Child:child (‘Next’) and word:word (‘Next word’) links

Apart from a small difference (the ‘different branches’ option is not relevant for the latter), the set of options for ‘Next (child)’ and ‘Next word’ are identical.

The ‘different branches’ option is useful if either of the sibling nodes are not immediately connected to their parent. Note that the first four ‘arrow’ links all relate to the ordering of child nodes in the tree, i.e., they imply that the pair of nodes share the same parent, irrespective of other links.
  Name Meaning = examples
immediately after The second element in the FTF must immediately follow the first in the tree.
after The second element in the FTF must follow the first, but not necessarily immediately.
just before or just after The second element in the FTF must immediately precede or immediately follow the first.
before or after The second element in the FTF must either precede or follow the first.
different branches (‘Next child’ only) The second (node) element in the FTF must be on a different branch to the first (i.e., one cannot be the parent of the other).
<unknown> No restriction is imposed.

FTF Edges

General node edges

There are four different unary links, or ‘edges’ in FTF nodes (‘Root’, ‘Leaf’, ‘First (child)’, and ‘Last (child)’), but one rule applies to all of them. They are drawn topographically, as if each were a link to a further node, absent in the FTF, that must be present in the tree. This becomes clearer when you look at the FTF as a whole (see below).

The general pattern is summarised below. When editing, the default for all links is <unknown>. (Note that the ‘Text Fragment’ query composes FTFs with different defaults: see below).

  Name Meaning
No There must be another node beyond the current one.
Yes There cannot be another node beyond the current node (therefore there is no link).
<unknown> No restriction is imposed.

Leaves

Leaf’ edges are a bit of a special case. The ‘leaf’ setting implies something else. If a node is a leaf, it must be directly connected to the word that it annotates (in other words, it must be a ‘tag node’). Conversely, if it is not a leaf, or it is not known whether it is a leaf or not, it will at least be eventually connected. The implied relation is shown by a dotted line linking the node and text unit element in the FTF. There is thus no need for a separate word:node link.

  Name Meaning
No There must be another node beyond this one. Thus the word is only eventually connected.
Yes There cannot be another node beyond this. Therefore the word is directly connected.
<unknown> No restriction is imposed. We do not know if the word is immediately connected or not.

Word edges

The same general rule for node edges also applies to the ‘word edges’ (‘First word’ and ‘Last word’). These are drawn as triangles because there is no explicit linking structure in the depiction of sentences.

  Name Meaning
No There must be another word beyond this one.
Yes There cannot be another word beyond this.
<unknown> No restriction is imposed.

In the next section, we make some introductory comments on how to use edges and links in combination.

Using Links and Edges

So far we have discussed links and edges individually, and rather abstractly. In order to understand how links and edges work, you should experiment.

As we mentioned above, the ‘cool spot’ in the centre of the link or end of the edge line controls the link. Press down with either the left or right mouse button to rotate through the set of links. Alternatively, you can use a popup menu. Select the node that governs the edge or link and press down with the right button in the grey area outside the node. This then lists the alternatives (to use the keyboard, hit ‘Alt+E, 4’ to bring up the menu).

The following notes are meant to help you get started and explain certain aspects of the representation. There is a detailed description here on how FTFs match corpus trees and the impact of different collections of links. Further information is available in the online help manual, which is also part of the complete download package. However, you will need to experiment yourself.

Unspecified is the default

The default status for edges is <unknown>, indicated by the white bars below.

If you do a ‘New FTF’ and then press the ‘Insert child after’ key twice you will get an FTF structure like this (you can add the node labels yourself - hint: use the ‘Edit node’ command).

A simple example created with ‘New FTF’ and a few edit operations.

To see how FTFs like this match against the corpus, press here.

(Over-)specifying the edges

You can edit edges in the tree, as we have done in the case below. What is the difference between the following FTF and the previous one? Well, this FTF will only match parts of trees in the corpus that:

  1. have nodes before and after the verb phrase [VP] (see the links at the far left of the figure),
  2. where the verb phrase is realised by only an auxiliary operator [OP, AUX] and a main verb [MVB, V], and
  3. these two nodes must be leaf nodes (this should be guaranteed by the grammar in this case).

The FTF will reject cases that matched the FTF above if any of these additional restrictions do not hold.

The same example with node edges marked.

This example also illustrates how the edge colouring system operates. The requirements that there is a parent for the VP node, and that there are nodes before and after the VP node, are marked by the presence of black links ‘to these nodes’. Likewise the absence of preceding or following nodes within the VP is indicated by the absence of black links.

Fuzzy Text Fragments

If you create an FTF using the ‘Text Fragment’ query, the nodes you introduce must ‘tag’ (i.e., be directly connected to) the words, and be ordered by the word sequence (this is indicated by the ‘Next word’ arrow on the right hand side).

The second point is that any nodes must be independent of the grammar, i.e., not connected together by any restricting ‘Next child’ or ‘Parent’ relation. Since all nodes have the root node as an ancestor, the shared parent node in the FTF (on the right) is linked as an ‘ancestor’ and marked as ‘Root’.

A tag search looking for similar structures.

Naturally, this is a weaker search than the first one (it is less specific). Apart from picking out any auxiliary-verb pair, regardless of grammar, it cannot relate to any other parts of the tree. The ‘Text Fragment’ command performs searches that are typical of a tagged, unparsed corpus. Finally, note that the focus (indicated by the yellow border) is spread across the tag nodes.

To see how FTFs like this match against the corpus, press here.

See also

Last word = <unknown> First word = <unknown> Last word = <unknown> First word = <unknown> Leaf = <unknown> Leaf = <unknown> Last child = <unknown> Root = <unknown> First child = <unknown> Next word = <unknown> Parent = (immediate) parent Parent = (immediate) parent Parent = (immediate) parent Next child = Immediately after

FTF home pages by Sean Wallis and Gerry Nelson.
Comments/questions to s.wallis@ucl.ac.uk.

This page last modified 16 October, 2014 by Survey Web Administrator.