How FTFs match trees
In order to understand FTFs you need to understand how their components work together. How does a program like ICECUP decide that an FTF matches (part of) a tree in the corpus?
Recall that an FTF is declarative, in other words: all aspects of the FTF must be true together, and the order in which they are evaluated is not important. (We will not trouble ourselves with how this might work here.)
All the examples are based on ICECUP, although, as we comment elsewhere, other programs could use the same principle. On the democratic principle, examples of matching cases are taken from the freely available sample corpus download (which is itself a sample of ICE-GB).
We will start with some simple examples. First we will consider FTFs consisting of a single node and a single word.
Our first examples are single node FTFs.
You can generate a single node FTF in ICECUP using the (inexact) Nodal query, typing the expression, e.g., OD,CL and hitting the Edit button.
The FTF you get should look like the example below.
Next, find an example of a matching case.
If you then press the key F4 or the Start! button, you will get a complete list of examples of (in this case) direct object clauses. Then double-click on an example to open a tree window.
The resulting tree should look something like the one below (this is S1A-010 #149). The matching case is the node in brown, and the part of the text dominated by the node is shaded. This is because the node also has the focus of the FTF.
The FTF contains a number of unspecified edges
apart from the OD,CL designation. But because
these are unspecified, they do not limit the position of the matching
case. So this query will match direct objects in the last position
in the branch, or in other positions, e.g., as in I think
you will agree because... they were dumbfounded[S1A-094
#52], where because... they were dumbfounded is analysed
as an adverbial clause.
The second point to notice is that the FTF explicitly contains a word element, which is unspecified. If we did specify the word, the FTF would only match examples that contained that word (note that the position of the word would not be specified within the set of covered words).
Finally, note that the FTF can match more than once within a single corpus tree.
In the case of single word FTFs, that is, an FTF that searches for a single text unit element, we must specify aspects of the (unspecified) node.
Using the Text Fragment command, type the word work and press the Edit button.
The result should be the FTF shown. If you then perform the query, you will get examples like this one (W1B-001 #179).
The empty node still has the focus but is specified as a leaf. There is no white stub between node and word, and there is a black dotted line, meaning that word and node are immediately connected. The node must tag the word. No other edges have been specified.
The previous FTF finds examples of work as a noun or (more rarely) as a verb, as in We will have to work very hard. [W2C-009 #44]. If you want to find examples of work as a verb, you can also use the Text Fragment command.
In the Text Fragment window, type the word work. Then, without pressing <SPACE>, press the Node button. Position the input caret (blinking cursor) between the angled brackets and type V. The query should look like this: work+<V>. Then press the Edit button.
If you have done this successfully, the FTF will look like the one shown here. In fact, the only difference from the previous example is that the node, still in the tag position, has the category V, meaning verb. An example match is shown below.
You may have noticed another difference. ICECUP 3.0 performs this search using a background search. This is because, although it has a table of indexes for words like work and another for elements like all verbs, it does not have one for work as a verb. ICECUP has to work out whether the FTF matches the tree by looking at the trees in the corpus, one-by-one. In the examples that follow, ICECUP has to do this kind of search.
This page last modified 1 December, 2016 by Survey Web Administrator.