A network of *T*^{N-1} nodes and *T*^{N} arcs is constructed (*N*=1 is
a special case and has the same topology as *N*=2 - see figure
1) . Each node represents a juncture type, and when
*N*>2 the nodes represent a juncture in the context of previous
junctures. The POS sequence probabilities do not take account of
context, and so for a given juncture type are the same no matter where
the node occurs in the network. For example, if *N*=3, we will have
2 break nodes, one for when the previous juncture was a break and
one for when the previous juncture was a non-break. These nodes have
the same observation probabilities. Figure 1 shows
networks for *N*=1, *N*=2 and *N*=3.

Under this formulation we have the likelihood
*P*(*C*_{i}|j_{i}) (the
POS sequence model) representing the relationship between tags and
juncture types, and
*P*(*j*_{i} | *j*_{i-1}, ..., *j*_{i-N+1}) (the n-gram
phrase break model) which represents the *a priori* probability of a
sequence of juncture types occurring. This is used to give a basic
regularity to the phrase break placement, enforcing the notion that
phrase breaks are not simply a consequence of local word information.

The probability we are interested in is *P*(*j*_{i}) given the previous
sequence of junctures and the POS sequence at that point. This
probability can be rewritten as follows:

and using Bayes equation

We make the assumption that the probabilities of all states of a
particular juncture type are equal (e.g.
*P*(*C*_{i} | *break*, *non*-*break*)
= *P*(*C*_{i} | *break*, *break*)), so

and from equation 5, the probability of a juncture type given the preceding types and POS sequence becomes