ctxt

ctxt #

Universal #

In French, adjectives agree with nouns in gender and number. Gender on adjectives functions as an agreement morpheme, marking their relationship with the noun.

Although French adjectives generally agree in gender, many share the same form for masculine and feminine. This raises questions about whether such adjectives should be assigned a Gender feature. This phenomenon, already frequent in written French, is even more prevalent in spoken French.

The ctxt suffix (e.g., Gender[ctxt], Number[ctxt]) is used when a morphological feature cannot be retrieved directly from the surface form but must be inferred from syntactic or semantic context. This situation arises both in written and spoken French, but is especially frequent in oral corpora, where final inflectional endings are often not pronounced.

Written corpora #

The ctxt feature was introduced in French to specify when the Gender or Number of adjectives, participles, pronouns, or common nouns can only be inferred from context. In written French, some adjectives maintain the same form regardless of gender or number, as illustrated in the example below:

Oral corpora #

This phenomenon becomes even more pronounced in spoken French. In oral corpora, we annotate what is actually pronounced, not the standard written transcription. In many cases, the -s marking plural nouns or adjectives is not pronounced, which makes the Number feature dependent on context.

Liaison #

There is additional complexity due to liaison in French. Adjectives ending in consonants are sometimes pronounced differently when followed by vowel-initial words. We identified several main rules for when Number[ctxt] and Gender[ctxt] features apply.

Number[ctxt]

When an adjective precedes a vowel-initial noun, only adjectives ending in -s or -x in the masculine form take the feature Number[ctxt], since number is not phonologically recoverable through liaison.

pattern { X [upos=ADJ, Number=Plur]; X<Y; Y[upos=NOUN, form=re"[a|e|i|o|u|é|è].*"] }

Gender[ctxt]

Adjectives ending in -t, -s, -x, or -n are often pronounced identically in masculine and feminine forms when followed by a vowel-initial noun. In such cases, they are marked with Gender[ctxt].

pattern { 
	X [upos=ADJ, Gender__ctxt, Number__ctxt=Sing];
	Y[upos=NOUN, form=re"[a|e|i|o|u|é|è].*"];
	X<Y
}