Punctuation
#
Universal
#
Definition from de
UD website
Punctuation marks are non-alphabetical characters and character groups used in many languages to delimit linguistic units in printed text.
Punctuation is not taken to include logograms such as $, %, and §, which are instead tagged as SYM. (Hint: if it corresponds to a word that you pronounce, such as dollar or percent, it is SYM and not PUNCT.)
Spoken corpora contain symbols representing pauses, laughter and other sounds; we treat them as punctuation, too. In these cases it is even not required that all characters of the token are non-alphabetical. One can represent a pause using a special character such as #, or using some more descriptive coding such as [:pause].
Examples
- Period: .
- Comma: ,
- Parentheses: ()
French
#
TODO
Overview
#
Specific Pattern
#
Haitian Creole
#
TODO
Overview
#
Specific Pattern
#