NooJ: A Linguistic Development Environment

Arabic module

Mesfar Slim, Université de Franche-Comté
The Arabic module includes:
-- one full dictionary named EL-DICAR (ELectronic DICtionary for ARabic) including more than 52000 lexical entries:
1/ 19504 nouns (N)
2/ 10375 verbs (V)
3/ 5816 adjectives (ADJ)
4/ 1236 particles (PREP, ADV, REL, DEM)
5/ 3686 loclizations (N+LOC)
6/ 11860 First names (N+Prenom)
-- one sample dictionary (_Example.dic) with its corresponding inflectional grammar(_Example.nof)
-- a file _properties.def that includes a listing of the dictionary codes, inflections, morphology and derivations
-- three morphological grammars:
1/ REQUIRED : Graph_Morpho.nom : a tokenization grammar to identify and annotate morphemes in agglutinated forms (this grammar has to be checked in Info > Preferences)
2/ OPTIONAL : Graph_Morpho_AlifToHamza.nom : a spelling correction grammar that converts Alif to Hamza (if used, this grammar should be associated with a low level priority)
3/ OPTIONAL : Graph_Morpho_HamzaToAlif.nom : a spelling correction grammar that converts Hamza to Alif (if used, this grammar has to be associated with a low level priority)
-- one syntactic grammar: Graph_Timex.nog : a local grammar to recognize temporal expressions (date, hour, age, period, ...)
-- two texts :
1/ declaration_rights.not : The Universal Declaration of the Rights of Man and of the Citizen
2/ example.not : a collection of samples from newspapers that include temporal expressions

