SIMPLE LE4-8346

WP01

 

 

 

SIMPLE - LEXICON DOCUMENTATION

 

* * *

Document first version date

26/04/00

 

 

Document date

28/04/00

DocumentID

WP1

Version

02

 

 

   

Doc. type

QAP*

 

 

   

Document status

to be validated

 

 

   

Validation type

 

 

 

   

Comments

 

 

 

 

 

   

 

Name

Organisation

Purpose

   

 

 

 

 

   

From

Leiden team LEI

INL

documentation

   

 

 

 

 

   

 

 

 

 

   

 

 

 

 

   

 

 

 

 

   

 

 

 

 

   

To

Coordinators, Reviewer

 

Documentation deliverable D.03.3.2

   

 

 

 

 

   

 

 

 

 

   

 

 

 

 

Lexicon Documentation DUTCH

 

0 Introduction

This documentation concerns the SIMPLE part (semantic layer) of the Dutch PAROLE lexicon. For extensive documentation on the Dutch PAROLE lexicon, we refer to the INL's website: www.inl.nl

Contents of this report:

1. General design information

    1.1 Lexicon population

    1.2 Current lexicon contents

    1.3 Sample of 100 entries

    1.4 Tools

    1.5 Impact of SIMPLE/PAROLE

    1.6 Remaining work

2. Semantic encoding

    2.1. Criteria for Syntax-Semantic linking

    2.2. Criteria for assigning Domain features

    2.3. Criteria for assigning Semantic class and template type

    2.4. Classes derived from encoding

    2.5. Representation of Predicative information

    2.6. Problems encountered

3. Statistics

4. Bibliography

5. List of Appendices

Appendices 1 – 4.

2

2

3

4

4

6

6

7

7

8

9

10

10

12

13

13

13

14-55

 

1. General design information

1.1. Lexicon population

The Dutch SIMPLE lexicon dd. 28 April 2000 contains 10,472 semantic units (Usems): 7326 noun Usems, 2114 verb Usems and 1032 adjective Usems. The 10,472 Usems are distributed over 3710 lemmata (head words): 2797 nouns, 559 verbs and 354 adjectives. 681 lemmata cover one or more Base Concepts (see below).

For each part of speech, starting point for the lemmalist to be provided with Usems, were the English base concepts (BC) selected by the Linguistic Specification Group from EuroWordnet lexicon (see general SIMPLE documentation). For each BC, a set of related Dutch equivalents (near-synonyms) was chosen, keeping the concept in mind (to avoid mere translation of the English word). For a number of BC's, no appropriate Dutch equivalent could be found. The whole set of Dutch equivalents was checked on occurrence in the Dutch PAROLE lexicon, so as to be able to connect the semantic descriptions to syntactic descriptions in the PAROLE lexicon. For nouns, occurrence in the Dutch EuroWordnet lexicon could also be checked (by cooperation of Piek Vossen), which resulted into a finetuned list of Dutch equivalents. For verbs and adjectives, this comparison was not possible due to pragmatic reasons. Per BC, a 'prototypical' Dutch equivalent was selected. There were three reasons for selecting more than one prototypical equivalent per BC: (1) 'real' synonyms from which a choice would be too arbitrary, (2) no single lemma covering the BC could be found, (3) for nouns: the preferred prototypical lemma was at the time not in the Dutch EuroWordnet lexicon. Totally 681 SIMPLE lemmata cover one or more BC's: 371 nouns, 172 verbs and 138 adjectives. See appendix 1 for the noun, verb and adjective lemmata covering one or more BC's.

The lemmalists were extended on the basis of an automatically lemmatized type-frequency list derived from the Dutch PAROLE Reference corpus (the corpus itself is not yet lemmatized; cf. 1.5). This list was compared with the PAROLE lemma list. The list of matching entries was ranked from high to low frequency. In the range of lower frequencies, priority was given to lemmata corresponding with BC's.

 

Coverage and completeness

BC-meanings were covered by Dutch Usems as precisely as possible, in view of the multilingual links foreseen.

Other meanings per lemma were selected by consultation of several medium-sized dictionaries of Dutch, and if necessary with other reference works (a.o. Wordnet). A criterion generally applied was that meanings shared by at least two of the dictionaries were selected as Usems for SIMPLE, based on the assumption that these meanings can be considered 'standard Dutch'. However, meanings that were considered outdated or obsolete (relics from older dictionaries) were not included. Domain specific meanings were included due to their importance for language technology (the domain field in SIMPLE).

For reasons of the rather limited size of the PAROLE lexicon (ca. 20,000 lemmata) and the SIMPLE lexicon ( 3,710 lemmata), target Usems in the qualia roles (formal, agentive, constitutive and telic) were determined on the basis of their suitability as target Usem, rather than their occurrence in the PAROLE or SIMPLE lexicon. The tables in 2.3.3. show that at the lemma level, target Usems are covered by the PAROLE lexicon between 76% and 85%, and by the (much smaller) SIMPLE lexicon between 36% and 50%.

 

1.2. Current Lexicon Contents

The standard templates delivered by the Specification Group were loaded into a database. For each Usem, the lexicographers fill in a database template form. The template forms are automatically converted into the SIMPLE SGML format, by software developed for the purpose. For further details about tools, see section 1.4. A lexicographer's manual (written in Dutch), which was of course based on the guidelines but specified some aspects and some working procedures, was used in order to enhance quality and consistency among the lexicographers.

Apart from the obligatory template fields, the lexicographers compiled the recommended 'type hierarchy information' ('template_supertype', 'unification_path'), 'polysemous class' and 'qualia roles' (formal, agentive, constitutive and telic), for reasons of their relevance as establishing relationships between different (groups of) lemmata irrespective of Part of Speech, and more specifically between specific meanings of lemmata. It is just this feature that is missing, or only partially or not systematically treated in dictionaries. Particular these relationships will be used in one of our institutional projects on the longer term (see 1.5.).

Up to now, all noun Usems are compiled as for obligatory fields and the recommended fields mentioned. For the verb and adjective Usems, the recommended and obligatory fields are compiled, but the obligatory fields 'predicative representation', 'selectional restrictions' and link with syntax have not yet been finished (this is reported on in the last bimonthly). This is mainly due to the complexity of these fields, the time we needed to become familiar with the matter and its representation in SGML, and the problems we met (cf. 2.1, 2.5). However, now work is going on without major problems and completion is guaranteed by permanent staff with thorough knowledge of the matter working on it. Additionally, we found a way to continue contracts with temporary staff (see last bi-monthly). Apart from the 7326 noun Usems, ca. 450 verb Usems and ca. 150 adjective Usems have these fields finished now.

Appendices 1- 4 show more detailed information on the current lexicon contents.

 

1.3. Sample of 100 entries

The Dutch sample of 100 entries has the following characteristics.

As a starting point for the composition of the entrylist, we adopted the distribution figures applied in SIMPLE: 70% noun, 20% verb and 10% adjective Usems. The sample contains 71 noun entries with 263 Usems, 20 verbs entries with 90 Usems and 10 adjective entries with 43 Usems. In addition to the 20 verb entries, 9 Usems of 6 verbs that have a master link with noun Usems in the sample, have been included, which results in a total of 99 verb usems. The sample contains a total of 405 Usems.

The noun and adjective entries are the most frequent ones from the corresponding lemmalists (cf. 1.1.). Due to the short term of preparation of the review, the selection of the verbs is less elegant: we selected 2 x 10 verbs from 2 working files used for the compilation of the predicative fields (cf. 1.2). In order to demonstrate the non-master/master link between noun and verb Usems, the 9 Usems of the corresponding 6 verbs were added. These noun/verb Usem pairs concern: begin_1 ~ beginnen_4; gebruik_1 ~ gebruiken_1; leven_1, _2, _6, _7, _8 ~ leven_1, _6; onderzoek_1 ~ onderzoeken_1; werk_1,_4,_5,_6,_7 ~ werken_1,_2; wil_1,_2,_3 ~ willen_1,_3.

See appendices 2B-4B for the number of Usems per template type, per domain and semantic class for the Usems in this sample, separately for each Part of Speech.

 

1.4. Tools

Software tools used by INL for SIMPLE are:

 

Database load tool

This tool (written in Perl) converts the standard 'skeleton' templates provided by the Linguistic Specification Group into an SQL command file, mainly consisting of 'insert'-statements. The tool searches for the relevant attribute-value pairs and builds the appropriate insert statement for it. The attribute-value pairs are easy to recognise as they are mostly in in the format: attribute:value.

When the command file is executed against the SIMPLE database, the skeleton template is loaded.

As an example we take the template for artwork. For this example we assume that it only consists of:

Usem: 1

Template_Type: [Artwork]

Unification_path: [Concrete_entity | ArtifactAgentive | Telic]

Applying the tool results in the following insert statement:

insert into template (USEM, TEMPLATE_TYPE, UNIFICATION_PATH) values (1,'[Artwork]', '[Concrete_entity | ArtifactAgentive | Telic]');

 

Data entry tool

This tool is a data entry form similar to the standard templates, built with Uniface, a visual development environment available for a variety of platforms and databases systems. Although we dot not use the latest release, our release is still perfectly suited for the SIMPLE tasks.

The main functionality for SIMPLE was to enable the lexicographer to add, modify or delete information about Usems. Nearly all this functionality could be generated with Uniface, so we only had to write some additional code for e.g. checking purposes.

 

Report tools

Report tools (all written in Perl) are used to collect information from the SIMPLE database which is not easy to collect by means of an SQL query. A typical approach is to obtain a set of data from the database and then applying the appropriate report tool.

As an example we look at the report tool for obtaining the number of Usems per domain. One Usem can have more than one value for domain (separated by ,). We need the separate values, so we have to split domain. As this cannot be done by SQL, we use a report tool. E.g the data set obtained from the database (one row represents one usem):

POLITICS_AND_GOVERNMENT, HISTORY, MONARCHY

FINANCE, ECONOMICS

FINANCE, SOCIOLOGY, GENERAL

POLITICS_AND_GOVERNMENT, GENERAL

The result after applying the tool:

DOMAIN

ECONOMICS

FINANCE

GENERAL

HISTORY

MONARCHY

POLITICS_AND_GOVERNMENT

SOCIOLOGY

FREQUENCY

    1

    2

    2

    1

    1

    2

    1

 

Conversion to SGML

The software for conversion to SGML consists of a large suite of Perl and C++ programs converting the database contents to SGML and establishing the link with the PAROLE lexicon. The conversion implies formal error correction (due to manual work in the database template forms) and the creation of additional SGML objects. The latter is automized, the former as far as possible.

 

1.5. Impact of SIMPLE/PAROLE

The PAROLE lexicon is distributed by ELRA, and, for researchers in the Netherlands and Belgium only, by our institute. The PAROLE lexicon (without the SIMPLE part) is furthermore used in two (inter)national projects: the Dutch-Flemish project Corpus Gesproken Nederlands (Corpus of Spoken Dutch, comparable with BNC) and the Dutch project ToKeN2000, which aims at a sophisticated knowledge retrieval system, including modules concerning automatic language generation and spoken answers to questions by users. The latter project will also use the SIMPLE part of PAROLE.

The SIMPLE data will be used in the institutional project Integrated Language Database of 8th-21st Century Dutch, a long term project approved by the Dutch and Flemish governments. This project aims at creating a database in which data from linguistically annoted texts, electronic dictionaries and linguistic files will be linked in a meaningful way, in order to function as an instrument for research into the Dutch language and culture throughout the centuries. Especially the SIMPLE data establishing relationships between different (meanings of) lemmata are interesting for this project (cf. 1.2) and will be further developed in this framework. Furthermore, the PAROLE POS tagset will, with some extensions for the historical periods, probably be used for POS tagging of the texts in the database.

A current activity of our institute is to make the PAROLE corpus accessible over the Internet in a way similar to (but more modern than) the three INL corpora already operational (see www.inl.nl). Work is going on automatic lemmatization, POS tagging according to the PAROLE tagset and global syntactic tagging. A retrieval system is being developed which will give access to this corpus on linguistic parameters.

 

1.6. Remaining work

As explained in section 1.2., compilation of the fields 'predicative representation', 'selectional restrictions' and link with syntax have not yet been finished for verbs and adjectives, but completion is guaranteed by permament staff with thorough knowledge of the matter working on it. Additionally, we found a way to continue contracts with temporary staff (see last bi-monthly).

Permanent staff will be concerned with checks on quality and consistency.

 

2. Semantic encoding

 

2.1. Criteria for Syntax-Semantic linking

General criteria for assigning readings to syntactic descriptions.

Syntax is linked to semantics by way of connecting the positions of the complementation frames of the Parole lexicon entries with the arguments of the Simple predicates. So the starting point for the link between Usem and Usyn are the syntactic descriptions (if any) of a lemma in the PAROLE lexicon. These syntactic descriptions are corpus based. In order to establish the connection we first had to decide about the semantic correspondence of each syntactic complement frame of the lexicon entry with the specific argument structure of the related Usem. Secondly, we had to decide which of the complements of a frame could actually be semantically linked to an argument. These decisions were made by interpretation of the relevant predicate and then compare it with the example phrases belonging to the syntactic complementation frames in question in order to see whether a correspondence might be established.

Criteria for defining language particular 'correspondences' for predicative SynUs.

Adjective: link between syntax and semantics

In the Dutch PAROLE lexicon, the basic fact of an adjective determining a noun is considered to belong to the grammar. In the syntactic frames, as a consequence, there is no position available for the noun in question. So, in our lexicon, adjectival complementation as found in: the man is angry with me, is described in the grammatically correct phrasing: The ‘with me angry’ man, where the entry ‘angry’ has a one-place frame with the prepositional complement PP(with) on the first and only position (P0).

For that reason, up to now, we described adjectival predication as one-place predicates:

    angry_1(<arg0>),

    where <arg0>=PP(with) with selectional restriction [Living Entity]

This implies that we don’t have the possibility to describe what kind of noun (for example: only living entity) the adjective selects for. Moreover, subject clauses which have the same P0 position in the frame, like:

    it is good for me to see you (read: ‘to see you’ is good for me)

    it is easy for me to do that (read: ‘to do that’ is easy for me)

cannot be linked to a semantic argument either.

This fact causes a considerable loss of semantic information and we plan to change the one-place predicates into two-place predicates:

He is angry with me, the ‘with me angry’ man

    angry_1(<arg0>)(<arg1>),

    where <arg0>=NP-[Living_Entity] and <arg1>=PP(with) [Living_Entity]

By consequence we can only link <arg1> to a syntactic position, and have leave <arg0> unlinked, which is allowed in Simple.

Prototypicality prevails over corpus based presence

As said above, the selection of syntactic complements in the Parole lexicon has been based on their freqency in our corpora. Sometimes, however, we considered a certain complementation pattern as non-prototypical from a semantic point of view. So in these cases, the predicate differs from the corpus based complementation patterns. By consequence, the ID shows that the non-prototypical complementation pattern is possible, but the respective positions have not been linked to an argument. Cf: boek_4, (book_4) with gloss ‘deel van een meerdelig boekwerk’ (part of a multipartite work). We choose for a predicate with only one (prototypical) argument: the name of the book, like the book Genesis. The two possible syntactic complements PP(van), naming the author of a book and PP(over), naming the topic the book is about, are not considered prototypical for that specific noun Usem and is therefore not linked to the only available argument.

Selectional restriction prevents linking

Sometimes different syntactic complements (e.g. an N or a PP) might be linked to one semantic argument, but this is not possible because of the selectional restriction on the argument, eg:

    nota_3(<arg0>)(<arg1>) de nota van de minister over het kunstbeleid (the note of the minister on art management)

where <arg0> = Role_ProtoAgent and <arg1> = Role_ProtoPatient

The PAROLE lexicon has three descriptions for nota:

(1) de nota Kunstbeleid (the note (called)Art management)
(2) de nota van de minister (the note of the minister)
(3) de nota over het kunstbeleid (the note on art management)

Only two syntactic frames (2 and 3) can be linked. The complement in (1) cannot be linked to <arg1> because of its role: Role_Adjunct instead of Role_ProtoPatient.

 

2.2. Criteria for assigning Domain Features

The general SIMPLE criterion for the selection of a domain value from the SIMPLE domain list is the topic of texts in which a Usem usually appears, or is most likely to appear. The most specific domain in the hierarchy was selected. If no suitable specific domain was available (e.g. a particular branch of sport), the immediate node was selected. Usems of very common usage have the value 'general'. A Usem can have one or more domain values. If a Usem is both domain specific (i.e. likely to appear in domain specific texts) and of very common usage, both specific domain values and the value 'general' are assigned. See appendix 3A for frequencies of Usems per domain value, separately for each Part of Speech (3.A.1-3.A.3).

 

2.3. Criteria for assigning Semantic Class and Template Type

The principles for the selection of a semantic class from the SIMPLE hierarchy were essentially the same as for domain: whenever possible the most specific one. Only one value is either assigned or selected from the list. A semantic class value can be specified by a distinctive feature. See appendix 4A for frequencies per semantic class, separately for each Part of Speech (4.A.1-4.A.3).

The selection of a template type was based on knowledge of the ontology and on the linguistic tests and the information per template type provided by the Specification Group. Furthermore, the suitability of the qualia roles and their values was used as a check on the correct template type: the selection was probably wrong if the qualia roles were not suitable for the particular meaning of an Usem. Irrespective of its status as core or recommended, the template type which was considered most suitable and most specific was selected.

After the final and complete set of template types was provided by the Specification Group, the lexicographers checked the selection of template type for noun Usems corresponding with Base Concepts, as at the time of compilation the set of template types was much smaller.

 

2.3.1. Language specific typing

No language specific typing was applied.

 

2.3.2. Template subtyping for language specific encoding

No language specific template subtyping was applied.

 

2.3.3. Criteria for encoding Semantic Relations

Relations. In the standard templates, template-specific relations for the qualia roles are 'predefined', with an indication of their status (type-defining, optional). These were taken as starting point. Whenever a predefined relation could not be filled in sensefully, this is marked as such in the database and then the relation is not implemented in the SGML file. Relations were added if they were judged essential for the particular Usem. These cases are marked as such in the database and implemented in the SGML file.

Choice of target Usems. As said above (1.1), due to the limited size of both the SIMPLE lexicon and the PAROLE lexicon, it was considred not desirable to select only target Usems that are in those lexicons. For this reason, our first criterion for the selection of target Usems was suitability for the specific relation between the Usem and its target Usem.

The present state of the art of coverage of target Usems in the lexicons is as follows. Per Part of Speech, the target Usems were collected and compared with the complete entrylist of the PAROLE lexicon and the SIMPLE lexicon, respectively. Note that 'target lemmata' are lemmatized target Usems, that is a target lemma includes all target Usems with an identical word form.

NOUNS

Number of target Usems
Number of target lemmata
Target lemmata covered by PAROLE lexicon
Target lemmata covered by SIMPLE lexicon

18,523
3,860
2,939 (76%)
1,583 (41%)

VERBS

Number of target Usems
Number of target lemmata
Target lemmata covered by PAROLE lexicon
Target lemmata covered by SIMPLE lexicon

4,211
1,099
944 (85%)
556 (50%)

ADJECTIVES

Number of target Usems
Number of target lemmata
Target lemmata covered by PAROLE lexicon
Target lemmata covered by SIMPLE lexicon

1,068
471
378 (80%)
170 (36%)

 

 

2.3.4. Criteria for encoding Derivation Relations (it's optional).

Not applicable. This field has not been compiled.

 

2.4. Classes derived from encoding

Polysemic relations given in the standard templates provided by the Specification Group were judged for their suitability for a particular Usem. If not, the relation is marked as such in the database template field and not implemented in the SGML file.

Only polysemous relations that have a counterpart in another Usem of the lemma concerned, are implemented in the SGML file.

Synonymy and hyponmy is not (yet) encoded in our lexicon.

 

2.5. Representation of Predicative information.

 

Type of link: Master/Non-master

The first step on the predication path was to decide about a Usem’s Type of Link with another Part of Speech item. Our starting point in this has been the basic SIMPLE assumption (Del. 2.1, p34/35): "We assume that verbs and adjectives always have a master relation with the predicate". For that reason even denominal verbs and adjectives are considered to be Master (where their true (etymological) derivation is to be described in the derivation field of the template). Nouns therefore were the only Part of Speech to be decided upon. As a consequence of what has been said, noun Usems are only Master if they cannot be semantically related to an adjectival or verbal predicate. All other noun Usems are non-Master, independently of the factual etymological relation between the noun and the verb or the noun and the adjective.

 

Some criteria for Non-Master usems

Formal resemblance prevails

Whenever a noun Usem could be semantically related to the predication of two different verbs the semantically most resembling was taken. In the case of:

    gevoel~ gevoelen|voelen

    feeling~feel|feel

we choose for ‘voelen’.

Verb prevailes over Adjective

Whenever a noun Usem could be semantically related to a verbal as well as to an adjectival predicate, the verb was given priority, cf:

    associatie ~ associeren/associatief

    association~ associate/associative

Two or more possible adjectives

Whenever a single noun Usem could be semantically related to two (or more) adjectives, the most meaningful was choosen: (the suffix –achtig (‘-like’) can be added to nearly every noun)

monster ~ monsterlijk/monsterachtig (we choose ‘monsterlijk’)

mos ~ mossig/mosachtig (we choose ‘mossig’)

paniek ~ panisch/paniekerig/paniekachtig (we choose ‘panisch’)

Relation of a noun to itself

Sometimes a noun might be semantically related to an adjective, but this adjective is in fact the noun itself in an adjectival function, cf:

    moslim_(noun), moslim_(adj) in: een moslim soldaat/aanval (the ‘real’ adjective: ‘moslims’ is not used)

    model_(noun), model_(adj) in: een model vader (an exemplary father)

We considered the nouns model and moslim to be a Master.

Relation of a noun to another noun: Master

Whenever a noun Usem could only be semantically related to another noun, we considered that noun Usem to be a Master, cf:

    mensheid_2 ~ mens (manhood ~man)

    ministerie~minister (ministry~minister)

Some criteria for predication

Prototypicality prevails over maximalization

The Guidelines propose to maximalize predication. We somehow restricted this principle by the criterion of prototypicality. For example, in many cases locative arguments are possible, and they are often part of the PAROLE descriptions of nouns. We, however, have only included them in the Simple predication if they were prototypical for the Usem in question. Take for example ‘artikel’ (article), where we consider the locative argument to be prototypical, because ‘artikel_2’, the item published in a paper, is different from, e.g. ‘artikel_3’, an item which only shows up in dictionaries.

Semantic Role

As the semantic roles mentioned in the Simple Guidelines are merely syntactically defined, we were in need of a more semantically driven notion in order to decide upon the semantic role of verbal arguments. For example, syntactic subjects vary considerably where their semantic role is concerned. They can have semantic role ProtoAgent, ProtoPatient or Underspecified. So we introduced the notion of ‘Control’ to decide whether a syntactic subject had role ProtoAgent (cf. Oppentocht 1999). Our working definition of this role is: "someone having control over the act expressed in the verbal or nominal predicate". According to the Guidelines the direct object is always attributed the ProtoPatient-role. We missed the possibility to discriminate between whether the object is ‘undergoing the event’ or ‘benefitting from the event’. Moreover syntactic subjects may also have RoleProtoPatient, cf.

    the action starts today, the boat is sinking, the storm died down, my cousin hates me

So in addition to the Guidelines definition: "for the direct object and strongly bound prepositional complements", we supplied "and for those subjects which experience or undergo the event expressed by the nominal or verbal predicate".

As a consequence Role_Underspecified is often attributed to subjects which neither control nor experience or undergo the action expressed in the verbal or nominal predicate: cf

    the storm caused great damages

    his anger showed how much he was affected by this

In addition to the closed list of Semantic Roles, we would be very happy with the semantic roles Beneficiary and Goal.

 

2.6. Problems encountered.

Problems are reported on in the respective sections.

 

3. Statistics

See appendices 1 and 2A-4A for information concerning the complete dataset. See for information about the delivered 100 entries appendices 2B-4B.

 

4. Bibliography

Oppentocht, L. (1999), Lexical Semantic Classification of Dutch verbs. Towards constructing NLP and human-friendly definitions. Ph.D. dissertation, University of Leiden, The Netherlands.

 

5. List of appendices

Appendix 1: List of lemmata covering one or more base concepts, for nouns (1.1), verbs (1.2) and adjectives (1.3), respectively.

 

Appendix 2.A: Number of Usems per template type in the complete dataset, for nouns (2.A.1), verbs (2.A.2) and adjectives (2.A.3), respectively.

Appendix 2.B: Number of Usems per template type in the sample of 100 entries, for nouns (2.B.1), verbs (2.B.2) and adjectives (2.B.3), respectively.

 

Appendix 3.A: Number of Usems per Domain in the complete dataset, for nouns (3.A.1), verbs (3.A.2) and adjectives (3.A.3), respectively.

Appendix 3.B: Number of Usems per Domain in the sample of 100 entries, for nouns (3.B.1), verbs (3.B.2) and adjectives (3.B.3), respectively.

 

Appendix 4.A: Number of Usems per Semantic Class in the complete dataset, for nouns (4.A.1), verbs (4.A.2) and adjectives (4.A.3), respectively.

Appendix 4.B: Number of Usems per Semantic Class in the sample of 100 entries, for nouns (4.B.1), verbs (4.B.2) and adjectives (4.B.3), respectively.

 

 

Appendix 1.1: List of NOUN lemmata covering one or more base concepts

aantal

aanval

aarde

actie

activiteit

afbeelding

afdeling

affiche

agglomeratie

akte

alternatief

apparaat

argumentatie

arts

associatie

auto

avond

baan

bedrag

bedrijf

behandeling

behoefte

bekende

beleid

belevingswereld

bemanning

bericht

bestraffing

bestuur

bevel

beweging

bewering

bezitting

bezoek

bezoeker

biografie

bloedvat

boek

boot

bouw

bouwgrond

bouwmateriaal

brandweer

brief

brochure

buis

buitenkant

bureau

cel

cijfer

club

commercie

commissie

communicatiemiddel

computerprogramma

constructie

cursus

daad

dag

dans

database

denkproces

deskundige

dessin

ding

discipline

district

doel

doelwit

domein

doorgang

drank

drankje

eenheid

eensgezindheid

eigendom

eigenschap

eind

einde

element

etmaal

exemplaar

expert

fabriek

familielid

feest

feestdag

figuur

fout

functie

functionaris

gang

gas

gebaar

gebeurtenis

gebied

gebouw

gedeelte

gedrag

gedragscode

gegeven

geheel

geld

gelijke

geloof

gelovige

geluid

gemeenschap

gemeente

geneesmiddel

gepeins

geslacht

getal

gevoel

gevolg

gewaarwording

gewoonte

gezelschap

god

godsdienst

groep

grond

grondslag

grootte

haar

handeling

hoeveelheid

hond

hoofd

hout

huid

huis

huishouden

hulp

hulpverlener

ideologie

inboedel

informatie

inhoud

instelling

instituut

instrument

jaar

jongen

kaart

kamer

kant

kantoor

karakteristiek

keer

kenmerk

kerk

keuze

kind

kleur

kuil

kunst

kunstenaar

kwestie

land

landbouwproduct

leider

letsel

letter

leven

lichaam

lichaamsdeel

licht

lichtbron

lied

lijn

maand

machine

macht

machthebber

man

manager

manier

materiaal

medewerker

medicijn

mensheid

menu

methode

militair

mislukking

mogelijkheid

moment

musicus

muziek

muziekstuk

naam

nacht

naslagwerk

natuur

natuurkunde

natuurwetenschap

niveau

nummer

olie

omtrek

onderdeel

onderneming

onderwerp

ontwikkeling

oorlog

oorlogvoering

oorzaak

operatie

oppervlak

organisatie

organisme

overeenkomst

overheid

paard

papier

parcours

partij

peil

periode

pijpleiding

plaat

plaats

plicht

positie

poster

preparaat

prestatie

probleem

procedure

proces

product

productie

programma

programmatuur

prospectus

provincie

rand

reactie

redenering

regel

regering

regio

reis

relatie

resultaat

richting

rol

route

ruimte

samenstelling

samenvoeging

schilderij

school

schrijver

segment

seks

serie

set

situatie

software

sokkel

soldaat

soort

sport

sportveld

spraak

staat

stad

stadium

stadsgewest

status

stel

stem

stemgeluid

steun

stijl

stoel

strategie

structuur

struik

substantie

systeem

taak

taal

tafel

tas

team

teken

tekst

telefoonnummer

terrein

thema

theorie

tijd

tijdstip

titel

toename

toestand

transformatie

transportmiddel

trend

trilling

uiteinde

uiterlijk

unie

unit

universum

vaardigheid

vaartuig

vacht

vakgebied

valuta

vel

veld

verandering

verband

verbond

vereniging

verklaring

verlies

vermogen

verplaatsing

verrichting

vertegenwoordiging

vertoning

vertrek

vervoermiddel

verzameling

vis

visitekaartje

vlakte

voedselvoorraad

vogel

volk

volksvertegenwoordiging

voorrecht

voorstelling

voorwerp

voorzitter

vorm

vorming

vriend

vriendin

vrouw

vrucht

wapen

water

wedstrijd

weer

weg

werk

werkdag

werknemer

werkplek

werkwijze

wezen

wijn

wijziging

wildgroei

wind

winkel

woning

woord

zaak

zak

ziekte

zijde

zin

zone

zorg

Appendix 1.2: List of VERB lemmata covering one or more base concepts

aankunnen

aantonen

afbreken

afnemen

bedekken

bedoelen

bedriegen

begrijpen

behandelen

beheersen

bekijken

beoordelen

bepalen

bereiken

berekenen

beschadigen

beschermen

beschrijven

besluiten

bespreken

bestaan

besteden

betalen

betreffen

bevestigen

bewegen

bezeren

bezitten

bezorgen

beëindigen

binnengaan

blijven

breken

brengen

concluderen

creëren

dalen

delen

denken

doden

doen

doodgaan

doorgaan

draaien

dragen

duwen

eindigen

ervaren

eten

fabriceren

gaan

gebeuren

gedragen

geven

halen

handelen

hanteren

hebben

helpen

herinneren

herscheppen

houden

identificeren

komen

kopen

krijgen

kwijtraken

laten

leggen

leiden

leven

leveren

lijken

lopen

maken

markeren

nadenken

nemen

ondergaan

onderscheiden

ontdekken

onthouden

ontzien

opdelen

opgeven

ophouden

opwinden

ordenen

overeenkomen

overeenstemmen

pakken

plannen

proberen

raken

rangschikken

samenwerken

scheiden

scheppen

schoonmaken

schrijven

slaan

sluiten

spelen

sterven

steunen

sturen

toelaten

toenemen

toepassen

toestaan

toevoegen

tonen

transformeren

treffen

uiten

uitleggen

uitspreken

uitvoeren

vallen

variëren

vastleggen

vastmaken

vechten

veranderen

verbeteren

verbinden

verbrokkelen

verdelen

vergroten

verhinderen

verklaren

verkopen

verlangen

verliezen

verminderen

veroorzaken

verplaatsen

verschaffen

verslechteren

versnipperen

versplinteren

verspreiden

vertegenwoordigen

vertellen

vertrekken

vervormen

verwachten

verwijderen

verzamelen

verzoeken

verzorgen

vinden

voelen

voltooien

voorbijgaan

voortzetten

vormen

vragen

vullen

wachten

wegdoen

weggaan

werken

weten

wijzigen

willen

worden

zakken

zeggen

zetten

zien

zijn

Appendix 1.3: List of ADJECTIVE lemmata covering one or more base concepts

aanwezig

aardig

afgelopen

algemeen

arm

behaaglijk

belangrijk

belangwekkend

binnenlands

breed

commercieel

compleet

correct

cruciaal

cultureel

dagelijks

democratisch

dichtbij

diep

direct

donker

dood

doorzichtig

duidelijk

duur

echt

economisch

eender

eenvoudig

eerlijk

effectief

enkel

essentieel

federaal

financieel

gebruikelijk

gehard

gelijk

gelukkig

gemakkelijk

gereed

geslaagd

gevaarlijk

gewoon

gezamenlijk

groot

hard

hartelijk

hecht

heftig

helder

huidig

huiselijk

huishoudelijk

identiek

individueel

industrieel

interessant

internationaal

jong

juist

juridisch

klaar

klein

koel

koninklijk

kort

koud

krachtig

laat

landelijk

lang

lokaal

makkelijk

manlijk

mannelijk

medisch

menselijk

microscopisch

militair

moeilijk

mogelijk

mooi

nationaal

nieuw

nobel

normaal

nucleair

nuttig

onafhankelijk

onjuist

onmogelijk

onwaarschijnlijk

oorspronkelijk

open

oud

perfect

plaatselijk

politiek

populair

prachtig

professioneel

publiek

rechtstreeks

regionaal

rijk

seksueel

serieus

significant

snel

sociaal

sterk

succesvol

toekomstig

traditioneel

veilig

verantwoordelijk

verkeerd

verleden

vermoedelijk

vers

verschillend

verschuldigd

volledig

volmaakt

vrij

vroeg

waar

waarschijnlijk

warm

werkelijk

wezenlijk

zacht

zeker

ziek

zwaar

zwak

zwart

Appendix 2.A.1: Number of Usems per template type in the complete dataset for NOUNS

 

TEMPLATE_TYPE

[3_D_Location] 45

[Abstract_Entity] 172

[Acquire_knowledge] 8

[Act] 16

[Agent_of_persistent_activity] 96

[Agent_of_temporary_activity] 126

[Agentive] 23

[Air_Animal] 14

[Amount] 157

[Animal] 11

[Area] 59

[Artifact] 226

[Artifact_Food] 22

[Artifactual_area] 55

[Artifactual_drink] 13

[Artifactual_material] 18

[Artwork] 24

[Aspectual] 25

[Body_part] 111

[Building] 101

[Cause_act] 3

[Cause_aspectual] 10

[Cause_change] 13

[Cause_change_location] 21

[Cause_change_of_state] 49

[Cause_change_of_value] 14

[Cause_constitutive_change] 17

[Cause_experience_event] 17

[Cause_motion] 9

[Cause_natural_transition] 5

[Cause_relational_change] 15

[Change] 15

[Change_of_location] 38

[Change_of_possession] 10

[Change_of_state] 18

[Change_of_value] 20

[Clothing] 24

[Cognitive_event] 89

[Cognitive_fact] 87

[Color] 3

[Commissive_speech_act] 1

[Concrete_entity] 84

[Constitutive] 43

[Constitutive_change] 7

[Constitutive_state] 17

[Container] 61

[Convention] 101

[Cooperative_activity] 119

[Cooperative_speech_act] 12

[Copy_creation] 5

[Creation] 10

[Declarative_speech_act] 15

[Directive_speech_act] 18

[Disease] 24

[Domain] 84

[Drink] 6

[Earth-Animal] 30

[Entity] 108

[Event] 104

[Exist] 8

[Experience_event] 70

[Expressive_speech_act] 7

[Flavouring] 2

[Flower] 4

[Food] 15

[Fruit] 7

[Furniture] 19

[Geopolitical_Location] 40

[Give_knowledge] 37

[Group] 140

[Human] 182

[Human_Group] 224

[Identificational_state] 18

[Ideo] 17

[Information] 243

[Institution] 202

[Instrument] 100

[Judgement] 4

[Kinship] 39

[Language] 4

[Living_entity] 7

[Location] 221

[Material] 6

[Mental_creation] 4

[Micro_organism] 2

[Modal_event] 23

[Money] 18

[Moral_standard] 11

[Move] 47

[Movement_of_thought] 13

[Natural_substance] 46

[Natural_transition] 9

[Non_relational_act] 5

[Number] 12

[Opening] 25

[Organic_object] 13

[Part] 404

[People] 7

[Perception] 16

[Phenomenon] 106

[Physical_creation] 8

[Physical_object] 12

[Physical_power] 9

[Physical_property] 63

[Plant] 33

[Profession] 183

[Proper_noun] 3

[Property] 112

[Psych_property] 50

[Psychological_event] 28

[Purpose_act] 270

[Quality] 37

[Relational_act] 196

[Relational_change] 13

[Relational_state] 94

[Reporting_event] 25

[Representation] 135

[Role] 39

[Semiotic_artifact] 190

[Shape] 41

[Sign] 44

[Social-status] 75

[Social_Property] 24

[Speech_act] 27

[State] 117

[Stative_location] 43

[Stative_possession] 17

[Stimulus] 24

[Substance] 40

[Substance_food] 3

[Symbolic_creation] 12

[Telic] 12

[Time] 142

[Transaction] 33

[Unit_of_measurement] 68

[Vegetal_entity] 10

[Vehicle] 52

[Water-Animal] 9

[Weather_verb] 8

139 rows selected

Appendix 2.A.2: Number of Usems per template type in the complete dataset for VERBS

TEMPLATE_TYPE

[Acquire_knowledge] 14

[Act] 15

[Agentive] 2

[Aspectual] 47

[Cause] 20

[Cause_act] 15

[Cause_aspectual] 20

[Cause_change] 18

[Cause_change_location] 54

[Cause_change_of_state] 86

[Cause_change_of_value] 12

[Cause_constitutive_change] 33

[Cause_experience_event] 22

[Cause_motion] 14

[Cause_natural_transition] 10

[Cause_relational_change] 24

[Change] 25

[Change_of_location] 54

[Change_of_possession] 30

[Change_of_state] 45

[Change_of_value] 23

[Cognitive_event] 104

[Commissive_speech_act] 8

[Constitutive_change] 7

[Constitutive_state] 12

[Cooperative_activity] 24

[Cooperative_speech_act] 5

[Copy_creation] 2

[Creation] 18

[Declarative_speech_act] 20

[Directive_speech_act] 36

[Event] 85

[Exist] 13

[Experience_event] 29

[Expressive_speech_act] 12

[Give_knowledge] 35

[Identificational_state] 36

[Judgement] 12

[Mental_creation] 2

[Modal_event] 33

[Move] 58

[Natural_transition] 8

[Non_relational_act] 22

[Perception] 23

[Phenomenon] 15

[Physical_creation] 24

[Psychological_event] 51

[Purpose_act] 220

[Relational_act] 232

[Relational_change] 22

[Relational_state] 116

[Reporting_event] 27

[Speech_act] 37

[State] 76

[Stative_location] 31

[Stative_possession] 23

[Stimulus] 11

[Symbolic_creation] 16

[Transaction] 26

59 rows selected

Appendix 2.A.3: Number of Usems per template type in the complete dataset for ADJECTIVES

TEMPLATE_TYPE

[emotive] 2

[emphasizer] 19

[extensional] 42

[intensional] 56

[intensity] 33

[manner] 24

[modal] 11

[object-related] 64

[phys_property] 246

[psych_property] 328

[relation] 59

[social_property] 71

[temporal] 16

[temporal_property] 61

14 rows selected

 

Appendix 2.B.1: Number of Usems per template type in the sample of 100 entries, for NOUNS

 

Template_type Freq.

--------------------------------------- --------

[3_D_LOCATION] 1

[ABSTRACT_ENTITY] 11

[AGENT_OF_PERSISTENT_ACTIVITY] 1

[AGENT_OF_TEMPORARY_ACTIVITY] 2

[AIR_ANIMAL] 1

[AMOUNT] 8

[AREA] 5

[ARTIFACTUAL_AREA] 3

[ARTIFACTUAL_DRINK] 1

[ARTIFACT] 5

[ASPECTUAL] 2

[BODY_PART] 8

[BUILDING] 3

[CAUSE_CHANGE_OF_STATE] 1

[CHANGE_OF_STATE] 1

[COGNITIVE_EVENT] 1

[COGNITIVE_FACT] 2

[CONCRETE_ENTITY] 2

[CONSTITUTIVE] 4

[CONSTITUTIVE_STATE] 1

[CONTAINER] 1

[CONVENTION] 2

[COOPERATIVE_ACTIVITY] 6

[DOMAIN] 2

[ENTITY] 4

[EVENT] 3

[EXIST] 1

[EXPERIENCE_EVENT] 1

[GEOPOLITICAL_LOCATION] 5

[GROUP] 2

[HUMAN] 13

[HUMAN_GROUP] 20

[INFORMATION] 7

[INSTITUTION] 12

[INSTRUMENT] 1

[KINSHIP] 4

[LOCATION] 5

[MODAL_EVENT] 1

[MONEY] 2

[MOVE] 1

[NATURAL_SUBSTANCE] 3

[NUMBER] 3

[PART] 13

[PHENOMENON] 3

[PHYSICAL_OBJECT] 1

[PHYSICAL_PROPERTY] 2

[PLANT] 1

[PROFESSION] 4

[PROPERTY] 5

[PSYCHOLOGICAL_EVENT] 1

[PSYCH_PROPERTY] 1

[PURPOSE_ACT] 5

[RELATIONAL_ACT] 3

[REPRESENTATION] 6

[ROLE] 1

[SEMIOTIC_ARTIFACT] 10

[SIGN] 3

[SOCIAL-STATUS] 1

[SOCIAL_PROPERTY] 1

[STATE] 5

[STATIVE_POSSESSION] 1

[STIMULUS] 1

[SUBSTANCE] 1

[TIME] 22

[UNIT_OF_MEASUREMENT] 10

Total: 263

 

Appendix 2.B.2: Number of Usems per template type in the sample of 100 entries, for VERBS

 

Template_type Freq.

--------------------------------------- --------

[ACQUIRE_KNOWLEDGE] 1

[ACT] 2

[ASPECTUAL] 4

[CAUSE] 1

[CAUSE_ASPECTUAL] 1

[CAUSE_CHANGE_LOCATION] 1

[CAUSE_CHANGE_OF_STATE] 5

[CAUSE_CONSTITUTIVE_CHANGE] 2

[CAUSE_RELATIONAL_CHANGE] 1

[CHANGE_OF_POSSESSION] 1

[CHANGE_OF_STATE] 2

[COGNITIVE_EVENT] 7

[COMMISSIVE_SPEECH_ACT] 1

[CONSTITUTIVE_STATE] 1

[COOPERATIVE_SPEECH_ACT] 1

[DECLARATIVE_SPEECH_ACT] 2

[DIRECTIVE_SPEECH_ACT] 1

[EVENT] 2

[EXIST] 4

[EXPERIENCE_EVENT] 1

[EXPRESSIVE_SPEECH_ACT] 1

[GIVE_KNOWLEDGE] 4

[IDENTIFICATIONAL_STATE] 2

[JUDGEMENT] 1

[MODAL_EVENT] 1

[MOVE] 2

[NON_RELATIONAL_ACT] 1

[PERCEPTION] 2

[PSYCHOLOGICAL_EVENT] 1

[PURPOSE_ACT] 14

[RELATIONAL_ACT] 15

[RELATIONAL_CHANGE] 1

[RELATIONAL_STATE] 2

[REPORTINGEVENT] 4

[SPEECH_ACT] 1

[STATE] 1

[STATIVE_LOCATION] 1

[STATIVE_POSSESSION] 1

[TRANSACTION] 3

Total: 99

 

Appendix 2.B.3: Number of Usems per template type in the sample of 100 entries, for ADJECTIVES

 

Template_type Freq.

--------------------------------------- --------

[EMPHASIZER] 6

[INTENSIONAL] 2

[OBJECT-RELATED] 2

[PHYS_PROPERTY] 9

[PSYCH_PROPERTY] 10

[RELATION] 2

[SOCIAL_PROPERTY] 1

[TEMPORAL_PROPERTY] 11

Total: 43

 

Appendix 3.A.1: Number of Usems per Domain in the complete dataset, for NOUNS

 

 

Domain Freq.

--------------------------------------- --------

ACCOUNTING 13

ACOUSTICS 30

ADMINISTRATIVE_LAW 2

ADVERTISING 33

AEROSPACE_ENGINEERING 26

AGRICULTURE 27

AGRICULTURE-FISHING-FORESTRY 6

AIRFORCE 23

AIR_CONDITIONING 2

AIR_TRANSPORT 26

ALCHEMY 2

AMERICAN_FOOTBALL 29

ANATOMY 89

ANESTHESIOLOGY 2

ANGLING 9

ANTIQUITY 11

ARABLE_FARMING 24

ARBORICULTURE 2

ARCHAEOLOGY 24

ARCHERY 4

ARCHITECTURE 40

ARMY 49

ARTS 71

ASTROLOGY 9

ASTRONOMY 34

ATHLETICS 21

AUDIOVISUAL 22

AUTOMATION 1

AUTOMOBILE_ENGINEERING 32

BABY_CARE 17

BACTERIOLOGY 5

BADMINTON 16

BAKERY 19

BALLET 18

BANKING 61

BASEBALL 17

BASKETBALL 25

BASKETRY 1

BEEKEEPING 6

BILLIARDS 13

BIOCHEMISTRY 4

BOOKBINDING 15

BOTANY 98

BOXING 11

BREWING 15

BUDDHISM 5

BUILDING 71

BUILDING_CRAFTS 26

BULLFIGHTING 1

BUSINESS 130

BUS_TRANSPORT 7

BUTCHERY 15

CANON_LAW 1

CARDIOLOGY 8

CARDS 25

CARTOGRAPHY 45

CAR_TRANSPORT 6

CATTLE_FARMING 26

CERAMICS 13

CEREAL_FARMING 10

CHEMISTRY 56

CHESS 29

CHRISTIANITY 42

CHURCH_OF_ENGLAND 1

CIRCUS 12

CITY_PLANNING 36

CIVIL_ENGINEERING 55

CIVIL_LAW 8

CLEANING 8

CLIMBING 5

CLOTHING_INDUSTRY 53

COKING_INDUSTRY 8

COMMERCE 134

COMMERCIAL_LAW 3

COMPUTING 26

CONSTITUTIONAL_LAW 4

CONSTRUCTION 46

COSMETICS 9

CRAFT_INDUSTRY 37

CREATIVE_WRITING 86

CRICKET 19

CRIME 84

CRIMINAL_LAW 14

CROQUET 11

CUISINE 50

CYCLING 28

CYTOLOGY 3

DANCE 38

DEATH 34

DEMOGRAPHY 17

DENTISTRY 6

DERMATOLOGY 8

DIPLOMACY 37

DISTILLING 13

DRINK 29

DRUGS 15

DYEING 3

EAR-NOSE-THROAT 8

EARTH_SCIENCES 12

ECOLOGY 5

ECONOMICS 103

EDUCATION 108

ELECTRICAL_ENGINEERING 34

ELECTRICAL_WORK 19

ELECTRICITY 13

ELECTRONIC_ENGINEERING 26

EMBRYOLOGY 6

EMPLOYMENT 84

ENOLOGY 13

ENTOMOLOGY 16

EQUESTRIAN_SPORT 17

ETHNOLOGY 21

FAMILY_PLANNING 6

FASHION 58

FENCING 3

FEUDALISM 20

FILM 55

FINANCE 109

FIRE 7

FIREFIGHTING 17

FISHING 24

FLOWER_GROWING 10

FOOD 35

FORESTRY 20

FORTIFICATION 6

FRESHWATER_FISHING 4

FRUIT_AND_VEGETABLES 28

FURNISHING 40

FURNITURE 31

GAMES 87

GARDENING 27

GAS 5

GENEALOGY 2

GENERAL 6315

GENETICS 11

GEOGRAPHY 77

GEOLOGY 22

GEOMETRY 36

GEOPOLITICS 5

GLASSMAKING 11

GLAZING 2

GOLF 11

GOVERNMENT-ADMINISTRATION 140

GRAPHIC_ARTS 76

GYMNASTICS 4

HAIR 10

HEALTH 31

HEALTH_AND_MEDICINE 61

HEATING 8

HEMATOLOGY 3

HERALDRY 10

HERPETOLOGY 3

HIGHER_EDUCATION 57

HINDUISM 2

HISTOLOGY 3

HISTORY 130

HOME_AND_GARDEN 17

HOME_LAUNDRY 5

HOROLOGY 8

HORSESHOEING 4

HORSE_RACING 7

HOTEL_BUSINESS 38

HOUSE_PAINTING 5

HUMAN_SCIENCES 7

HUNTING_AND_SHOOTING 20

HYDROGRAPHY 14

HYDROLOGY 19

HYGIENE 10

ICHTHYOLOGY 7

INLAND_WATERWAY_TRANSPORT 42

INSURANCE 18

INTELLIGENCE 2

INTERNATIONAL_AFFAIRS 116

INTERNATIONAL_LAW 8

ISLAM 1

JEWELRY 18

JUDAISM 7

KITCHEN_EQUIPMENT 20

KNITTING 5

LAW 199

LAW_ENFORCEMENT 137

LEISURE 84

LIBRARIANSHIP 14

LIFE_SCIENCES 32

LINGUISTICS 73

LITURGY 2

LIVESTOCK_FARMING 6

LOCKSMITHING 9

LOGIC 4

MAGIC_AND_WITCHCRAFT 12

MAIL 15

MAMMALOGY 40

MANAGEMENT 76

MANUFACTURING_INDUSTRY 68

MARITIME_LAW 2

MARKETING 23

MARRIAGE 31

MARTIAL_ARTS 6

MASONRY 9

MATHEMATICS 57

MECHANICAL_ENGINEERING 30

MEDIA 27

MEDICINE 80

MEETING 168

METALLURGY 11

METEOROLOGY 63

METROLOGY 15

MICROSCOPY 2

MILITARY 176

MILITARY_LAW 3

MINERALOGY 15

MINING-GENERAL 15

MONARCHY 23

MUSIC 168

MYCOLOGY 1

MYTHOLOGY 9

NAVY 44

NEUROANATOMY 1

NEUROLOGY 16

NEWSPAPER_PUBLISHING 58

NUCLEAR_ENGINEERING 4

NUCLEAR_PHYSICS 8

OBSTETRICS 7

OCEANOGRAPHY 18

OFFICE_EQUIPMENT 12

OIL_INDUSTRY 10

ONCOLOGY 3

OPERA 74

OPHTHALMOLOGY-OPTOMETRY 10

OPTICS 4

ORNITHOLOGY 23

ORTODOX_CHURCH 7

PACKAGING 12

PAINTMAKING 5

PALEOBIOLOGY 7

PALMISTRY 2

PAPERHANGING 4

PAPERMAKING 9

PARAPSYCHOLOGY 5

PEDIATRICS 1

PENAL_SYSTEM 40

PERFUMERY 2

PETS 14

PHARMACY 28

PHILATELY 4

PHILOSOPHY 45

PHONETICS 5

PHOTOGRAPHY 39

PHYSICAL_SCIENCES 22

PHYSICS 48

PHYSIOLOGY 22

PIG_FARMING 14

PLASTERING 2

PLUMBING 6

POETICS 6

POLITICS 208

POLITICS_AND_GOVERNMENT 115

POLO 15

POTTERY 12

POULTRY_FARMING 12

PRIMARY_AND_SECONDARY_EDUCATION 38

PRINTING 53

PROTESTANTISM 12

PSYCHIATRY 16

PSYCHOANALYSIS 10

PSYCHOLOGY 272

PUBLISHING 105

PYROTECHNICS 5

QUARRYING 3

RADIO-TELEVISION 94

RAIL_TRANSPORT 30

REAL_ESTATE 59

RELIGION 77

RESTAURATION 75

RETAIL 110

RHETORIC 15

ROAD_TRANSPORT 44

ROMAN_CATHOLICISM 36

ROOFING 6

ROWING 8

RUGBY 31

SAILING_YACHTING_AND_BOATING 39

SCIENCES 73

SCOUTING 6

SCULPTURE 34

SEA_FISHING 15

SEA_TRANSPORT 78

SEISMOLOGY 2

SERVICE_INDUSTRY 27

SEWING 10

SEX 23

SHAVING 2

SHEEP_FARMING 22

SHIP_BUILDING 38

SHOEMAKING 10

SHOWS 56

SKIING 4

SMOKING 3

SOAPMAKING 2

SOCCER 68

SOCIAL_ACTION 48

SOCIAL_SECURITY 29

SOCIOLOGY 133

SPORT 242

SPORTS_AND_LEISURE 26

STATISTICS 5

STEEL_INDUSTRY 9

SUBWAY_TRANSPORT 7

SURFACE_TREATMENT 2

SURFING 6

SURGERY 14

SURVEYING 5

SWIMMING 10

TANNING 2

TAXATION 33

TELECOMMUNICATIONS 33

TENNIS 15

TEXTILES 19

THEATER 81

THEOLOGY 29

TILING 7

TOBACCO_INDUSTRY 5

TOPOGRAPHY 59

TOWN_AND_COUNTRY_PLANNING 62

TRANSPORT 78

TRUCKING 4

TYPOGRAPHY 17

UPHOLSTERING 3

UTILITIES 15

VENERY 7

VERSIFICATION 4

VETERINARY_MEDICINE 8

VIROLOGY 1

VITICULTURE 4

VOLCANOLOGY 1

WASHING 7

WASTE_TREATMENT 15

WATER 8

WATER_SPORT 8

WHEELWRIGHTING 1

WOODWORKING 36

WOOL_INDUSTRY 6

WRESTLING 6

ZOOLOGY 65

Total: 16458

 

Appendix 3.A.2: Number of Usems per Domain in the complete dataset, for VERBS

 

Domain Freq.

--------------------------------------- --------

ACCOUNTING 6

ACOUSTICS 11

AEROSPACE_ENGINEERING 2

AGRICULTURE 4

AIRFORCE 1

AIR_TRANSPORT 5

AMERICAN_FOOTBALL 3

ARABLE_FARMING 9

ARCHERY 1

ARCHITECTURE 6

ARMY 2

ARTS 6

ASTROLOGY 1

ASTRONOMY 1

ATHLETICS 2

AUDIOVISUAL 2

AUTOMOBILE_ENGINEERING 3

BABY_CARE 1

BADMINTON 2

BAKERY 3

BALLET 3

BANKING 2

BASEBALL 2

BASKETBALL 3

BOOKBINDING 2

BOTANY 2

BOXING 2

BREWING 3

BUILDING 1

BUILDING_CRAFTS 5

BUSINESS 12

BUTCHERY 1

CARDS 11

CARTOGRAPHY 1

CATTLE_FARMING 3

CEREAL_FARMING 3

CHEMISTRY 8

CHRISTIANITY 2

CIRCUS 3

CIVIL_ENGINEERING 2

CLEANING 4

CLIMBING 3

CLOTHING_INDUSTRY 8

COMMERCE 22

CONSTRUCTION 10

COSMETICS 1

CRAFT_INDUSTRY 2

CREATIVE_WRITING 10

CRICKET 2

CRIME 19

CRIMINAL_LAW 2

CROQUET 2

CUISINE 11

CYCLING 5

DANCE 6

DEATH 15

DENTISTRY 1

DIPLOMACY 2

DISTILLING 3

DRINK 4

DRUGS 4

EAR-NOSE-THROAT 5

EARTH_SCIENCES 2

ECONOMICS 4

EDUCATION 24

ELECTRICAL_ENGINEERING 3

ELECTRICAL_WORK 2

EMBRYOLOGY 2

EMPLOYMENT 15

ENOLOGY 1

ENTOMOLOGY 2

EQUESTRIAN_SPORT 3

ETHNOLOGY 2

FAMILY_PLANNING 2

FASHION 9

FENCING 2

FILM 5

FINANCE 33

FIREFIGHTING 2

FISHING 3

FLOWER_GROWING 1

FOOD 7

FORESTRY 3

FORTIFICATION 1

FRUIT_AND_VEGETABLES 3

GAMES 18

GARDENING 7

GENERAL 2048

GEOGRAPHY 3

GEOMETRY 1

GOLF 2

GOVERNMENT-ADMINISTRATION 5

GRAPHIC_ARTS 13

HAIR 1

HEALTH 6

HEALTH_AND_MEDICINE 11

HIGHER_EDUCATION 2

HISTORY 10

HOME_AND_GARDEN 4

HOME_LAUNDRY 1

HOROLOGY 6

HORSE_RACING 1

HOTEL_BUSINESS 10

HOUSE_PAINTING 1

HUNTING_AND_SHOOTING 10

HYDROGRAPHY 2

HYDROLOGY 4

HYGIENE 1

ICHTHYOLOGY 1

INLAND_WATERWAY_TRANSPORT 10

INSURANCE 2

INTERNATIONAL_AFFAIRS 9

JEWELRY 1

KNITTING 2

LAW 41

LAW_ENFORCEMENT 47

LEISURE 14

LIBRARIANSHIP 1

LIFE_SCIENCES 4

LINGUISTICS 9

LIVESTOCK_FARMING 1

LOCKSMITHING 1

LOGIC 1

MAGIC_AND_WITCHCRAFT 1

MAIL 6

MANAGEMENT 5

MANUFACTURING_INDUSTRY 8

MARKETING 1

MARRIAGE 14

MARTIAL_ARTS 2

MASONRY 1

MATHEMATICS 6

MECHANICAL_ENGINEERING 2

MEDICINE 19

MEETING 32

METALLURGY 1

METEOROLOGY 8

METROLOGY 2

MILITARY 33

MINING-GENERAL 1

MONARCHY 1

MUSIC 32

MYTHOLOGY 1

NAVY 2

NEUROLOGY 2

NEWSPAPER_PUBLISHING 4

OBSTETRICS 4

OCEANOGRAPHY 2

OPERA 8

OPHTHALMOLOGY-OPTOMETRY 2

OPTICS 1

ORNITHOLOGY 4

PALMISTRY 1

PENAL_SYSTEM 4

PETS 4

PHILOSOPHY 2

PHONETICS 1

PHOTOGRAPHY 5

PHYSICAL_SCIENCES 3

PHYSICS 5

PHYSIOLOGY 2

PIG_FARMING 1

PLUMBING 2

POETICS 1

POLITICS 14

POLITICS_AND_GOVERNMENT 2

POLO 1

POULTRY_FARMING 1

PRIMARY_AND_SECONDARY_EDUCATION 2

PRINTING 5

PROTESTANTISM 3

PSYCHOANALYSIS 1

PSYCHOLOGY 71

PUBLISHING 3

QUARRYING 1

RADIO-TELEVISION 10

RAIL_TRANSPORT 2

REAL_ESTATE 2

RELIGION 20

RESTAURATION 12

RETAIL 17

ROAD_TRANSPORT 6

ROMAN_CATHOLICISM 2

ROOFING 1

RUGBY 3

SAILING_YACHTING_AND_BOATING 10

SCIENCES 15

SCULPTURE 8

SEA_FISHING 1

SEA_TRANSPORT 19

SEISMOLOGY 1

SEWING 3

SEX 9

SHAVING 1

SHEEP_FARMING 2

SHIP_BUILDING 7

SHOEMAKING 2

SHOWS 6

SOCCER 10

SOCIAL_ACTION 2

SOCIAL_SECURITY 3

SOCIOLOGY 10

SPORT 39

SPORTS_AND_LEISURE 3

STEEL_INDUSTRY 3

SUBWAY_TRANSPORT 1

SURFACE_TREATMENT 1

SURFING 1

SURVEYING 1

SWIMMING 1

TAXATION 8

TELECOMMUNICATIONS 5

TENNIS 2

TEXTILES 6

THEATER 11

THEOLOGY 2

TILING 1

TOPOGRAPHY 1

TOWN_AND_COUNTRY_PLANNING 2

TRANSPORT 22

TYPOGRAPHY 2

UTILITIES 4

VENERY 1

VERSIFICATION 1

VETERINARY_MEDICINE 2

VITICULTURE 3

WASHING 2

WASTE_TREATMENT 2

WATER_SPORT 1

WOODWORKING 6

WOOL_INDUSTRY 1

ZOOLOGY 8

Total: 3391

Appendix 3.A.3: Number of Usems per Domain in the complete dataset, for ADJECTIVES

 

Domain Freq.

--------------------------------------- --------

ACOUSTICS 8

AEROSPACE_ENGINEERING 1

AGRICULTURE 2

AIR_TRANSPORT 1

ANATOMY 4

ANTIQUITY 4

ARABLE_FARMING 4

ARCHAEOLOGY 2

ARCHITECTURE 3

ARTS 10

BABY_CARE 1

BAKERY 1

BANKING 1

BOTANY 3

BUILDING 1

BUSINESS 7

BUS_TRANSPORT 1

BUTCHERY 2

CARTOGRAPHY 3

CHEMISTRY 2

CHRISTIANITY 2

CITY_PLANNING 1

CIVIL_ENGINEERING 1

CLEANING 4

CLOTHING_INDUSTRY 10

COKING_INDUSTRY 1

COMMERCE 9

CONSTRUCTION 1

CREATIVE_WRITING 1

CRIME 5

CUISINE 10

DEATH 1

DIPLOMACY 2

DISTILLING 2

DRINK 4

EAR-NOSE-THROAT 1

ECONOMICS 11

EDUCATION 6

ELECTRICITY 1

EMPLOYMENT 4

ENOLOGY 3

ETHNOLOGY 8

FASHION 15

FILM 1

FINANCE 13

FISHING 1

FOOD 4

FRUIT_AND_VEGETABLES 2

FURNITURE 2

GARDENING 1

GENERAL 956

GENETICS 1

GEOGRAPHY 5

GEOLOGY 2

GEOMETRY 6

GEOPOLITICS 1

GOVERNMENT-ADMINISTRATION 1

GRAPHIC_ARTS 14

HEALTH 2

HEALTH_AND_MEDICINE 16

HEATING 4

HERALDRY 1

HIGHER_EDUCATION 3

HISTORY 17

HOME_AND_GARDEN 2

HOME_LAUNDRY 1

HOROLOGY 1

HOUSE_PAINTING 2

HUNTING_AND_SHOOTING 1

HYDROLOGY 2

HYGIENE 1

INLAND_WATERWAY_TRANSPORT 2

INTERNATIONAL_AFFAIRS 4

JEWELRY 5

JUDAISM 1

KNITTING 1

LAW 10

LAW_ENFORCEMENT 7

LEISURE 4

LIFE_SCIENCES 4

LINGUISTICS 9

MANAGEMENT 1

MANUFACTURING_INDUSTRY 1

MATHEMATICS 4

MEDICINE 3

MEETING 11

METALLURGY 3

METEOROLOGY 12

MILITARY 8

MILITARY_LAW 1

MONARCHY 2

MUSIC 10

NUCLEAR_PHYSICS 3

OBSTETRICS 1

OCEANOGRAPHY 1

OPHTHALMOLOGY-OPTOMETRY 3

OPTICS 1

PAINTMAKING 1

PALEOBIOLOGY 1

PHILOSOPHY 5

PHOTOGRAPHY 1

PHYSICAL_SCIENCES 1

PHYSICS 4

PHYSIOLOGY 4

POLITICS 10

POLITICS_AND_GOVERNMENT 22

PSYCHIATRY 3

PSYCHOANALYSIS 3

PSYCHOLOGY 55

PUBLISHING 1

RAIL_TRANSPORT 1

REAL_ESTATE 1

RELIGION 9

RESTAURATION 10

RETAIL 5

ROAD_TRANSPORT 1

SAILING_YACHTING_AND_BOATING 2

SCIENCES 2

SEA_FISHING 1

SEA_TRANSPORT 3

SEWING 1

SEX 5

SHAVING 1

SHOEMAKING 1

SOAPMAKING 1

SOCIAL_ACTION 2

SOCIAL_SECURITY 1

SOCIOLOGY 25

SPORT 4

STATISTICS 2

SURVEYING 1

TANNING 1

TEXTILES 4

THEOLOGY 1

TOPOGRAPHY 2

TOWN_AND_COUNTRY_PLANNING 4

TRANSPORT 7

UTILITIES 2

VIROLOGY 1

VITICULTURE 2

VOLCANOLOGY 1

WASHING 1

WATER 1

WOODWORKING 1

ZOOLOGY 6

Total: 1562

Appendix 3.B.1: Number of Usems per Domain in the sample of 100 entries, for NOUNS

 

 

Domain Freq.

--------------------------------------- --------

ACOUSTICS 1

AGRICULTURE 2

AIR_TRANSPORT 1

AMERICAN_FOOTBALL 1

ANATOMY 6

ANGLING 1

ANTIQUITY 1

ARCHAEOLOGY 1

ARCHITECTURE 1

ARTS 1

ASTRONOMY 3

BABY_CARE 1

BADMINTON 2

BAKERY 1

BANKING 4

BASEBALL 1

BASKETBALL 1

BOOKBINDING 3

BOTANY 2

BOXING 1

BUILDING_CRAFTS 1

BUSINESS 9

CARDS 2

CATTLE_FARMING 1

CHEMISTRY 2

CHRISTIANITY 2

CIVIL_LAW 1

CLEANING 1

COMMERCE 6

COMPUTING 2

CONSTRUCTION 5

COSMETICS 2

CRAFT_INDUSTRY 7

CREATIVE_WRITING 3

CRICKET 1

CRIME 4

CROQUET 1

CUISINE 4

DEATH 1

DEMOGRAPHY 1

DISTILLING 1

DRINK 2

EARTH_SCIENCES 1

ECONOMICS 5

EDUCATION 2

EMPLOYMENT 2

ENOLOGY 1

ENTOMOLOGY 2

ETHNOLOGY 1

FAMILY_PLANNING 1

FASHION 1

FEUDALISM 1

FINANCE 6

FIREFIGHTING 1

FISHING 2

FOOD 3

FORESTRY 1

GAMES 1

GENERAL 244

GEOGRAPHY 9

GEOLOGY 1

GEOPOLITICS 1

GOLF 1

GOVERNMENT-ADMINISTRATION 7

GRAPHIC_ARTS 2

HEALTH_AND_MEDICINE 1

HERPETOLOGY 1

HISTORY 3

HOME_AND_GARDEN 1

HOME_LAUNDRY 2

HOROLOGY 1

HUMAN_SCIENCES 2

HYDROGRAPHY 1

HYDROLOGY 1

ICHTHYOLOGY 1

INLAND_WATERWAY_TRANSPORT 2

INTERNATIONAL_AFFAIRS 2

JEWELRY 1

LAW 4

LAW_ENFORCEMENT 4

LEISURE 2

LIFE_SCIENCES 3

LINGUISTICS 4

LOGIC 1

MAMMALOGY 2

MANAGEMENT 3

MANUFACTURING_INDUSTRY 8

MARRIAGE 3

MATHEMATICS 6

MEETING 9

METEOROLOGY 1

MILITARY 2

MONARCHY 1

MUSIC 2

NEWSPAPER_PUBLISHING 2

OPERA 1

ORNITHOLOGY 1

PAPERMAKING 1

PEDIATRICS 1

PERFUMERY 1

PETS 2

PHILOSOPHY 2

PHOTOGRAPHY 2

PHYSICAL_SCIENCES 1

PHYSIOLOGY 1

POLITICS 8

POLITICS_AND_GOVERNMENT 11

POLO 1

PRIMARY_AND_SECONDARY_EDUCATION 1

PRINTING 5

PROTESTANTISM 1

PSYCHOANALYSIS 1

PSYCHOLOGY 7

PUBLISHING 3

RADIO-TELEVISION 1

REAL_ESTATE 1

RELIGION 1

RESTAURATION 2

RETAIL 6

ROAD_TRANSPORT 2

RUGBY 1

SAILING_YACHTING_AND_BOATING 1

SCIENCES 2

SEA_TRANSPORT 3

SERVICE_INDUSTRY 4

SEX 1

SHAVING 2

SHOWS 1

SOCCER 2

SOCIOLOGY 7

SPORT 9

SPORTS_AND_LEISURE 1

STATISTICS 1

TENNIS 2

THEATER 1

THEOLOGY 1

TILING 1

TOPOGRAPHY 2

TOWN_AND_COUNTRY_PLANNING 1

TRANSPORT 5

UTILITIES 1

WASHING 3

WATER 1

WATER_SPORT 1

WOODWORKING 1

WRESTLING 1

Total: 586

Appendix 3.B.2: Number of Usems per Domain in the sample of 100 entries, for VERBS

 

Domain Freq.

--------------------------------------- --------

ARTS 2

BREWING 1

BUILDING_CRAFTS 1

CIRCUS 1

COMMERCE 3

CONSTRUCTION 1

CRIME 2

CUISINE 1

DRINK 1

DRUGS 1

EARTH_SCIENCES 1

ELECTRICAL_ENGINEERING 1

ELECTRICAL_WORK 1

EMPLOYMENT 3

ETHNOLOGY 2

FAMILY_PLANNING 1

FASHION 1

FINANCE 2

GARDENING 1

GENERAL 98

GOVERNMENT-ADMINISTRATION 2

HEALTH 1

HEALTH_AND_MEDICINE 3

HOROLOGY 1

HOTEL_BUSINESS 2

INLAND_WATERWAY_TRANSPORT 1

INTERNATIONAL_AFFAIRS 1

LAW_ENFORCEMENT 4

LEISURE 2

LIFE_SCIENCES 2

MANAGEMENT 1

MEETING 1

METROLOGY 1

MILITARY 2

PHILOSOPHY 1

PHYSICAL_SCIENCES 1

POLITICS 2

PSYCHOLOGY 4

RELIGION 1

RESTAURATION 2

RETAIL 1

SCIENCES 3

SEA_TRANSPORT 2

SHOWS 1

SOCCER 1

SOCIAL_ACTION 1

SOCIOLOGY 2

SPORT 3

TAXATION 3

THEOLOGY 1

TRANSPORT 1

WOODWORKING 1

Total: 180

 

Appendix 3.B.3: Number of Usems per Domain in the sample of 100 entries, for ADJECTIVES

 

Domain Freq.

--------------------------------------- --------

ARABLE_FARMING 1

ARTS 1

CARTOGRAPHY 1

CLOTHING_INDUSTRY 1

CUISINE 1

GENERAL 41

GEOMETRY 1

GRAPHIC_ARTS 1

JEWELRY 1

LEISURE 1

LINGUISTICS 1

SEA_FISHING 1

SURVEYING 1

TOPOGRAPHY 1

TRANSPORT 2

Total: 56

Appendix 4.A.1: Number of Usems per Semantic Class in the complete dataset, for NOUNS

 

 

Semantic class Freq.

-------------------------------------------------------------------------

ABSTRACT 660

ACT 463

ACTIVITY 2

ADMINISTRATIVE 37

AFFECTION 13

AGENCY 202

AMOUNT 157

ANIMAL 10

ARTIFACT 439

ARTIFACT, EDIBLE 39

ATTRIBUTE 249

BIO 77

BIRD 13

BODY_PART 111

BUILDING 105

CHANGE 223

COGNITION 168

COGNITIVE_FACT 87

COLLECTIVE 350

COLOR 3

COMMUNICATION 177

COMPETITION 1

CONSUMPTION 2

CONTAINER 2

CREATION 42

CURRENCY 18

DAY 18

EMOTION 93

ETHNOS 8

EVENT 72

FEELING 1

FISH 5

FLOWER 4

FOOD 2

FORM 41

FRUIT, EDIBLE 7

FURNITURE 19

GARMENT 24

GEOGRAPHY 40

GROUP 24

HUMAN 293

HUMAN_COLLECTIVE 2

IDEO 17

ILLNESS 14

INANIMATE 10

INDIVIDUAL_NAMES 3

INSECT 6

INSTRUMENT 100

LETTER 12

LIVING_BEING 7

LOCATION 401

MAMMAL 30

MATTER 45

MEASURE_UNIT 80

MICROORGANISM 2

MONTH 14

MOTION 115

MOVE 1

MUSIC 1

NOTION 329

OBJECT 25

OCCUPATION 84

OCCUPATION_AGENT 355

OPERATION 2

PART 404

PERCEPTION 18

PERIOD 21

PHENOMENON 135

PLANT 18

POSSESSION 55

PROCESS 5

PSYCHOLOGICAL_FEATURE 50

QUANTITY 1

RELATION 1

SHRUB 6

SITU 26

STATE 32

STATIVE 331

SUBSTANCE 68

SUBSTANCE, EDIBLE 20

SYSTEM_OF_THOUGHT 13

TIME 4

TIME_PERIOD 89

TOPS 4

TREE 9

VEHICLE 52

WEATHER 8

Total: 7326

Appendix 4.A.2: Number of Usems per Semantic Class in the complete dataset, for VERBS

 

Semantic class Freq.

-------------------------------------------------------------------------

BODY 19

CHANGE 417

COGNITION 210

COMMUNICATION 202

COMPETITION 35

CONSUMPTION 17

CONTACT 112

CREATION 102

EMOTION 64

ILLNESS 1

MOTION 202

MOVE 1

NOTION 2

PERCEPTION 54

PHENOMENON 25

POSSESSION 89

SOCIAL 172

STATE 12

STATIVE 378

Total: 2114

Appendix 4.A.3: Number of Usems per Semantic Class in the complete dataset, for ADJECTIVES

 

Semantic class Freq.

-------------------------------------------

ABSTRACT 23

AGENCY 1

AMOUNT 10

ATTRIBUTE (TAKEN FROM NOUNS) 516

COGNITION (TAKEN FROM VERBS) 11

COGNITIVE_FACT 3

COLOUR 13

DIRECTION 2

EMOTION (TAKEN FROM VERBS) 2

ENTITY 6

EVENT 5

FACULTY 3

NUMBER 5

OCCUPATION 2

OPERATION 1

PERIOD 34

PROCESS 2

PSYCHOLOGICAL_FEATURE 345

STATE 25

SUBSTANCE 1

SYSTEM_OF_THOUGHT 2

TIME_PERIOD 20

Total: 1032

 

Appendix 4.B.1: Number of Usems per Semantic Class in the sample of 100 entries, for NOUNS

 

Semantic class Freq.

-------------------------------------------------------------------------

ABSTRACT 26

ACT 11

ACTIVITY 1

AGENCY 12

AMOUNT 8

ARTIFACT 15

ATTRIBUTE 8

BIO 11

BIRD 1

BODY 8

BUILDING 3

CHANGE 4

COGNITION 1

COGNITIVE_FACT 2

COLLECTIVE 17

CURRENCY 2

DAY 5

EMOTION 2

EVENT 4

GEOGRAPHY 5

HUMAN 13

INDIVIDUAL_NAMES 1

INSTRUMENT 1

LETTER 2

LOCATION 14

MATTER 1

MEASURE_UNIT 13

MONTH 2

MOTION 1

NOTION 12

OBJECT 1

OCCUPATION 2

OCCUPATION_AGENT 6

PART 13

PERIOD 2

PHENOMENON 4

POSSESSION 1

PSYCHOLOGICAL_FEATURE 1

STATE 2

STATIVE 7

SUBSTANCE 3

SUBSTANCE,EDIBLE 1

TIME_PERIOD 13

TREE 1

Total: 263

Appendix 4.B.2: Number of Usems per Semantic Class in the sample of 100 entries, for VERBS

 

Semantic class Freq.

-------------------------------------------------------------------------

BODY 1

CHANGE 20

COGNITION 17

COMMUNICATION 13

COMPETITION 2

CONSUMPTION 5

CONTACT 4

CREATION 1

EMOTION 1

MOTION 4

PERCEPTION 2

POSSESSION 7

SOCIAL 6

STATE 3

STATIVE 13

Total: 99

Appendix 4.B.2: Number of Usems per Semantic Class in the sample of 100 entries, for ADJECTIVES

 

Semantic class Freq.

-------------------------------------------------------------------------

ATTRIBUTE(TAKEN FROM NOUNS) 24

PERIOD 4

PSYCHOLOGICAL_FEATURE 10

TIME_PERIOD 5

Total 43