Interlanguage lexicology of Arab students of English: A computer learner corpus -based approach
Since the very early emergence of machine-readable corpora into the linguistics scene in the 1960s, the direction of a considerable body of linguistic research began to shift from syntax and phonology, the, by then, focus of linguistic research, to a number of domains that remained mostly neglected under the umbrella of traditional approaches. Fortunately, lexicology, the target of this research, was a major beneficiary of that dramatic shift.
By employing a computer learner corpus-based approach, this study addresses multidimensional lexical aspects of a machine-readable corpus of the writing of Arab students of English as a foreign language. Lexical investigation of this corpus, which was solely compiled to serve the objectives of this study, required the existence of a similar sized authentic corpus, which was, in turn, methodically selected from Louvain Corpus of Native English Essays (LOCNESS). Via the computerized contrastive and analytical methods employed here, this dissertation aims at exploring: (1) learners' lexical complexity and richness, (2) how far the learner corpus is deviant from the reference corpus in terms of the features and percentages of the top most 200 frequent tokens and hapax legomena, (3) how far the learner corpus is influenced by learners' L1, (4) the most salient lexical and stereotyped features of the learner corpus, (5) learners' lexical and collocational errors and (6) whether learners' collocational knowledge is on a par with their lexical knowledge.
Findings show that: (1) the learner corpus is much less complex in terms of lexical diversity and density than the reference corpus, (2) learners' top 200 tokens are markedly characterized by vague lexica, excessive overuse of the most frequently used words and L1 transfer, (3) rhetorically speaking, learners' writing is much closer to their L1 than to L2, (4) no source of lexical errors is more confusing for learners than near-synonyms, (5) a significant degree of diversity in terms of the incorrect use of collocations is obviously ascribed to the method of investigation, (6) a considerable body of collocational errors occurs as a result of the learners' limited word stock rather than from their ignorance of the collocability between the target lexical items, and (7) learners' free writing collocations are well-governed by their L1 collocations and thus, the degree of success in the use of the target collocations depends heavily upon the degree of similarity between the two languages (positive transfer).