4.4. Data collection: building the corpora

Jegyzet elhelyezéséhez, kérjük, lépj be.!

The analysis in the present study is partly qualitative in nature; as a result, it is best to work with a small, specialized corpus. Initially, a sample of 20 research articles and 20 MA theses were collected for the purposes of this study. However, from the analysis of the first ten papers in each category, it turned out that some of them would not be suitable for comparison. The analysis revealed that empirical and theoretical research papers – despite having the same apparent structure – have markedly different referential patterns. This difference was considerable enough to divide the two kinds of papers into two different genres and distort any comparison between the corpora. Therefore, theoretical research papers were excluded from the sample and were replaced with new empirical papers. The total corpus was gradually built, and we approached it from different perspectives at each step in the construction of the analytical tool.

Jegyzet elhelyezéséhez, kérjük, lépj be.!

  • Research articles and MA theses were collected based on the following selection criteria: written since the year 2000 (in order to make sure that neither language change, nor major changes in editorial requirements affect the language used in the RAs); presenting the results of some empirical research; and from the field of applied linguistics (in order to make the corpus representative of this field, care was taken to include articles of varying length from a range of topics within applied linguistics, such as: education, education technology, language technology, psycholinguistics, discourse analysis, and second language acquisition). Research articles and MA theses were collected based on the following selection criteria: written since the year 2000 (in order to make sure that neither language change nor major changes in editorial requirements affect the language used in the RAs); presenting the results of some empirical research; and from the field of applied linguistics (in order to make the corpus representative of this field, care was taken to include articles of varying length from a range of topics within applied linguistics, such as: education, education technology, language technology, psycholinguistics, discourse analysis, and second language acquisition).

Jegyzet elhelyezéséhez, kérjük, lépj be.!

These MA theses were made available from the Hungarian Corpus of Learner English (Károly & Tankó, 2009). The corpus of theses in this book contains papers written in Applied Linguistics by English major students. The research approaches and methods differed greatly in the theses in the corpus: there were case studies, theoretical papers and research papers reporting empirical studies. In the initial analysis of a few MA theses, the structures of these different kinds of papers were found to differ greatly. To ensure comparability with the RAs and among different sets of MA theses, only those papers were chosen that report a piece of empirical research. These students’ level of English extends from advanced to near native. While the language of the papers might be slightly affected by the thesis supervisors’ guidance and corrections, this influence cannot be estimated based on the papers.
Tartalomjegyzék navigate_next
Keresés a kiadványban navigate_next

A kereséshez, kérjük, lépj be!
Könyvjelzőim navigate_next
A könyvjelzők használatához
be kell jelentkezned.
Jegyzeteim navigate_next
Jegyzetek létrehozásához
be kell jelentkezned.
    Kiemeléseim navigate_next
    Mutasd a szövegben:
    Szűrés:

    Kiemelések létrehozásához
    MeRSZ+ előfizetés szükséges.
      Útmutató elindítása
      delete
      Kivonat
      fullscreenclose
      printsave