9.6.1. Reliability and validity results

Jegyzet elhelyezéséhez, kérjük, lépj be.!

Reliability of the analytical process is partly ensured by the procedure itself. Having two approaches, hence two analyses to each text (a non-cohesive and a cohesive analysis, supplemented with a list of errors) it is easy to account for all the items in the text in one way or another. Still, it cannot be sure that everyone would categorize non-cohesive items or cohesive chains in the same way. While it is acknowledged that some errors are inherent in the analysis, based on previous reliability checks we assume that there should be at least a 90% overlap between two analyses of the same coder or between the results of the analyses of two coders. For this reliability analysis I asked an advanced-level English speaker friend, Judit Andruskó, to analyze an RA based on the guidelines I had sent her. For it to be a pedagogical tool besides being an analytical tool, Referential Cohesion Analysis should be such that it can be carried out by anyone with sufficient English language knowledge to understand the particular text. The analysis should be simple enough to be performed by students without extensive training. Judit received a five-page description of the instructions (see Appendix A) and was asked to report any difficulties or problems during the analysis. She reported that she had found the task quite enjoyable, though it took her a bit more than an hour to finish the analysis of 15–20 sentences, depending on the length of the sentences and the number of items in them. With some practice the analysis gets much easier, and it is actually possible to reach 50–60 sentences per hour (which still means that it takes about 12 hours to analyze an MA thesis, which is not really the efficiency one would dream of). She learned the method very quickly and her analysis was very helpful, and her feedback on the tool was informative.

Jegyzet elhelyezéséhez, kérjük, lépj be.!

The reliability test of the instrument was carried out using two analyses of RA5 (by a trained co-coder and the author of this research). The article was 318 sentences long and 70 cohesive chains were found in the text. The only difference was the position of two chains in the analysis, where the author had the presupposed items in the abstract, the other coder put them in the introduction. This is a much better result than at Stage 2 (that is, 26 identical and 13 different chains in the inter-coder comparison of the analysis of RA3). Still, considering the actual ties within the chains of reference there were a number of differences, specifically that the co-coder’s analysis gave 286 cohesive ties while the author found 320. What is surprising is that there were no ties that appeared in one chain in one analysis and somewhere else in the other, but the only difference was the presence of a certain item in one analysis which did not appear at all in the other. That is, there were 34 items more in one analysis than in the other, while the other 286 were identical, that is, 89.44% of the total items in the author’s analysis. While this is not exactly the 90% reliability aimed at, the result is still acceptable, especially if considering the ways in which the two analyses differed.

Jegyzet elhelyezéséhez, kérjük, lépj be.!

The four most frequent items that were missing from the co-coder’s analysis of RA5 are each, here, such, then, and (from most to least frequent); out of the pronouns, only their was missing twice. Actually, these differences are probably due to either a lack of experience in using the analytical tool or items that were overlooked (an occurrence that is unavoidable to some extent). In connection with the other items, mainly determiners, that were missing, it may be assumed that they disappeared as a result of the coder’s lack of experience in the register used in research articles, especially in the field of linguistics or applied linguistics. This will also be true for some university students taking up academic skills or advanced writing courses, which makes the following findings pedagogically relevant. In the previous reliability analysis, it was part-whole relationships that caused differences, but here, pairs of items such as the following were not recognized by the advanced English speaker coder:

Jegyzet elhelyezéséhez, kérjük, lépj be.!

  • strings of phonemes these nonwords
  • basic linguistic information the stimuli
  • a subsequent study the most significant finding
  • Rodekohr & Haynes (2001) investigated this study
  • 581 school-age children their NWR performance
 

Jegyzet elhelyezéséhez, kérjük, lépj be.!

All the items above (and the others not listed here) contain some academic vocabulary that may have led to uncertainty on the part of the coder and to the exclusion of the items from the analysis. What we need to consider here is how much of the academic vocabulary university students are familiar with and whether it causes problems in discourse production and processing.

Jegyzet elhelyezéséhez, kérjük, lépj be.!

As a first approach to the data, the total number of referring items was counted for both RAs and MA theses. As texts in the corpora were markedly different in length, the articles ranging from 2040 to 10,425 words and the theses from 10,140 to 20,883 words, comparability was ensured by using normalized data: results for each text were divided by the number of words in the given text and multiplied by 1000 for easier counting. It would have been possible to use the number of sentences as a basis for normalizing the data, as cohesion concerns relationships between sentence boundaries. However, the differences between the lengths of the sentences (see Table 20 below) and the sometimes-arbitrary nature of the identification of sentence boundaries might have blurred the results as compared to the word-based method. The average sentence length did not differ greatly, though in general it is true that the lower the proficiency, the shorter the sentences are in the three corpora.

Jegyzet elhelyezéséhez, kérjük, lépj be.!

Table 20 below shows the three sub-corpora we have been working with. The corpus of 20 RAs contains approximately the same number of words as the high- and low-rated MA thesis corpora of 10 texts each. The total number of items in the table refers to all those reference items that were listed among the types of reference (see Table 8), that is, both cohesive and non-cohesive items whose status we needed to consider in the course of the analysis. All three corpora were subjected to a lexical analysis carried out by Cobb’s (2002) Vocabulary Profiler. This analysis was used to confirm that, besides receiving different grades at the university, the differences in the language proficiency of the three groups of texts are reflected in their vocabulary choices. The percentages for the vocabulary profile indicate the ratio of K1 words (the most frequent 1000 words), K2 words (the most frequent 1001–2000 words), and AWL words (from the Academic Word List) used. The higher the K1 percentage, the simpler vocabulary is used in the text, which indicates lower proficiency; likewise, a higher ratio of AWL words will indicate a greater reliance on academic vocabulary. The percentages in the table for each corpus indeed show that low rated theses do contain the highest ratio of K1 words and the lowest ratio of AWL words. In general, there is an observable a high ratio of lexical density which is in line with Halliday’s (2004) observation of the same phenomenon in academic discourse.
 

Jegyzet elhelyezéséhez, kérjük, lépj be.!

Table 20 Summary table of corpus data
Corpus:
Research Articles
(20 texts)
High Rated
MA theses
(10 texts)
Low Rated
MA theses
(10 texts)
Total no. of words
124,290
145,372
157,249
Total no. of sentences
in the corpus
4,802
5,899
6,792
Avg. sentence length (=words/sentence)
26
25
23
Total no. of items (normalized × 100)
15,541
(12.7)
25,512
(17.5)
20,233
(12.8)
Vocabulary profile1
(K1 / K2 / AWL)
73.4% / 4.4% / 11.1%
84.2% / 4.5% / 10.4%
86% / 4.6% /
9.4%
Lexical density (content words / total)
0.61
0.58
0.60
% of non-cohesive items
71%
63%
58%
% of cohesive items
29%
37%
42%
Avg. total of cohesive ties (average / text)
260
(43.1)
539
(36)
602
(38.1)
Average no. of cohesive chains (average / text)
67.3
(11.2)
190
(13.1)
202
(12.7)
Av. no. of long cohesive chains (average / text)
25.9
(4.4)
64
(4.4)
69
(4.5)
 

Jegyzet elhelyezéséhez, kérjük, lépj be.!

The second part of Table 20 gives general information about the cohesive ties and chains found in the corpora. Percentages of cohesive and non-cohesive items out of the total number of items searched were counted, which produced rather unexpected results. Apparently, the highest ratio of cohesive items characterized low rated theses, and not RAs, with high-rated theses in between these results. While this phenomenon is very interesting, there might be a wide range of possible reasons behind it. It might be hypothesized that a higher ratio of cohesive items is related to text length, but it could also be proficiency-related or even accidental. What suggests that it is more of an expert-novice difference is that a comparison between the specific types of cohesive referring items shows systematic differences between item frequencies. For example, the pronoun she as part of a cohesive tie appears on average once in RAs, 7.9 times in high-rated and 21.7 times in low-rated papers; in addition, all the other pronouns display a similar frequency pattern.
1 Cobb,T. Web Vocabprofile [accessed 24 January 2024 from http://www.lextutor.ca/vp/], an adaptation of Heatley, A., Nation, I. & Coxhead, A. (2002). RANGE and FREQUENCY programs. Available at http://www.victoria.ac.nz/lals/staff/paul-nation.aspx
Tartalomjegyzék navigate_next
Keresés a kiadványban navigate_next

A kereséshez, kérjük, lépj be!
Könyvjelzőim navigate_next
A könyvjelzők használatához
be kell jelentkezned.
Jegyzeteim navigate_next
Jegyzetek létrehozásához
be kell jelentkezned.
    Kiemeléseim navigate_next
    Mutasd a szövegben:
    Szűrés:

    Kiemelések létrehozásához
    MeRSZ+ előfizetés szükséges.
      Útmutató elindítása
      delete
      Kivonat
      fullscreenclose
      printsave