The Acquisition of English Word-Final Consonants by Cantonese ESL Learners in Hong Kong ALICE Y.W. CHAN City University of Hong Kong
1. INTRODUCTION The role that one’s native language plays in the acquisition of a second or foreign language has always been of interest to linguists. Earlier discussions of language transfer often attributed a learner’s difficulty in learning a second language to differences between his/her native language and the target language. The Contrastive Analysis Hypothesis (Lado 1957), argued that target language forms that were different from the equivalent forms in the native language (L1) would be difficult to learn. This hypothesis was, however, shown to be inadequate in predicting (the strong version of the hypothesis) or explaining (the weak version) the learning difficulties that a second language (L2) learner has, as there was evidence showing that differences between languages did not always lead to learning difficulties (Odlin 1989). In view of the inadequacy of the Contrastive Analysis Hypothesis and in order to revise it to incorporate certain principles of Universal Grammar, Eckman (1977) suggests the Markedness Differential Hypothesis. This hypothesis predicts not only the areas of difficulty for second language learners, but also the relative degree of difficulty on the basis of a systematic comparison between native and target languages and markedness relations stated in Universal Grammar. Important in this hypothesis is the notion of typological markedness, which says that a phenomenon A in some language is more marked than B if, cross-linguistically, the presence of A (the implicans; Eckman 1984) necessarily implies the presence of B (the implicatum; Eckman 1984), but the presence
of B does not necessarily imply the presence of A (Eckman 1981a, 1981b). Markedness, in this sense, refers to “the relative frequency or generality of a given structure across the world’s languages” (Eckman 1996:198) and is an “independently motivated, empirical construct” rather than a matter of judgment or conjecture (Eckman 1996:201). Accordingly, the Markedness Differential Hypothesis attempts to explain difficulties in L2 acquisition on the basis of cross-linguistic data. It predicts that: (i) those areas of the target language that differ from the native language and are more marked than the native language will be difficult; (ii) the relative degree of difficulty of the areas of the target language that are more marked than the native language will correspond to the relative degree of markedness; and (iii) those areas of the target language that are different from the native language but are not more marked than the native language will not be difficult. (Eckman 1977:321) Although the goals of the Markedness Differential Hypothesis and those of the Contrastive Analysis Hypothesis are essentially the same, the former is able to account for the relative degrees of difficulty of acquisition, for the areas of difference between the native language and the target language that will not cause difficulty, as well as for the fact that a learner can make progress in acquiring the target language (Eckman 1985:293). However, in resonance with the underlying assumptions of the Contrastive Analysis Hypothesis, differences between the native language and the target language are paramount in the Markedness Differential Hypothesis, in that learner difficulties are predicted on the basis of differences between the native language and the target language. Difficulties in an area where there is no difference between the native and target
languages, thus, fall outside the scope of the hypothesis. Several areas of second/third language acquisition have been examined to investigate the effectiveness of the Markedness Differential Hypothesis in predicting areas of difficulty and relative degree of difficulty. Studies that have examined second/third language phonology acquisition have focused on, among others things, the acquisition of voicing contrasts (Bhatia 1995; Eckman 1981a; Edge 1991; Major and Faudree 1996), consonants and/or consonant clusters (Benson 1986; Eckman 1987, 1991), and syllable structures (Anderson 1987; Stockman and Pluut 1992; Tarone 1987). The results of these studies generally support the Markedness Differential Hypothesis, in the sense that the presence of the more marked implicans in the second learner’s interlanguage (Selinker 1972) implies the presence of the less marked implicatum. Moreover, learners who experience difficulty in the implicatum also experience difficulties in the implicans, but those who experience difficulty in the implicans do not necessarily experience difficulties in the implicatum. For example, Anderson (1987) found that the marked longer English consonant clusters are more difficult than the unmarked shorter ones, and that the marked final clusters are more difficult than the unmarked initial ones for learners whose native language differs from English in terms of permissible consonant sequences in word-initial and word-final positions. Eckman’s (1981b) data confirm the relative degree of difficulty between word-final voiced obstruents and word-final voiceless obstruents, finding that the former are more difficult than the latter. Supporting evidence for the Markedness Differential Hypothesis notwithstanding, there has been some criticism of the hypothesis ever since it was launched (Kellerman 1979; Zobl 1983). Research studies showing the inadequacy of the hypothesis are not lacking. In their study of the acquisition of French consonants
by Cantonese speakers, Cichocki et al. (1999) have observed several patterns that the Markedness Differential Hypothesis incorrectly predicts. Major and Kim (1999) also condemn the hypothesis’ disregard of the nature of the similarities or differences that exist between the target language and the native language in its prediction of relative degree of difficulty. The fact that relative ease or difficulty of acquisition is not specified longitudinally in terms of stages or rate of learning is also another area of criticism (Major and Kim 1999). A number of researchers whose work has been inspired in one way or another by the notion of universal markedness have either modified the theoretical constructs of the Markedness Differential Hypothesis or have suggested different extensions. Carlisle (1988) suggests the Intralingual Markedness Hypothesis, in order to incorporate markedness relationships within L2 (in addition to markedness relationships between L1 and L2) into Eckman’s theory. Eckman (1991) himself, in explaining word-final devoicing in the English of native Farsi speakers, proposes the Structural Conformity Hypothesis to discard the requirement for areas of difference (between L1 and L2) and simply claims that interlanguages obey primary language universals. Major and Kim (1999), on the other hand, put forward the Similarity Differential Rate Hypothesis to suggest “a compound influence of typological markedness and phonetic similarity/dissimilarity that works to the benefit or detriment of the L2 learner” (Leather 1999:31). Their proposal focuses on rate of acquisition rather than relative degree of difficulty as measured by ultimate achievement, claiming that dissimilar phenomena are acquired at faster rates than similar phenomena. They argue that markedness and similarity interact in interesting ways and that the former is a mediating factor affecting second language acquisition. In consonance with Major and Kim’s (1999) proposal, a number of second
language phonology acquisition models have demonstrated the significance of similarity/dissimilarity. Examples include the Perceptual Assimilation Model proposed by Best (1994), which argues that non-native contrasts are perceived in terms of their phonetic similarity to the phonological categories present in a listener’s native language (Harnsberger 2001); and the Speech Learning Model proposed by Flege, which claims that “the greater the perceived phonetic dissimilarity between an L2 sound and the closest L1 sound, the more likely it is that phonetic differences between the sounds will be discerned” (Flege 1995:239). Although the contribution of markedness universals has not been investigated in these models, it is nonetheless apparent that markedness relationships between the native language and the target language may not necessarily be the main determining factor for second language phonology acquisition. The concept of markedness itself has also come under severe attack. Because the Markedness Differential Hypothesis is based on a functional-typological approach to second language acquisition theory, markedness is defined on the basis of crosslinguistic data. Observed patterns that contradict markedness at the level of individual languages, however, have led researchers to view markedness from other perspectives. Hume (2004) argues that the notion of universal markedness is insufficient to explain language-specific properties. She suggests that markedness should be a probabilistic notion, with predictability positively correlated with unmarkedness. Within a language system, unmarked elements have a high degree of predictability, but if languages differ in terms of the elements that make up their systems and how the elements are used, predictability of the elements will also differ. The relationship between frequency and language acquisition has also provided evidence undermining the significance of universal markedness. Levelt et al. (2000)
and Roark and Demuth (2000) have found that the earlier acquired structures in each language are often much higher in frequency. However, where markedness and frequency make opposite predictions, both markedness and frequency play a role in determining language development (Stites et al. 2004). Thus, when two options for a given entity are present, both can be selected as unmarked (Rose 2003). The loss of perceptual discrimination abilities in infancy has also been found to be frequencyrelated, and models based on input frequencies are seen as a better account than markedness for such loss of discrimination (Anderson et al. 2003). Focusing on relative markedness as defined in terms of frequencies rather than implicational universals, Major and Kim (1999) also argue that the markedness relationship between voiced obstruents and voiceless obstruents does not necessarily apply to individual sounds, because some voiced obstruents (e.g., /b/) are found in certain languages (e.g., Arabic) while their voiceless counterparts (e.g., /p/) are not. All these discussions show that the notion of markedness needs to be revisited. The validity of the Markedness Differential Hypothesis and thus the appropriateness of its theoretical constructs are also yet to be determined.
2. THIS STUDY The explanatory power of the Markedness Differential Hypothesis on the learning of English pronunciation by Cantonese learners has not been the focus of much second language acquisition research. Though there has been supporting evidence showing the compliance of the interlanguage phonology of Cantonese speakers with certain universal principles (Eckman 1984, 1987), such as the Resolvability Principle (Eckman 1991) and the typological universal concerning voicing contrasts in wordfinal obstruents (Eckman 1981b), many universal generalizations have not been
investigated. It is not clear, for instance, to what extent the Markedness Differential Hypothesis is valid for predicting and explaining the relative degree of difficulty for Cantonese speakers in pronouncing word-final obstruents and sonorant consonants. Eckman (1984) documents two implicational relations that are relevant to the present study: (1) Universal implicational relations a.
Word-final voiced obstruents imply word-final voiceless obstruents.
b.
Word-final voiceless obstruents imply word-final sonorant consonants.
These two implicational universals entail the following markedness hierarchy (where “>” means “is more marked than”): (2) Markedness ranking in word-final position voiced obstruents > voiceless obstruents > sonorant consonants According to the Markedness Differential Hypothesis, then, for second language learners whose native language differs from the target language in the system of word-final consonants, sonorant consonants should be the easiest to learn and voiced obstruents the most difficult. While it is true that many Cantonese learners of English encounter difficulties with English word-final obstruents, it has also been observed that––despite their being universally less marked––word-final nasals preceding diphthongs and word-final /l/ also pose tremendous problems for Cantonese learners of English (Chan and Li 2000). In this context, a study was carried out to analyze the interlanguage data of Cantonese English as a second language (ESL) learners in Hong Kong, in an attempt to investigate the validity of the Markedness Differential Hypothesis for second language phonology acquisition by these learners. The relative degree of difficulty between the three categories of consonants, namely voiced
obstruents, voiceless obstruents, and sonorant consonants, is the centre of the study. If the results of the study show that learner difficulties conform to the markedness relationships documented, this will support the Markedness Differential Hypothesis. However, if it is shown that some Cantonese learners of English encounter difficulties in word-final consonants that do not parallel the markedness relationships, the Markedness Differential Hypothesis will be undermined.
3. DIFFERENCES BETWEEN ENGLISH AND CANTONESE English differs from Cantonese in both the inventory of permissible word-final consonants and the articulation of the segments. In terms of inventory, while all English consonants except /h j w/ can occur in syllable-final (coda) position,1 only the nasals /m n N/ and the voiceless plosives /p t k/ can occur in syllable-final position in Cantonese. Other obstruents, such as voiced plosives, fricatives (voiced or voiceless), affricates (voiced or voiceless), and other sonorant consonants, such as the lateral /l/, are not allowed in syllable-final position (Chan and Li 2000). In terms of articulation, whereas English final plosives in isolated words are often released and those in connected speech are also sometimes released, final plosives in Cantonese are obligatorily unreleased regardless of speech rate. For the voiceless bilabial /p/, the lips remain closed; for the voiceless alveolar /t/, the tongue tip clings to the alveolar ridge; and for the voiceless velar /k/, the back of the tongue touches the velum and remains there without air being released (Chan and Li 2000). The articulation of the sonorant consonant /l/ also differs significantly in the two
1
In Received Pronunciation (RP) English, the liquid /r/ does not occur in syllable-
final position, although it is found syllable finally in many other varieties (e.g., North American English).
languages because of distributional differences (and corresponding allophonic variations). In Cantonese, /l/ always surfaces as a clear [l] with the raising of the front of the tongue (secondary articulation) in addition to the primary articulation that is characteristic of an alveolar lateral. In English, /l/ in syllable-final position often surfaces as a velarized, dark [lÚ] with the back of the tongue raised (Ladefoged 2006; see also Sproat and Fujimura 1993).
4. OBJECTIVES Given that the Markedness Differential Hypothesis predicts difficulty on the basis of differences between the target language and the native language, and that there exist significant differences between the consonantal systems of English and Cantonese, the basic requirements for testing the hypothesis are met. The objectives of the study are (i) to investigate the extent to which the Markedness Differential Hypothesis, as suggested by Eckman (1977), is valid for describing the acquisition of English word-final singleton consonants by Cantonese learners of English as a second language, and (ii) to look into the relevance of universal markedness (voiced obstruents > voiceless obstruents > sonorant consonants)2 to the interlanguages of Cantonese ESL learners.
5. METHODOLOGY The research methodology of the present study is modelled on that of similar studies, such as Eckman (1991).
2
Because non-rhotic accents are widespread in Hong Kong, word-final /r/ is not
investigated in the present study. Thus, only the sonorant consonants /m n N l/ in word-final position are investigated.
5.1. Participants Twelve Hong Kong Cantonese ESL learners at intermediate and advanced levels of English proficiency participated in the study. The participants included six students from a local secondary school, all in Forms 4 or 5 (five females and one male), and six firstor second-year local university students, all English majors (three females and three males).3 The ages of the students ranged from 15 to 25 years at the time of the study, and they all started learning English as a second language at four or five years of age. The secondary students had not received any formal phonetics training before, but all the university students had taken at least one course (lasting 13 weeks) in English phonetics and phonology during their first year of university studies. They learned the accent of Received Pronunciation English. Three native speakers of English (one female and two males) residing in Hong Kong served as a comparison group to provide baseline data. They were between 23 and 35 years of age at the time of the study. They had been in Hong Kong for different lengths of time, ranging from one year to 23 years. All the native speakers of English had received formal phonetics training comparable to that received by the university participants. They all had experience teaching English to ESL students in Hong Kong or elsewhere, and two of them had extensive experience teaching
3
Form 4 and 5 students in Hong Kong are comparable to grade 10 and 11 students,
respectively, in the U.S. and Canada. The participants’ proficiency levels were identified based on their class levels: Form 4 and 5 students were categorized as intermediate, and university English majors were classified as advanced. This classification is not without problems, because the English proficiency of different students at similar class levels may differ due to individual differences. However, as no comparison was made between the two groups, it is not know if or how such differences affected the results reported here.
English pronunciation. They were chosen because they all speak English as their first language and their accents could be considered representative of Standard Englishes. One of the native speakers of English (female, 23 years old), was born in Hong Kong. She received her primary and secondary education largely at international schools and uses English as her first language for daily communication, study, and work. At the age of 16, she started teaching English to local students. Her accent is accepted as native by locals and expatriates in Hong Kong.
5.2. Data Collection Procedures Each participant performed four speech tasks during a single 20-minute session in a quiet room. The instructions for each task were given in English, written on a piece of paper, and a research assistant explained the instructions in either Cantonese or English depending on the participants’ preference. The participants’ performance in the four tasks was recorded using a high-quality portable mini-disk recorder (SONY MZR910).
5.2.1. Task 1: Reading of word lists The participants read a randomized list of 167 monosyllabic and disyllabic words one by one. So that they would not be distracted or impeded by long and difficult words, only high-frequency monosyllabic and disyllabic words such as cup, meal, sing, and lemon were included. Care was taken to ensure that different preceding vowel environments were included. For example, the list included words with final nasals following diphthongs (such as nine and lime), as well as words with final nasals following pure
vowels, long or short (such as ten and deem), or high or low (such as teen and palm).4
5.2.2. Task 2: Picture description The participants looked at a series of 101 pictures depicting different objects, actions, or scenes and were asked to produce a word appropriate to the content of each of the pictures. Cues eliciting the appropriate response were given where necessary.5 The aim of this task was to elicit words with the target final consonants without the use of spelling cues such as those used in the word-list reading task, thus eliminating the possibility of visually prompting the use of the target consonants. Although a context such as a cueing sentence or phrase was provided for some of the pictures, the participants were asked to say just the target word in isolation, not the whole sentences or phrases.
5.2.3. Task 3: Reading of passages 4
Words with complex codas of the form rC, such as fork or shark, were also
included in the study because none of the participants is a rhotic speaker and the orthography of forms with post-vocalic /r/ does not seem to have influenced the participants’ performance on the target consonants. 5
Examples of the cues given to the participants included: i.
a picture showing a girl eating an ice-cream to elicit the word eat;
ii.
a picture showing a person jumping into the swimming pool, together with a cueing clause He is jumping into the swimming ____ to elicit the word pool.
These cues were given on the picture cards in order to facilitate the participants’ understanding, and thus description, of the pictures.
For the third task, the participants read three passages, each 250–350 words in length: a narrative passage, a descriptive passage, and a fable. Only simple passages were included, because academic articles or technical writings often consist of unfamiliar vocabulary items that would hinder students’ reading fluency. The passages were selected specifically for the study to elicit words containing the final consonants under investigation. The use of three different short passages instead of one long passage ensured that a variety of topics and words were included. Their length was so decided in order to sustain participants’ interest and attention.
5.2.4. Task 4: Conversational interview Since spontaneous speech would produce speech samples more akin to performance in a real communicative situation, each participant was interviewed individually for the elicitation of spontaneous speech. The participants were given a choice of topics relating to personal experience and were asked to select one for a 15-minute discussion. Examples of the conversation topics included, My favourite hobby, The movie star I like best, and My friends and family, among others. Topics related to personal experience were offered because such topics are more likely to elicit spontaneous speech than topics relating to politics or world affairs. The interviews were conducted in a conversational manner, with the interviewer asking cueing questions to help elicit responses from the participants in case they had difficulty continuing. In the design of the test materials, care was taken to ensure a similar number of test items across the three categories and within each category. However, this was difficult to achieve, because there are more English nasals than the English lateral /l/. As a result, more words with English nasals were elicited or cued than words with the English lateral.
5.3. Data analysis methods A total of 3658 tokens of voiced obstruents, 4645 tokens of voiceless obstruents, and 6056 tokens of sonorant consonants were analyzed and transcribed by two transcribers who had attended a series of coaching sessions conducted by the researcher to ensure accuracy and consistency. Both the transcribers were very proficient in English (having each obtained a First Class Honours degree in English), had received formal training in linguistics and phonetics, were well versed in phonetic transcription, and had taught English to local students.
5.3.1. Accuracy judgment For a study like the present one, human transcription of the recordings is sufficient, because the features of the final consonants under investigation, such as the release (or non-release) of a word-final plosive, the voicing (or non-voicing) of a voiced consonant, and the presence (or absence) of a nasal, can be easily identified without the help of instrumental analysis. To ensure reliability, the study tracked both inter-rater and intrarater judgments. For productions that were regarded as difficult to judge, the two transcribers listened to the recordings at least twice, on two different occasions. In examining the participants’ pronunciation of a certain segment, they took into account all the features associated with it, including the manner of articulation, the place of articulation, and the state of the glottis (Roach 2000). The precise ways the target words were produced by each speaker were also noted. These included, among others, the substitution sounds used to replace a particular target sound, the presence or absence of final voicing (for voiced sounds), and the presence or absence of final release (for plosives).
Although Hong Kong is cosmopolitan and different varieties of English are used by both native and non-native English speakers, the accent most widely taught at schools and taken as the norm is Received Pronunciation (RP) English. For this reason, RP was taken as the norm in the data transcription process. The two transcribers’ transcriptions were compared. Original inter-rater reliability was 90%, 90%, 91%, and 88% for the word list reading, picture list reading, passage reading, and conversation tasks respectively (89% overall),6 which were considered acceptable rates. Where discrepancies in transcription occurred, the researcher listened to the items and made a third judgment, and chose the majority option.
5.3.2. Data treatment A frequency count was used to arrive at the participants’ performance on each target consonant and their overall performance on the three categories of consonants: sonorant consonants (subclassified into nasals and lateral), voiceless obstruents (subclassified into plosives, fricatives, and affricate), and voiced obstruents (subclassified into plosives, fricatives, and affricate). Separate frequency counts were carried out to analyze the participants’ performance in each task, and a summative frequency count was done to compute their overall performance in the four tasks. Productions that deviated from the target language norms, such as phone substitutions, insertions, or deletions, were counted as non-target productions, and those that were in line with target-language norms or were produced in comparable ways by native speakers were counted as target productions. The average percentage of target productions of each individual consonant
6
Inter-rater reliability was computed by dividing the number of identical
transcriptions made by the two transcribers by the total number of transcriptions made.
(by each participant) was obtained by dividing the total number of target productions by the total number of tokens cued or attempted. The average percentage of target productions of each category of consonants was calculated in a similar fashion.
6. RESULTS Because the main objective of the study is to examine the explanatory power of the Markedness Differential Hypothesis, the relative degree of difficulty between the three categories of consonants should be the focus of comparison. However, a preview of the results of the study (see below) reveals that certain subcategories of consonants (e.g., lateral) within a particular category (e.g., sonorant consonants) are significantly more problematic than other subcategories (e.g., nasals) within the same category. The following discussion of results will therefore focus on the subcategories within each category.
6.1. Participants’ performance on voiceless obstruents The participants’ performance on word-final voiceless plosives is characterized by a strong tendency of non-release. Over 54% of the total number of plosives cued are unreleased: 17% , 28%, 53%, and 70% in the word-list reading, picture-list reading, passage reading, and conversation tasks respectively; see Table 1. Thus, words such as trap and shout are pronounced [trQp|] and [S Ut|] respectively. Such performance is in consonance with earlier findings on the pronunciation of voiceless plosives by Cantonese speakers (Bolton and Kwok 1990; Chan and Li 2000). Table 1: Percentages of non-release of voiceless plosives produced by the participants and the comparison group
Participants Word list
Picture list
Passages
Conversation
Total
S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11 S12
21% 31% 4% 21% 21% 17% 24% 0% 10% 0% 21% 24%
37% 44% 33% 59% 30% 26% 48% 4% 0% 0% 7% 48%
45% 92% 60% 67% 59% 27% 71% 42% 14% 42% 61% 55%
66% 93% 53% 84% 89% 54% 85% 70% 50% 63% 70% 58%
53% 81% 48% 67% 64% 36% 68% 46% 28% 43% 57% 52%
Average
17%
28%
53%
70%
54%
Comparison group C1 C2 C3 Average
Word list 0% 21% 21%
Picture list 7% 4% 0%
14%
4%
Passages Conversation 84% 84% 78% 73% 80% 82% 80%
80%
Total 67% 62% 65% 65%
As for fricatives and the affricate /tS/, substitution of a non-target sound for a target sound is noted, though infrequently for fricatives and very rarely for /tS/. Examples of substitution include the replacement of /T/ (e.g., tooth) by [f]. The percentage of nontarget productions made for fricatives is about 6% in the four tasks (6%, 5%, 3%, and 9% in the word-list reading, picture-list reading, passage reading, and conversation tasks respectively), whereas the percentage of non-target productions made for /tS/ is about 1% in the four tasks (0%, 1%, 2%, and 1% in the word-list reading, picture-list reading, passage-reading, and conversation tasks respectively). See Table 2.
Table 2: Percentages of non-target productions made to voiceless fricatives and affricates by the participants
Voiceless fricatives S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11 S12
Word list 6% 0% 0% 0% 6% 11% 0% 6% 6% 6% 22% 6%
Picture list 0% 0% 9% 0% 0% 0% 0% 9% 18% 0% 0% 18%
Passages 0% 0% 7% 0% 0% 5% 2% 2% 7% 11% 5% 2%
Conversation 1% 0% 0% 0% 0% 57% 9% 2% 12% 0% 9% 3%
6%
5%
3%
9%
Average
Total 1% 0% 4% 0% 1% 28% 4% 3% 10% 6% 9% 4% 6%
Voiceless affricates Word list 0% for all participants
Picture list 0% for all participants except S9 (17%)
Passages 0% for all participants except S1 (29%)
Conversation Total 0% for all 0% for participants except S2 all (9%) particip ants except S1(8%),
Average
0%
1%
2%
1%
S2 (3%), and S9 (4%) 1%
Non-release of word-final voiceless plosives is also common among the comparison group, but is typically limited to the passage reading and interview tasks. 80% of the final voiceless plosives are unreleased in these two tasks. Not only is nonrelease found when a final plosive is followed by an initial consonant across word
boundaries, but it is also found when the plosive is phrase final or when it precedes a pause. Unlike for the Cantonese participants, for the native speakers of English the nonrelease of final plosives in isolated words is more rare in the word-list and picture-list reading tasks. Only about 9% were unreleased: 14% and 4% in the word-list and picturelist reading tasks respectively; see Table 1. Non-release of final voiceless plosives, being a phenomenon widely accepted by the native speaker community, is therefore not regarded as non-target-like for the participants.
6.2. Participants’ performance on voiced obstruents The Cantonese participants have a very strong tendency to devoice word-final voiced obstruents: nearly all the instances of voiced fricatives and affricate cued or attempted are devoiced by the participants without compensation strategies such as lengthening of preceding vowels; see Table 3. Non-release of final (voiced) plosives is predominant: 61% of the voiced plosives cued or attempted in the four tasks are unreleased by the participants; 33%, 37% , 64%, and 81% in the word-list reading, picture-list reading, passage reading, and conversation tasks, respectively; see Table 4. Because of this, the systematic contrast between voiced and voiceless final plosives is neutralized in many cases. For those voiced plosives that are indeed released, nearly all the instances are devoiced. Such results are in line with previous studies that investigated production of word-final consonants by learners of different native languages (e.g., Flege et al. 1992). Table 3: Percentages of devoicing of final obstruents by the participants and the comparison group
Percentages of Devoicing of Final Voiced Fricatives (Participants) Word list Picture list Passages Conversation Total
Average
100% for all participants
100% for all participants except S11 (67%)
100% for all participants
100%
97%
100%
100% for all participants except S7 (99%) 100%
100% for all participants except S7 and S11 (99%) 100%
Percentages of Devoicing of Final Voiced Affricates (Participants) Word list Picture list Passages Conversation Total 100% for all 100% for all No data 100% for all 100% for all participants participants participants participants except S10 except S1, S3, except S10 (11%) S4 and S6 (no (92%) data) Average 99% 100% No data 100% 99% Percentages of Devoicing of Released Plosives (Comparison group) Word list Picture list Passages Conversation Total C1 93% 100% 17% 37% 45% C2 47% 33% 41% 42% 41% C3 60% 0% 19% 23% 24% Average
C1
67%
43%
25%
34%
36%
Percentages of Devoicing of Final Voiced Fricatives (Comparison group) Word list Picture list Passages Conversation Total 52% 33% 3% 4%
9%
C2 C3
0% 10%
0% 0%
7% 0%
6% 0%
6% 1%
Average
21%
11%
3%
4%
5%
C1 C2 C3
Percentages of Devoicing of Final Voiced Affricates (Comparison group) Word list Picture list Passages Conversation Total 88% 100% No data 100% 92% 88% 50% 20% 50% 100% 0% No data 80%
Average
92%
50%
No data
33%
69%
Table 4: Percentages of non-release of voiced plosives produced by the participants
Percentages of Non-Release of Voiced Plosives (Participants) S1 S2
Word list Picture list Passages Conversation Total 50% 58% 74% 85% 69% 60% 46% 100% 100% 86%
S3 S4 S5 S6 S7 S8 S9 S10 S11 S12
23% 13% 33% 62% 36% 0% 27% 0% 27% 73%
33% 46% 25% 55% 27% 0% 33% 8% 25% 92%
63% 67% 76% 67% 77% 15% 43% 61% 55% 90%
42% 79% 81% 90% 92% 73% 62% 92% 76% 89%
48% 57% 64% 72% 69% 27% 44% 46% 56% 87%
Average
33%
37%
64%
81.0%
60%
As for native speakers of English, devoicing is also found, but it is often accompanied by lengthening of preceding vowels. For example, sad is pronounced [sQ˘d|] with a lengthened [Q˘]. A total of 36% of final (released) plosives are devoiced by the comparison group: 67%, 43%, 25%, and 34% in the word-list reading, picture-list reading, passage reading, and conversation tasks, respectively; see Table 3. Voiced fricatives, especially /z v D/, are seldom devoiced (5% overall; Table 3), but devoicing of the affricate /dZ/ is quite common (69%; Table 3). Though devoicing is also occasionally found among the comparison group, a comparison between the participants’ performance and the native speakers’ performance suggests that devoicing of final obstruents without lengthening of preceding vowels is much more common among the participants and is thus regarded non-target-like.
6.3. Participants’ performance on sonorant consonants In the present study, the lateral /l/ is found to be one of the most difficult segments for Cantonese participants despite the fact that other sonorant consonants, namely nasals, do not pose many problems. Relatively fewer non-target productions are made to the final nasals cued or attempted in the study. Only about 2%, 6%, and 9% of /m n N/ respectively are modified (an average of 5%; see Table 5). Most of the non-target
productions are substitution of a non-target sound for a target sound (e.g., [n] for /m/ in dim). Omission is also occasionally found (e.g., sign pronounced [s I]). Vocalization and omission are the most common strategies employed to cope with /l/. About 90% of /l/ are modified, either by omission or by vocalization by a [u]-like vowel (Table 5). Omission is typically found when a preceding vowel is [+back], such as /ç˘/ (e.g., call), but vocalization is found in various contexts regardless of the frontness or backness of the preceding vowel. Thus, the word hill, which has a preceding front vowel, is often pronounced [hIu], and the word ball, which has a preceding back vowel, is often pronounced [bç˘u]. It should be noted that when the [u]-like vowel is used to replace /l/, it is likely to surface as the sonorant [w] and be syllabified in the nucleus of the syllable as the second member of a [-w] diphthong (sound combinations such as /i:w/ and /a:w/ are sometimes regarded as diphthongs in Cantonese; Bauer and Benedict 1997). This is in accordance with recent spectrographic studies that show that Cantonese ESL learners often use a velar glide [w], rather than a [u]-like vowel, to substitute for /l/ (Hung 2000). Table 5: Percentages of non-target productions made to the different sonorant consonants by the participants
S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11 S12 Average
l 98% 97% 100% 98% 94% 90% 100% 91% 93% 59% 97% 75%
m 0% 0% 1% 10% 1% 0% 0% 0% 1% 0% 1% 14%
n 10% 14% 12% 10% 1% 1% 9% 4% 6% 2% 3% 5%
N 29% 1% 13% 13% 3% 22% 1% 2% 9% 0% 7% 1%
Nasals as a group 10% 6% 9% 10% 1% 4% 5% 1% 6% 1% 5% 6%
90%
2%
6%
9%
5%
Vocalization of final /l/ is also found among the comparison group, but it is typically limited to words with a labial articulation such as careful or people, in line with Cruttenden (2001). Unlike Cantonese participants, native speakers of English do not exhibit vocalization of /l/ in other contexts such as ill or ball, and it is not found in the word-list reading task at all (Table 6).7 In view of the significant differences between the participants’ performance and the native speakers’ performance, as well as the contexts in which the phenomenon is found, vocalization of /l/ by the participants is regarded as non-target-like alongside other non-target productions of sonorant consonants such as omission and substitution. Table 6: Percentages of vocalization of laterals produced by the comparison group
C1 C2 C3 Average
Word list 0% 0% 0%
Picture list 17% 17% 0%
Passages 24% 0% 29%
Conversation 0% 24% 0%
Total 14% 9% 10%
0%
11%
18%
7%
11%
6.4. The three categories in comparison The participants’ different performances on specific subsets of the same superset, that is, lateral versus nasals for the set of sonorant consonants, has significant effects on their overall performance on the superset. Since the participants demonstrate poorer performance on final /l/ than on final nasals, the actual number of tokens in which the final lateral is cued or attempted may have substantial effects on the overall results of the category of sonorant consonants. Had the number of words containing a final lateral been increased, the overall results of the category of sonorant consonants would have worsened. Conversely, had the number of words containing a final lateral been 7
Vocalization of /l/ is common in many dialects of English (e.g., Cockney English,
Glasgow English, Scottish English).
decreased, the overall results would have improved. The participants’ performance on a superset, thus, seems to be highly dependent on the relative frequency of occurrence of the subsets. Because of such inconsistent performance, comparisons between the three categories of voiceless obstruents, voiced obstruents, and sonorant consonants may be misleading. Nonetheless, it is obvious from the above discussion that the participants’ performance on the lateral /l/, a sonorant consonant, is much worse than their performance on voiceless obstruents, although their performance on voiced obstruents remains the poorest (Table 7). Table 7: Percentages of non-target productions made to the three categories of consonants by the participants
S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11 S12
Average
Percentages of non-target productions made Voiceless Voiced Sonorant Obstruents Consonants / Obstruents (non-release of Laterals only plosives not included) 6% 100% for all 29% / 98% 6% participants 25% / 97% 5% except S11 (99%) 27% / 100% 5% 27% / 98% 5% 18% / 94% 6% 24% / 90% 7% 25% / 100% 7% 17% / 91% 6% 22% / 93% 4% 12% / 59% 4% 19% / 97% 5% 18% / 75%
6%
100%
22% / 90%
7. DISCUSSION The previous section outlined the participants’ performance on the three categories (and
subcategories) of consonants for four different tasks. In light of the results, this section considers the interlanguages of Cantonese ESL learners, the adequacy of the Markedness Differential Hypothesis and its theoretical constructs, and the validity of implicational universals.
7.1. Cantonese ESL learners’ acquisition of English word-final singleton consonants The results of the study show that the Cantonese ESL participants encounter some difficulties in acquiring English word-final voiced obstruents and the lateral /l/ regardless of their language-training backgrounds. Despite their having learned Received Pronunciation English for at least one semester, the university participants, like their secondary school counterparts with no phonetics training, show a high percentage of devoicing of voiced obstruents and vocalization or omission of /l/. Although the phenomena noted are also found in the comparison group, the percentages of such productions made by the Cantonese participants is much higher, and there is no evidence of alternative pronunciation strategies to compensate for the non-target productions. In view of the fact that Received Pronunciation, or a standard model for pronunciation, is what most Hong Kong speakers (both teachers and students) aspire to, we have reason to believe that devoicing of voiced obstruents and vocalization of /l/ are indications of participants’ acquisitional difficulties. Nasals and voiceless obstruents, on the other hand, do not pose many problems for Cantonese ESL learners. The participants’ performance on nasals, voiceless fricatives, and the voiceless affricate /tS/ is largely unproblematic. Their performance on voiceless plosives may be the result of mother-tongue interference and their lack of awareness of the typical feature of English plosives, but given the equally widespread non-release of final plosives by the comparison group in similar contexts, there is no hard and fast
evidence to suggest acquisitional difficulties in this respect.
7.2. The predictions of the Markedness Differential Hypothesis The participants’ performance patterns suggest that the relative degree of difficulty between the different categories of consonants does not invariably parallel the predictions of the Markedness Differential Hypothesis. While the relative degree of difficulty between word-final voiceless and voiced obstruents does receive significant support, the degree of difficulty between word-final sonorant consonants and word-final voiceless obstruents does not. Participants encounter more difficulties with final /l/ (a less marked item) than with voiceless obstruents (a more marked item), and they make many more non-target productions to the former than to the latter, to a degree which is not found with the comparison group.
7.3. Markedness relationships between categories and within categories The use of implicational universals as the sole basis of markedness is problematic, especially when the internal make-up of a sound category is taken into consideration. Because different members of a sound category (e.g., sonorant consonants) can form subsets (e.g., lateral and nasals), implicational universals relating one subset to another are important for the determination of the relative markedness between different subsets. If the different subsets of a superset are not equally marked, the markedness relationships between different supersets may not follow (see section 6.4). Cross-linguistic studies of the phonological systems of the world’s languages, however, are not explicit about these subsets. The Markedness Differential Hypothesis, thus, makes no prediction regarding the relative degree of difficulty of the individual segments (or subsets) within a superset. For this reason, predictions made
with regard to the relative degree of difficulty of different supersets may not be borne out. The English lateral is a good example of this problem. The results of this study suggest that /l/ should not be treated as equally marked as English nasals, yet both subsets belong to the same superset of sonorant consonants. The possible effect of /l/ on the relative degree of difficulty between different supersets (i.e., voiced obstruents, voiceless obstruents, sonorant consonants), thus, cannot be explained by a theory that bases its arguments on existing implicational universals, such as the Markedness Differential Hypothesis.
7.4. Allophonic variations and frequency effects It appears from the data of this study that certain factors other than implicational universals should be given due attention when explanations for participants’ performance are invoked. One factor that requires attention is the difference between a phoneme and its allophones. As is well known, phonemes are abstract entities whose allophonic realizations may vary in different contexts. In generalizing universal statements regarding the presence or absence of sounds or sound sequences, linguists often use phonemes, rather than allophones, as the basis. Frequency counts are also made in terms of phonemes (Greenberg 1966). However, the importance of isolating allophones from phonemes has already been observed in the speech learning literature. Strange (1992), for example, has found that Japanese learners of English perceive and produce English liquids more accurately in word-final position than in word-initial position. In his Speech Learning Model, Flege (1995) hypothesizes that positional allophones in the second language are related to the closest positionally defined allophone in the first language. Flege and Wang also conclude that speech production skills must be “learned
on an allophone-by-allophone basis” (1989:303). Allophonic variation is thus an essential part of the description and analysis of a learner’s acquisition of a second language. Different allophones of a phoneme have different allophonic distributions, so an allophone may be more frequent (and more basic) than other less frequent (non-basic) ones that differ from the basic one by possession of a marked feature (Greenberg 1966). The velarized (dark) [lÚ] occurs less frequently than the clear [l] across languages (Maddieson 1984). Although sonorant consonants are less marked than obstruents crosslinguistically, there seems to be a conflict between markedness and frequency in this respect. The infrequent distribution of dark [lÚ], coupled with the secondary articulation which is required in the production of the allophone, may render the English word-final lateral a much more marked element across languages. Thus, this may obscure the relative markedness (and thus the relative degree of difficulty for second language learners) between English sonorant consonants and obstruents (voiced or voiceless) and result in an otherwise unexpected pattern of second language acquisition, such as the one reported in the present study. While there is no doubt that, all things being equal, a marked item should be more difficult to learn than an unmarked item, it is debatable whether implicational universals should be used to form the basis of markedness (Major 1996; Rutherford 1982), and more importantly, whether markedness alone should be used as a predictor of difficulty (Major 2001). As Hume (2004) argues, predictions based on markedness are only made on patterns that are supposed to be universal. The markedness relationship between English sonorant consonants (especially /l/) and obstruents is allophonic and language-specific. Predicting the relative degree of difficulty of second language sounds simply on the basis of universal generalizations about
phonemes––as the Markedness Differential Hypothesis does––is far from adequate. Other factors such as allophonic variation, frequency effects, predictability, and the like, must also be taken into account.
8. CONCLUSION In this article, I have investigated the acquisition of English word-final singleton consonants by 12 Cantonese learners of English as a second language in Hong Kong. Learners encounter the most difficulties with voiced obstruents and the lateral /l/, while their performance on other sonorant consonants and on voiceless obstruents is good overall. The results of the study suggest that the Markedness Differential Hypothesis does not make the correct prediction regarding second language phonology acquisition by Cantonese ESL learners, and that implicational universals should not be used as the sole determining factor for markedness. This study has both theoretical and pedagogical implications. On the theoretical side, the data have provided a significant test case for the Markedness Differential Hypothesis and invite further thoughts on its theoretical underpinnings. On the pedagogical side, the findings may serve as input to the focus of pronunciation teaching. Given that the relative degree of difficulty of different subsets of the same superset is different, teaching professionals should devote more attention to the more difficult subset(s) and sequence their teaching materials appropriately. Since only one type of markedness relationship regarding word-final singleton consonants has been investigated, the relationships that exist between other categories of sounds or sound sequences have not yet been dealt with. Learners’ perceptual abilities have not been examined either. As is well known, second language learners often need to precisely perceive new phonemic contrasts before they can produce these same contrasts
accurately. The Speech Learning Model discussed earlier has also been devised on the premise that a learner’s production of a second-language sound is closely related to the way the sound is perceived. Given the focus of the present study, it is unclear how learners’ perceptual abilities might affect their production abilities and whether the Markedness Differential Hypothesis could account for this. Further research is needed to examine Cantonese learners’ acquisition of other phonological segments, such as vowels, as well as their perceptual abilities in differentiating different categories of sounds.
REFERENCES Anderson, Janet I. 1987. The markedness differential hypothesis and syllable structure difficulty. In Interlanguage phonology: The acquisition of a second language sound system, ed. Georgette Ioup and Steven Weinberger, 279–291. Cambridge: Newbury House. Anderson, Jennifer L., James L. Morgan, and Katherine S. White. 2003. A statistical basis for speech sound discrimination. Language and Speech 46:155–182. Bauer, Robert S., and Paul K. Benedict. 1997. Modern Cantonese phonology. New York: Mouton De Gruyter. Benson, Bronwen. 1986. The markedness differential hypothesis: Implications for Vietnamese speakers of English. In Markedness, ed. Fred R. Eckman, Edith A. Moravcsik, and Jessica R. Wirth, 271–289. New York: Plenum Press. Best, Catherine T. 1994. The emergence of native-language phonological influences in infants: A perceptual assimilation model. In The development of speech perception: The transition from speech sounds to spoken words, ed. Judith C. Goodman and Howard C. Nusbaum, 167–224. Cambridge, MA: MIT Press.
Bhatia, Tej K. 1995. Acquisition of voicing and aspiration in second language development. In The teaching and acquisition of South Asian languages, ed. Vijay Gambhir, 183–196. Philadelphia: University of Pennsylvania. Bolton, Kingsley, and Helen Kwok. 1990. The dynamics of the Hong Kong accent: Social identity and sociolinguistic description. Journal of Asian Pacific Communication 1:147–172. Carlisle, Robert S. 1988. The effect of markedness on epenthesis in Spanish/English interlanguage phonology. Issues and Developments in English and Applied Linguistics 3:15–23. Chan, Alice Y.W., and David C.S. Li. 2000. English and Cantonese phonology in contrast: Explaining Cantonese ESL learners’ English pronunciation problems. Language, Culture and Curriculum 13:67–85. Cichocki, Wladyslaw, A.B. House, A.M. Kinloch, and A.C. Lister. 1999. Cantonese speakers and the acquisition of French consonants. Language Learning 49, supplement 1:95–121. Cruttenden, Alan. 2001. Gimson’s pronunciation of English. 6th ed. London: Arnold. Eckman, Fred R. 1977. Markedness and the contrastive analysis hypothesis. Language Learning 27:315–330. Eckman, Fred R. 1981a. On predicting phonological difficulty in second language acquisition. Studies in Second Language Acquisition 4:18–30. Eckman, Fred R. 1981b. On the naturalness of interlanguage phonological rules. Language Learning 31:195–216. Eckman, Fred R. 1984. Universals, typologies and interlanguage. In Language universals and second language acquisition, ed. William E. Rutherford, 79–105. Amsterdam: John Benjamins.
Eckman, Fred R. 1985. Some theoretical and pedagogical implications of the markedness differential hypothesis. Studies in Second Language Acquisition 7:289– 307. Eckman, Fred R. 1987. The reduction of word-final consonant clusters in interlanguage. In Sound patterns in second language acquisition, ed. Allan James and Jonathan Leather, 143–162. Dordrecht: Foris. Eckman, Fred R. 1991. The structural conformity hypothesis and the acquisition of consonant clusters in the interlanguage of ESL learners. Studies in Second Language Acquisition 13:23–41. Eckman, Fred R. 1996. A functional-typological approach to second language acquisition theory. In Handbook of second language acquisition, ed. William C. Ritchie and Tej K. Bhatia, 195–211. San Diego: Academic Press. Edge, Beverly A. 1991. The production of word-final voiced obstruents in English by L1 speakers of Japanese and Cantonese. Studies in Second Language Acquisition 13:377–393. Flege, James E. 1995. Second language speech learning: Theory, findings and problems. In Speech perception and linguistic experience: Issues in cross-language research, ed. Winifred Strange, 233–277. Baltimore: York Press. Flege, James E., Murray J. Munro, and Laurie Skelton. 1992. Production of the wordfinal English /t/ - /d/ contrast by native speakers of English, Mandarin and Spanish. Journal of the Acoustical Society of America 92:128–143. Flege, James E., and Chipin Wang. 1989. Native-language phonotactic constraints affect how well Chinese subjects perceive the word-final English /t/ - /d/ contrast. Journal of Phonetics 17:299–315. Greenberg, Joseph H. 1966. Language universals: With special reference to feature
hierarchies. The Hague: Mouton. Harnsberger, James D. 2001. On the relationship between identification and discrimination of non-native nasal consonants. The Journal of the Acoustical Society of America 110:489–503. Hume, Elizabeth. 2004. Deconstructing markedness: A predictability-based approach. In Berkeley Linguistics Society: Proceedings of the Annual Meeting 2004, 182–198. Department of Linguistics, University of California, Berkeley. (Also available at http://www.ling.ohio-state.edu/~ehume/papers/Hume_markedness_BLS30.pdf). Hung, Tony T.N. 2000. Towards a phonology of Hong Kong English. World Englishes 19:337–356. Kellerman, Eric. 1979. The problem with difficulty. Interlanguage Studies Bulletin 4:27–48. Ladefoged, Peter. 2006. A course in phonetics. 5th ed. Boston: Thomson Wadsworth. Lado, Robert. 1957. Linguistics across cultures. Ann Arbor: University of Michigan Press. Leather, Jonathan. 1999. Second-language speech research: An introduction. Language Learning 49, supplement 1:1–56. Levelt, Clara C., Niels O. Schiller, and Willem J. Levelt. 2000. The acquisition of syllable types. Language Acquisition 8:237–264. Maddieson, Ian. 1984. Patterns of sounds. Cambridge: Cambridge University Press. Major, Roy C. 1996. Markedness in second language acquisition of consonant clusters. In Second language acquisition and linguistic variation, ed. Robert Bayley and Dennis R. Preston, 75–96. Amsterdam: John Benjamins. Major, Roy C. 2001. Foreign accent: The ontogeny and phylogeny of second language phonology. Mahwah: Lawrence Erlbaum Associates.
Major, Roy C., and Michael C. Faudree 1996. Markedness universals and the acquisition of voicing contrasts by Korean speakers of English. Studies in Second Language Acquisition 18:69–90. Major, Roy C., and Eunyi Kim. 1999. The similarity differential rate hypothesis. Language Learning 49, supplement 1:151–183. Odlin, Terence. 1989. Language transfer: Cross-linguistic influence in language learning. Cambridge: Cambridge University Press. Roach, Peter. 2000. English phonetics and phonology: A practical course. 3rd ed. Cambridge: Cambridge University Press. Roark, Brian, and Katherine Demuth. 2000. Prosodic constraints and the learner’s environment: A corpus study. In Proceedings of the 24th annual Boston University conference on language development, ed. S. Catherine Howell, Sarah A. Fish, and Thea Keith-Lucas, 597–608. Somerville, MA: Cascadilla Press. Rose, Yvan. 2003. Place specification and segmental distribution in the acquisition of word-final consonant syllabification. Canadian Journal of Linguistics 48: 409– 435. Rutherford, William E. 1982. Markedness in second language acquisition. Language Learning 32:85–108. Selinker, Larry. 1972. Interlanguage. International Review of Applied Linguistics in Language Teaching 10:209–231. Sproat, Richard, and Osamu Fujimura. 1993. Allophonic variation in English /l/ and its implications for phonetic implementation. Journal of Phonetics 21:291–311. Stites, Jessica, Katherine Demuth, and Cecilia Kirk. 2004. Markedness vs. frequency effects in coda acquisition. In Proceedings of the 28th annual Boston University conference on language development, ed. Alejna Brugos, Linnea Micciulla, and
Christine E. Smith, 565–576. Somerville, MA: Cascadilla Press. Stockman, Ida J., and Erna Pluut. 1992. Segment composition as a factor in the syllabification errors of second-language speakers. Language Learning 42:21–45. Strange, Winifred. 1992. Language non-native phoneme contrasts: Interactions among subject, stimulus, and task variables. In Speech perception, production and linguistic structure, ed. Yoh'ichi Tohkura, Eric Vatikiotis-Bateson and Yoshinori Sagisaka, 197–219. Tokyo: Ohmsha. Tarone, Elaine E. 1987. Some influences on the syllable structure of interlanguage phonology. In Interlanguage phonology: The acquisition of a second language sound system, ed. Georgette Ioup and Steven Weinberger, 232–247. Cambridge: Newbury House. Zobl, Helmut. 1983. Markedness and the projection problem. Language Learning 33:293–313