Postcolonial Literature and World Englishes: A Corpus-Based Approach of Modes of Representation of the Non-Standard in Writing

The present study investigates the representation of non-standardised varieties of English in literary prose texts. This is achieved by creating and annotating a corpus of literary texts from Scotland, West Africa, and Southeast Asia. The analysis addresses two major topics. Firstly, the extent of representation reveals clearly distinct feature profiles across regions, coupled with varying feature densities. Feature profiles are also relevant to individual characters, as certain traits such as social status, ethnicity, or age can be signalled by linguistic means. The second topic, accuracy of representation, compares the features observed in literary texts with descriptions of the actual varieties, and suggests that representations of varieties may differ from their reallife models in the sense that highly frequent features may be absent from texts, while less frequent but more emblematic ones, or even invented ones, may be used by authors to render a variety of English in their texts.


Introduction
The spread of English as a world language and the subsequent development of numerous varieties during the colonial and postcolonial periods have led to the emergence of local literatures that use local varieties of English.Although literary texts form a written medium, a trait which favours formal and thus standardised language, they contain various instances of non-standardised usage.The possible motivations for authors to make use of non-standardised local language are manifold; we posit three such options: (1) attempts to realistically render the oral speech of characters (see also Mair 1992: 106;Page 1988: 1-23), (2) the attribution of traits associated with non-standardised language forms to characters (see also Blake 1981: 12-13;Mair 1992: 107), (3) a show of support for local language forms in defiance of exo-normative standards (see also Mair 1992: 119-120).While the first motivation relates to aspects of authenticity, realism, and mimesis, the other two motivations are not only of a more symbolic nature, but also serve functional purposes, which are text-internal in (2) and external to the text in (3).
A linguistic analysis of the use of non-standardised language in literary texts has applications for literary scholarship, as the type of representation, ranging from mimetic to symbolic, and the possible motivations for its use, constitute textual devices that can be interpreted.The applications for the field of linguistics is less obvious, as the language attributed to characters in literary texts is generally not made up of transcripts of actually produced speech, but should rather be considered "invented speech" (Schneider 2002: 73).The consequence of this is that literary texts are not the best choice for the description and documentation of contemporary language varieties, for which transcripts of actual speech are better suited.In spite of their "invented" nature, literary texts can serve as a data source to answer other linguistic questions, given that invented utterances ascribed to fictional characters rely on real-life models (Fowler 1989: 114) that the author recollects and adjusts.In other words, authors do not make up nonstandardised language out of thin air, but "have at their disposal all the codes and resources of language" (Simpson 1997: 164) from which they can draw.Deriving from this, studying the use of a given language variety in literary texts can yield insights into how this variety is perceived.The present article addresses two major questions: (1) To what extent are non-standardised linguistic features represented in literary texts?The feature profiles resulting from this inquiry have further ramifications related to the types of linguistic features used, the frequency with which they are used, and which characters used them.(2) How accurate is the representation of a linguistic variety in literary texts in comparison to the actual variety?Answering this question provides insights not only into the mimetic or symbolic nature of representation, but also into its authentic/invented dimension.Authors do not necessarily aim at a realistic representation of non-standardised varieties.They may use variation (or variational features) for other purposes such as social characterisation, or support for local non-standard varieties of language.
In a research project called The representations of oral varieties of language in the literature of the English-speaking world, we address these questions by basing our methodology on a corpus-based approach.The text selection process is restricted to prose, so as to exclude the influence of rhyme and metre on syntax in poetic texts, or the instruction to use a particular accent in stage directions rather than representing the accent itself in dramatic texts.Texts are selected from anthologies and collections, so as to better represent the literary landscape.The selection is not limited to well-known or canonical texts; the only criterion being the presence of local linguistic features.The selected texts are then digitised and manually annotated for non-standardised linguistic features, as shown in Table 1.
The annotation is in the XML format and facilitates the storage of the following information for each observed feature: firstly, the feature category, which distinguishes the four major categories 'phonology', 'grammar', 'lexis', and 'code'.The category 'lexis' refers to semantic shifts or lexical innovations compared to standardised varieties.The category 'code' marks instances of code-mixing or code-switching to a language other than English.The category 'phonology' deserves a more detailed definition, as it corresponds to respellings that reflect a change in pronunciation, e.g. a change from <the> /ðǝ/ to <de> /dǝ/, which stands in contrast to respellings that have no bearing on actual pronunciation, e.g. a change from <listen> to <lissen> which still represents /lɪsǝn/.The latter type of respelling makes use of "sound-symbol correspondences which are conventional for the language, but are the 'wrong' ones for the particular 'word'" (Sebba 2007: 34).It has been variously labeled as "eye-dialect" (Preston 1982) or "graphemic substitution" (Androutsopoulos 2000: 522).These respellings are "phonologically unmotivated, giving the impression of non-standardness but not providing any linguistic detail" (Honeybone & Watson 2013: 313), which justifies the distinction between a category named 'phonology' as opposed to linguistically inconsequential respellings.
In addition to these main categories, relevant attributes are marked, which, depending on the feature category, can be the observed feature, the expected equivalent in a standardised variety, the meaning, and the language used; further, grammatical features possess an 'ewave' attribute which corresponds to the identification number for the feature in eWAVE, the Electronic World Atlas of Varieties of English (Kortmann & Lunkenheimer 2013).In addition to linguistic features, character and narrator passages are also annotated, which makes it possible to assess feature profiles for individual characters and narratorial modes.The research project from which the present study is derived places a focus on the literary representation of linguistic features in Outer Circle varieties (cf.Kachru 1985), i.e. societies in which English is not a native language for the majority of the population but plays an important societal role, typically as a colonial legacy.At the current stage, the project draws on two sub-corpora of texts from West Africa and Southeast Asia.A third sub-corpus serves as a yardstick of comparison, and is composed of texts from the Inner Circle, i.e. societies in which English serves as a native language.For novels, only excerpts are selected rather than the entire texts so as to include more authors and thus be more representative of a region's literary landscape.In contrast, short stories are brief enough to be included in their entirety.2010: 270-271).The figures shown in the present study are produced with R (R Core Team 2015), using the packages "ggplot2" (Wickham 2009) and "plotrix" (Lemon 2006).The ultimate data collection goal of the project is to diversify the number of countries per region, obtain comparable word counts for each decade for a study of diachronic developments, and reach a size of 100,00 words for each sub-corpus.

Feature profiles and accuracy of representation
The data analysis in the present article is divided into two main topics.The first topic investigates whether meaningful linguistic feature profiles can be identified for regional variations of English.This examination is first applied from a broad cross-regional perspective, then on a narrower basis of profiles for individual fictional characters.The second topic is concerned with the accuracy of represented features, for which linguistic features observed in literary texts are compared to those listed in previous descriptive accounts of the real varieties.

Feature profiles
The annotation of non-standardised linguistic features in the literary texts makes it possible to objectively and quantitatively assess the types of linguistic features used, as well as their frequencies.The present section first looks at how the various sub-corpora, corresponding to various world regions, fare in such a comparison, after which the focus shifts to characters and how nonstandardised language is used to shape their portrayal.

Regional profiles
The regional feature profiles, shown in Figure 1, suggest stark contrasts across regions: Scottish literary texts represent the local non-standardised variety of English primarily by means of phonological features, Southeast Asian texts use code-mixing as their most important feature, while West African texts emphasise grammatical features.While the three regions exhibit striking differences, a common underlying pattern can be observed: rather than using an equal or at least a well-balanced combination of different categories, the texts from each region tend to select one prominent category containing the majority of nonstandardised features.This common pattern offers a first hint towards a symbolic rather than a mimetic approach to the representation of nonstandardised varieties in literary texts.A working hypothesis for the choice of the prominent feature category in each region lies in the perceived distance from the relevant standardised variety.In other words, authors may choose those features which they deem to most saliently set their local varieties apart from standardised varieties.As such, accent is primarily represented in Scottish texts in order to most effectively distinguish the variety, while for Southeast Asian and West African texts, codemixing and grammatical features respectively fulfil that role.
The proportional feature profiles reveal different foci for each region, but they do not take feature density into consideration.Feature density measures the number of non-standardised features in a given sample size, expressed as normalised frequencies.As such, it expresses whether such features are widespread or rare.Figure 2 shows feature densities based on a sample size of 10,000 words.The differences are obvious, as the feature density found in Scottish texts dwarfs the values observed for the other regions.Scottish texts appear to be almost eight times as dense as West African texts, which in turn are denser than Southeast Asian texts.The underrepresentation of phonological features in Southeast Asian and West African texts deserves an explanation.Regarding Nigerian texts, the relative scarcity of phonological features to represent Nigerian Pidgin English could be accounted for by the fact that it is probably in the phonological domain that Nigerian Pidgin English is less homogeneous and that sub-varieties can be recognised, influenced by local languages (Elugbe 2008: 56): the northern variety is strongly influenced by Hausa in the North, the Rivers variety is noticeably influenced by the Ijoid and other languages of the Rivers States, the South Eastern variety has a heavy Igbo colouration.The three major Nigerian languages have very different phonological systems and these differences are apparent in both sub-varieties of Nigerian Pidgin and in varieties of Nigerian English (Hausa English, Igbo English and Yoruba English).For the written words of writers to be "heard" by a large audience, they need to represent a prototypical variety of Pidgin valid for all sub-varieties.In addition, representing phonetic and phonological features such as consonantal and vocalic variants, nasalisation or pitch variation would imply the creation of a "visible" dialect (as opposed to "eye dialect")or the transcription of an oral variety into the written codethat would not only subsume differences between sub-varieties but would also be decipherable by a large readership.

Character profiles
To reiterate the framework discussed in the introduction, we may describe character speech in fictional dialogue as invented but also as observed at least to some extent by adopting Schneider's classification of text types (2002: 72).
Invented utterances ascribed to characters of fiction rely on real-life models (Fowler 1989: 114) that the writer recollects and adjusts.The transposition of real life features of orality in fiction, the choice of the features that are represented and the frequency with which they are represented reveal the author's commitment as to what he/she wishes to signal as prototypical.Neither the author's intentions nor the constraints imposed by a readership that expands beyond the cultural context of the fictional world should be underestimated.To reconstruct orality in fiction, authors need to alter reality.Far from being a mimetic reproduction of actual speech, orality in fiction acquires a symbolic dimension that occasionally turns sociolinguistic reality upside down to the point of overturning the language hierarchy of a diglossic context.
When "English is not English", when the English language is used to represent a local language in literary discourse, as it is often the case in the West African corpus, Standard English acquires different values depending on the situation, interpersonal relationships, the status of the language it represents among other parameters.In the West African corpus, Standard English is used both in urban and rural settings to represent Igbo in exchanges between Igbo participants.It appears to be a convention for authors who write through the medium of English to reach a non-Igbo readership.This is valid for the corpus under study as well as for different authors, for example Chinua Achebe, John Munonye, Elechi Amadi, T.M. Aluko, Flora Nwapa, Buchi Emecheta among others.Igbo literature in English has developed since the publication of Chinua Achebe's Things Falls Apart in 1958 among writers for whom English is a language of literary expression.However, as a close reading of the texts indicates, the use of a language that does not deviate from standardised norms is meant to empower low social status fictional characters.Such is the case in "Lokotown" (1966) by Cyprian Ekwensi.In this urban story, Nwuke, a main-line engine driver, works for the Nigerian Railway Corporation whereas Konni, divorced and "easy going", has many boyfriends that she uses for money.Nwuke's son dies as he falls off his mother's back in a fight that Konni started with her during which she tore off the piece of cloth that held the baby on the mother's back.In the exchange that follows Nwuke's son's death, Nwuke and Konni's conversation is represented in English.But here "English is clearly not English".Nwuke makes use of high prestige markers (see the presence of whence and shall in particular), which are not part of his repertoire, and serve to convey a menacing tone.
(1) "Listen, and let me tell you.You have come to Lokotown to seek your fortune, and to empty men's pockets.You are going back to the place whence you came.We do not want your type here.Lokotown is a clean place for decent families.If I find you here when I return from line, I shall kill you with my own hands."He [Nwuke] squeezed harder."You hear me?" (Ekwensi 1966: 32) Standard English is also used to make low social status Igbo speakers seem worthy of consideration or respect.Serious conversations, including those between relatives and acquaintances, are transcribed in standard English, whereas Pidgin English is used to represent the social, cultural and economic decadence in an urban context, be it in interethnic exchanges or not.The Igbo language spoken by Igbo protagonists in intralingual situations can either be represented by Standard English or by Pidgin English as in (2) depending on the function the written representation fulfils in terms of characterisation of the protagonists and typification of interpersonal relationships.
(2) "Shut up! Sir, sir, sir?Am I the D.L.S.? Service!Bring water for this small boy.Or you nor go drink water?"He [Nwuke] smiled."Water be small boy drink."(Ekwensi 1966: 29) Standard English is once again used conventionally to represent Igbo in Buchi Emecheta's The Joys of Motherhood (1979).Nnu Ego's husband, Nnaife, a laundryman working for a white couple, Mr and Mrs Meers, loses his job when WWII breaks out.When Nnaife and Nnu Ego learn that Nnaife's master, Mr Meers, is going back to England, they do not know how they are going to survive.Nnu Ego and Nnaife discuss and quarrel in dialogues in Standard English, which conventionally represents their native tongue, Igbo.The reader learns that servants and employeesamong whom are Naife and Nnu Egocommunicate in their native tongue, Igbo, among themselves whereas they use Pidgin English to communicate with their employers.This is made clear in a passage that depicts the sociolinguistic reality: "he [Naife, a servant] either did not like what the Madam was telling him or he did not understand, though there was no reason why he should not understand, since the Madam was speaking pidgin English" (Emecheta 1979: 83).Standard English does not encode power relations in passages in direct discourse.
(3) "But, Nnaife, that paper alone won't employ you, will it?"Nnu Ego asked."You must have a master first.All I see all over the place are soldiers of different racessome white with round-shaped faces, others with eyes sunk into their heads.Are they to be the new masters?Why are they all here in Lagos?" "There is a war going on.I have told you before.The new master could be an army man.I only hope he turns up soon, as our money is running out."(Emecheta 1979: 85) The function of Standard English in direct discourse among Igbo participants needs to be contrasted with that of Pidgin English in intralingual situations.Standard English is consistently used between Igbo characters with the same sociocultural background.Pidgin English is used as a lingua franca between Nnaife and Mrs Meers, characters who occupy opposite positions on the social ladder.Nnaife's social position as a laundryman, as well as his lack of pugnacity when faced with adversity, are echoed in the reduced Pidgin English he uses.
That is in conformity with the fictional reality that is depicted.However, the Pidgin used by Mrs Meers is not only a means to ensure communication; it also functions as a way to debunk Mrs Meers' linguistic proficiency: her Pidgin does not conform to reduced canonical constructions in Pidgin.She uses complex structures that do not conform with Pidgin expressions such as "week after this one".
(4) Nnaife had not recovered from the financial loss incurred during the Meers' last leave.So why were they going again?The Madam was still talking, noting the shock on his face.
"No be this week, but na week after this one," she added [….] "Another leave?" he gulped, fear clinging to his throat.
"No, no leave.England de fight the Germans."She smiled again, as if that would explain everything.
Nnaife stopped his ironing, putting the still glowing coal-iron in its cradle and thinking.Well, if that was so, what had it got to do with them?
"But why Master?" he persisted."Why 'im de go England?'Im be no fightfight man.Why, Madam?"There were many things he wanted to ask, but his knowledge of English was limited […] (Emecheta 1979: 84) In No Longer at Ease by Chinua Achebe (1963), Obi returns to Nigeria after four years of studies in England and lives in Lagos with his friend Joseph.He takes a job with the Scholarship Board and is almost immediately offered bribes that he does not accept.At the same time, Obi has a love affair with Clara Okeke who eventually reveals that she is an osu, an outcast, which means that Obi cannot marry her under the traditional ways of the Igbo people.Obi sinks into financial trouble, due to poor budgeting, the need to repay the loan he was given to study abroad, to pay for his siblings' education, and also to pay for the cost of Clara's abortion.After hearing of his mother's death, Obi sinks into severe depression and refuses to go home for the funeral.When he recovers, he begins to accept bribes, finally accepting that it is the way Nigerian society functions.The novel closes as Obi takes a bribe and tells himself that it is the last one he will take but he is arrested.
Due to diamesic constraints regarding variation between the oral and the written codes when oral language is transcribed into written discourse (Gadet 2003), standard English either represents Igbo, for example in a conversation between Obi and his friend Joseph or in the various dialogues that occur at Obi's father's, or the English language stands for itself, for example in receptions or when Obi is interviewed for a job position at the Public Service commission.
(5) Obi hesitated.His first impulse was to say it was an idiotic question.(Achebe 1963: 32-33) Readers are guided in their reception of language attitudes and of the use of different registers in English thanks to metalinguistic comments, as they appear in excerpt (4) "his knowledge of English was limited" or in ( 5) "said Joseph in Ibo".Language registers characterise people's attitudes, as it is the case for example at the reception held in Obi's honour when he returns to Lagos.The simple but nevertheless correct quality of Obi's language is representative of his attitude then: he has not yet convinced himself to accept bribes at this stage.
The register he uses contrasts with the use of pompous English by officials, members of the Umuofia Progressive Union.When "English is not English", the reader is faced with an anti-mimetic approach of representation of linguistic reality which proves a powerful challenge of real-world situations to serve literary ethics.
In Half of a Yellow Sun by Chimamanda Ngozi Adichie (2006), the story of the Nigerian Civil War is alternatively told from the point of view of Ugwu, a village boy who becomes a houseboy at Professor Odenigbo's house, Olanna, a teacher, who is the Professor's girlfriend and later his wife, Richard Churchill, an English writer and expatriate, strong supporter of Igbo Biafra.The novel starts and ends with Ugwu, a 13 year-old boy who moves in with Master Odenigbo who entertains intellectuals, university colleagues mostly, to discuss the political situation in Nigeria.Thanks to Master Odenigbo and Olanna, Ugwu can return to school and his language skills progress but his life is violently interrupted when he is enrolled into the Biafran Army.At the beginning of the novel, Ugwu starts working as a houseboy for Master Odenigbo after being introduced by his aunt, a cleaning lady at the University.Several animated conversations among intellectuals regarding ethnicity, language, pan-Africanism take place at Odenigbo's place.Ugwu, at that stage, neither speaks English nor Pidgin but for the archetypal sequence "Yes sah" which is socially diagnostic of his social status as a houseboy as well as of his rural origins.
(6) "Oh, yes, shelves.I suppose we could fit more shelves somewhere, perhaps in the corridor.I will speak to somebody at the Works Department." "Yes, sah." "Odenigbo.Call me Odenigbo." Ugwu stared at him doubtfully."Sah?" "My name is not Sah.Call me Odenigbo." "Yes, sah." "Odenigbo will always be my name.Sir is arbitrary.You could be the sir tomorrow." "Yes, sah -Odenigbo."(Adichie 2006: 13) The word sah in direct discourse performs the same function in The Joys of Motherhood by Buchi Emecheta to mark Nnaife's social status as well as that of other small boys.So does the sequence the Madam in 3rd person narrative passages, as previously seen in excerpt (4).
In his autobiographical novel, Kossoh Town Boy (1960), Robert Wellesley Cole presents a sociological picture of Freetown in Sierra Leone in the twenties seen through the eyes of a growing boy.The boy, who is gradually moulded by his Krio parents and his teachers, develops qualities of leadership.It is a realistic depiction of a Krio middle-class family, their Christian ethic, their adoption of British codes of behaviour.The sense of Krio Africanness is felt in the strong presence of code-switching.In the passage below, Krio functions as a lingua franca in interethnic situations.
(7) "Masa, a de go kontri." (Please, master, I have come to say goodbye.I am going back to my country.)Papa wished him God-speed and inquired when he would be returning.
"Waka gud! Ustem yu de cam bak?" But Sori, who had been with us so long that he was part of the family, had suddenly grown tired of life in the city.He wanted to go back to his people, marry and farm his land.But he was not leaving his master in the lurch; instead, he said: "Dis na mi broda!" (I've brought you my brother), introducing another and somewhat younger man.
"But every time you bring a man for a job, you say he is your brother", countered my father, eyeing the newcomer carefully.
"Yes, master", Sori answered, speaking in Krio, which was the only language he knew apart from his own native Limba.
"Yes, master, we are all brothers.Same country.Same chief.But this man, he is my real brother.Same mother."The new man took over the majordomo-ship, and when years afterwards he too left he first brought a 'brother' to take over his place.(Cole 1960: 57-58) In this passage, Sori, a butler working for the Coles, addresses his master in direct discourse in Krio, "which was the only language he knew apart from his own native Limba", as mentioned in excerpt (7).Krio functions as a language of social empowerment for a character who belongs to a low prestige ethnic group (at least in comparison with the Krios), localised in a Northern province of the country.This realistic representation of the sociolinguistic reality rapidly gives way to a fictional representation of Sori's language that is encoded in Standard English after a few exchanges, once the reader has integrated the sociocultural and sociolinguistic profiles of the participants.But there is more to it than just that.Unlike Krio members of the Cole family, Sori is made to switch to Standard English so as to encode his linguistic and cultural alterity as a Limba.As encountered previously in excerpts ( 1) and (3), Standard English represents an African language in the speech of a low social status character which pinpoints to the fact that the transfer and representation of spoken discourse to fiction is problematic: devices of the oral channel function differently in written texts and vice versa (Tannen 1982).
Different degrees of saliency of certain features of Pidgin English can be identified.The choice of the features that are most represented is related to discourse-pragmatic functions of Pidgin English mainly in direct discourse.
Features that are known to be socially characteristic can be more frequently marked than actual sociolinguistic usage requires to create archetypal characters, in particular the archetypal servant, who refers to his employer as sah ("sir") or the Madam.The frequent use of such non-standardised terms of address results in a higher feature density for such characters.This tendency is observed across multiple texts.Figure 3 shows densities for all features per character in three West African texts.The format of a violin plot combines the feature density per individual character on the vertical axis, and the density of characters sharing comparable feature densities on the horizontal axis.The characters of lower social status, such as the servants Ugwu, Michael, Sam, and Nwuke's unnamed servant, as well as an unnamed waiter, display visibly higher frequencies of non-standardised features than other characters in their respective texts.Thus, their speech is quantitatively marked as sociolinguistically distinct from the bulk of the other characters.While West African texts tend to mark characters' social status or ethnicity through their linguistic profiles, Southeast Asian texts appear to apply similar techniques to signal a different set of sociolinguistic factors, mainly ethnicity or age group.In the novel The Return by Malaysian author K.S. Maniam (1981), the first person narrator called Ravi, a young member of Malaysia's ethnic Indian community, enrols in an English-medium school, where he is required to use Western grooming practices.Excerpt ( 8) is taken from a scene in which Ravi and his father buy toothpaste from a Chinese shopkeeper.
(8) We entered the only fashionable shop in Bedong, where prices were stiff but which stocked all kinds of goods.The Chinese in his blue drawers and white singlet shuffled up to us.
"What you want, Ayah?" he said politely to my father.
"My son, he goes to English school," my father said.
"Yes, yes.Very good.So going to be great scholar?" he said, running a finger through my hair.
"He wants medicine for the teeth," my father said.The man laughed and shook his head.
"You Indian got strong, white teeth.Ha! Ha!This joke!" (Maniam 1981: 33) The conversation between the narrator's father and the shopkeeper is construed by the author to show a clear contrast between the characters' speech: whereas the father's speech consists of standardised language (save perhaps for the awkward term "medicine for the teeth" instead of toothpaste, which signals his lack of familiarity with the product), the Chinese shopkeeper's speech shows many instances of non-standardised grammar, for example omissions of copulas, auxiliaries, articles, and personal pronouns, as well as the use of the Malay word Ayah, literally meaning "father", but employed as a polite term of address for men older than oneself.Whether the juxtaposition of standardised and non-standardised language is meant to be mimetic or symbolic cannot be established with certainty, but regardless of this distinction, the effect of singling out the Chinese shopkeeper as different from the other characters is achieved through linguistic means.This gap is also observable in a quantitative manner, as seen in Figure 4, which indicates that the Chinese shopkeeper mainly uses grammatical features when other characters tend to employ lexical or codeswitching means, and this with a higher frequency.(It should be noted that the "Pupil", the only character with a higher feature frequency, only utters four words in the selected passage, a minuscule sample size which greatly amplifies the character's two observed features by a factor of 25 in this normalised comparison.)In contrast, the narrator's father is entirely absent from Figure 4, as he produces no non-standardised features in the selected excerpt.In the novel The Adventures of Holden Heng by Singaporean author Robert Yeo (1986), the vast majority of characters are ethnically Chinese Singaporeans.Still, a strategy very similar to the one in excerpt ( 8) is used to single out a character, albeit for a different trait.In excerpt ( 9), the main character Holden Heng asks his father about the origin of his name.
(9) When he was old enough to realise the significance or insignificance of names, he had asked his father.The old man volunteered eagerly.He was twelve then.
"Well," his father said, "your name come from William Holden." "William Holden is the actor, is it?"he asked.
"Yah, it is.From the picture Picnic -" "Oh, I know, I know.He steals Kim Novak from her boyfriend, is it?My friends tell me.But why you call me Holden, not William?William is better." "No use, no use.William so common, Holden so lomantic." Holden's fluent recall was forced to pause at his father's mispronouncement.The pause threatened to become an interval.But his anger at his father's explanation was such that he had to resume his reminiscing.Resuming was his feeble way of getting at his father.
"Holden so lomantic," his father had repeated."Holden Heng.Afterward, you can sign your name H. H. also very stylish.If your name Holden, sure good with woman.I got friend his name William and he not so good with woman.His wife leave him for another man, because he got wrong name.That's why I call you Holden, Holden."(Yeo 1986: 7-8) Both characters make use of grammatical features typical of Singaporean English, for example the invariant question tag "is it", or the omission of auxiliaries, copulas, and the third person present tense -s morpheme.What clearly sets the two characters apart is the father's use of the /l/ for /r/ substitution in the word romantic, yielding "lomantic".Not only is this feature highly stereotypical of ethnic Chinese Englishes, but it also highly salient given the rarity of phonological features in Southeast Asian texts (cf.Figures 1 and 2).Furthermore, the third person extradiegetic narrator, i.e. external to the fictional world (cf.Prince 2003: 29), labels it a "mispronouncement", which also reflects Holden's impression of this feature, as he is the focal character in this passage.The father's linguistic profile is clearly distinct from all other characters who are either pupils or students.Figure 5 indicates that in addition to being the sole user of phonological features as just mentioned, the father also stands out in terms of a clearly higher feature density as well as a focus on grammatical features when most other characters clearly favour code-switching features.
Fig. 5: Normalised feature profiles by character in the excerpt from Yeo (1986) To summarise the present section, the analysis of linguistic profiles has revealed that texts from various regions tend to place a focus on specific feature categories: phonology for Scottish texts, grammar for West African texts, and code-mixing/code-switching in Southeast Asian texts.These linguistic profiles consist of two factors: feature density, and dominant feature type.Presumably, the prominent feature category is the one that most iconically sets the variety apart from its relevant standard, without necessarily being the actually most divergent category.Further, the linguistic profiles of individual characters can be used to convey social indicators such as social status, ethnicity, and age.Additional characteristics gained from a qualitative analysis relate to conversational tone and interpersonal relations.

Accuracy of representation
The frequency and types of linguistic features encountered already offer suggestions as to whether a representation is mimetic or symbolic.For example, the fact that phonological features are barely present in Southeast Asian texts clearly points towards a symbolic representation, as the notion that the varieties in the region have no discernible accent features should seem absurd.Accent features in Southeast Asian Englishes are well documented, e.g.Baskaran (2008) for Malaysian English, Tayao (2008) for Philippine English, or Wee (2008) for Singapore English.Furthermore, accent features have been found to be far more frequent than grammatical features for Singapore and Malaysian English (Percillier 2016: 120-122), even when taking register variation into consideration (Percillier 2016: 158-162).
Even so, a deeper understanding of the nature of representation can be gained by verifying whether the features that do occur in texts correspond to those observed in the actual varieties.To this end, grammatical features are annotated with an 'ewave' attribute which corresponds to their item number in the list of features used in eWAVE (Kortmann & Lunkenheimer 2013).Based on "judgements by top experts on each of the 76 varieties, Pidgins and Creoles on the frequency with which each of the 235 features can be encountered in the relevant variety, Pidgin, or Creole" (Kortmann & Lunkenheimer 2013), the eWAVE database applies a five-tiered rating system to describe the status of a feature in a given variety: A -"feature is pervasive or obligatory"; B -"feature is neither pervasive nor rare"; C -"feature exists, but is extremely rare"; D -"attested absence of feature"; X -"feature is not applicable (given the structural make-up of the variety / Pidgin / Creole)".As such, a mimetic representation in literary texts contained in the corpus is expected to follow a hierarchy by which A-rated features are more frequent than B-rated features, which in turn are more frequent than C-rated features.Features rated a D or X should not occur at all.In contrast, a more symbolic representation may muddle this hierarchy, for example by ignoring A-rated features, or using them less frequently than B or Crated features.Also, D and X-rated features may occur as "invented" features, as they are normally not part of the actual variety.The comparison is performed with help of plots, shown in Figure 6, that mark the occurrence of a feature in a text by a black rectangle, and leave a blank whenever a feature is absent.The distinction between presence/absence of a feature is binary, meaning that a single instance of a feature in a corpus file is enough to warrant a black rectangle, whereas its total absence is required for a blank.As such, the size of the black rectangles is not related to feature frequency, but determined by the number of features listed in a given plot.It should be noted that this comparison is limited to grammatical features in this paper and in relation to the scope of eWAVE.An extension of this comparison to include further linguistic features, e.g.phonological features, will have recourse to more detailed descriptive accounts focused on specific varieties.
Rather than reciting all possible comparisons of features in all varieties under investigation, two varieties that present valuable insights are discussed, namely Nigerian Pidgin and Scottish English.Nigerian Pidgin is a variety of Afro-Caribbean English Lexifier Creole which is spoken not only in Nigeria but also in parts of other West African countries, such as Equatorial Guinea, Cameroon and Ghana (Faraclas 2008: 340).Nigerian Pidgin is the first spoken language in Nigeria and is, in Nigeria, the language with the largest number of speakers (Faraclas 2008: 340).Nigerian Pidgin includes basilectal (pidginised or repidginised) varieties, mesolectal (creolised) varieties, and acrolectal varieties, which are decreolising under the influence of English (Faraclas 1996: 3).From a sociolinguistic point of view, we can say that these three sociolects form a continuum.It is what is found in actual speech and it is taken up in literary representation as an indication of orality (Moreno & Nunes 2009).No speaker of any particular variety, real or fictional, will use every possible feature of that variety in his or her speech, nor produce speech containing the same features as every other speaker of that variety (Minnick 2004: 21-41).In literary texts, as in real life situations, the level of proficiency in English of fictional characters varies in accord with the level of education.Pidgin English tends to be identified as an independent code by scholars but it nevertheless belongs to a sociolinguistic continuum that ranges from Pidgin speakers who have no formal education to Standard English speakers who have completed secondary or university education (Alo & Mesthrie 2008: 323).Popular Nigerian English is akin to Pidgin English in that it deviates from Standard English and includes variants and innovations, which are characteristic of lower sociolects (Alo & Mesthrie 2008: 332).In literary writing, authors explore possibilities combining criteria of social representation and international intelligibility.
We have identified different degrees of saliency of certain features of Pidgin English.The choice of the features that are most represented is related to discourse-pragmatic functions of Pidgin English mainly in direct discourse.
Features that are known to be socially characteristic can be more frequently marked than actual sociolinguistic usage requires to create archetypal characters, in particular the archetypal servant as shown in excerpts ( 4) and ( 6).The written representation of Pidgin English, the choice of certain markers as opposed to others as well as the deletion of some markers, appears to be constrained by the necessity to be understood by a non-local readership even though authors may have recourse to metalinguistic devices such as translation or explanatory comments.As can be seen in Figure 6, a large number of pervasive or obligatory  (Blake 1981: 12), that is, as a mere symbol that would suffice to represent the real variety (Moreno & Nunes 2009).Some core features of the variety are left out; whereas others may be purposefully invented.
The presence or absence of Pidgin English linguistic features very much depend on the characterisation of the participants involved in the situation and on their pragmatic function.We can draw a parallelism with the use of loan words (code mixing), which tend to be restricted to a limited range of topics.
Similarly, grammatical markers may be restricted to prototypical markers or markers that have a central pragmatic function (Schneider 2002: 88).We feel the need to adopt a prototypicality theory as opposed to an all feature-based theory for the purpose of analysing and interpreting Pidgin English in literary productions (Freeman 2010, De Geest & Van Gorp 1999, Steen 1999).
The concept of norm we adopt is centred on what is permitted to ensure that communication is established.Production as well as reception constraints need to be considered in that they influence authors' choices: "the so-called 'prototype' need not exist in reality, since it is generally assumed to be a kind of hypothetical cognitive construction, a theoretical 'fiction'" (De Geest & Van Gorp 1999: 41).In that respect, the prototypes we are faced with in literary texts need to be understood as meaning construal attributes that are subjected to interpretation in sociocultural sensitive contexts.The interpretation the reader makes of socio-linguistic markedness is complex, at a level between what is perceived as mimetic representation and what is literary stylistic innovation, which is still representative, in a non-mimetic mood this time but in a committed manner, of sociolinguistic and sociocultural realia.The texts that succeed in making the reader hear and feel the variety of Pidgin English that they represent are, at least in some respects, unfaithful to the linguistic reality.What is important is that the sociolinguistic dimension of the text emerges in the communicative interaction between text and reader.By 'communicative interaction', we understand 'give-and-take between text and reader'.Excerpt (10) contains character speech in Pidgin English including central features such as levelling of present / past tenses, unmarked third person singular in the present tense, done used as a preverbal aspectual marker of the perfective, say used as a complementiser, absence of the indefinite article.However, the presence of BE V-ING in a WH-relative clause in "the man who is doing this thing" as well as that of the third person singular marker in "Konni is like that" clearly indicate that the dialogue is not written in genuine Pidgin English but rather in a literary variety of language that blurs the line between a genuine representation of speech and written language.The reader is faced with a literary code, neither oral nor fully written, in what we will refer to as diamesic variation which nevertheless ensures that the colloquial dimension of the conversation is kept.
(10) "The neighbours tell me say one motor come pack all her things.Not quite two days, I get notice of transfer to Northern Region.What can I do?I can't leave my daily bread.But I think and think.Then I begin to suspect: the man who is doing this thing must be in our Department.He done work everything, so that when I go on transfer, he can take my wife for himself.But he don't know Konni." "He hire house for her, give her car, do everything I cannot do.But she cannot stay with him two days before they quarrel, and she live by herself.Konni is like that."(Ekwensi 1966: 43) Pidgin English markers function as truth value markers but also as illocutionary markers that have dual expressiveness: for an international readership or heteronormative readers, and for a more local readership or homonormative readers.We here shift from the study of marked forms in and for the text to the study of marked forms in the role they play between text and reader (Müller-Wood 2013: 3) and need to address the question of the reader response.
Bottom-up analysis needs to be coupled with top-down analysis that include ideological considerations to structuralist taxonomy (Alber & Fludernik 2010).In other words, there is space for a holistic approach that shifts from descriptive to interpretative paradigms and that puts the fine-grained linguistic description to interpretative use.Whether it is due to the flesh-and-blood author or to the implied author, the presence in the text of variation markers, in particular when they are used stereotypically, answers the reader's need to know where the author stands in the world of values.Pidgin English markers can be seen as counterhegemonic discourse markers.The author often guides his/her readership in the literal and socio-discursive interpretation he/she makes of passages in Pidgin (or in Krio in Kossoh Town Boy) and of code-switching in general.In that sense, by using Pidgin English markers, the author does not only construct the text but also his/her readership.In this respect, we open our interpretation of the text to the real world, not only in terms of more or less mimetic reproduction or representation of Pidgin English but also, or even more so, in terms of the interaction between author and reader.And so, to a great extent, the Pidgin English markers aid the readers in their own (internal) construction of the variety.
The question of "granularity" or relative representation of features needs to be addressed: how closely do we need to look into similarities and differences between real life varieties and literary varieties to account for the function of markers features and variants?What is at stake really is the question of indigenisation of Standard English in literary writing (Ubanako & Anderson 2014: 88-115) in order to reflect Africanness in given sociocultural and sociolinguistic realities.The range of Englishes present in African writing shows that writers impress their African identity on the English language creatively and for pragmatic purposes.They draw from a pool of linguistic features that result from the contact of real life varieties and literary language and constitute a literary variety that fulfils diamesic purposes.
The interplay between the representation of a variety in the text and its interpretation can equally be observed in other contexts.A look at A-rated features in Scottish texts, as given in Figure 7, reveals that only one such feature is represented, and this in only half of the texts.This may not come as a surprise, given that Scottish texts have been shown to strongly favour phonological features (cf. Figure 1) and may thereby disregard many grammatical features.However, when comparing the accuracy of A-features with B-rated features, also shown in Figure 7, it becomes apparent that more Bfeatures are represented although they are reported as less frequent in actual Scottish English.One feature in particular, labelled "Levelling of past tense/past participle verb forms: regularization of irregular verb paradigms", occurs in five of the eight texts, mostly as the past tense form told realised as telt.In spite of being only moderately frequent in Scottish English according to eWAVE, this feature occurs in Scottish texts while most A-features do not.For example, the feature "Like as a quotative particle" is rated as pervasive or obligatory for Scottish English, but is not attested in any of the Scottish texts added to the corpus so far.As the feature also exists in many other varieties of English, such as Colloquial American English, New Zealand English, Philippine English, Indian English, Welsh English, and Irish English (Kortmann & Lunkenheimer 2013), its ability to convey a Scottish linguistic identity is diluted by the fact that it is by no means unique to Scottish English, with the result that it is not perceived as emblematic in spite of its high frequency in the actual variety.In contrast, a less frequent B-feature can be considered to be more representative of the variety by the author and thus be used to signal a Scottish linguistic identity in the text.In such cases, it can be said that iconicity trumps frequency.

Conclusion
The analysis of the representation of non-standardised linguistic features in literary texts, as undertaken in the present study, was approached through two major aspects, namely feature profiles and accuracy of representation.The examination of feature profiles on a regional basis revealed the existence of clearly distinct profiles, with each region placing a noticeable focus on a different linguistic category.Despite these differences, a common underlying pattern of favouring one major category can be observed.In addition, feature density has also been observed to vary between regions, with Scottish texts displaying a much higher density than West African and Southeast Asian texts.This higher value could be explained in multiple ways: the generally higher number of phonemes in language that need to be represented when the focus lies on phonology, or a peculiarity of Scottish literature being more mimetic in this regard, or a more general split between the literatures of the Inner Circle and Outer Circle.This phenomenon needs to be addressed once texts from other varieties are added to the Inner Circle component of the corpus.As regards character profiles, the use of non-standardised linguistic features has been shown to mark characters for sociolinguistics variables such as social status, ethnic group and age group, both in terms of feature density and feature categories.
The investigation of accuracy of representation, while limited to grammatical features, suggests that the texts containing linguistic elements from certain varieties, e.g.Nigerian Pidgin English, follow the hierarchy of frequency observed in the actual variety in the sense that features rated as pervasive or obligatory (A-features in eWAVE) occur in more texts than features rated as neither pervasive nor rare (B-features).However, certain features rated as pervasive or obligatory are omitted from texts, while features rated as nonexistent (D-or X-features) are found.The last observation either points to the existence of potentially invented features in the texts, possibly to amplify the effect of non-standardised language on the reader, or is to be explained by the use of multiple varieties in the same texts.For texts using Scottish English, fewer A-features have been found to occur in texts than B-features, which in spite of their lower frequency in the actual variety may be considered more emblematic or representative of the variety by the author, or at least assumed to be perceived as such by the intended readership.
The analyses offered in the present study admittedly only scratch the surface of the possibilities offered by a linguistic approach to literary texts.Besides expanding on the answers already provided, which can be achieved by completing the corpus, delving deeper into specific features rather than major feature categories, and extending the study of representation accuracy beyond grammatical features, further lines of inquiry can be addressed, for example the question of diachronic development to verify whether any fluctuations exist in terms of representation types.Also, the influence of textual elements such as inter-character relationships, topic, and setting need to be questioned further.

Fig. 3 :
Fig. 3: Violin plot of feature densities by character in select West African texts

Fig. 6 :
Fig. 6: eWAVE features rated as A, B, D, X in Nigerian Pidgin English texts

Fig. 7 :
Fig. 7: eWAVE features rated as A, B in Scottish texts

Table 1 :
Overview of annotation scheme Malaysia, and the Philippines.The Inner Circle sub-corpus currently features Scottish texts only, totalling 21,269 words.Given the different sample sizes, any cross-regional comparison uses normalised frequencies (cf.Gries At the time of writing, the West African component contains 60,704 words from Nigeria and Sierra Leone, and the Southeast Asian component 75,588 words from Singapore, Call it what you like," said Joseph in Ibo."You know more book than I, but I am older and wiser.And I can tell you that a man does not challenge his chi to a wrestling match." from us in due course.Good morning."Josephwas not very happy when Obi told him the story of the interview.His opinion was that a man in need of a job could not afford to be angry."Nonsense!" said Obi. "That's what I call colonial mentality."" features, A-features inKortmann & Lunkenheimer  (2013), are present in the most mimetic texts, for example in "Lokotown", which contains all but one of the A-features shown.It should be noted Figure6displays only features that are observed in at least one text so as to remain readable.B-features, neither pervasive nor extremely rare, are less represented.Some D-Features, which are attested as absent in the actual variety, are present in the texts.And a number of X-features, which are not applicable according to Kortman & Lunkenheimer's classification, are present, which suggests that authors may draw from different varieties, Nigerian English and Pidgin English in a continuum that goes from Standard International English to basilectal Pidgin(Kortmann & Lunkenheimer 2013).Non-standard varieties in literature can be taken as a selection of linguistic features that range from obligatory to absent in the real-life variety, which are intended to suggest the variety as a whole