1. Introduction
1.1. usage-based second language acquisition
Recent years have witnessed a growing attention to usage-based (henceforth UB) approaches to second language (L2) acquisition, which have highlighted the experientially adaptive nature of language knowledge (cf., e.g., Beckner, Ellis, Blythe, Holland, Bybee, & Ke, Reference Beckner, Ellis, Blythe, Holland, Bybee and Ke2009; Ellis & Cadierno, Reference Ellis and Cadierno2009; Ellis & Larsen-Freeman, Reference Ellis and Larsen-Freeman2009; Ellis, O’Donnell, & Römer, Reference Ellis, O’Donnell and Römer2013; Eskildsen, Reference Eskildsen2012; Robinson & Ellis, Reference Robinson and Ellis2008). Footnote 1 In general, UB theorizing draws on fundamental insights from multiple fields, including research into complex adaptive systems, dynamical systems theory, and construction grammar (e.g., de Bot, Lowie, Thorne, & Verspoor, 2013; Larsen-Freeman, Reference Larsen-Freeman1997; MacWhinney, Reference MacWhinney1999; Solé, Corominas-Murtra, Valverde, & Steels, Reference Solé, Corominas-Murtra, Valverde and Steels2010; for a recent overview, cf. Ellis et al., Reference Ellis, O’Donnell and Römer2013). In UB approaches, linguistic knowledge is seen to arise from the automatic distributional analysis of perceived language inputs in social contexts through processes of statistical learning (cf. Ellis, Reference Ellis2002; MacWhinney Reference MacWhinney, Robinson and Ellis2008; Rebuschat & Williams, Reference Rebuschat, Williams, Rebuschat and Williams2012a; Romberg & Saffran, Reference Romberg and Saffran2010; Tomasello, Reference Tomasello2003). Footnote 2 Long established as a key mechanism in L1 learning, the human capacity to induce linguistic knowledge via statistical learning has been shown to be operative in L2 language acquisition as well, leading to recent attempts at a theoretical unification of language learning models (cf. MacWhinney, Reference MacWhinney, Gass and Mackey2011; Onnis & Thiessen, Reference Onnis and Thiessen2013; Rebuschat & Williams, Reference Rebuschat and Williams2012b). In constructionist variants of UB theorizing, Footnote 3 the emerging linguistic knowledge is characterized in terms of networks of form–function alignments at different grain sizes, so-called constructions of varying degrees of complexity and abstractness (cf., e.g., Ambridge & Lieven, Reference Ambridge and Lieven2011; Diessel, Reference Diessel2004; Goldberg, Reference Goldberg2006; Tomasello, Reference Tomasello2003, for L1 acquisition; Ellis, Reference Ellis, Hoffmann and Trousdale2013; Wiechmann & Kerz, Reference Wiechmann and Kerz2014a; Wulff & Gries, Reference Wulff, Gries and Robinson2011, for L2 acquisition). Following Goldberg (Reference Goldberg2003), we will use the term constructicon to refer to the entire structured repository of constructions. Language learning, then, is the emergence of constructions from the intricate interplay between “the memories of all the utterances in a learner’s entire history of language use and the frequency-biased abstraction of regularities within them” (Ellis & Larsen-Freeman, Reference Ellis and Larsen-Freeman2009, p. 92). The induction of more abstract constructions has been shown to generally rely heavily on high-frequency exemplars, i.e., items that account for a large proportion of the usages of a construction in the input. In the course of development, the positions, or constructional slots, filled by these ‘path-breaking’, high-frequency items are gradually generalized over time in processes of schematization (e.g., Conway & Christiansen, Reference Conway and Christiansen2001; Piaget, Reference Piaget1952), categorization (e.g., Rakinson & Oakes, Reference Rakinson and Oakes2003), and analogy (e.g., Gentner & Markman, Reference Gentner and Markman1997; Ratterman & Gentner, Reference Rattermann and Gentner1998), so as to give rise to more abstract knowledge structures (for a comprehensive discussion, cf. Tomasello, Reference Tomasello2003, and references therein). This type of gradual, item-based construction learning, which proceeds from item-specific to more schematic patterns via iterative categorization of the input – from formulas through low-scope generalizations to fully abstract constructions – has been extensively demonstrated in language learning research (for L1 contexts, see, e.g., Abbot-Smith & Tomasello, Reference Abbot-Smith and Tomasello2006; Dabrowska & Lieven, Reference Dabrowska and Lieven2005; Diessel, Reference Diessel2004; Goldberg, Reference Goldberg2006; Kidd, Lieven, & Tomasello, Reference Kidd, Lieven and Tomasello2010; Rowland, Pine, Lieven, & Theakston, Reference Rowland, Pine, Lieven and Theakston2003; Tomasello, Reference Tomasello2003; for L2 contexts, see, e.g., Ellis, Reference Ellis, Doughty and Long2003; Ellis & Ferreira-Junior, Reference Ellis and Ferreira-Junior2009; Ellis & Larsen-Freeman, Reference Ellis and Larsen-Freeman2009; Eskildsen, Reference Eskildsen2012; Mellow Reference Mellow2006). This aspect of gradual construction learning has been referred to as paradigmatic growth. A complementary dimension of gradual construction learning – syntagmatic growth – concerns processes by which constructional units of different sizes are combined to form more complex units (cf. Alishahi & Stevenson Reference Alishahi and Stevenson2008; Bannard, Lieven, & Tomasello, Reference Bannard, Lieven and Tomasello2009; Beekhuizen, Bod, Fazly, Stevenson, & Verhagen, Reference Beekhuizen, Bod, Fazly, Stevenson, Verhagen, Demberg and O’Donnell2014; Chang, Reference Chang2008), which involve sub-processes of linear expansion and linear integration, i.e., processes of adding material to a construction as well as embedding a construction into larger structural units (cf. Arnon, Reference Arnon and Kidd2011; Brandt, Diessel, & Tomasello, Reference Brandt, Diessel and Tomasello2008; Diessel, Reference Diessel2004). Like its paradigmatic counterpart, syntagmatic growth, too, is based on item-based learning, i.e., it proceeds through incremental developmental steps involving chunking (cf. Ellis, Reference Ellis, Doughty and Long2003; MacWhinney, Reference MacWhinney, Gass and Mackey2011) and the integration of lexically specific constructional patterns into larger structures (Frank, Bod, & Christiansen, Reference Frank, Bod and Christiansen2012). Syntagmatic growth has been extensively studied in early child L1 acquisition, leading to the identification of well-documented paths of development of linear complexity – from phrasal utterances to simple sentences to complex sentences with coordinated clauses to complex sentences with subordinated clauses (cf. Clark, Reference Clark2009; Diessel, Reference Diessel2004; Tomasello, Reference Tomasello2003). Although child language acquisition is unquestionably an important piece of the puzzle that is language learning, a full UB account of language learning clearly requires the analysis of adult language on grounds of another key assumption of UB approaches, namely the assumption that language learning is a lifelong, situated, and locally contingent process (cf. Eskildsen, Reference Eskildsen2012, for an overview). UB approaches do not conceive of language learning as eventually resulting in the establishment of a static system. Rather, linguistic knowledge is seen as a dynamic system, which language users build up and fine-tune while adapting to the needs that arise in the communicative situations they are engaged in (cf., e.g., Bates & MacWhinney, Reference Bates, MacWhinney, MacWhinney and Bates1989). As long as there is exposure to input – in particular input from previously unperceived language domains – an individual's knowledge of a language is in constant flux. Correspondingly, the notion of ultimate attainment is an empty one in UB theory, and developmental change is viewed “not so much [as] the stage-like progression of new accomplishments [but] as the waxing and waning of patterns, some stable and adaptive and others fleeting and seen only under special conditions” (Thelen & Bates, Reference Thelen and Bates2003, cited in Larsen-Freeman, Reference Larsen-Freeman2007, p. 783, our emphasis). Footnote 4
Building on ideas about the dynamic nature of language and ife-long, locally situated language learning, previous UB-oriented research on L2 construction learning has studied “how L2 learners implicitly ‘tally’ (N. Ellis, Reference Ellis2002) and tune their constructional knowledge to construction-specific preferences” (N. Ellis, Reference Ellis, Hoffmann and Trousdale2013, p. 367). This work has primarily investigated changes in the probabilistic biases and preferences in L2 production in regard to associations between lexical items and syntactic frames (N. Ellis, Reference Ellis2002; Gries & Wulff, Reference Gries and Wulff2005; cf., also N. Ellis Reference Ellis, Hoffmann and Trousdale2013, for an overview). Another line of research, relevant for this paper, has explored the development of formulaic language in L2 production (cf., for example, a recent study by O’Donnell, Römer, & Ellis, Reference O’Donnell, Römer and Ellis2013). However, exposure to linguistic input from new language domains will also lead to a tallying and tuning of constructions that are already part of a language user’s L2 constructicon resulting in the adaptation of constructions so as to meet the communicative needs of those language domains. In the present study, we are interested in those aspects of L2 construction learning that result from L2 learners’ exposure to a new written language domain and are concerned with adaptations of an already established construction that result in syntagmatic growth. To this end, we selected a construction that L2 learners are introduced to at a very early stage. As we will describe in more detail below, the construction is defined on the basis of a few essential constructional properties, which are learned either through explicit instruction in a classroom setting and/or provided by standard reference grammars. The constructional properties of the target construction that characterize adaptation to register-specific functions are not captured in reference grammars and pedagogical grammars, meaning that L2 learners will have to learn register-adequate usage inductively via implicit statistical learning from written input. Footnote 5 By comparing the statistical properties of learner productions with those that define the learning target, we investigate if advanced L2 learners show sensitivity to register-specific statistical regularities governing complex constructions in written language and to what extent they can be said to have successfully induced the right generalizations from such input. In other words, we seek to investigate to what extent advanced L2 learners have successfully extracted and applied register-specific probabilistic constructional properties.
1.2. advanced language learning and the importance of written language
As we will detail below, in the present study we investigate written samples produced by German advanced L2 learners with English academic writing as the target register. An important fact we would like to highlight at this point to substantiate the general rationale underlying the study and its relevance for theories of L2 acquisition is that the register- or language-domain-specific adaptations constitute a learning target in both L1 and L2 learning. Biber and colleagues have emphasized that:
[unlike L1 speakers of English], second language (L2) English students do not necessarily begin with control of conversational discourse grammar. However, to succeed in advanced university study, they share the same final target as native speakers: control of the grammatical style required for academic research writing. Thus, regardless of their starting point, all advanced university students need to acquire the grammatical style of academic writing to be successful. (Biber, Gray, & Poonpon, Reference Biber, Gray and Poonpon2013, p. 196)
The acquisition of the grammatical style of academic writing clearly constitutes a non-trivial learning problem that involves the register-contingent adaptation of linguistic constructions leading to syntagmatic growth. Understanding advanced language learning requires taking into account three important facts: first, at advanced levels of language proficiency, learner errors are of a probabilistic – rather than categorical – nature. Second, there are pronounced functionally motivated differences in construction usage across language domains. Third, there are similarities and differences in the growth of structural complexity across modalities. We shall unpack these points in turn. In regard to the first point, it has been demonstrated that at advanced levels of L2 proficiency, learner difficulty is typically not characterized by downright errors, i.e., ungrammatical forms, but rather by a non-conformant, contextually non-target-like use of constructions (e.g., Granger, Gilquin, & Meunier, Reference Granger, Gilquin and Meunier2013; Gries & Wulff, Reference Gries and Wulff2013; Wiechmann & Kerz, Reference Wiechmann and Kerz2014a, Reference Wiechmann and Kerz2014b; Wulff & Gries, Reference Wulff, Gries and Robinson2011). Due to the probabilistic nature of these challenges, language learning at advanced levels requires the employment of mechanisms of statistical or distributional learning highlighted in UB models. In regard to the second point, research on register variation has shown that the constructions of spoken and written discourse differ substantially (Biber, Reference Biber1988, Reference Biber2006; Biber & Gray, Reference Biber, Gray, Konopka, Kubczak, Mair, Štícha and Waßner2011; Biber & Vásquez, Reference Biber, Vásquez and Bazerman2008), which is a direct consequence of the fact that registers are characterized by different communicative needs for which different sets of constructions are of communicative utility. That is, many constructional types and subtypes do not (or hardly ever) occur in spoken language, simply because they are not useful in most situation contexts of spoken discourse (e.g., Biber, Gray, & Poonpon, Reference Biber, Gray and Poonpon2011). This implies at least two important facts: first, language users cannot learn the full spectrum of constructions from spoken input alone. And second, investigating spoken language production is not sufficient to adequately assess the full extent of a learner’s linguistic knowledge, because the absence of a construction in a learner’s spoken output is not evidence for the absence of knowledge of this construction. Due to the functional differentiation of linguistic constructions, any comprehensive account of both L1 and L2 language learning will thus have to investigate the adaptational processes triggered through experience with written language (cf., e.g., Verspoor, Schmid, & Xu, Reference Verspoor, Schmid and Xu2012; Wiechmann & Kerz, Reference Wiechmann and Kerz2014a, Reference Wiechmann and Kerz2014b). Finally, in regard to the last point concerning pathways of the development of structural complexity, L2 writing research has revealed that the trajectory of written language learning resembles that of spoken language learning only partially: as in the spoken modality, the development of structural complexity in written language proceeds through the production of sentence fragments to simple sentences to clausal coordination to clausal subordination (cf. Cooper, Reference Cooper1976; Ishikawa, Reference Ishikawa1995; Ortega, Reference Ortega2003; Wolfe-Quintero, Inagaki, & Kim, Reference Wolfe-Quintero, Inagaki and Kim1998; see also, Parkinson & Musgrave, Reference Parkinson and Musgrave2014, and references therein). However, during later phases of development, notably during university education, the locus of complexity shifts from the clausal to the phrasal level (see, e.g., Biber et al., Reference Biber, Gray and Poonpon2011; Ferris, Reference Ferris1994; Halliday, Reference Halliday, Halliday and Martin1993). Highly proficient academic writing is characterized by increased levels of complexity within noun phrases (NPs) rather than by the extent of subordination. More specifically, the path of development from clausal to nominal complexity is moving from:
finite dependent clauses functioning as constituents in other clauses, through intermediate stages of nonfinite dependent clauses and phrases functioning as constituents in other clauses, and finally to the last stage requiring dense use of phrasal (nonclausal) dependent structures that function as constituents in noun phrases. (Biber et al., Reference Biber, Gray and Poonpon2011, pp. 29f.)
Biber et al. (Reference Biber, Gray and Poonpon2011) hypothesized that this further development of complexity, which is set in motion through increased experience with formal registers of written language in adulthood, is the same in L1 and L2 learning. This hypothesis has received empirical support in a recent study by Parkinson and Musgrave (Reference Parkinson and Musgrave2014). Our investigation of advanced learner knowledge and syntagmatic growth is motivated by these corpus findings. We would like to emphasize that, due to the high degree of interdependencies of the distributions of numerous linguistic features, it appears necessary on methodological grounds to investigate linguistic knowledge at the level of individual constructions.
1.3. probing into register-specific constructional knowledge: the existential there-construction
One construction that is ideally suited for the local assessment of knowledge states at advanced levels of proficiency is the English existential there-construction (ETC), which is typically described as an information packaging construction used to express propositions concerning (non-)existence (Biber, Johansson, Leech, Conrad, & Finegan, Reference Biber, Johansson, Leech, Conrad and Finegan1999; Huddleston & Pullum, Reference Huddleston and Pullum2002), which ensures that a focal argument does not appear in subject position (cf. Duffield & Michaelis, Reference Duffield and Michaelis2011, for a recent discussion of the construction and its relationship to other presentational constructions). There are several reasons why this construction lends itself well to such an analysis. First, due to the fundamental meaning of existence, it is introduced in its basic form at the early stages of learning. Second, its usage in specialized language domains, e.g., academic writing, involves substantial adaptation in terms of the expansion of its basic form in a prominent nominal position. And third, as a clause-level construction, it can be employed either as a stand-alone pattern or be integrated into different types of superordinate constructional patterns, permitting us to broaden our inspection of structural complexity and to include properties of the larger planning unit, i.e., the sentence. ETCs are introduced by a non-referential there typically followed by copular be, and an indefinite NP expressing an addressee-new postverbal argument. In its minimal form, the ETC can be represented schematically as follows:

In the light of the theoretical considerations regarding the continuous, locally contingent elaboration of constructional knowledge, the investigation of ETCs seems particularly fruitful in the context of the gradual adaptation of constructional knowledge that results from the immersion of language users into new language domains or registers, Footnote 6 and, in particular, the extent to which learners have successfully adapted their constructicons so as to meet the need for the information compression that is characteristic of academic writing. As Hiltunen and Tyrkkö (Reference Hiltunen, Tyrkkö, Paul, Hoffmann and Leech2011) pointed out with reference to Biber et al. (Reference Biber, Johansson, Leech, Conrad and Finegan1999):
Because ETCs focus discursive attention on the logical subject, they are useful for increasing clarity in text types with a high information density such as contemporary scientific prose. Indeed, according to Biber et al. (1999: 948–949), in Present-day English, postmodification of the displaced subject is particularly prevalent in scientific writing, owing to the need to pack as much information into each sentence as possible.
While quite a number of studies have been carried out on the use of ETCs in English (see, e.g., Aniya, Reference Aniya1992; Bergen & Plauché, Reference Bergen and Plauché2005; Biber et al., Reference Biber, Johansson, Leech, Conrad and Finegan1999; Breivik, Reference Breivik1983; Firbas, Reference Firbas1992; Hannay, Reference Hannay1985; Huddleston & Pullum, Reference Huddleston and Pullum2002; Jenset, Reference Jenset2010; Johansson, Reference Johansson, Nevalainen and Kahlas-Tarkka1997; Lambrecht, Reference Lambrecht1994; Lumsden, Reference Lumsden1988; Martínez-Insua, Reference Martínez-Insua2004; McNally, Reference McNally1998; Milsark, Reference Milsark1979), there is a lack of research regarding their acquisition, for both L1 and L2 contexts. The ontogeny of the construction in L1 acquisition from deictic there-constructions has been described by Johnson (Reference Johnson, Cienki, Luka and Smith2001). Prior research on L2 use of ETCs has focused either on the basic usage properties of these constructions, such as the confusion of non-referential grammatical subjects (existential there and dummy it), and/or investigated the general proportional misuse (over- or underuse) of ETCs by L2 learners with various L1 backgrounds (Hinkel, Reference Hinkel2003; Miyake & Tsushima, Reference Miyake and Tsushima2012; Palacios-Martínez & Martínez-Insua, Reference Palacios-Martínez and Martínez-Insua2006; Tsushima & Miyake, Reference Tsushima and Miyake2013). Most studies report a tendency towards proportional overuse of ETCs, which is often explained in terms of teaching-induced effects (Palacios-Martínez & Martínez-Insua, Reference Palacios-Martínez and Martínez-Insua2006) or with respect to the the relative structural simplicity of the construction resulting from its formulaic initial elements (There + BE) and the fact that its verb expresses a stative predicate (Hinkel, Reference Hinkel2003). In the light of considerations of syntagmatic growth, general statements suggesting that ETCs are ‘simple’ constructions seem problematic. That is, while it is true that the minimal realizations of ETCs are simple structures by virtually any measure of complexity, typical realizations of the construction in formal language domains like academic writing tend to exhibit substantially higher degrees of complexity, exhibiting multiple postmodifying elements. Furthermore, ETCs can be used either as stand-alone constructions or they can be integrated into a larger syntactic context, in which case one relevant planning unit, the sentence, is a complex unit. Consider the examples in (1) and (2), taken from our data.
-
(1) There was also a main effect of speaker repetition.
-
(2) Therefore, the hypothesis is that there will be a three-way interaction between program, grade, and task, with the executive control demands of the task determining the outcome.
Such examples cast into doubt the meaningfulness of general claims about the simplicity of a clause level construction as a class, and it is this basic fact – that simple constructions can evolve into highly complex constructions – that invites research into the gradual elaboration of constructional knowledge in advanced phases of language development. A more nuanced understanding of advanced L2 learning will thus benefit from a more detailed description of constructional knowledge that goes beyond the basic properties of ETCs mentioned above, and will include additional properties of the construction. We will provide a description of the targeted constructional properties and their systematization in Section 2.1.
A fundamental issue in the assessment of language proficiency and state knowledge is the choice of the benchmark against which L2 learner knowledge is to be evaluated. A substantial amount of second language acquisition research has employed native-speaker knowledge as a standard of comparison. Correspondingly, previous learner corpus studies have typically compared L2 written production to a native speaker benchmark, and have then interpreted observed differences between learners’ and native speakers’ (conditional) usage frequencies as expressions of deficits of the underlying L2 proficiency. Recently, however, several studies have shown that native and non-native novice academic writers struggle with similar challenges on their way to developing academic discourse competence, i.e., native speakers also have to learn the language of academic writing. In the parlance of usage-based constructionist accounts, native speakers, too, have to fine-tune their register-contingent constructicons (cf. Biber et al., Reference Biber, Gray and Poonpon2011; Bolton, Nelson, & Hung, Reference Bolton, Nelson and Hung2002; Jenkins Reference Jenkins2006; Römer Reference Römer2009; Swales Reference Swales2004; Wiechmann & Kerz, Reference Wiechmann and Kerz2014b, inter alia). Or, as Swales (2004: 52) put it:
[t]he difficulties typically experienced by NNS [non-native speaker] academics in writing English are (certain mechanics such as article usage aside) au fond pretty similar to those typically experienced by native speakers.
These observations align with L2 acquisition research that has challenged the empirical reality of native speaker competence as a homogeneous quantity, and that has raised questions about the utility of treating it as the target that L2 learners seek to converge on (see, e.g., Mitchell, Myles, & Marsden, Reference Mitchell, Myles and Marsden2012). More generally, there is a general trend among scholars to investigate the acquisition of English as a global language and cast off the notion of a standard native-speaker target, giving rise to a range of new terms that mirror a more open position towards language learning goals, such as novice/apprentice vs. expert/professional language users, etc. (cf. Duff, Reference Duff, Gass and Mackey2012; Römer, Reference Römer2009). In contrast to a general class of native-speaker productions, the language of domain experts – especially in formal registers – tends to be much more homogeneous (cf. Wiechmann & Kerz, Reference Wiechmann and Kerz2014a, Reference Wiechmann and Kerz2014b). The existence of a statistically robust baseline makes any attempt at estimating degrees of probabilistic learner deficits much more sensible and feasible. This, and the fact that being a native speaker does not imply advanced knowledge in a specialized domain, led to the decision to use expert (professional) academic writing as the target and benchmark of comparison.
1.4. aims and scope of the study
Any account of L2 language learning must be able to state the conditions under which we are inclined to say that learners behave in a target-like fashion. In UB constructionist approaches, target-like behaviour is expected to arise when learners have developed robust patterns that are adapted to the functional needs of the language domain in question. One approach to measuring this, which we pursue in the present study, is to revert to the shapes of the distributions of produced outputs and compare them with target-like distributions. Based on an L2 learner corpus of advanced written productions in a narrowly defined register, and an expert-writer corpus representing the target, the study sets out to reverse-engineer aspects of the proficiency level, i.e., the states of constructional knowledge, of a group of German advanced L2 learners of English through the comparison of the statistical regularities underlying their written productions with those of domain experts. This approach rests on the assumption that learners’ linguistic productions are constrained by the probabilistic biases of the systems that embody their knowledge of a language, so that analyses of the statistical properties of a learner’s output permit inferences about aspects of the underlying knowledge system (see Wiechmann & Kerz, Reference Wiechmann and Kerz2013, Reference Wiechmann and Kerz2014a, Reference Wiechmann and Kerz2014b, for a detailed description of the rationale of this approach). Our goal here is to describe systematically exactly what learners have and have not yet learned at the stage investigated, and what the results suggest with respect to the nature of construction learning. In the systematization of L2 learning theories in Cummins (Reference Cummins1983) (see also Ellis, Reference Ellis1999; Gregg, Reference Gregg1993), we are thus concerned with issues at the level of a property theory – rather than a transition theory – of L2 learning, meaning that we seek to model aspects of the nature of the to-be-acquired language system. Footnote 7
2. Method
We approached the identification of differences in the knowledge systems underlying the productions of advanced L2 learners and those of expert-level academic writers by setting up classification models that, based on distributional information about a number of features of the target construction, seek to decide whether a given utterance type is more likely to be produced by a learner or an expert. Features that are highly discriminative in the task are interpreted as marking aspects of knowledge that L2 learners have not yet mastered.
2.1. corpus data and constructional features investigated
The study employed a design based on a corpus consisting of fifty written samples produced by L2 advanced learners of English (L1 German) in their second and third year of studies (BA English Linguistics) at the RWTH Aachen University (Ntotal ∼ 187.5k words), and a roughly same-sized expert-writers control corpus compiled from twenty research articles about language (Ntotal ∼ 185.5k words; see ‘Appendix’ for details of compilation; available at <http://dx.doi.org/10.1017/langcog.2015.6>). The L2 learners whose written productions are investigated in this study have about ten years of formal exposure to English and an attested proficiency level of B2–C1. Their explicit instruction pertaining to writing skills resided exclusively in the domain of essay writing, meaning that they did not receive explicit instruction in the domain of academic/scientific writing investigated here. The data used in the present study were collected by extracting first all instances of there followed by manual identification of existential usages of there. All locative/deictic there-constructions were deleted from the sample, leading to 370 ETC instances in the learner sample and 324 ETC instances in the expert sample. The cleaned data were subsequently manually annotated in terms of ten features relevant to the description of ETCs. These features are briefly introduced below (Table 1).
table 1. Constructional features investigated in this study (overview)

The two-level factor tense distinguishes past tense forms from all other tense variants. Modal marks the presence or absence of a modal auxiliary, the choice of which is subject to certain semantic constraints (Quirk, Greenbaum, Leech, & Svartvik, Reference Quirk, Greenbaum, Leech and Svartvik1985). Quantifier marks the presence or absence of a quantifying expression in the postverbal argument. neg.polar concerns the presence of a negative polarity item, which indicates that the ETC encodes a statement about the absence of something (as in There was no effect of X on Y). The feature np length measures the length of the postverbal argument in words, measured on a logarithmic scale, which is often used as a proxy of structural complexity (Arnold, Wasow, Losongco, & Ginstrom, Reference Arnold, Wasow, Losongco and Ginstrom2000; Bresnan, Cueni, Nikitina, & Baayen, Reference Bresnan, Cueni, Nikitina, Baayen, Boume, Kraemer and Zwarts2007; Hawkins, Reference Hawkins2004; Jäger & Rosenbach, Reference Jäger and Rosenbach2006; Szmrecsanyi, Reference Szmrecsanyi, Purnelle, Fairon and Dister2004). Finally, the feature np definiteness describes the ETCs with respect to the definiteness of the postverbal argument, which relates to information-structural constraints mentioned above. The features syntax, premod, freq.head, and extension all relate to register-specific adaptations of the construction leading to syntagmatic growth: syntax describes whether, and if so how, the ETC is integrated into a larger structure. Here we distinguished ETCs that are stand-alone constructions from ETCs embedded within subordinate clauses that function as either complements of a matrix verb or adverbials. premod indicates whether or not the postverbal argument contained adjectival premodification. Footnote 8 Freq.head indicates if the head of the postverbal argument is among the top 100 most frequent nouns in the academic writing sections of the British National Corpus (BNC; Burnard & Aston, Reference Burnard and Aston1998) or Corpus of Contemporary American English (COCA; Davies, Reference Davies2008). We chose to discretize head frequency as its effect is likely to be non-linear and we wanted to keep our models simple and interpretable. While being arbitrary, a cut-off at the top-100 mark constitutes a conservative binning into high- and low-frequency bands. Finally, the feature extension describes the presence and type of constructions that extend the minimal form of the ETC in postnominal position. Standard reference grammars of English typically distinguish extension types based on a heterogeneous set of semantic and syntactic criteria (cf., e.g., Huddleston & Pullum’s, Reference Huddleston and Pullum2002, classification into locative, temporal, predicative, infinitival, participial, and relative clause extensions). To minimize the difficulty of data annotation, we followed the classification scheme employed by Palacios-Martínez and Martínez-Insua (Reference Palacios-Martínez and Martínez-Insua2006), which employs exclusively formal categories. Illustrations of the distinguished values of that feature (taken from our data) are provided below.
No extension, i.e., minimal ETCs (no)
-
(3) There are no universally valid answers.
-
(4) There was no interaction of word repetition and speaker repetition.
Due to their strong semantic connection to the head noun, prepositional phrases headed by of that contain analytic genitives were not counted as extensions (see the examples below for further illustration).
Prepositional Phrase (PP)
-
(5) Given that there is no verb movement
, […].
-
(6) Austin states that there is a high level of energetic action
, […].
Relative Clauses (RC)
-
(7) wh-relative: There was also a main effect of sentence type, F (2, 240) = 5.15, p < .01,
-
(8) that-relative: There were other secondary sources
-
(9) Zero relative: There are certain similarities
-
(10) Non-finite (present participial): There is a potentially uncountable number of factors
.
-
(11) Non-finite (past participial): There is limited evidence
.
To-infinitival clauses (inf)
-
(12) There must have been many benefits
.
Fact-S construction / Appositive that-clause (that)
-
(13) […] there is some evidence
.
Multiple extension types (multi)
-
(14) PP + PP: […] and that there is a difference PP
PP
[…].
-
(15) PP + RC: [...] there are studies PP [on multilinguals] RC
[…].
-
(16) PP + AC: […] there is a connection PP
, AC
[…].
-
(17) Other complex: There are other ways PP
INF
[…].
For an item to be treated as an instance of the category ‘multi’, it had to contain some sequence of the basic categories, which extensionally resulted in the combinations listed for the feature extension in Table 1. In the annotation of the data regarding this feature, we looked at linear sequencing only, and did not distinguish differences that pertain to the hierarchical order of phrasal constituents. For instance, the subcategory ‘PP + PP’ could, in principle, be treated as a subcase of the ‘single PP’ extension-type, in which a higher-level PP would dominate the two PPs in question. This treatment, however, would make invisible the differences between complex and simple PP extensions. As indicated earlier, the only type of PP, which was excluded from phrasal counts – based on their semantic closeness to the head nominal – was that of genitive PPs headed by of. The decision to not consider hierarchical structure also served the goal of minimizing annotation errors.
2.2. data analysis
We analyzed the data using logistic regression modelling with stepwise model simplification (backward selection) via the Bayesian Information Criterion from a model with all main effects and all 2-way interactions (Venables & Ripley, Reference Venables and Ripley2002). Footnote 9 In a secondary step, to assess the degree of within-group variation in the learner and expert sample, we fitted linear mixed models to expert and learner data separately. In this step, we reversed the functional roles of the to-be-related features and set the models up so as to predict the value of a constructional feature FCx solely based on the random variable ‘ID’, which encoded the text from which a given instance was extracted. Footnote 10 To increase the robustness of the derived estimates, we considered only texts that contributed at least ten instances of the construction, which reduced our data to about 90% in the case of the expert data and 75% in the case of the learner data.
3. Results
We found a slight overuse of ETCs in learner language (learners: 370 instances / 187.5k words versus experts: 324 / 185.5k words). Table 2 presents an overview of distributional statistics of the features investigated in this study.
table 2. Distributional statistics of investigated features (numbers in parentheses indicate observed expert and learner frequencies respectively, i.e., Frequency expert : Frequency learner )

The logistic regression model achieved a classification accuracy of 77% (corresponding to an error rate of 0.23), which constitutes a substantial improvement over the baseline of 0.53 resulting from the slightly larger number of instances of learner productions in the sample. While this is, of course, a very coarse-grained evaluation of model performance, it indicates that learner usage of the construction is clearly different from expert usage. A more detailed comparison of the performance of the regression model relative to alternative models explored here in terms of receiver operating characteristic curves can be found in the ‘Appendix’. After stepwise model simplification, the minimal adequate model contained the features syntax, extension, premod, freq.head, and tense as well as two 2-way interactions tense:extension and freq.head:extension (LRchi2 = 280.77, d.f. = 20, Pr(> chi2) < 0.0001, R2 = 0.44 (0.40 after resampling validation (1,000 repetitions)), Dxy = 0.67 (0.65 after resampling validation)). The regression model thus asserts that the features np definiteness, modal, quantifier, and neg.polar are relatively unimportant in the discrimination of expert and learner language, suggesting that the aspects of constructional knowledge expressed through these features have been successfully learned. Footnote 11 Figure 1 visualizes the effects of the five variables that figure in the final regression model. Table A1, with the statistics of the regression coefficients, can be found in the ‘Appendix’.

Fig. 1. Effect plots of the variables in the minimal adequate logistic regression model describing the differences between expert and learner productions. Estimates near the value 0.5 indicate similar usage dispositions. Estimates above the 0.5 mark indicate tendencies towards proportional overuse in learner productions. Correspondingly estimates below the 0.5 mark indicate tendencies towards proportional underuse in learner productions.
The only general variable to distinguish learner from expert productions was tense. The results suggest that learners underuse past-tense forms, especially with more complex extension types (multi). Footnote 12 All other variables that distinguish expert and learner language (extension, premod, syntax, freq.head) concern aspects of syntagmatic growth. In this regard, it is interesting to note that it is not the sheer number of words used to express the postverbal argument that distinguishes learner and expert language, as length, which is often used as a proxy of structural complexity (Szmrecsanyi, Reference Szmrecsanyi, Purnelle, Fairon and Dister2004), plays only a minor role in the discrimination of the two compared groups. For illustration, Figure 2 presents a comparison of univariate density estimates of (logarithmic) length.

Fig. 2. Density of (logarithmic) length, i.e. length of postverbal argument in words, across groups. The light blue area denotes a reference band indicating where a density estimate is likely to lie, when there is no difference between the groups (under an assumed normal distribution).
Figure 2 shows that for experts a larger portion of the probability mass is located to the right of the reference band, indicating that experts produce slightly longer utterances on average. Furthermore, there is some evidence for a slightly stronger reliance on premodifying material in expert production. The regression model identifies the effect of premod as statistically significant, but rather weak in terms of its effect size. Further hypotheses regarding differences relating to premodification, i.e., interactions with other variables, e.g., the interaction with the presence or absence of postmodifying material, were not supported by the data. Turning to postnominal modification, we found that extension is by far the most discriminative factor in the model: the learners clearly overused minimal ETCs as well as relative clause extensions, and underused more complex extensions, i.e., extension types with multiple phrasal components. A more nuanced picture about preferred and dispreferred extension types and their interactions with freq.head is shown in the extended mosaic plot in Figure 3.

Fig. 3. Extended mosaic plot visualizing a log-linear model relating group, freq.head, and a more detailed description of extension, which distinguishes ten factor levels. The area of each tile is proportional to the corresponding cell entries’ size and the significance of the corresponding residuals is indicated through coloring (Meyer, Zeileis, & Hornik, Reference Meyer, Zeileis and Hornik2006).
An interesting finding concerns the interaction of extension and freq.head. Figure 3 reveals that the most complex extensions in expert language tend to appear with frequent head nominals. With respect to syntax, i.e., the integration of the ETC into a larger syntactic context, we found that learners’ productions are characterized by a stronger preference towards a nominal/complement clause integration of ETCs. There is also some evidence for underuse of ETCs within adverbial clauses. However, the predominant use of stand-alone ETCs (∼60% of the ETCs in both datasets are stand-alone constructions) results in relatively little statistical power to detect significant group differences with respect to the integration of ETCs.
3.1. individual differences
Research into L2 learning has often highlighted the pronounced individual variation relative to the variation observed in L1 learning (cf., e.g., Dörnyei, Reference Dörnyei2005). In order to estimate the amount of variability across different learners and experts, we investigated the adjustments to the intercept in linear mixed models, in which a given ETC variable was modelled only as a function of the random effect ID, identifies the text from which an instance was extracted. We focus here on the variation regarding the strongest discriminating variable, extension, but found similar results – viz. little variation between individuals – for all investigated variables. Figure 4 presents graphically the 95% prediction intervals for each of the source texts (ID) estimated from a linear mixed model fitted using restricted maximum likelihood (REML) estimation.

Fig. 4. Adjustments of intercept estimated from in a linear mixed model fitted by REML with extension modelled only as a function of the random effect ID, which describes the source text of a given example.
Learner productions exhibited even less pronounced individual differences than expert productions (Learner model: Fixed Effect (Intercept) = 4.6; Random Effect ID (Intercept) SD = 0.15, Residual 1.36; Expert model: Fixed Effects (Intercept) = 3.62; Random Effect ID (Intercept) SD = 0.33, Residual 1.23). As the learners’ 95% prediction intervals all overlap zero comfortably, the variation across individuals can be considered negligible.
4. Discussion
One of the goals of this study was to draw attention to the importance of turning to written production in specialized registers to investigate advanced stages of L2 constructional learning. Part of reaching advanced levels of L2 proficiency involves mastering the specific distributional properties of a given language domain (or register). The distributional properties of the register of English academic writing, i.e., the learning target investigated here, are to be derived from the full extension of texts in that register, or – more realistically – form the subset of texts that constitute the learners’ experience with that register. As pointed out by Biber et al. (Reference Biber, Gray and Poonpon2013), native speakers and L2 learners have a common target at advanced levels of proficiency, namely the control of register-adequate patterns of language use. The design employed here marks the first step in an analytical research pipeline that is geared to infer limits of current knowledge states from produced outputs and identify the arenas in which these non-target-like aspects of constructional knowledge are situated. Using techniques from supervised machine learning, we fitted models to rich descriptions of natural written production data that instantiate a particular test-construction, English ETCs. The models were set up to identify constructional properties that strongly discriminate between advanced L2 learners and domain experts that define the target language. For purposes of exposition, we framed the presentation of our results in the language of regression modelling – as we assume greatest familiarity with this approach – but sanity-checked all reported effect sizes against the results from functionally equivalent expressions from other machine-learning algorithms (cf. Figures A1 and A2 in the ‘Appendix’). We found that learners exhibit clear deficits in the adaptational fine-tuning of their constructional knowledge, which relies on mechanisms of implicit learning and pattern finding. We observed further that learner productions where target-like with respect to nearly all constructional features that describe fundamental properties of ETCs that are largely register-independent: the relative heaviness and the indefiniteness of the postverbal argument follow from general information-structural regularities in English, which can be picked up from reference grammars of the target language. The only basic grammatical feature to discriminate between learner and experts was tense. Compared to the expert writer benchmark, the L2 productions exhibit a greater proportion of ETCs in the present tense. However, the question of whether this is due to L2 learners’ unsuccessful induction of this constructional feature cannot be answered conclusively on the basis of our data due to substantial observed variation regarding tense usage in expert production. This variation of tense usage in expert language was in fact observed in two dimensions: we found rather pronounced variation across texts in regard to the dominant tense in the academic articles. There was also substantial systematic within-text variation in expert language: ETCs in the early sections of an academic paper are more likely to be in the present tense then those in later sections, reflecting discourse-functional differences of ETCs with different tenses. Present tense ETCs are typically used to introduce new discourse referents, whereas past tense ETCs typically appear in shell noun contexts that are useful for the presentation of results. The pronounced difference in the usage of tense thus motivates additional research into the spectrum of discourse-pragmatic functions ETCs and their course of development (cf. Hiltunen & Tyrrkö, Reference Hiltunen, Tyrkkö, Paul, Hoffmann and Leech2011, for a discussion). Our results showed that – with the exception of tense – all features that discriminate between advanced learners and experts concern aspects of syntagmatic growth: we found differences with regard to how the investigated construction is integrated into the larger syntactic context and, more pronouncedly, how the prominent phrasal slot of the construction, the postverbal argument, was realized. The most pronounced differences between expert and learner productions concern the extension of the postverbal argument of ETCs. These differences were not so much a function of the weight (or ‘heaviness’) of the NP, which is often investigated as a proxy of structural complexity in studies of sentence processing (cf. Arnold et al., Reference Arnold, Wasow, Losongco and Ginstrom2000). Rather, our data suggest that learners’ productions differ from the target with respect to both the degree of internal complexity of the NP and the types of phrase-internal modifiers. That is, our advanced learners clearly showed different patterns with respect to how they expand on the nominal heads of the focus phrase. Crucially, they relied too much on finite relative clause extensions and too little on chains of phrasal postmodifiers, i.e., multi-type extensions. Footnote 13 The distance to expert-like linguistic behaviour thus supports the developmental pathway described in Biber et al. (Reference Biber, Gray and Poonpon2011), which describes a course of development from clausal to phrasal complexity. Our results also support general considerations of the role of the detectability of a feature and the difficulty of its acquisition. Humans most readily learn detectable features (or cues), i.e., statistical regularities among elements that are perceptually salient and temporally proximal. Functional similarities (without perceptual similarity) and temporally non-adjacent generalizations are harder to detect and thus harder to learn (Creel, Newport, & Aslin, Reference Creel, Newport and Aslin2004; Endress, Nespor, & Mehler, Reference Endress, Nespor and Mehler2009; cf. Bates, McNew, MacWhinney, Devescovi, & Smith, Reference Bates, McNew, MacWhinney, Devescovi and Smith1982; MacWhinney, Reference MacWhinney, Robinson and Ellis2008, for discussions of cue detectability in the Competition Model). Effects of feature detectability are typically observed during the early stages of language learning. Turkish children, for example, tend to pick up accusative marking earlier than Hungarian children because the Turkish accusative marker is easier to perceive (MacWhinney, Pleh, & Bates, Reference MacWhinney, Pleh and Bates1985). The general idea underlying feature (or cue) detectability, however, is easily extended to learning difficulties related to the complexity of the mappings of forms and meanings: in a study investigating the development of probabilistic constraints on clause ordering, Wiechmann and Kerz (Reference Wiechmann and Kerz2014a) found that advanced L2 learners relied more strongly on perceivable lexical cues than on distributional regularities involving abstract semantic categories. Similarly, our learners had less trouble learning the distributions of modal verbs, quantifiers, and negative polarity items within ETCs, which are encoded through a closed and relatively small set of lexical items. Furthermore, we found some evidence for item-specificity and formulaicity in the productions of both learners and experts (cf. Ellis, Reference Ellis1996; Ellis & Cadierno, Reference Ellis and Cadierno2009; Granger & Meunier, Reference Granger and Meunier2008; O’Donnell et al., Reference O’Donnell, Römer and Ellis2013; Pawley & Syder, Reference Pawley, Syder, Richards and Schmidt1983; Sinclair, Reference Sinclair1991, Reference Sinclair2004; Wray Reference Wray2002, Reference Wray2008). Effects of item-specificity and formulaicity are expected in UB-accounts of language learning as “generalizations arise from conspiracies of memorized utterances collaborating in productive schematic linguistic productions” (Ellis, Reference Ellis, Cenoz and Hornberger2008, p. 125). We observed that the productions of our learners are characterized by target-like proportional usage of stand-alone constructions, in which the ETC functions as the main clause of the sentence. However, there was some evidence in our data that learner productions were less target-like when the ETC was integrated into a subordinate structure. Specifically, the embedded ETCs that learners tend to produce are nominal clauses, whereas ETCs produced by experts were proportionally more often integrated into an adverbial clause structure. Closer inspection of the data revealed that learner productions are typically organized around a small set of communication verbs, most notably ‘X {report|argue|claim} that ETC’). The lexical-specificity of such item-specific language use and chunking effects cannot be meaningfully quantified on the basis of the available data, and is presented here as a suggestive finding to be explored in future work. We also found that experts produce structurally complex postverbal arguments with multiple phrasal postmodifiers typically in combination with frequent head nominals (cf. Figure 3). A unifying explanation for these findings could link the use of frequent lexical items and formulaic expressions to processing demand: the use of frequent structural anchors, e.g., frequent heads of matrix clause VPs or frequent heads of postverbal arguments, reduce the overall processing demand of a complex pattern. Their employment in complex structural environments could thus be conceived of as a compensation strategy (Rohdenburg, Reference Rohdenburg1996).
Having discussed the specific findings of this study, and their interpretation, we would like to address some general issues. Specifically, we would like to address three points: the first point concerns the definition of the learning target and the use of pooled data from linguistic corpora in second language learning research. The second point addresses issues concerning the study of probabilistic errors and the validity of inferences from usage frequency to inadequate usage. Finally, we present our stance on the role of transfer and connect the work presented here to research into implicit learning. Turning to the first point: we have described above that language learning is a lifelong process, in which knowledge is continuously modified and relativized to situational contexts. Consequently, to “know a construction isn’t an all-or-nothing state’ (Arnon, Reference Arnon and Kidd2011, p. 82). Both L1 learners and L2 learners have to learn the register-specific adaptations of already known constructions. At any given level of language proficiency, knowledge states can only be reconstructed from produced behaviours. As Biber and colleagues pointed out, corpus analysis constitutes an important tool in the uncovering of intermediate and target states:
All normal native speakers of English participate in conversational interactions and control the grammatical structures typical of conversation. In contrast, comparatively few native speakers productively control the register of academic writing. So there must be a process of writing development: academic professionals had to acquire the phrasal grammatical style of academic writing […] [T]he eventual end point [of the process of writing development] can be demonstrated from empirical corpus analysis: we can fully describe the grammatical characteristics of advanced academic writing. And there can be no doubt that writing development must occur: somewhere along the way, advanced students and professionals learn how to produce discourse of this type, whether they are native or L2 English speakers. (Biber et al., Reference Biber, Gray and Poonpon2013, p. 196)
In UB perspectives, what Biber and colleagues refer to as the “end point[s]” in this process is arguably better characterized as attractors in a dynamical system that is linguistic knowledge (de Bot et al., Reference de Bot, Lowie, Thorne, Verspoor, Mayo, Gutierrez-Mangado and Adrián2013; Elman, Reference Elman, Port and van Gelder1995) But the essential points remain: there are states of knowledge that can be considered the learning target, and the analysis of naturalistic language data promises to be a valuable methodology for their description. In this study we made use of pooled data from multiple learners. The validity of using pooled data for the assessment of target-like behaviour depends on the degree of variation in expert language. At least with respect to the construction use investigated here, we found that the language of expert academic writers is remarkably homogeneous, reflecting the high degree of conventionalization of the register. It is interesting to note that there was also surprisingly little variation in probabilistic construction use in the investigated group of advanced learners. In particular, the degree of target-likeness in learner production with regard to phrasal complexity was found to be very similar across individuals. This supports a number of conceivable hypotheses as to why this was observed. It could mean that our learners happen to be remarkably similar along all dimensions typically associated with individual differences including intelligence, learning style, learner strategies, aptitude towards teacher and learning materials, cognitive style, motivation, personality, etc. (cf. Dewaele & Furnham, Reference Dewaele and Furnham1999; Dörnyei, Reference Dörnyei2005; R. Ellis Reference Ellis1985). Another – arguably more plausible – possibility is that these factors affect only relatively weakly the register-contingent adaptation of constructional knowledge at advanced levels of proficiency. A strong version of current UB accounts will hold that learner knowledge is derived from the subconscious distributional analysis of the input, and will predict that individual-level variation is best explained with reference to differences in perceived inputs, and differences in rote learning and inductive learning ability, as well as differences in associative memory capacity. To the best of our knowledge, UB theory has not developed testable proposals of how factors studied in differential psychology are best integrated into existing accounts of UB language learning.
The second point we would like to address concerns the validity of inferences from usage frequency to inadequate usage, which we employed in this study and which are employed in most corpus-based analyses of advanced learner language, where errors tend to be of a probabilistic nature. A type of argument against the validity of such inferences will hold that proportional under- or overuse of some construction Ci does not necessarily imply inadequate usage. After all, differences in usage frequency of Ci may reduce to the fact that, in the case of alleged underuse, a learner simply did not intend to express some communicative function Fi that is adequately expressed through Ci or, in the case of alleged overuse, felt that Fi needed to be expressed rather frequently. It seems conceivable that we could observe significant differences in the usage frequency of a construction even though each and every usage event, if carefully assessed, would be considered target-like, rendering invalid any inference from differences of usage frequency to inadequate usage (cf. Gries & Deshors, Reference Gries and Deshors2014, for discussion and an interesting proposal of a multi-step statistical procedure addressing the general issue). Our reaction to the argument against inference from usage frequency to inadequate usage (henceforth the UIU argument) is based on the considerations of constructional utility and register-specificity detailed above: under the assumption of a principle of no synonymy, which is routinely invoked in constructionist approaches to language (cf. Goldberg, Reference Goldberg1995, for discussion), the force of the UIU argument seems to depend systematically on how much variability in construction choice is permitted by the situational context. Clearly, the set of contextually appropriate constructions gets smaller as the situational context becomes more specific. In this study, the register was delimited to the narrowly defined situational context of academic writing about language. To say that learner productions are dissimilar to those of experts but are still adequate (at least potentially) presupposes that there is a contextually appropriate subset of constructions that learners but not experts happened to choose from. However, the idea that such a subset and such viable choices exist is clearly at odds with the observed fact that the productions of experts and learner each exhibit little group-internal variation. We believe that this defeats the UIU argument.
Finally, we would like to address our stance towards the role of transfer and the connection of this work to research into implicit learning. We have demonstrated the empirical reality of systematic non-target-likeness in the behaviour of advanced learners, which was interpreted as reflecting a not fully adapted constructicon. On the basis of the available data, we cannot, however, quantify the role of transfer effects, i.e., interactions with L1 knowledge. It is generally observed that L2 learners typically attempt to first transfer knowledge from the L1 whenever they can perceive correspondences between items in L1 and L2 (Robinson & Ellis, Reference Robinson and Ellis2008; also MacWhinney, Reference MacWhinney, Gass and Mackey2011). For the phenomenon investigated here, however, there is reason to believe that transfer effects play only a subsidiary role. Prior research suggests that transfer of item-based syntactic patterns is very limited, as such patterns cannot be readily matched across languages, meaning that item-specific preferences must be learned from the bottom up without any support from the L1 (MacWhinney, Reference MacWhinney, Gass and Mackey2011). Future studies will have to address whether and to what extent transfer plays a role in the production of complex structures in L2 production.
We believe that the methodology presented here can inform research into implicit/explicit L2 learning, which concerns cognitive processes, and implicit/explicit L2 knowledge, which concerns the products of these processes (for overviews and discussion of the extensive literature in the field, see Ellis, Loewen, Elder, Erlam, Philp, & Reinders, Reference Ellis, Loewen, Elder, Erlam, Philp and Reinders2009; Rebuschat, Reference Rebuschat2013, and references therein). Since implicit knowledge is tacit and procedural, while explicit knowledge is conscious and declarative (cf. R. Ellis Reference Ellis1994, Reference Ellis2002, Reference Ellis2004), the former is intrinsically harder to demonstrate than the latter: demonstrating successful implicit learning cannot meaningfully involve asking a learner to state their knowledge, but rather requires the investigation of produced behaviours. Behavioural experimental studies can aim to disclose non-target-like procedural rules. We believe that the study presented here offers a valuable complementary approach to the study of implicit knowledge. By focusing on naturalistic productions instantiating a phenomenon that – at the current point of linguistic description and pedagogical application – must be mastered through implicit learning, viz. complex register-adequate constructional adaptation, we can draw inferences from products to implicit condition–action rules, which learners construct as part of their implicit knowledge (Ellis, Reference Ellis, Cenoz and Hornberger2008). We have previously demonstrated this for binary sequencing choices (Wiechmann & Kerz, Reference Wiechmann and Kerz2014a) and binary constructional selection choices (Wiechmann & Kerz, Reference Wiechmann and Kerz2014b), both of which involve rules of the type ‘under condition C, perform action A’. In this study, we have applied a similar rationale to a constructional adaptation scenario.
5. Conclusion
UB constructionist accounts of language conceive of linguistic knowledge as a complex adaptive system, whose information processing dynamics cause it to be in a state of constant change. In this view, L2 learning is construction learning, which is characterized as a lifelong, item-based, gradual, and locally contingent (situated) process involving the extraction of statistical regularities from experience with language, from which L2 knowledge gradually emerges. We have highlighted here that “the forms of language use are created, governed, constrained, acquired and used in the service of communicative functions” (Bates & MacWhinney, Reference Bates, MacWhinney, MacWhinney and Bates1989, p. 3), leading to pronounced differences in the patterns underlying functionally dissimilar domains of language use (registers). Over time, exposure to new language domains results in the register-contingent readjustment of already established constructions so as to adapt them to the specific discourse functions of that language domain. Focusing on aspects of syntagmatic growth, we have argued that the nature of language learning – as portrayed in UB models – requires not only the investigation of early stages of language learning but also the investigation of advanced stages, which are most pronouncedly shaped by the processing of written language.
Supplementary materials
For supplementary material for this paper, please visit <http://dx.doi.org/10.1017/langcog.2015.6>.