Linguistic performance elicited by language tasks has generally been operationalized in terms of complexity, accuracy and fluency (CAF). However, as has been argued in a number of studies (
e.g., De Jong et al., 2012; Révész et al., 2016), assessment of L2 proficiency is impossible without considering the efficacy and appropriacy of L2 performance (henceforth 'functional adequacy', FA). From the perspective of task-based language assessment (TBLA; Long, 2015, 2016; Norris, 2016), FA is conceived of as a multi-layered, goal-directed, task-related construct, in terms of successful task completion by the speaker/writer in conveying a message to the listener/reader. A rating scale of FA for the assessment of oral and written performance has been developed, which distinguishes four dimensions: Task Requirements, Content, Comprehensibility, and Coherence & Cohesion (Kuiken & Vedder, 2017).
In order to investigate the reliability, validity and applicability of the FA scale, a number of experimental studies have been conducted in which FA was assessed by both expert and non-expert raters, in different learning contexts, involving various source and target languages, proficiency levels (A2-C1), task types and modalities. Some of these studies have also investigated the relationship and mutual development of FA and CAF, resulting in mixed findings. The main outcome of the studies in which the FA scale was employed was that the FA scale is a reliable, valid and user-friendly tool and that, in terms of applicability, its scope is sufficiently broad. A number of issues and challenges for future research, however, still remain.
The goal of our presentation is to discuss perspectives and challenges of research on FA for TBLA and SLA, in particular regarding the following topics:
1. Standardization
An important issue concerns the reliability, validation and/or adaptation of the FA scale, in relation to learning context, target language, task type, task modality. In order to assure comparability of studies it is necessary to standardize test instrument, methodology, assesment tasks (use of 'proto-typical tasks'), data analysis and rater training.
2. FA in relation to (sub)components of CAF
Although (sub)components of FA and CAF appear to be connected to some degree, the overall picture is still unclear. Further investigation is needed, e.g., associations between FA descriptors and CAF measures, or the extent to which the relationship of FA and CAF is moderated by proficiency level and task type.
3. FA in interactional tasks
So far, the FA rating scale has been employed exclusively for the assessment of monologic tasks. An important question is whether and how the rating scale can be used (adapted and/or extended) for interactional tasks.
4. FA in classroom practice
Another issue which needs to be further explored is the role of FA in classroom and assessment practice, and how it can be incorporated into the field of instructed second language acquisition (ISLA). Future research should also examine the impact of different instructional treatments on the development of FA, and the possibility to use the scale for self-assessment by learners and/or peer feedback.