Towards a discourse theory of abstracts and abstracting
This Ph.D. thesis investigates the extent to which certain linguistic variables affect the perceived success of an abstract. In his opening chapter, the author situates the work as a contribution to discourse theory and formulates a set of seven basic research questions the study seeks to answer. Chapter 2 considers the most satisfactory design for the research. Part Two provides a survey of the relevant source material, consisting of reviews of: the Linguistics and Psychology literature; Artificial Intelligence computational summarising systems; and standards and guidelines taken from Information and Library Science. Part Three discusses the data collection: first, the collection of naturally occurring abstracts from a teaching establishment; second, the collection of judgements elicited by means of a set of questionnaires. The remainder of the thesis constitutes an attempt to reconcile these two types of data, words and opinions. The author draws a distinction between the qualitative and quantitative opinions of the judges (which he refers to as 'external measures') and the linguistic features present in the different abstracts ('internal measures') which determine these subjective opinions. Part Four discusses the data analysis, which draws on grammatical techniques from Systemic-Functional theory. In Part Four, hypotheses are investigated which relate the success of the different abstracts as perceived by the judges to the linguistic features present in the texts. Five different types of analysis are piloted using a small number of texts; three of these analyses are taken further and applied to all the abstracts. Part Five consists of two chapters. The first details the conclusions to be drawn from the study and explicitly answers the seven basic research questions introduced in Chapter 1. The second provides some suggestions for further research, chiefly concerning the collection of further external and internal measures. Finally, techniques from multivariate statistics are briefly sketched as a means of reconciling the two types of measure in the future. Abstract 2 This Ph.D. thesis investigates the extent to which certain linguistic variables affect the perceived success of an abstract. More specifically, answers to seven basic research questions are sought. These include: what reasons do readers give for preferring one abstract over another?; is 'success' better explained by correlation with one, or with many, linguistic variables?; to what extent do readers agree with each other in their various preferences? and which linguistic features can help to explain readers' preferences? In order to answer these questions, a total of 42 naturally occurring abstracts were collected from 29 second year Library and Information Science students at Brighton Polytechnic. 26 of these abstracts summarised General Knowledge source texts. 17 summarised Information Science (I.S.) source texts of three different types: journal articles, newspaper items and book chapters. The shortest I.S. abstract consisted of 111 words, the longest 651. Subjective data in the form of opinions of these abstracts were elicited by means of a set of six questionnaires. These questionnaires were administered to the students, to their two lecturers, and to 14 judges representing model consumers of such abstracts. The questionnaires elicited both quantitative and qualitative data. The 8 I.S. judges, for example, were asked to rank up to five different abstract versions summarising the same source text according to how helpful they believed them to be. They were also asked to provide reasons for their preferences. Different grammatical analyses from Systemic-Functional theory were employed to discover to what extent certain linguistic features in the abstracts determine their overall quality as perceived by the judges. Although some suggestions were made to overcome problems with the descriptive frameworks, analyses of generic structure and of cohesive harmony were found to be insufficiently reliable to enable precise hypothesis testing. However, the following linguistic phenomena were investigated more extensively and yielded interesting results: lexical texture; grammatical intricacy and choice of Theme. The answers to the above research questions are as follows. The reasons judges provide for preferring one abstract over another are many and varied; the two most common concern content and what might be termed 'reader-friendliness'. Success in text is a multivariate notion; anyone linguistic measure cannot explain all the variation in judges' preferences. Judges hold widely differing views of what constitutes a successful abstract: scores for Kendall's Coefficient of Concordance, W, a measure of inter-judge agreement, range from 0.109 to 0.597, suggesting that there are different drivers of success and that judges prioritise the importance of these drivers differently. In answer to the question, which linguistic features can help to explain readers' preferences?, the following results were obtained from the various hypotheses tested. Counter-intuitively, it was found that the more successful abstracts were characterised by lower levels of lexical density and were described as being 'clear'. Low levels of lexical density and lexical variation seem to be more the mark of 'reader-friendly' abstract writing, whereas higher levels of lexical density and lexical variation characterise abstracts which contain more information, but are correspondingly harder to process. In contrast to what is claimed in the literature, the hypothesis which stated that abstracts with a larger amount of clause level complexity would be generally preferred over abstracts with a smaller amount of clause level complexity was generally supported. Also, some clause combining strategies were noticeably preferred by the judges~ while others were noticeably dispreferred. However, these preferences were not shared across the different abstract sets. Judges were found to be particularly sensitive to choice of Theme. A new type of Theme was identified to complement the two already existing sub-types of topical Theme, interactional and informational: Themes are to be regarded as discoursal if they refer to aspects of the source material, or to studies which are themselves discussed by the source material. Eight hypotheses concerning choice of Theme were investigated. For example, hypothesis 8 claimed that abstracts with more informational Themes would be preferred over abstracts with fewer informational Themes. This was supported for the Tanzanian set (H8c), but falsified for the other three. The judges seem to be indicating that they deem an informational style to be more appropriate for the Tanzanian source text. The three different types of topical theme serve different functions: informational themes primarily reflect the writer's desire to enlighten, by presenting the raw facts of the message for readers' consideration; discoursal themes primarily reflect the writer's desire in orient their readers, by providing a way of navigating through the various channels in which the information is presented; interactional themes primarily reflect the writer's desire to make it easy for readers to integrate the knowledge, by showing readers how the information relates to the various people involved in its transfer.