Book Reviews The Oxford Handbook of Computational Linguistics Ruslan Mitkov (editor) (University of Wolverhampton) Oxford: Oxford University Press, Oxford University Computing Laboratory, and before that spent four years as a . sole editor of The Oxford Handbook of Computational Linguistics (, jinzihao.info The Oxford Handbook of Computational Linguistics. Edited by Ruslan Mitkov. Oxford Handbooks. Describes the best concepts, processes.
|Language:||English, Spanish, Japanese|
|Genre:||Academic & Education|
|ePub File Size:||25.58 MB|
|PDF File Size:||13.31 MB|
|Distribution:||Free* [*Regsitration Required]|
This handbook is currently in development, with individual articles publishing online in advance of print publication. At this time, we cannot add information about. The Oxford Handbook of Computational Linguistics features thirty-eight articles commissioned from experts all over the world. The book describes major. Garsington Road, Oxford, OX4 2DQ, UK .. sole editor of The Oxford Handbook of Computational Linguistics (, Oxford Uni-.
Word sense disambiguation. Part III provides overviews of important areas such as machine translation, in- formation retrieval, information extraction, question answering, and summarization. Measures such as precision and recall are useful yardsticks, but the real issue is, what value does the system deliver to an end user? These chapters will be particularly attractive to practitioners in these fields, as they provide succinct and realistic overviews of what can and cannot be achieved by cur- rent technology. Such is not the case here.
As the title of this book suggests, it is an update of the first edition of the Handbook of Natural Language Processing which was edited by Robert Dale, Hermann Moisl, and Harold Somers and published in the year Second edition.
With contents and search through the book. Speech and Language Processing is a general textbook on natural language processing, with an excellent coverage of the area and an unusually broad scope of topics. It includes statistical and symbolic approaches to NLP, as well as the main methods of speech processing Second printing with corrections.
Mathematical foundations. Linguistic essentials. Corpus-based work.
Statistical inference: Word sense disambiguation. Though not a proper history of MT, Readings in Machine Translation is certainly a historical collection. For this alone, Nirenburg, Somers, and Wilks deserve our gratitude. The volume begins with the famous memo- randum that Warren Weaver sent out to some professional acquaintances in , which is generally taken to mark the genesis of machine translation; and the most recent paper included dates back to the fourth MT Summit in The editors cite three: Well, as criteria go, that certainly sets a high standard!
And yet many of these articles seem to meet it with ease. One reads these papers today, decades after they were written, and one still cannot help but be impressed.
Needless to say, not all the articles included in Readings in Machine Translation come up to this high standard; that would be too much to expect. In other cases, one wishes the editors had made more liberal use of their prerogative to abridge. Another reason for the excessive length of Readings in Machine Translation is that the book is divided into three distinct sections, each under the responsibility of one of the editors.
There are obvious overlaps between these divisions, in the sense that articles included in one section could just as well fit into another. The editors acknowledge this, and in itself it is not very serious. In his introduction, for example, Nirenburg cites numerous, often lengthy passages from the articles by the early MT pioneers that purportedly support his preferred approach to meaning-based MT. A more serious criticism of Readings in Machine Translation is that the book is some- what dated.
This is a rather paradoxical charge for a collection of historical articles; what I mean by it is this: In fact, I was sent a preliminary version by the publisher in In the last few years, for example, there has been an impressive resurgence of activity in machine transla- tion, particularly in the United States, where statistical methods drawn from speech recognition and various techniques borrowed from machine learning have proven re- markably successful.
Had the editors been more aware of the profound impact of these new influences on the field, they would perhaps have modified their selection of articles.
As it is, only two of the thirty-six papers in the collection explicitly address data-driven or statistical methods in MT: Which brings me to my final criticism of this otherwise wonderful volume. Watson Research Center in the late s that eventually produced the Mark I system, later installed at the U.
And where did the article included in this collection first appear? It would have been so much easier and more helpful to display this information on the first page of each contribution! Indeed, one wishes the editors had seen fit to include a short introductory note to each article, providing a few words of historical background on the author, or at least his or her affiliation at the time the paper was published.
But these are more or less minor quibbles, and they do not significantly detract from the value of this generous volume: References Hutchins, John. Machine Translation: Past, Present, Future. Ellis Horwood, Chichester, England.
Kluwer Academic Publishers Kluwer international series on information retrieval, edited by W. As the editors note, these papers provide a cross-section of current work on the language-modeling ap- proach to information retrieval, which has become a very active area of research in the past few years. The editors place the papers in this volume in three broad cate- gories: By casting the doc- ument retrieval problem in this way, language-modeling techniques that have been developed over many years, particularly for speech recognition, can be applied to document retrieval.
On the other hand, at least in this simple statement, the language- modeling approach appears to ignore the concept of relevance. By contrast, the tradi- tional approach to probabilistic information retrieval, as expressed, for example, in the probability-ranking principle, explicitly states that the goal of probabilistic informa- tion retrieval is to predict the probability that a document will be judged relevant by a user, taking into account all evidence available to the retrieval system, which then ranks the documents in decreasing order of probability of relevance.
As mentioned by the editors in the preface, this concern with the relationship of the language-modeling approach to relevance was one of the underlying themes of the workshop, and the issue is taken up by some of the authors. In particular they disagree with Sparck Jones et al.
Sparck Jones et al. The result, although incomplete, is complex. It is unclear whether there are any theoretical advantages in abandoning the standard probabilistic model. Lavrenko and Croft also take up the issue of rele- vance, introducing the concept of a relevance model for information retrieval, that is, a language model reflecting word frequencies in the class of documents relevant to a particular information need. One of their motivations for introducing relevance models is to overcome one of the major disadvantages of the language-modeling framework: They present many experimental results supporting the retrieval effectiveness of their formal model.
Four chapters discuss language modeling in the context of ad hoc information retrieval: Thus the good experimental results for the language modeling approach reported throughout this book may be due more to its improved statistical estimation techniques than to the use of language modeling as a theoretical framework.
The remaining three chapters describe applications of language modeling to re- lated information retrieval tasks i. Information retrieval and computational linguistics have been more or less closely related fields since the s. Statistical approaches have always played an important role in information retrieval.
Within the field of computational linguistics, corpus lin- guistics has emerged over the past 20 years as an increasingly influential approach.
Language-modeling techniques originally developed to support speech recognition are now transforming the field of probabilistic information retrieval.
However, as alluded to by Sparck Jones et al. It is to be expected that other areas of human-language technology, such as dialogue modeling and mixed-initiative interaction, might also find more application in information retrieval research.
His research interests include probabilistic information retrieval and natural language processing. Thompson dartmouth.
A specific countercriticism of the quantitative practices in natural language pro- cessing is that the kinds of reasoning employed are unnatural and in many cases ignore important observations about linguistics; examples of this include the use of Markov processes or n-gram statistics to describe phonology or syntax, instead of the more lin- guistically plausible and traditional grammar-based formalisms.
In this book, Bod, Hay, and Jannedy present some of the evidence in favor of a role for probability in linguistic reasoning. Moreover, they also argue that incorporating probability theory into linguistic theory is not only legitimate but necessary. Probability theory should be compatible with, instead of a substitute for, more traditional linguistic observations. The book itself is an outgrowth of a symposium organized by the editors at the Linguistic Society of America annual meeting held in Washington, D.
Much of the book is thus dedicated to presenting background material specifically for non— computational linguists, who might not have a strong mathematical and statistical background. The rest of the book is taken up by individual chapters by noted experts in the var- ious subfields of psycholinguistics, sociolinguistics, historical linguistics and language change, phonology, morphology, syntax, and semantics.
The experts are well-chosen and include leading lights such as Baayen, Jurafsky, Manning, and Pierrehumbert, as well as the editors themselves. Most of the individual chapters are similarly structured; the author s observe that categorial theories do not neatly account for observed vari- ation and then present a probabilistic model that accounts for both categorial and the observed continuous data.
These models are fairly specific to the problems studied by each expert, and together they make an interesting collection of different ways to solve linguistic problems from a variety of standpoints; students of modeling could do much worse than to simply page through and look at each model in turn to see whether it could be adapted to their own studies.
In each case, the effects of frequency are directly modeled in a probabilistic framework and an appropriate causal role is assigned.
He addresses many of the challenges that traditional linguistics might present to probabilists. The bibliography is extensive, and there is a useful glossary of probabilistic terms to help readers keep definitions in mind. From the standpoint of computational linguistics, the book is slightly disappoint- ing in not discussing computational processes more, as the reader is usually left to infer the exact mechanisms and algorithms used to implement the equations discussed in the book.
Another weakness is the lack of discussion of statistical inferences, which are often necessary to interpret the probabilistic models themselves is it reasonable, for example, to expect readers who actually need chapter 2 to understand generalized linear modeling?
The most significant omission is a discussion of how these individ- ual models interact, either with each other or with more traditional categorial models of other linguistic subfields.
These are minor weaknesses in an otherwise significant work. For those who hold to categorial theories of language, this book will at least provide a single source for some major arguments, evidence, and theories supporting probabilistic processes underly- ing linguistic competence.
And for those who believe that probability is really only a description a quantification of our ignorance, if you will , this volume neatly sum- marizes ways in which probability may play a key role in human language processing. Patrick Juola has been working in computational and statistical applications of psycholinguistics since the early s as a Ph.