Handbook of Linguistic Annotation
暫譯: 語言標註手冊

Ide, Nancy, Pustejovsky, James

  • 出版商: Springer
  • 出版日期: 2018-09-06
  • 售價: $15,790
  • 貴賓價: 9.5$15,001
  • 語言: 英文
  • 頁數: 1459
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 9402414266
  • ISBN-13: 9789402414264
  • 相關分類: Text-miningComputer-ScienceData Science
  • 海外代購書籍(需單獨結帳)

商品描述

This handbook offers a thorough treatment of the science of linguistic annotation. Leaders in the field guide the reader through the process of modeling, creating an annotation language, building a corpus and evaluating it for correctness. Essential reading for both computer scientists and linguistic researchers.Linguistic annotation is an increasingly important activity in the field of computational linguistics because of its critical role in the development of language models for natural language processing applications. Part one of this book covers all phases of the linguistic annotation process, from annotation scheme design and choice of representation format through both the manual and automatic annotation process, evaluation, and iterative improvement of annotation accuracy. The second part of the book includes case studies of annotation projects across the spectrum of linguistic annotation types, including morpho-syntactic tagging, syntactic analyses, a range of semantic analyses (semantic roles, named entities, sentiment and opinion), time and event and spatial analyses, and discourse level analyses including discourse structure, co-reference, etc. Each case study addresses the various phases and processes discussed in the chapters of part one.

商品描述(中文翻譯)

本手冊提供了語言標註科學的全面探討。該領域的領導者引導讀者了解建模、創建標註語言、構建語料庫及其正確性評估的過程。對於計算機科學家和語言學研究者來說,這是必讀的書籍。語言標註在計算語言學領域中越來越重要,因為它在自然語言處理應用的語言模型開發中扮演著關鍵角色。本書的第一部分涵蓋了語言標註過程的所有階段,從標註方案設計和表示格式選擇,到手動和自動標註過程、評估以及標註準確性的迭代改進。第二部分則包括了各種語言標註類型的標註項目案例研究,包括形態-句法標註、句法分析、一系列語義分析(語義角色、命名實體、情感和意見)、時間和事件及空間分析,以及話語層級分析,包括話語結構、共指等。每個案例研究都針對第一部分章節中討論的各個階段和過程。

作者簡介

Nancy Ide is Professor of Computer Science at Vassar College in Poughkeepsie, New York, USA. She has been in the field of computational linguistics for over 30 years and made significant contributions to research in word sense disambiguation, computational lexicography, discourse analysis, and the use of semantic web technologies for language data. She is founder of the Text Encoding Initiative (TEI), the first major standard for representing electronic language data, and later developed the XML Corpus Encoding Standard (XCES). More recently, she co-developed the ISO LAF/GrAF representation format for linguistically annotated data. She has also developed major corpora for American English, including the Open American National Corpus (OANC) and the Manually Annotated Sub-Corpus (MASC), and has been a pioneer in efforts to foster open data and resources. Professor Ide is Co-Editor-in-Chief of the journal Language Resources and Evaluation and Editor of the Springer book series Text, Speech, and Language Technology. James Pustejovsky is the TJX Feldberg professor of computer science at Brandeis University in Waltham, Massachusetts, United States. His expertise includes theoretical and computational modeling of language, specifically: Computational linguistics, Lexical semantics, Knowledge representation, temporal and spatial reasoning and Extraction. His main topics of research are Natural language processing generally, and in particular, the computational analysis of linguistic meaning. He proposed Generative Lexicon theory in lexical semantics. His other interests include temporal reasoning, event semantics, spatial language, language annotation, computational linguistics, and machine learning.

作者簡介(中文翻譯)

南希·艾德(Nancy Ide)是美國紐約州波基基普的瓦薩學院(Vassar College)計算機科學教授。她在計算語言學領域已有超過30年的經驗,對於詞義消歧、計算詞典學、話語分析以及語言數據的語義網技術應用等研究做出了重要貢獻。她是文本編碼倡議(Text Encoding Initiative, TEI)的創始人,這是第一個主要的電子語言數據表示標準,並且後來開發了XML語料庫編碼標準(XML Corpus Encoding Standard, XCES)。最近,她共同開發了ISO LAF/GrAF語言註釋數據的表示格式。她還為美式英語開發了主要語料庫,包括開放美國國家語料庫(Open American National Corpus, OANC)和手動註釋子語料庫(Manually Annotated Sub-Corpus, MASC),並在促進開放數據和資源方面成為先驅。艾德教授是期刊《語言資源與評估》(Language Resources and Evaluation)的共同主編,以及施普林格(Springer)書系《文本、語音與語言技術》(Text, Speech, and Language Technology)的編輯。

詹姆斯·普斯特約夫斯基(James Pustejovsky)是美國馬薩諸塞州沃爾瑟姆的布蘭代斯大學(Brandeis University)TJX·費爾德伯格計算機科學教授。他的專業領域包括語言的理論與計算建模,特別是:計算語言學、詞彙語義學、知識表示、時間與空間推理以及信息提取。他的主要研究主題是自然語言處理,特別是語言意義的計算分析。他在詞彙語義學中提出了生成詞彙理論(Generative Lexicon theory)。他的其他興趣包括時間推理、事件語義學、空間語言、語言註釋、計算語言學和機器學習。