Syntactic N-Grams in Computational Linguistics
暫譯: 計算語言學中的語法 N-gram

Sidorov, Grigori

  • 出版商: Springer
  • 出版日期: 2019-04-11
  • 售價: $2,420
  • 貴賓價: 9.5$2,299
  • 語言: 英文
  • 頁數: 92
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 3030147703
  • ISBN-13: 9783030147709
  • 海外代購書籍(需單獨結帳)

相關主題

商品描述

This book is about a new approach in the field of computational linguistics related to the idea of constructing n-grams in non-linear manner, while the traditional approach consists in using the data from the surface structure of texts, i.e., the linear structure.

In this book, we propose and systematize the concept of syntactic n-grams, which allows using syntactic information within the automatic text processing methods related to classification or clustering. It is a very interesting example of application of linguistic information in the automatic (computational) methods. Roughly speaking, the suggestion is to follow syntactic trees and construct n-grams based on paths in these trees. There are several types of non-linear n-grams; future work should determine, which types of n-grams are more useful in which natural language processing (NLP) tasks.

This book is intended for specialists in the field of computational linguistics. However, we made an effort to explain in a clear manner how to use n-grams; we provide a large number of examples, and therefore we believe that the book is also useful for graduate students who already have some previous background in the field.

商品描述(中文翻譯)

本書探討了一種在計算語言學領域的新方法,與以非線性方式構建 n-grams 的概念有關,而傳統方法則是使用文本的表面結構數據,即線性結構。

在本書中,我們提出並系統化了句法 n-grams 的概念,這使得在與分類或聚類相關的自動文本處理方法中使用句法信息成為可能。這是一個非常有趣的例子,展示了語言信息在自動(計算)方法中的應用。粗略來說,建議是遵循句法樹並基於這些樹中的路徑構建 n-grams。非線性 n-grams 有幾種類型;未來的工作應該確定哪些類型的 n-grams 在哪些自然語言處理(NLP)任務中更有用。

本書旨在為計算語言學領域的專家提供參考。然而,我們努力以清晰的方式解釋如何使用 n-grams;我們提供了大量的例子,因此我們相信本書對於已經具備一定背景的研究生也會有所幫助。

作者簡介

Grigori Sidorov is full Professor and researcher at the "Centro de Investigación en Computación" (Center for Computing Research, CIC), which is part of the "Instituto Politécnico Nacional" (National Polytechnic Institute), IPN in Mexico city, Mexico.

作者簡介(中文翻譯)

格里戈里·西多羅夫(Grigori Sidorov)是墨西哥城國立 Polytechnic Institute(Instituto Politécnico Nacional, IPN)下屬的計算研究中心(Centro de Investigación en Computación, CIC)的全職教授及研究員。