Transfer Learning for Harmful Content Detection
暫譯: 有害內容檢測的遷移學習

Mohtaj, Salar

  • 出版商: Springer
  • 出版日期: 2025-09-27
  • 售價: $6,370
  • 貴賓價: 9.5$6,052
  • 語言: 英文
  • 頁數: 105
  • 裝訂: Hardcover - also called cloth, retail trade, or trade
  • ISBN: 3032008492
  • ISBN-13: 9783032008497
  • 相關分類: Natural Language Processing
  • 海外代購書籍(需單獨結帳)

商品描述

This book provides an in-depth exploration of the effectiveness of transfer learning approaches in detecting deceptive content (i.e., fake news) and inappropriate content (i.e., hate speech). The author first addresses the issue of insufficient labeled data by reusing knowledge gained from other natural language processing (NLP) tasks, such as language modeling. He goes on to observe the connection between harmful content and emotional signals in text after emotional cues were integrated into the classification models to evaluate their impact on model performance. Additionally, since pre-processing plays an essential role in NLP tasks by enriching raw data--especially critical for tasks with limited data, such as fake news detection--the book analyzes various pre-processing strategies in a transfer learning context to enhance the detection of fake stories online. Optimal settings for transferring knowledge from pre-trained models across subtasks, including claim extraction and check-worthiness assessment, are also investigated. The author shows that the findings indicate that incorporating these features into check-worthy claim models can improve overall model performance, though integrating emotional signals did not significantly affect classifier results. Finally, the experiments highlight the importance of pre-processing for enhancing input text, particularly in social media contexts where content is often ambiguous and lacks context, leading to notable performance improvements.

商品描述(中文翻譯)

本書深入探討轉移學習方法在檢測欺騙性內容(即假新聞)和不當內容(即仇恨言論)方面的有效性。作者首先針對標記數據不足的問題,通過重用從其他自然語言處理(NLP)任務中獲得的知識,例如語言建模,來解決此問題。接著,他觀察到有害內容與文本中的情感信號之間的關聯,並在將情感線索整合到分類模型中後,評估其對模型性能的影響。此外,由於預處理在NLP任務中扮演著至關重要的角色,特別是在數據有限的任務中(如假新聞檢測),本書分析了在轉移學習背景下的各種預處理策略,以增強在線假故事的檢測。還調查了從預訓練模型轉移知識到子任務的最佳設置,包括主張提取和檢查價值評估。作者顯示,研究結果表明,將這些特徵納入檢查價值主張模型可以改善整體模型性能,儘管整合情感信號對分類器結果的影響並不顯著。最後,實驗強調了預處理在增強輸入文本方面的重要性,特別是在社交媒體環境中,內容往往模糊且缺乏上下文,從而導致顯著的性能提升。

作者簡介

Salar Mohtaj is a Research Scientist at the German Research Center for Artificial Intelligence (DFKI) and a postcoctoral researcher in the Speech & Language Technology group. He completed his PhD at Technische Universität Berlin, focusing on fake news and hate speech detection, and hold a Master's degree in Information Technology from Tehran Polytechnic (Amirkabir University of Technology), specializing in natural language processing. Previously, he led the development of a Persian plagiarism detection system at ICT Research Institute of Tehran. With over 40 publications in journals and conferences, Salar has made contributions to different natural language processing tasks, notably publishing research and creating datasets across various tasks--from plagiarism detection and German text readability assessment to fake news detection.

作者簡介(中文翻譯)

Salar Mohtaj 是德國人工智慧研究中心 (DFKI) 的研究科學家,也是語音與語言技術組的博士後研究員。他在柏林工業大學 (Technische Universität Berlin) 完成了博士學位,專注於假新聞和仇恨言論的檢測,並持有德黑蘭工業大學 (Amirkabir University of Technology) 的資訊科技碩士學位,專攻自然語言處理。之前,他在德黑蘭的資訊與通訊技術研究所 (ICT Research Institute) 領導了一個波斯語抄襲檢測系統的開發。Salar 在期刊和會議上發表了超過 40 篇論文,對不同的自然語言處理任務做出了貢獻,特別是在抄襲檢測、德語文本可讀性評估和假新聞檢測等各種任務中發表研究並創建數據集。