Incremental learning for large scale data stream analytics in a complex environment

Za'in, Choiru

  • 出版商: Routledge
  • 出版日期: 2023-04-16
  • 售價: $2,810
  • 貴賓價: 9.5$2,670
  • 語言: 英文
  • 頁數: 356
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1805263242
  • ISBN-13: 9781805263241
  • 海外代購書籍(需單獨結帳)

相關主題

商品描述

Nowadays, in the era of the Internet of Things (IoT), data are generated from devices/sensors in the form of text, images, videos, etc. Data from sensors arrive continuously from multiple sources, different environments as data streams [1]. As a result, vast volumes of data can be generated in the cloud over time. Furthermore, data streams are also characterized by non-stationary environments that come from real-world applications [2]. Due to the high business demands, these enormous volumes of data streams need to be learned immediately as they arrive for decision-making purposes [3]. While large-scale data streams have a high potential to improve effective decision making, learning from this data is challenging. Based on [4], a general problem large-scale/big data of big data includes 5V (Variety, Velocity, Volume, Value, and Veracity) characteristics. In machine learning literature, of these five characteristics, there are at least two main issues in learning from data streams which trigger further investigation: huge volumes and velocity. Another issue of data streams is the nonstationary characteristics of data streams [5]. The vast volumes and velocity characteristics causes the generation of big data due to the high speed of arrival data streams. In contrast, the non-stationary characteristic is related to the changing of data distribution over time. These challenges provide excellent opportunities for many research directions.

商品描述(中文翻譯)

如今,在物聯網(IoT)時代,數據以文本、圖像、視頻等形式從設備/傳感器生成。來自傳感器的數據以數據流的形式持續不斷地從多個來源、不同環境中到達[1]。因此,隨著時間的推移,大量的數據可以在雲端中生成。此外,數據流還具有來自現實應用的非穩態環境特徵[2]。由於商業需求高,這些龐大的數據流需要立即學習以進行決策[3]。雖然大規模數據流具有提高有效決策的巨大潛力,但從這些數據中學習是具有挑戰性的。根據[4],大數據的一般問題包括5V(多樣性、速度、容量、價值和真實性)特徵。在機器學習文獻中,這五個特徵中,至少有兩個主要問題需要進一步研究:龐大的容量和速度。數據流的另一個問題是數據流的非穩態特徵[5]。巨大的容量和速度特徵導致大數據的生成,這是由於數據流的高速到達。相反,非穩態特徵與數據分佈隨時間變化有關。這些挑戰為許多研究方向提供了良好的機會。