Feasibility analysis of AsterixDB and Spark streaming with Cassandra for stream-based processing
Posted on 2016-04-08 - 05:00
Abstract For getting up-to-date insight into online services, extracted data has to be processed in near real time. For example, major big data companies (Facebook, LinkedIn, Twitter) analyse streaming data for development of new services. Several technologies have been developed, which could be selected for implementation of stream processing functionalities. The contribution of this paper is feasibility analysis of technologies for stream-based processing of semi-structured data. Particularly, feasibility of a Big Data management system for semi-structured data (AsterixDB) will be compared to Spark streaming, which has been integrated with Cassandra NoSQL database for persistence. The study focuses on stream processing in a simulated social media use case (tweet analysis), which has been implemented to Eucalyptus cloud computing environment on a distributed shared memory multiprocessor platform. The results indicate that AsterixDB is able to provide significantly better performance both in terms of throughput and latency, when data feed functionality of AsterixDB is used, and stream processing has been implemented with Java. AsterixDB also scaled on the same level or better, when the amount of nodes on the cloud platform was increased. However, stream processing in AsterixDB was delayed by batching of data, when tweets were streamed into the database with data feeds.
CITE THIS COLLECTION
DataCiteDataCite
3 Biotech3 Biotech
3D Printing in Medicine3D Printing in Medicine
3D Research3D Research
3D-Printed Materials and Systems3D-Printed Materials and Systems
4OR4OR
AAPG BulletinAAPG Bulletin
AAPS OpenAAPS Open
AAPS PharmSciTechAAPS PharmSciTech
Abhandlungen aus dem Mathematischen Seminar der Universität HamburgAbhandlungen aus dem Mathematischen Seminar der Universität Hamburg
ABI Technik (German)ABI Technik (German)
Academic MedicineAcademic Medicine
Academic PediatricsAcademic Pediatrics
Academic PsychiatryAcademic Psychiatry
Academic QuestionsAcademic Questions
Academy of Management DiscoveriesAcademy of Management Discoveries
Academy of Management JournalAcademy of Management Journal
Academy of Management Learning and EducationAcademy of Management Learning and Education
Academy of Management PerspectivesAcademy of Management Perspectives
Academy of Management ProceedingsAcademy of Management Proceedings
Academy of Management ReviewAcademy of Management Review
PääkkÜnen, Pekka (2017). Feasibility analysis of AsterixDB and Spark streaming with Cassandra for stream-based processing. figshare. Collection. https://doi.org/10.6084/m9.figshare.c.3698710.v1