Embargoed and Restricted Access
Reason: Under embargo until May 2020. After this date a copy can be supplied under Section 51 (2) of the Australian Copyright Act 1968 by submitting a document delivery request through your library, or by emailing document.delivery@monash.edu
Distributed Associative Memory Approach for Cloud Computing Environments
thesis
posted on 2017-04-09, 23:25 authored by Amir Hossein BasiratWith emerging
interest to leverage massive amounts of data that are available in open
sources, such as the Web for solving long-standing information retrieval
problems, the question as how to effectively process immense datasets is
becoming increasingly relevant. This raises the question of whether our
capability to recognise and process such immense data copes with our ability to
generate them. This question will be addressed in this thesis by first
examining the capability of existing large-scale data-processing schemes to
scale up with this outgrowth of data. To address some of their highlighted
limitations, particularly regarding computational complexity and scalability,
this research proposes a novel associative-memory-based scheme for big data
processing that is scalable, distributable and lightweight, and that overcomes
some of the issues encountered in traditional data access mechanisms for data
storage and retrieval. To achieve the above goal, a distributed data access
scheme that enables data storage and retrieval by association is first
developed to circumvent the partitioning issue experienced within referential
data access mechanisms. In our model, data records are treated as patterns. As
a result, data storage and retrieval are performed using a distributed pattern
recognition approach that is implemented through the integration of loosely
coupled computational networks, followed by a divide-and-distribute approach
that facilitates the distribution of these networks within the cloud
dynamically.
To date, all implementations of MapReduce, including the Hadoop version, have interpreted data in a relational model, which limits its functionality when dealing with complex and unstructured data such as images. To address this, an associative-memory-based MapReduce is introduced to elevate the MapReduce key-value scheme to a higher level of functionality by replacing the purely quantitative key-value pairs with scalable associative-memory-based data structures that will improve parallel processing of data with complex relations. By having an associative key-value model, we can deal with data in any form and in any representation simply by using a pattern-matching model that treats data records as patterns and provides a distributed data access scheme that enables data storage and retrieval by association, thereby circumventing the scaling issue experienced within referential data access mechanisms. The principle of associative-memory-based learning is implemented through the use of connected layers in a hierarchical fashion; with local feature learning happening at the lowest layer while features are combined to form higher representations at upper layers.
In addition, this thesis investigates the extension of the proposed distributed data management scheme for different data-intensive scenarios by improving upon the existing cloud data management models for fault tolerance and scalability and reducing MapReduce communication overheads by introducing data locality. In particular, three data-intensive scenarios are considered in detail: dealing with large datasets, handling large training volumes and a neural network with an excessive number of processing neurons. Moreover, the application of our associative-memory-based approach is examined as a case study in a cloud of wireless sensor networks (Cloud-WSNs) to investigate the capabilities of the scheme in performing large-scale pattern recognition operations in resource-constrained WSNs.
To date, all implementations of MapReduce, including the Hadoop version, have interpreted data in a relational model, which limits its functionality when dealing with complex and unstructured data such as images. To address this, an associative-memory-based MapReduce is introduced to elevate the MapReduce key-value scheme to a higher level of functionality by replacing the purely quantitative key-value pairs with scalable associative-memory-based data structures that will improve parallel processing of data with complex relations. By having an associative key-value model, we can deal with data in any form and in any representation simply by using a pattern-matching model that treats data records as patterns and provides a distributed data access scheme that enables data storage and retrieval by association, thereby circumventing the scaling issue experienced within referential data access mechanisms. The principle of associative-memory-based learning is implemented through the use of connected layers in a hierarchical fashion; with local feature learning happening at the lowest layer while features are combined to form higher representations at upper layers.
In addition, this thesis investigates the extension of the proposed distributed data management scheme for different data-intensive scenarios by improving upon the existing cloud data management models for fault tolerance and scalability and reducing MapReduce communication overheads by introducing data locality. In particular, three data-intensive scenarios are considered in detail: dealing with large datasets, handling large training volumes and a neural network with an excessive number of processing neurons. Moreover, the application of our associative-memory-based approach is examined as a case study in a cloud of wireless sensor networks (Cloud-WSNs) to investigate the capabilities of the scheme in performing large-scale pattern recognition operations in resource-constrained WSNs.
History
Campus location
AustraliaPrincipal supervisor
Asad I. KhanAdditional supervisor 1
Balasubramaniam SrinivasanYear of Award
2017Department, School or Centre
Information Technology (Monash University Clayton)Course
Doctor of PhilosophyDegree Type
DOCTORATEFaculty
Faculty of Information TechnologyUsage metrics
Categories
No categories selectedKeywords
Licence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC