This book provides readers the “big picture” and a comprehensive survey of the domain of big data processing systems. For the past decade, the Hadoop framework has dominated the world of big data processing, yet recently academia and industry have started to recognize its limitations in several application domains and thus, it is now gradually being replaced by a collection of engines that are dedicated to specific verticals (e.g. structured data, graph data, and streaming data). The book explores this new wave of systems, which it refers to as Big Data 2.0 processing systems.
Big Data 2.0 Processing Systems A Systems Overview (Second Edition)
Encyclopedia of Big Data Technologies
Amazon Link: https://www.amazon.com/dp/331977526X
The Encyclopedia of Big Data Technologies provides researchers, educators, students and industry professionals with a comprehensive authority over the most relevant Big Data Technology concepts. With over 300 articles written by worldwide subject matter experts from both industry and academia, the encyclopedia covers topics such as big data storage systems, NoSQL database, cloud computing, distributed systems, data processing, data management, machine learning and social technologies, data science. Each peer-reviewed, highly structured entry provides the reader with basic terminology, subject overviews, key research results, application examples, future directions, cross references and a bibliography. The entries are expository and tutorial, making this reference a practical resource for students, academics, or professionals. In addition, the distinguished, international editorial board of the encyclopedia consists of well-respected scholars, each developing topics based upon their expertise.
Linked Data: Storing, Querying, and Reasoning
Amazon Link: https://www.amazon.com/dp/3319735144
This book describes efficient and effective techniques for harnessing the power of Linked Data by tackling the various aspects of managing its growing volume: storing, querying, reasoning, provenance management and benchmarking. Providing an updated overview of methods, technologies and systems related to Linked Data this book is mainly intended for students and researchers who are interested in the Linked Data domain. It enables students to gain an understanding of the foundations and underpinning technologies and standards for Linked Data, while researchers benefit from the in-depth coverage of the emerging and ongoing advances in Linked Data storing, querying, reasoning, and provenance management systems. Further, it serves as a starting point to tackle the next research challenges in the domain of Linked Data management.
Handbook of Big Data Technologies
Amazon Link: https://www.amazon.com/dp/3319493396
This handbook offers comprehensive coverage of recent advancements in Big Data technologies and related paradigms. Chapters are authored by international leading experts in the field, and have been reviewed and revised for maximum reader value. The volume consists of twenty-five chapters organized into four main parts. Part one covers the fundamental concepts of Big Data technologies including data curation mechanisms, data models, storage models, programming models and programming platforms. It also dives into the details of implementing Big SQL query engines and big stream processing systems. Part Two focuses on the semantic aspects of Big Data management including data integration and exploratory ad hoc analysis in addition to structured querying and pattern matching techniques. Part Three presents a comprehensive overview of large scale graph processing. It covers the most recent research in large scale graph processing platforms, introducing several scalable graph querying and mining mechanisms in domains such as social networks. Part Four details novel applications that have been made possible by the rapid emergence of Big Data technologies such as Internet-of-Things (IOT), Cognitive Computing and SCADA Systems. All parts of the book discuss open research problems, including potential opportunities, that have arisen from the rapid progress of Big Data technologies and the associated increasing requirements of application domains.
Large-Scale Graph Processing Using Apache Giraph
Amazon Link: https://www.amazon.com/dp/3319474308
This book takes its reader on a journey through Apache Giraph, a popular distributed graph processing platform designed to bring the power of big data processing to graph data. Designed as a step-by-step self-study guide for everyone interested in large-scale graph processing, it describes the fundamental abstractions of the system, its programming models and various techniques for using the system to process graph data at scale, including the implementation of several popular and advanced graph analytics algorithms. This book serves as an essential reference guide for students, researchers and practitioners in the domain of large scale graph processing. It offers step-by-step guidance, with several code examples and the complete source code available in the related github repository. Students will find a comprehensive introduction to and hands-on practice with tackling large scale graph processing problems using the Apache Giraph system, while researchers will discover thorough coverage of the emerging and ongoing advancements in big graph processing systems.
Big Data 2.0 Processing Systems: A Survey
Amazon Link: http://www.amazon.com/dp/3319387758/qid=1461320575
This book provides readers the “big picture” and a comprehensive survey of the domain of big data processing systems. For the past decade, the Hadoop framework has dominated the world of big data processing, yet recently academia and industry have started to recognize its limitations in several application domains and big data processing scenarios such as the large-scale processing of structured data, graph data and streaming data. Thus, it is now gradually being replaced by a collection of engines that are dedicated to specific verticals (e.g. structured data, graph data, and streaming data). The book explores this new wave of systems, which it refers to as Big Data 2.0 processing systems. Overall, the book offers a valuable reference guide for students, researchers and professionals in the domain of big data processing systems. Further, its comprehensive content will hopefully encourage readers to pursue further research on the subject.
Process Analytics: Concepts and Techniques for Querying and Analyzing Process Data
Amazon Link: http://www.amazon.com/dp/3319250361/
The book chiefly focuses on concepts, techniques and methods. It covers a large body of knowledge on process analytics – including process data querying, analysis, matching and correlating process data and models – to help practitioners and researchers understand the underlying concepts, problems, methods, tools and techniques involved in modern process analytics. Following an introduction to basic business process and process analytics concepts, it describes the state of the art in this area before examining different analytics techniques in detail. In this regard, the book covers analytics over different levels of process abstractions, from process execution data and methods for linking and correlating process execution data, to inferring process models, querying process execution data and process models, and scalable process data analytics methods. In addition, it provides a review of commercial process analytics tools and their practical applications. The book is intended for a broad readership interested in business process management and process analytics. It provides researchers with an introduction to these fields by comprehensively classifying the current state of research, by describing in-depth techniques and methods, and by highlighting future research directions. Lecturers will find a wealth of material to choose from for a variety of courses, ranging from undergraduate courses in business process management to graduate courses in business process analytics. Lastly, it offers professionals a reference guide to the state of the art in commercial tools and techniques, complemented by many real-world use case scenarios.
Large Scale and Big Data: Processing and Management
Publisher: CRC Press
Amazon Link: http://www.amazon.com/dp/1466581506/
This book provides readers with a central source of reference on the data management techniques currently available for large-scale data processing. Presenting chapters written by leading researchers, academics, and practitioners, it addresses the fundamental challenges associated with Big Data processing tools and techniques across a range of computing environments. After reading this book, you will gain a fundamental understanding of how to use Big Data-processing tools and techniques effectively across application domains. Coverage includes cloud data management architectures, big data analytics visualization, data management, analytics for vast amounts of unstructured data, clustering, classification, link analysis of big data, scalable data mining, and machine learning techniques.
Cloud Data Management
Amazon Link: http://www.amazon.com/dp/3319047647
In practice, the design and architecture of a cloud varies among cloud providers. We present a generic evaluation framework for the performance, availability and reliability characteristics of various cloud platforms. We describe a generic benchmark architecture for cloud databases, specifically NoSQL database as a service. It measures the performance of replication delay and monetary cost. Service Level Agreements (SLA) represent the contract which captures the agreed upon guarantees between a service provider and its customers. The specifications of existing service level agreements (SLA) for cloud services are not designed to flexibly handle even relatively straightforward performance and technical requirements of consumer applications. We present a novel approach for SLA-based management of cloud-hosted databases from the consumer perspective and an end-to-end framework for consumer-centric SLA management of cloud-hosted databases. The framework facilitates adaptive and dynamic provisioning of the database tier of the software applications based on application-defined policies for satisfying their own SLA performance requirements, avoiding the cost of any SLA violation and controlling the monetary cost of the allocated computing resources. In this framework, the SLA of the consumer applications are declaratively defined in terms of goals which are subjected to a number of constraints that are specific to the application requirements. The framework continuously monitors the application-defined SLA and automatically triggers the execution of necessary corrective actions (scaling out/in the database tier) when required. The framework is database platform-agnostic, uses virtualization-based database replication mechanisms and requires zero source code changes of the cloud-hosted software applications.
Graph Data Management: Techniques and Applications
Publisher: IGI Global
Amazon Link: https://www.amazon.com/dp/161350053X
Graphs are a powerful tool for representing and understanding objects and their relationships in various application domains. The growing popularity of graph databases has generated data management problems that include finding efficient techniques for compressing large graph databases and suitable techniques for visualizing, browsing, and navigating large graph databases. Graph Data Management: Techniques and Applications is a central reference source for different data management techniques for graph data structures and their application. This book discusses graphs for modeling complex structured and schemaless data from the Semantic Web, social networks, protein networks, chemical compounds, and multimedia databases and offers essential research for academics working in the interdisciplinary domains of databases, data mining, and multimedia technology.