Oracle Big Data Handbook
Transform Big Data into Insight "In this book, some of Oracle's best engineers and architects explain how you can make use of big data. They'll tell you how you can integrate your existing Oracle solutions with big data systems, using each where appropriate and moving data between them as needed." -- Doug Cutting, co-creator of Apache Hadoop Cowritten by members of Oracle's big data team, Oracle Big Data Handbook provides complete coverage of Oracle's comprehensive, integrated set of products for acquiring, organizing, analyzing, and leveraging unstructured data. The book discusses the strategies and technologies essential for a successful big data implementation, including Apache Hadoop, Oracle Big Data Appliance, Oracle Big Data Connectors, Oracle NoSQL Database, Oracle Endeca, Oracle Advanced Analytics, and Oracle's open source R offerings. Best practices for migrating from legacy systems and integrating existing data warehousing and analytics solutions into an enterprise big data infrastructure are also included in this Oracle Press guide. Understand the value of a comprehensive big data strategy Maximize the distributed processing power of the Apache Hadoop platform Discover the advantages of using Oracle Big Data Appliance as an engineered system for Hadoop and Oracle NoSQL Database Configure, deploy, and monitor Hadoop and Oracle NoSQL Database using Oracle Big Data Appliance Integrate your existing data warehousing and analytics infrastructure into a big data architecture Share data among Hadoop and relational databases using Oracle Big Data Connectors Understand how Oracle NoSQL Database integrates into the Oracle Big Data architecture Deliver faster time to value using in-database analytics Analyze data with Oracle Advanced Analytics (Oracle R Enterprise and Oracle Data Mining), Oracle R Distribution, ROracle, and Oracle R Connector for Hadoop Analyze disparate data with Oracle Endeca Information Discovery Plan and implement a big data governance strategy and develop an architecture and roadmap
Big Data over Networks
Examines the crucial interaction between big data and communication, social and biological networks using critical mathematical tools and state-of-the-art research.
Big Data for Chimps
Finding patterns in massive event streams can be difficult, but learning how to find them doesn't have to be. This unique hands-on guide shows you how to solve this and many other problems in large-scale data processing with simple, fun, and elegant tools that leverage Apache Hadoop. You'll gain a practical, actionable view of big data by working with real data and real problems. Perfect for beginners, this book's approach will also appeal to experienced practitioners who want to brush up on their skills. Part I explains how Hadoop and MapReduce work, while Part II covers many analytic patterns you can use to process any data. As you work through several exercises, you'll also learn how to use Apache Pig to process data. Learn the necessary mechanics of working with Hadoop, including how data and computation move around the cluster Dive into map/reduce mechanics and build your first map/reduce job in Python Understand how to run chains of map/reduce jobs in the form of Pig scripts Use a real-world datasetbaseball performance statisticsthroughout the book Work with examples of several analytic patterns, and learn when and where you might use them
Big Data Management
This book focuses on the analytic principles of business practice and big data. Specifically, it provides an interface between the main disciplines of engineering/technology and the organizational and administrative aspects of management, serving as a complement to books in other disciplines such as economics, finance, marketing and risk analysis. The contributors present their areas of expertise, together with essential case studies that illustrate the successful application of engineering management theories in real-life examples.
Big Data Computing
This book unravels the mystery of Big Data computing and its power to transform business operations. The approach it uses will be helpful to any professional who must present a case for realizing Big Data computing solutions or to those who could be involved in a Big Data computing project. It provides a framework that enables business and technical managers to make optimal decisions necessary for the successful migration to Big Data computing environments and applications within their organizations.
Networking for Big Data
Networking for Big Data supplies an unprecedented look at cutting-edge research on the networking and communication aspects of Big Data. Starting with a comprehensive introduction to Big Data and its networking issues, it offers deep technical coverage of both theory and applications. The book is divided into four sections: introduction to Big Data, networking theory and design for Big Data, networking security for Big Data, and platforms and systems for Big Data applications. Focusing on key networking issues in Big Data, the book explains network design and implementation for Big Data. It examines how network topology impacts data collection and explores Big Data storage and resource management. Addresses the virtual machine placement problem Describes widespread network and information security technologies for Big Data Explores network configuration and flow scheduling for Big Data applications Presents a systematic set of techniques that optimize throughput and improve bandwidth for efficient Big Data transfer on the Internet Tackles the trade-off problem between energy efficiency and service resiliency The book covers distributed Big Data storage and retrieval as well as security, trust, and privacy protection for Big Data collection, storage, and search. It discusses the use of cloud infrastructures and highlights its benefits to overcome the identified issues and to provide new approaches for managing huge volumes of heterogeneous data. The text concludes by proposing an innovative user data profile-aware policy-based network management framework that can help you exploit and differentiate user data profiles to achieve better power efficiency and optimized resource management.
Big Data Analytics
With this book, managers and decision makers are given the tools to make more informed decisions about big data purchasing initiatives. Big Data Analytics: A Practical Guide for Managers not only supplies descriptions of common tools, but also surveys the various products and vendors that supply the big data market. Comparing and contrasting the different types of analysis commonly conducted with big data, this accessible reference presents clear-cut explanations of the general workings of big data tools. Instead of spending time on HOW to install specific packages, it focuses on the reasons WHY readers would install a given package. The book provides authoritative guidance on a range of tools, including open source and proprietary systems. It details the strengths and weaknesses of incorporating big data analysis into decision-making and explains how to leverage the strengths while mitigating the weaknesses. Describes the benefits of distributed computing in simple terms Includes substantial vendor/tool material, especially for open source decisions Covers prominent software packages, including Hadoop andOracle Endeca Examines GIS and machine learning applications Considers privacy and surveillance issues The book further explores basic statistical concepts that, when misapplied, can be the source of errors. Time and again, big data is treated as an oracle that discovers results nobody would have imagined. While big data can serve this valuable function, all too often these results are incorrect, yet are still reported unquestioningly. The probability of having erroneous results increases as a larger number of variables are compared unless preventative measures are taken. The approach taken by the authors is to explain these concepts so managers can ask better questions of their analysts and vendors as to the appropriateness of the methods used to arrive at a conclusion. Because the world of science and medicine has been grappling with similar issues in the publication of studies, the authors draw on their efforts and apply them to big data.
Advancing Big Data Benchmarks
This book constitutes the thoroughly refereed joint proceedings of the Third and Fourth Workshop on Big Data Benchmarking. The third WBDB was held in Xi'an, China, in July 2013 and the Fourth WBDB was held in San Jos, CA, USA, in October, 2013. The 15 papers presented in this book were carefully reviewed and selected from 33 presentations. They focus on big data benchmarks; applications and scenarios; tools, systems and surveys.
Handbook of Big Data
Handbook of Big Data provides a state-of-the-art overview of the analysis of large-scale datasets. Featuring contributions from well-known experts in statistics and computer science, this handbook presents a carefully curated collection of techniques from both industry and academia. Thus, the text instills a working understanding of key statistical and computing ideas that can be readily applied in research and practice. Offering balanced coverage of methodology, theory, and applications, this handbook: Describes modern, scalable approaches for analyzing increasingly large datasets Defines the underlying concepts of the available analytical tools and techniques Details intercommunity advances in computational statistics and machine learning Handbook of Big Data also identifies areas in need of further development, encouraging greater communication and collaboration between researchers in big data sub-specialties such as genomics, computational biology, and finance.
Principles of Big Data
Principles of Big Data helps readers avoid the common mistakes that endanger all Big Data projects. By stressing simple, fundamental concepts, this book teaches readers how to organize large volumes of complex data, and how to achieve data permanence when the content of the data is constantly changing. General methods for data verification and validation, as specifically applied to Big Data resources, are stressed throughout the book. The book demonstrates how adept analysts can find relationships among data objects held in disparate Big Data resources, when the data objects are endowed with semantic support (i.e., organized in classes of uniquely identified data objects). Readers will learn how their data can be integrated with data from other resources, and how the data extracted from Big Data resources can be used for purposes beyond those imagined by the data creators. Learn general methods for specifying Big Data in a way that is understandable to humans and to computers Avoid the pitfalls in Big Data design and analysis Understand how to create and use Big Data safely and responsibly with a set of laws, regulations and ethical standards that apply to the acquisition, distribution and integration of Big Data resources
