A revelatory exploration of the hottest trend in technology and the dramatic impact it will have on the economy, science, and society at large.Which paint color is most likely to tell you that a used car is in good shape? How can officials identify the most dangerous New York City manholes before they explode? And how did Google searches predict the spread of the H1N1 flu outbreak?The key to answering these questions, and many more, is big data. “Big data” refers to our burgeoning ability to crunch vast collections of information, analyze it instantly, and draw sometimes profoundly surprising conclusions from it. This emerging science can translate myriad phenomena—from the price of airline tickets to the text of millions of books—into searchable form, and uses our increasing computing power to unearth epiphanies that we never could have seen before. A revolution on par with the Internet or perhaps even the printing press, big data will change the way we think about business, health, politics, education, and innovation in the years to come. It also poses fresh threats, from the inevitable end of privacy as we know it to the prospect of being penalized for things we haven’t even done yet, based on big data’s ability to predict our future behavior.In this brilliantly clear, often surprising work, two leading experts explain what big data is, how it will change our lives, and what we can do to protect ourselves from its hazards. Big Data is the first big book about the next big thing.
Published by: Eamon Dolan/Houghton Mifflin Harcourt | Publication date: 03/05/2013Kindle book details: Kindle Edition, 257 pages
Foreword by Steven PinkerBlending the informed analysis of The Signal and the Noise with the instructive iconoclasm of Think Like a Freak, a fascinating, illuminating, and witty look at what the vast amounts of information now instantly available to us reveals about ourselves and our world—provided we ask the right questions.By the end of an average day in the early twenty-first century, human beings searching the internet will amass eight trillion gigabytes of data. This staggering amount of information—unprecedented in history—can tell us a great deal about who we are—the fears, desires, and behaviors that drive us, and the conscious and unconscious decisions we make. From the profound to the mundane, we can gain astonishing knowledge about the human psyche that less than twenty years ago, seemed unfathomable. Everybody Lies offers fascinating, surprising, and sometimes laugh-out-loud insights into everything from economics to ethics to sports to race to sex, gender and more, all drawn from the world of big data. What percentage of white voters didn’t vote for Barack Obama because he’s black? Does where you go to school effect how successful you are in life? Do parents secretly favor boy children over girls? Do violent films affect the crime rate? Can you beat the stock market? How regularly do we lie about our sex lives and who’s more self-conscious about sex, men or women? Investigating these questions and a host of others, Seth Stephens-Davidowitz offers revelations that can help us understand ourselves and our lives better. Drawing on studies and experiments on how we really live and think, he demonstrates in fascinating and often funny ways the extent to which all the world is indeed a lab. With conclusions ranging from strange-but-true to thought-provoking to disturbing, he explores the power of this digital truth serum and its deeper potential—revealing biases deeply embedded within us, information we can use to change our culture, and the questions we’re afraid to ask that might be essential to our health—both emotional and physical. All of us are touched by big data everyday, and its influence is multiplying. Everybody Lies challenges us to think differently about how we see it and the world.
Published by: Dey Street Books | Publication date: 05/09/2017Kindle book details: Kindle Edition, 357 pages
Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems
Data is at the center of many challenges in system design today. Difficult issues need to be figured out, such as scalability, consistency, reliability, efficiency, and maintainability. In addition, we have an overwhelming variety of tools, including relational databases, NoSQL datastores, stream or batch processors, and message brokers. What are the right choices for your application? How do you make sense of all these buzzwords?In this practical and comprehensive guide, author Martin Kleppmann helps you navigate this diverse landscape by examining the pros and cons of various technologies for processing and storing data. Software keeps changing, but the fundamental principles remain the same. With this book, software engineers and architects will learn how to apply those ideas in practice, and how to make full use of data in modern applications.
- Peer under the hood of the systems you already use, and learn how to use and operate them more effectively
- Make informed decisions by identifying the strengths and weaknesses of different tools
- Navigate the trade-offs around consistency, scalability, fault tolerance, and complexity
- Understand the distributed systems research upon which modern databases are built
- Peek behind the scenes of major online services, and learn from their architectures
Published by: O'Reilly Media | Publication date: 03/16/2017Kindle book details: Kindle Edition, 624 pages
Less than 0.5 per cent of all data is currently analysed and used. However, business leaders and managers cannot afford to be unconcerned or sceptical about data. Data is revolutionizing the way we work and it is the companies that view data as a strategic asset that will survive and thrive. Bernard Marr's Data Strategy is a must-have guide to creating a robust data strategy. Explaining how to identify your strategic data needs, what methods to use to collect the data and, most importantly, how to translate your data into organizational insights for improved business decision-making and performance, this is essential reading for anyone aiming to leverage the value of their business data and gain competitive advantage.Packed with case studies and real-world examples, advice on how to build data competencies in an organization and crucial coverage of how to ensure your data doesn't become a liability, Data Strategy will equip any organization with the tools and strategies it needs to profit from big data, analytics and the Internet of Things.
Published by: Kogan Page | Publication date: 04/03/2017Kindle book details: Kindle Edition, 200 pages
This book fills the need for a concise and conversational book on the growing field of Data Science. Easy to read and informative, this lucid book covers everything important, with concrete examples, and invites the reader to join this field. The chapters in the book are organized for a typical one-semester course. The book contains case-lets from real-world stories at the beginning of every chapter. There is also a running case study across the chapters as exercises. This book is designed to provide a student with the intuition behind this evolving area, along with a solid toolset of the major data mining techniques and platforms. Finally, it includes a tutorial for R platform. The 2018 edition includes a new chapter on Artificial Intelligence primer. The 2017 edition had added four new chapters in response to the thoughts and suggestions expressed by many reviewers. The book has proved very popular throughout the world. Many universities in the US and around the world have adopted it as a textbook for their courses. Students across a variety of academic disciplines, including business, computer science, statistics, engineering, and others attracted to the idea of discovering new insights and ideas from data can use this as a textbook. Professionals in various domains, including executives, managers, analysts, professors, doctors, accountants, and others can use this book to learn in a few hours how to make sense of and develop actionable insights from the enormous data coming their way. This is a flowing book that one can finish in one sitting, or one can return to it again and again for insights and techniques.Table of ContentsChapter 1: Wholeness of Data AnalyticsChapter 2: Business Intelligence Concepts & ApplicationsChapter 3: Data WarehousingChapter 4: Data MiningChapter 5: Data VisualizationChapter 6: Decision TreesChapter 7: Regression ModelsChapter 8: Artificial Neural NetworksChapter 9: Cluster Analysis Chapter 10: Association Rule Mining Chapter 11: Text MiningChapter 12: Naïve Bayes AnalysisChapter 13: Support Vector MachinesChapter 14: Web MiningChapter 15: Social Network AnalysisChapter 16: Big DataChapter 17: Data Modeling PrimerChapter 18: Statistics PrimerChapter 19: Artificial Intelligence PrimerChapter 20: Data Science CareersAppendix: Data Mining Tutorial using R
Publication date: 05/01/2014Kindle book details: Kindle Edition, 156 pages
This books fills the need for an easy and holistic book on essential Big Data technologies. Written in a lucid and simple language free from jargon and code, this book provides an intuition for Big Data from business as well as technological perspectives. This book is designed to provide the reader with the intuition behind this evolving area, along with a solid toolset of the major big data processing technologies such as Hadoop, MapReduce, Spark Streaming, and NoSql databases. A complete case study of developing a web log analyzer is included. The book also contains two primers on Cloud computing and Data Mining. It also contains two tutorials on installing Hadoop and Spark. The book contains caselets from real-world stories.Students across a variety of academic disciplines including business, computer science, statistics, engineering, and others attracted to the idea of harnessing Big Data for new insights and ideas from data, can use this as a textbook. Professionals in various domains, including executives, managers, analysts, professors, doctors, accountants, and others can use this book to learn in a few hours how to make the most of Big Data to monitor their infrastructure, discover new insights, and develop new data-based products. It is a flowing book that one can finish in one sitting, or one can return to it again and again for insights and techniques.Table of Contents1.Wholeness of Big Data2.Big Data Applications3.Big Data Architectures4.Distributed Systems with Hadoop5.Parallel Programming with MapReduce6.Advanced NoSQL databases7.Stream programming with Spark8.Data Ingest with Kafka9.Cloud Computing Primer10. Web Log Analyzer development 11.Data Mining Primer12.Appendix 1 on Installing Hadoop on AWS cloud13.Appendix 2 on Installing Spark
Publication date: 06/28/2016Kindle book details: Kindle Edition, 301 pages
Find the right big data solution for your business or organization Big data management is one of the major challenges facing business, industry, and not-for-profit organizations. Data sets such as customer transactions for a mega-retailer, weather patterns monitored by meteorologists, or social network activity can quickly outpace the capacity of traditional data management tools. If you need to develop or manage big data solutions, you'll appreciate how these four experts define, explain, and guide you through this new and often confusing concept. You'll learn what it is, why it matters, and how to choose and implement solutions that work.
- Effectively managing big data is an issue of growing importance to businesses, not-for-profit organizations, government, and IT professionals
- Authors are experts in information management, big data, and a variety of solutions
- Explains big data in detail and discusses how to select and implement a solution, security concerns to consider, data storage and presentation issues, analytics, and much more
- Provides essential information in a no-nonsense, easy-to-understand style that is empowering
Published by: For Dummies | Publication date: 04/02/2013Kindle book details: Kindle Edition, 336 pages
Big Data: Using SMART Big Data, Analytics and Metrics To Make Better Decisions and Improve Performance
Convert the promise of big data into real world results There is so much buzz around big data. We all need to know what it is and how it works - that much is obvious. But is a basic understanding of the theory enough to hold your own in strategy meetings? Probably. But what will set you apart from the rest is actually knowing how to USE big data to get solid, real-world business results - and putting that in place to improve performance. Big Data will give you a clear understanding, blueprint, and step-by-step approach to building your own big data strategy. This is a well-needed practical introduction to actually putting the topic into practice. Illustrated with numerous real-world examples from a cross section of companies and organisations, Big Data will take you through the five steps of the SMART model: Start with Strategy, Measure Metrics and Data, Apply Analytics, Report Results, Transform.
- Discusses how companies need to clearly define what it is they need to know
- Outlines how companies can collect relevant data and measure the metrics that will help them answer their most important business questions
- Addresses how the results of big data analytics can be visualised and communicated to ensure key decisions-makers understand them
- Includes many high-profile case studies from the author's work with some of the world's best known brands
Published by: Wiley | Publication date: 01/09/2015Kindle book details: Kindle Edition, 233 pages
New York Times BestsellerAfter twenty consecutive losing seasons for the Pittsburgh Pirates, team morale was low, the club's payroll ranked near the bottom of the sport, game attendance was down, and the city was becoming increasingly disenchanted with its team. Pittsburghers joked their town was the city of champions…and the Pirates. Big Data Baseball is the story of how the 2013 Pirates, mired in the longest losing streak in North American pro sports history, adopted drastic big-data strategies to end the drought, make the playoffs, and turn around the franchise's fortunes. Award-winning journalist Travis Sawchik takes you behind the scenes to expertly weave together the stories of the key figures who changed the way the small-market Pirates played the game. For manager Clint Hurdle and the front office staff to save their jobs, they could not rely on a free agent spending spree, instead they had to improve the sum of their parts and find hidden value. They had to change. From Hurdle shedding his old-school ways to work closely with Neal Huntington, the forward-thinking data-driven GM and his team of talented analysts; to pitchers like A. J. Burnett and Gerrit Cole changing what and where they threw; to Russell Martin, the undervalued catcher whose expert use of the nearly-invisible skill of pitch framing helped the team's pitchers turn more balls into strikes; to Clint Barmes, a solid shortstop and one of the early adopters of the unconventional on-field shift which forced the entire infield to realign into positions they never stood in before. Under Hurdle's leadership, a culture of collaboration and creativity flourished as he successfully blended whiz kid analysts with graybeard coaches—a kind of symbiotic teamwork which was unique to the sport.Big Data Baseball is Moneyball on steroids. It is an entertaining and enlightening underdog story that uses the 2013 Pirates season as the perfect lens to examine the sport's burgeoning big-data movement. With the help of data-tracking systems like PitchF/X and TrackMan, the Pirates collected millions of data points on every pitch and ball in play to create a tome of color-coded reports that revealed groundbreaking insights for how to win more games without spending a dime. In the process, they discovered that most batters struggled to hit two-seam fastballs, that an aggressive defensive shift on the field could turn more batted balls into outs, and that a catcher's most valuable skill was hidden. All these data points which aren't immediately visible to players and spectators, are the bit of magic that led the Pirates to spin straw in to gold, finish the 2013 season in second place, end a twenty-year losing streak.
Published by: Flatiron Books | Publication date: 05/19/2015Kindle book details: Kindle Edition, 255 pages
Since long before computers were even thought of, data has been collected and organized by diverse cultures across the world. Once access to the Internet became a reality for large swathes of the world's population, the amount of data generated each day became huge, and continues to grow exponentially. It includes all our uploaded documents, video, and photos, all our social media traffic, our online shopping, even the GPS data from our cars.'Big Data' represents a qualitative change, not simply a quantitative one. The term refers both to the new technologies involved, and to the way it can be used by business and government. Dawn E. Holmes uses a variety of case studies to explain how data is stored, analysed, and exploited by a variety of bodies from big companies to organizations concerned with disease control. Big data is transforming the way businesses operate, and the way medical research can be carried out. At the sametime, it raises important ethical issues; Holmes discusses cases such as the Snowden affair, data security, and domestic smart devices which can be hijacked by hackers.ABOUT THE SERIES: The Very Short Introductions series from Oxford University Press contains hundreds of titles in almost every subject area. These pocket-sized books are the perfect way to get ahead in a new subject quickly. Our expert authors combine facts, analysis, perspective, new ideas, and enthusiasm to make interesting and challenging topics highly readable.
Published by: OUP Oxford | Publication date: 11/16/2017Kindle book details: Kindle Edition, 151 pages