A revelatory exploration of the hottest trend in technology and the dramatic impact it will have on the economy, science, and society at large.Which paint color is most likely to tell you that a used car is in good shape? How can officials identify the most dangerous New York City manholes before they explode? And how did Google searches predict the spread of the H1N1 flu outbreak?The key to answering these questions, and many more, is big data. “Big data” refers to our burgeoning ability to crunch vast collections of information, analyze it instantly, and draw sometimes profoundly surprising conclusions from it. This emerging science can translate myriad phenomena—from the price of airline tickets to the text of millions of books—into searchable form, and uses our increasing computing power to unearth epiphanies that we never could have seen before. A revolution on par with the Internet or perhaps even the printing press, big data will change the way we think about business, health, politics, education, and innovation in the years to come. It also poses fresh threats, from the inevitable end of privacy as we know it to the prospect of being penalized for things we haven’t even done yet, based on big data’s ability to predict our future behavior.In this brilliantly clear, often surprising work, two leading experts explain what big data is, how it will change our lives, and what we can do to protect ourselves from its hazards. Big Data is the first big book about the next big thing.
Published by: Eamon Dolan/Houghton Mifflin Harcourt | Publication date: 03/05/2013Kindle book details: Kindle Edition, 257 pages
Foreword by Steven PinkerBlending the informed analysis of The Signal and the Noise with the instructive iconoclasm of Think Like a Freak, a fascinating, illuminating, and witty look at what the vast amounts of information now instantly available to us reveals about ourselves and our world—provided we ask the right questions.By the end of an average day in the early twenty-first century, human beings searching the internet will amass eight trillion gigabytes of data. This staggering amount of information—unprecedented in history—can tell us a great deal about who we are—the fears, desires, and behaviors that drive us, and the conscious and unconscious decisions we make. From the profound to the mundane, we can gain astonishing knowledge about the human psyche that less than twenty years ago, seemed unfathomable. Everybody Lies offers fascinating, surprising, and sometimes laugh-out-loud insights into everything from economics to ethics to sports to race to sex, gender and more, all drawn from the world of big data. What percentage of white voters didn’t vote for Barack Obama because he’s black? Does where you go to school effect how successful you are in life? Do parents secretly favor boy children over girls? Do violent films affect the crime rate? Can you beat the stock market? How regularly do we lie about our sex lives and who’s more self-conscious about sex, men or women? Investigating these questions and a host of others, Seth Stephens-Davidowitz offers revelations that can help us understand ourselves and our lives better. Drawing on studies and experiments on how we really live and think, he demonstrates in fascinating and often funny ways the extent to which all the world is indeed a lab. With conclusions ranging from strange-but-true to thought-provoking to disturbing, he explores the power of this digital truth serum and its deeper potential—revealing biases deeply embedded within us, information we can use to change our culture, and the questions we’re afraid to ask that might be essential to our health—both emotional and physical. All of us are touched by big data everyday, and its influence is multiplying. Everybody Lies challenges us to think differently about how we see it and the world.
Published by: Dey Street Books | Publication date: 05/09/2017Kindle book details: Kindle Edition, 357 pages
Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems
Data is at the center of many challenges in system design today. Difficult issues need to be figured out, such as scalability, consistency, reliability, efficiency, and maintainability. In addition, we have an overwhelming variety of tools, including relational databases, NoSQL datastores, stream or batch processors, and message brokers. What are the right choices for your application? How do you make sense of all these buzzwords?In this practical and comprehensive guide, author Martin Kleppmann helps you navigate this diverse landscape by examining the pros and cons of various technologies for processing and storing data. Software keeps changing, but the fundamental principles remain the same. With this book, software engineers and architects will learn how to apply those ideas in practice, and how to make full use of data in modern applications.
- Peer under the hood of the systems you already use, and learn how to use and operate them more effectively
- Make informed decisions by identifying the strengths and weaknesses of different tools
- Navigate the trade-offs around consistency, scalability, fault tolerance, and complexity
- Understand the distributed systems research upon which modern databases are built
- Peek behind the scenes of major online services, and learn from their architectures
Published by: O'Reilly Media | Publication date: 03/16/2017Kindle book details: Kindle Edition, 624 pages
New York Times BestsellerAfter twenty consecutive losing seasons for the Pittsburgh Pirates, team morale was low, the club's payroll ranked near the bottom of the sport, game attendance was down, and the city was becoming increasingly disenchanted with its team. Pittsburghers joked their town was the city of champions…and the Pirates. Big Data Baseball is the story of how the 2013 Pirates, mired in the longest losing streak in North American pro sports history, adopted drastic big-data strategies to end the drought, make the playoffs, and turn around the franchise's fortunes. Award-winning journalist Travis Sawchik takes you behind the scenes to expertly weave together the stories of the key figures who changed the way the small-market Pirates played the game. For manager Clint Hurdle and the front office staff to save their jobs, they could not rely on a free agent spending spree, instead they had to improve the sum of their parts and find hidden value. They had to change. From Hurdle shedding his old-school ways to work closely with Neal Huntington, the forward-thinking data-driven GM and his team of talented analysts; to pitchers like A. J. Burnett and Gerrit Cole changing what and where they threw; to Russell Martin, the undervalued catcher whose expert use of the nearly-invisible skill of pitch framing helped the team's pitchers turn more balls into strikes; to Clint Barmes, a solid shortstop and one of the early adopters of the unconventional on-field shift which forced the entire infield to realign into positions they never stood in before. Under Hurdle's leadership, a culture of collaboration and creativity flourished as he successfully blended whiz kid analysts with graybeard coaches—a kind of symbiotic teamwork which was unique to the sport.Big Data Baseball is Moneyball on steroids. It is an entertaining and enlightening underdog story that uses the 2013 Pirates season as the perfect lens to examine the sport's burgeoning big-data movement. With the help of data-tracking systems like PitchF/X and TrackMan, the Pirates collected millions of data points on every pitch and ball in play to create a tome of color-coded reports that revealed groundbreaking insights for how to win more games without spending a dime. In the process, they discovered that most batters struggled to hit two-seam fastballs, that an aggressive defensive shift on the field could turn more batted balls into outs, and that a catcher's most valuable skill was hidden. All these data points which aren't immediately visible to players and spectators, are the bit of magic that led the Pirates to spin straw in to gold, finish the 2013 season in second place, end a twenty-year losing streak.
Published by: Flatiron Books | Publication date: 05/19/2015Kindle book details: Kindle Edition, 255 pages
Find the right big data solution for your business or organization Big data management is one of the major challenges facing business, industry, and not-for-profit organizations. Data sets such as customer transactions for a mega-retailer, weather patterns monitored by meteorologists, or social network activity can quickly outpace the capacity of traditional data management tools. If you need to develop or manage big data solutions, you'll appreciate how these four experts define, explain, and guide you through this new and often confusing concept. You'll learn what it is, why it matters, and how to choose and implement solutions that work.
- Effectively managing big data is an issue of growing importance to businesses, not-for-profit organizations, government, and IT professionals
- Authors are experts in information management, big data, and a variety of solutions
- Explains big data in detail and discusses how to select and implement a solution, security concerns to consider, data storage and presentation issues, analytics, and much more
- Provides essential information in a no-nonsense, easy-to-understand style that is empowering
Published by: For Dummies | Publication date: 04/02/2013Kindle book details: Kindle Edition, 336 pages
This books fills the need for an easy and holistic book on essential Big Data technologies. Written in a lucid and simple language free from jargon and code, this book provides an intuition for Big Data from business as well as technological perspectives. This book is designed to provide the reader with the intuition behind this evolving area, along with a solid toolset of the major big data processing technologies such as Hadoop, MapReduce, Spark Streaming, and NoSql databases. A complete case study of developing a web log analyzer is included. The book also contains two primers on Cloud computing and Data Mining. It also contains two tutorials on installing Hadoop and Spark. The book contains case-lets from real-world stories.The 2019 edition includes four new chapters. These are full primers Data Modeling, Data Analytics, Artificial Intelligence, and Data Science Careers. Students across a variety of academic disciplines including business, computer science, statistics, engineering, and others attracted to the idea of harnessing Big Data for new insights and ideas from data, can use this as a textbook. Professionals in various domains, including executives, managers, analysts, professors, doctors, accountants, and others can use this book to learn in a few hours how to make the most of Big Data to monitor their infrastructure, discover new insights, and develop new data-based products. It is a flowing book that one can finish in one sitting, or one can return to it again and again for insights and techniques.Table of ContentsChapter 1.Wholeness of Big DataChapter 2: Big Data ApplicationsChapter 3: Big Data ArchitecturesChapter 4: Distributed Systems with HadoopChapter 5: Parallel Programming with MapReduceChapter 6: Advanced NoSQL databasesChapter 7: Stream programming with SparkChapter 8:Data Ingest with KafkaChapter 9:Cloud Computing PrimerChapter 10: Web Log Analyzer development Chapter 11: Data Modeling PrimerChapter 12: Data Analytics PrimerChapter 13: Artificial Intelligence PrimerChapter 14: Data Science CareersAppendix 1 on Installing Hadoop on LinuxAppendix 2 on Installing Hadoop on AWS cloudAppendix 3 on Installing and Running Spark
Publication date: 06/28/2016Kindle book details: Kindle Edition, 315 pages
Less than 0.5 per cent of all data is currently analyzed and used. However, business leaders and managers cannot afford to be unconcerned or sceptical about data. Data is revolutionizing the way we work and it is the companies that view data as a strategic asset that will survive and thrive. Bernard Marr's Data Strategy is a must-have guide to creating a robust data strategy. Explaining how to identify your strategic data needs, what methods to use to collect the data and, most importantly, how to translate your data into organizational insights for improved business decision-making and performance, this is essential reading for anyone aiming to leverage the value of their business data and gain competitive advantage.Packed with case studies and real-world examples, advice on how to build data competencies in an organization and crucial coverage of how to ensure your data doesn't become a liability, Data Strategy will equip any organization with the tools and strategies it needs to profit from big data, analytics and the Internet of Things.
Published by: Kogan Page | Publication date: 04/03/2017Kindle book details: Kindle Edition, 200 pages
This book fills the need for a concise and conversational book on the hot and growing field of Data Science. Easy to read and informative, this lucid book covers everything important, with concrete examples, and invites the reader to join this field. The chapters in the book are organized for a typical one-semester course. The book contains case-lets from real-world stories at the beginning of every chapter. There is also a running case study across the chapters as exercises. This book is designed to provide a student with the intuition behind this evolving area, along with a solid toolset of the major data mining techniques and platforms. Finally, it includes a tutorial for R. The 2019 edition contains expanded primers on Big Data, Artificial Intelligence, and Data Science careers. For the first time, it now includes a full tutorial on Python. The book has proved very popular throughout the world. Dozens of universities around the world have adopted it as a textbook for their courses. Students across a variety of academic disciplines, including business, computer science, statistics, engineering, and others attracted to the idea of discovering new insights and ideas from data can use this as a textbook. Professionals in various domains, including executives, managers, analysts, professors, doctors, accountants, and others can use this book to learn in a few hours how to make sense of and develop actionable insights from the enormous data coming their way. This is a flowing book that one can finish in one sitting, or one can return to it again and again for insights and techniques.Table of ContentsChapter 1: Wholeness of Data AnalyticsChapter 2: Business Intelligence Concepts & ApplicationsChapter 3: Data WarehousingChapter 4: Data MiningChapter 5: Data VisualizationChapter 6: Decision TreesChapter 7: Regression ModelsChapter 8: Artificial Neural NetworksChapter 9: Cluster Analysis Chapter 10: Association Rule Mining Chapter 11: Text MiningChapter 12: Naïve Bayes AnalysisChapter 13: Support Vector MachinesChapter 14: Web MiningChapter 15: Social Network AnalysisChapter 16: Big DataChapter 17: Data Modeling PrimerChapter 18: Statistics PrimerChapter 19: Artificial Intelligence PrimerChapter 20: Data Science CareersAppendix R: Data Mining Tutorial using RAppendix P: Data Mining Tutorial using Python
Publication date: 05/01/2014Kindle book details: Kindle Edition, 317 pages
Longlisted for the National Book AwardNew York Times BestsellerA former Wall Street quant sounds an alarm on the mathematical models that pervade modern life — and threaten to rip apart our social fabricWe live in the age of the algorithm. Increasingly, the decisions that affect our lives—where we go to school, whether we get a car loan, how much we pay for health insurance—are being made not by humans, but by mathematical models. In theory, this should lead to greater fairness: Everyone is judged according to the same rules, and bias is eliminated.But as Cathy O’Neil reveals in this urgent and necessary book, the opposite is true. The models being used today are opaque, unregulated, and uncontestable, even when they’re wrong. Most troubling, they reinforce discrimination: If a poor student can’t get a loan because a lending model deems him too risky (by virtue of his zip code), he’s then cut off from the kind of education that could pull him out of poverty, and a vicious spiral ensues. Models are propping up the lucky and punishing the downtrodden, creating a “toxic cocktail for democracy.” Welcome to the dark side of Big Data.Tracing the arc of a person’s life, O’Neil exposes the black box models that shape our future, both as individuals and as a society. These “weapons of math destruction” score teachers and students, sort résumés, grant (or deny) loans, evaluate workers, target voters, set parole, and monitor our health.O’Neil calls on modelers to take more responsibility for their algorithms and on policy makers to regulate their use. But in the end, it’s up to us to become more savvy about the models that govern our lives. This important book empowers us to ask the tough questions, uncover the truth, and demand change.— Longlist for National Book Award (Non-Fiction)— Goodreads, semi-finalist for the 2016 Goodreads Choice Awards (Science and Technology)— Kirkus, Best Books of 2016— New York Times, 100 Notable Books of 2016 (Non-Fiction)— The Guardian, Best Books of 2016— WBUR's "On Point," Best Books of 2016: Staff Picks— Boston Globe, Best Books of 2016, Non-Fiction
Published by: Broadway Books | Publication date: 09/06/2016Kindle book details: Kindle Edition, 254 pages
Big Data in Practice: How 45 Successful Companies Used Big Data Analytics to Deliver Extraordinary Results
The best-selling author of Big Data is back, this time with a unique and in-depth insight into how specific companies use big data. Big data is on the tip of everyone's tongue. Everyone understands its power and importance, but many fail to grasp the actionable steps and resources required to utilise it effectively. This book fills the knowledge gap by showing how major companies are using big data every day, from an up-close, on-the-ground perspective. From technology, media and retail, to sport teams, government agencies and financial institutions, learn the actual strategies and processes being used to learn about customers, improve manufacturing, spur innovation, improve safety and so much more. Organised for easy dip-in navigation, each chapter follows the same structure to give you the information you need quickly. For each company profiled, learn what data was used, what problem it solved and the processes put it place to make it practical, as well as the technical details, challenges and lessons learned from each unique scenario.
- Learn how predictive analytics helps Amazon, Target, John Deere and Apple understand their customers
- Discover how big data is behind the success of Walmart, LinkedIn, Microsoft and more
- Learn how big data is changing medicine, law enforcement, hospitality, fashion, science and banking
- Develop your own big data strategy by accessing additional reading materials at the end of each chapter
Published by: Wiley | Publication date: 03/22/2016Kindle book details: Kindle Edition, 277 pages