The Informed Company. Dave FowlerЧитать онлайн книгу.
us. People who likely over the years have acquired a variety of data skills, whether that's Excel VLOOKUPs, Google Analytics dashboards, SQL, or any of a thousand other options. People who have always felt like it shouldn’t be so hard to do basic stuff when it comes to data (but of course, for some reason, it always has been).
What I've come to realize in the years since my initial experience with Redshift is that the modern data stack has exactly two very important impacts for us:
1 With this far better tooling, we can grow our impact on the organizations we work for dramatically. I'm going to say something really stupid and obvious, but: if your queries return 100x faster, you can know 100x more stuff. And that means that you'll just be tremendously more valuable in the insights you're able to provide and the decisions you can make.
2 As a result of #1, our career options are just far, far greater. When I started my career, “data analyst” was a junior position that you attempted to graduate out of as quickly as possible to move onto other things. Now data analysts are high‐leverage, strategic employees with earning potential that mirrors that of software engineers.
There really has been a change in what a single analyst is able to accomplish with some fairly simple tooling and some very accessible knowledge and skills. That single individual can now construct an entire sophisticated data infrastructure by plugging together some inexpensive off‐the‐shelf tools, and then get to work getting insights. Or, that individual can operate as an integral part of a large and sophisticated data team, forming a part of the nervous system for the modern enterprise. The same technology, same skill sets, same knowledge is required regardless how big or small the organization.
This is the modern data stack. To harness its power, you need to know SQL (learnable a day!) and you need to know some basic best practices that folks like the authors of this book have been deep in the weeds developing and recording over the past half‐decade. It's surprisingly achievable—one of the magical things about these tools is just how little advanced technical knowledge is required.
This book provides the foundational knowledge you'll need to navigate the modern data stack. From there, you'll be able to dive in yourself, get your hands dirty, and start asking questions of your fellow practitioners. And when you do, make sure to head over to join me in the dbt Community—my Slack handle there is @tristan and I'd love to hear from you!
Introduction
Knowledge is power. Knowing more about your surroundings, like which fruit is safe to eat and which will harm you, keeps us safe and successful. Knowing more about the relationships between events, like the inverse relationship between washing hands and influenza outbreaks, minimizes inevitable difficulties. And knowing which strategies lead to success, like which mass‐reforestation strategies lead to sustainable ecosystems, helps us optimize our efforts. In the modern world information on our environment is often in the form of data.
Whether by an individual or a team, data used intentionally and critically is a key part of success. While it's true that anyone can get lucky and stumble upon the right answer, having an idea of which tactics confer the most stability or which metrics indicate big opportunities for growth can drive teams forward quicker.
Our thesis is this: being informed implies being successful. But what does it mean for a person or an organization to be informed exactly? If you are sufficiently informed, you can make decisions with a sense of conviction that your thinking is correct a majority of the time. In the end, your gut feeling (intuition) factors into decision making. However, with well‐organized and accessible information added into the mix, your “gut” can make more thoughtful (and better) decisions.
On the other hand, bad decisions are the result of not knowing enough: not knowing the root causes of numeric anomalies, not knowing where there is potential to invest more capital, and not knowing what the next few months could look like based on previous quarterly performance. Intuition is a powerful and necessary tool, but we believe that the individual and the company owe it to themselves to hone their intuitions. We believe that anyone with access to performance metrics should integrate that information into their decision making. Being informed is a key part of the puzzle to avoiding bad decisions. But what does it look like to be informed in the twenty‐first century?
Companies and their operations continue to become more digitized. Companies no longer have direct access to each customer and must increasingly rely on data to improve and compete. So, organizations end up tracking a lot of data taken from many input streams. But why are so many teams struggling with organizing and leveraging insights from their data? We believe that the problem begins with tooling. We believe that the modern data workflow requires a data stack and operational structure that is ready for today's challenges.
In this book, we will outline what the modern data stack looks like starting from a brand‐new startup all the way through a data‐driven enterprise. We will cover architectures, tools, team organizations, common pitfalls, and best practices. Armed with these, we are confident that your team can identify the evolving needs of your business and implement solutions in your data infrastructure.
Merging Business Context with Data Information
There is a chasm between the people who know the technical aspects of how to work with data and the people who know the context behind that data. The technical people can write structured query language (SQL), whereas the business people know which marketing campaigns they've run and what types of users comprise an analysis. This creates friction between the stakeholders who want access to data insights (without always knowing what that data looks like) and the analysts who know their data but not all of its history or significance.
Figure I.1 Business context vs technical know how chart.
A modern data stack seeks to bridge that gap. It enables an organization, not just analysts, to work with data. The modern data stack combines disparate input sources into a single understandable format and then stores that data in a single location that a business intelligence (BI) product can connect to. It should provide the information they need at their fingertips, and it should help everyone ask more questions. The modern data stack should empower teams to organize around a key performance indicator (KPI) and make decisions using data (Figure I.1).
Here are examples of being uninformed versus being informed:
Marketers who keep doubling down on all of their channels versus the ones who iterate or drops them depending on the success and failure of each campaign.
Sales leaders using anecdotes to support what type of customers they should target versus ones who know what customers are coming in and which are closing and can focus their teams accurately on the biggest opportunities.
Support persons who respond to an issue with “what browser are you using?” versus ones who have all of the error information, including the browser details, the account information, and recent actions.
Journalists writing about what matters to them versus ones who can see what resonates.
Product managers who justify their feature by showing the initial results versus ones who set up a test and let the data speak for itself.
Getting data into an understandable format that everyone has access to makes technical and business people collaborators rather than adversaries. It empowers everyone in the organization to make more informed decisions. This book aims to show how to set up a data stack to do just that.
The Four Stages of Agile Data Organization
After working with thousands of companies, we