In the midst of the Covid-19 pandemic, in the fall of 2021, the whistleblower who famously exposed multiple controversial practices at Facebook (now known as Meta) was a data scientist. Some have speculated that the controversy surrounding the former Facebook data scientist’s revelations to world media and the United States Congress hastened Facebook’s decision to rebrand and reorganize as Meta. The rebranding and reorganization, it was thought, was meant to take attention away from the controversy.
Ethics in data science is a meaningful centerpiece for the topic of data culture because ethical norms serve to illustrate what data culture is.
What are data?
Data are raw material, and we use data to make information. Eventually, after sufficient effort the data transform into new knowledge, which is information about how the world works but that did not exist before.
What Is Data Culture?
Culture is a shared set of thoughts, understandings, values, traditions, practices, institutions, language and other customs that groups of people pass from one generation to the next. All organizations have culture, some are more intentional than others.
Therefore, data culture comprises data-related thoughts, data-related understandings, data-related values, data-related traditions, data-related practices, data-related institutions, data-related language and other data-related customs that a group passes from one generation to the next.
Because data culture is a specific, observable and measurable phenomenon, it is possible to build it. Those working in data science either as individual contributors or as leaders can build data culture through multiple strategies. Most of these strategies map back to a specific exercise.
The specific exercise is to speak as a group about a specific data-related question and then arrive at a shared understanding or conclusion.
How Can I Build a Data Culture?
What is data? This is one of the many questions I most enjoy. There is no correct answer. In full view of the definition of data culture, given above, the correct answer for any given organization is the one that members of that organization find most meaningful for and useful for themselves.
If you seek to build data culture at your organization, consider setting aside time at a gathering or a series of gatherings. Often weekly or monthly staff meetings will work well for this. Provide for everyone in the meeting a plain, clean sheet of paper. And then ask everyone to quietly write on the paper what they believe is the definition of data. After sufficient time passes, ask everyone to give their paper to someone else. Ask everyone to read what the previous colleague wrote and then to write out a response. Continue the process of passing the papers around so that everyone can read and respond to what others say.
Later you can collect the papers, compile the thoughts, and then use the compilation as reading material that your organization can use for more discussion. The goal is to determine, for your own purposes and your own use, what the definition of data is.
A common finding from exercises like this one is that data are raw material. But that we use data to make information. Eventually, after sufficient effort the data transform into new knowledge, which is information about how the world works but that did not exist before. Your results will vary.
What Is an Analysis?
This is a question that many organizations would do well for themselves to answer. Again, there is no right or wrong answer.
The goal is not to arrive at a textbook answer. The goal is to find a common and shared understanding that members of the organization find useful. Also that members of the organization can pass from one generation to the next.
A failed, stalled or disappointing analytical project often roots back to an insufficient shared understanding of what it means to conduct an analysis. Related questions under this heading that organizations should consider are:
- What are the expected inputs for an analysis?
- What are the expected outputs of an analysis?
- Who will prioritize what analyses we perform? How will we know which analyses to perform ourselves? How can we know which to delegate? And, on what basis will we know that it is safe to postpone an analysis for later work?
- How do we know what success is when we conduct an analysis?
Common findings from the discussions under the “what is an analysis” heading include a documented process that organizations can follow through the course of an analysis. For organizations that wish to further reinforce and infuse the data culture within other traditions, an option is to place the results of this discussion in an operational manual or handbook.
What is Our Analytical Process?
Knowing and having documented your analytical process is an important aspect of building and maintaining a strong data culture. There are many ways to know about and document your process. Often it can begin with simple discussions in which members of the organization share their own thoughts. Later the documentation work can proceed by writing summaries of the discussions. Building diagrams related to those discussions is also an effective way of documenting the process.
As it turns out, discussing, formulating, writing out and diagramming your process will build team cohesion and also build culture. As a team, through these discussions and planning activities you will build shared values and understandings that can be passed down from one generation to the next.
It is useful to go through the experience of building and documenting your own process. You need to arrive at a process that works for you and that represents your own behaviors, habits and practices.
To effectively answer the question “What is our analytical process?” and also to effectively document that answer, you will need to consider the steps or stages of your analytical process. Determine what inputs are required for each step and what outputs are generated. Additionally, it is important to consider how you will assign authority and responsibility for initiating an analysis. By documenting your process, you can create a shared understanding of how data analysis is conducted in your organization, which is critical for a strong data culture.
This extract is from Confident Data Science by Adam Ross Nelson ©2023 and is reproduced and adapted with permission from Kogan Page Ltd. Discover the fundamentals of data science and develop the skills you need for achieving success in this important sector in Confident Data Science.