Whereto Data Literacy?

Over the past decade, there has been increased attention to the idea that students should learn how to make sense of data and use it in decision-making. It could be the next new big thing in education, hot on the heels of coding. But, depending on how the interest in data literacy develops, being the next big thing might not be an altogether good thing.

Earlier this month I was invited to participate in a meeting to think about the future of “data literacy.” There seemed to be a sense that we could be poised to either make some real progress or to screw things up.  It was not quite as dramatic as the image on the right suggests, but it had that feel to it for me. In this post, I share some of the ideas and opinions aired in that meeting.

Forum and Participants

The meeting took place at ASU GSV X, which bills itself as the “World’s Leading Education and Workforce Innovation Summit.”  Speakers include Andre Agassi, Tony Blair, Karl Rove, Priscilla Chan (Mark Zuckerberg’s spouse and partner in the Chan Zuckerberg Initiative), Sal Khan, Geoffrey Canada, and similar luminaries. The price for general admission was $3,195.  So, a “high end” event.

The session that I participated in virtually (whew! much cheaper) was titled “Is There a Need for Data Literacy Standards?”  It was hosted by ETS and Tuva, a company that provides online tools, curated datasets, and other curriculum supports for data literacy. The meeting was described as follows in the conference program:

This workshop will bring together subject matter experts to discuss their individual work on data literacy. We will also discuss whether any collective activities like standards development would aid individual efforts and the goal of improving data literacy in general in K-12 classrooms and, consequently higher education and the workplace. Besides standards, other proposals or ideas for collaboration are welcome. The desired outcome of the workshop is to determine whether and how we can advance data literacy by working together.

My fellow virtual participants were an impressive group, including Helen Quinn (Chair the committee that produced A Framework for K-12 Science Education, the foundation document for the Next Generation Science Standards), Rob Gould (Director of the Center for Teaching Statistics at UCLA), William Finzer (who lead development of Fathom, a software precursor to current tools), Chad Dorsey (CEO of the Concord Consortium, which developed CODAP, an online data analysis tool), and many others.

A Big Umbrella

Suppose you are at a conference. It could be ASU GSV X, NSTA, CitSci, or any other. Someone you meet says that she thinks that data literacy is really important.  You agree.  Do you REALLY agree?  Last week’s meeting suggests that the answer is “Maybe not so much.”

The meeting began with the questions of whether there would be value in bringing people together to develop a set of data literacy standards and why one might need such standards.  Among the first answers was, “To certify that someone has the qualifications to be a data scientist.”

This was not what I expected. I was thinking about kids in schools. But, hey … this was a “workforce innovation summit.” If we are training data scientists, we need a way to know that they can do the job.  Later in the meeting, there was a discussion about the knowledge that school teachers need, but it had nothing to do with their ability to teach students to make sense of data. Data literate teachers are proficient at using data about students to make instructional decisions. The conversation about the need to prepare and certify data scientists did eventually include programs at the middle school and high school level, but oriented more toward the “pipeline” metaphor: We need to ensure that students are developing required knowledge and skills early in their learning so that they are in the pipeline to become data scientists later on.

Coding and Computer Science as Models?

One participant connected data literacy with the much broader idea of “digital literacy.” After some discussion, the takeaway, as I understood it, was that there may be some low-level intersection between the two kinds of literacy, but that it is useful to keep them separate. Even so, there was broad recognition that computer science has emerged over the past 5-10 years as a topic of study of its own, now with its own Advanced Placement courses, instructional standards, and examinations.  Many high schools now offer it as a course, just like other science and math courses.  Could/should instruction about data literacy develop along a similar path?  This is where the conversation got interesting.

Silos vs. Crossing Disciplinary Boundaries

Helen Quinn, as I noted above, has been at the forefront of thinking about science education standards for more than a decade. She argued that data literacy should NOT be viewed as a separate discipline, or, as she put it, “another silo.” Her argument wove together a number of ideas:

  • We need to see data literacy as a capacity that everyone in a 21st-century democracy needs to possess, not as a specialty.
  • Data literacy is inherently trans-disciplinary. Making sense of data is how one creates meaning in any discipline, and disciplinary contexts are essential to making sense of data.
  • Related to the above, data literacy is not the same as proficiency in statistics. It is more about knowing how to ask the right question, given the data available.
  • To the extent that we do develop ways of assessing the development of data literacy, it should not be in terms of standards as they are usually conceived–as specific proficiencies that the student should demonstrate–but something more like a set of data challenges that students should master at different grade levels.

Someone asked her if she could provide an example such a challenge. She said that she could not, but such standards would look different than the relatively detailed disciplinary ideas that we are now using for science standards. Harshil Parikh, CEO of Tuva, returned to the distinction between the emergence of “coding” in schools and the emergence of data literacy. While there seems to be increasing agreement that the teaching of coding skills needs to be done as something that is separate from other subject matter, Harshil said that “data literacy does not need to be taught that way, in fact, cannot be taught that way.”

This presents those of us who would like to see more productive, well-informed and well-designed attention to data literacy with a mix of opportunities and problems. On the positive side, data literacy can be incorporated into many subjects, just as reading and writing can be incorporated into many subjects. On the more problematic side, not having a reserved, special place in the curriculum for data literacy means that we need to work with decision makers and influencers in each of the other, more “siloed” subject matter areas to persuade them to increase the time and focus given to data literacy. Since there is never enough time in the school year to do all the things that these other disciplinary experts would like to do, successfully making this case will never be easy. Further, even if teachers and curriculum designers in biology, environmental science, earth science, physical science, sociology, and other disciplines do include data literacy, there remains the challenge of doing this coherently and productively.

Data Challenges

I think that Helen Quinn has it right: the way forward involves the framing of something like “data challenges.” I am thinking, for example, of something like having students present an argument, based upon inspection of the distribution of a dataset, as to whether the mean or the median would be a better measure of central tendency for that particular dataset, or whether they are both equally good. Or, as another example, comparing the distributions of data from two different experimental treatments to make an informed decision about whether there is a difference that would turn up consistently across many samples. Here is a third example: making an argument about whether data that appear to be “outliers” should be included or excluded from the analysis.

Some Closing Thoughts

My participation on the periphery of ASU GSV X persuaded me that data literacy is turning a corner and doing so quickly.  Here are a few of the changes that I see …

First, the intensity of interest in data literacy is much different than it was less than a decade ago. In 2010, when Molly Schauffler and I began digging into the research literature about helping kids make sense of data, we did find good studies, but their number was small compared to other areas of student learning and the number of researchers was even smaller. A gathering such as the one that I joined virtually at ASU GSV X was inconceivable back then.  Data literacy was not at the forefront of thinking for many people. That has changed.

Second, part of the reason for this change is because there is now a market for products related to data literacy. This is a good thing. Having a market for data literacy means that more people can see opportunities to grow not-for-profit and for-profit businesses focused on data literacy. This will result in better tools for teachers and students. Just a few years ago, most teachers did not have easy access to tools that would enable students to create a box plot or, in other ways, explore data visually without mastering the details of working with spreadsheets. Now they do, in part because there is enough demand for such products to support their creation and maintenance.

Third, the existence of a market drives specialization and differentiation. When there were just a few, more academically oriented folks thinking about and doing research on data literacy, there was a good chance that, if you brought them together, they might be talking more or less about the same thing.  As ASU GSV X demonstrated, that, too, has changed.

The increased attention to data literacy is a good thing … wonderful to see. But, as we have seen before, the emergence of new markets for products and services does not necessarily result in improved learning, particularly if one is concerned about learning for all. This is where the drive toward specialization concerns me. Data literacy could become another specialty and pipeline,  and I am aware that, from a business standpoint, framing it as a pipeline with well-defined outcomes and standards can lead to products and solutions that can be brought to market in the next year or two. But this is data literacy for the few.  It is Path 1 in the little graph at the top of this posting.

I am in the Helen Quinn camp: Data literacy is a capacity that everyone in a 21st-century democracy needs to possess. This is Path 2 in the graph. Building that capacity will be difficult and will take more than a year or two. Fortunately, there are also non-profit and for-profit organizations that are taking that longer view.

I do believe that the increased attention to data literacy that has emerged over the past few years will result in better supports for acquiring such literacy and will make it more widespread.  So, I came away from ASU GSV X feeling hopeful. But I also came away a little wary.  Those us who think of data literacy, like reading, as something for everyone will need to be active in promoting that vision, and we should be mindful that others who use the term “data literacy” may mean something quite different.

This entry was posted in Assessment, Data Literacy, Equity and tagged , , . Bookmark the permalink.

1 Response to Whereto Data Literacy?

  1. Thanks, Bill — this frames the wide-ranging discussion during the ASU GSV session in a way that can help the conversation move forward.

    I agree, Helen’s points are key. I’ve tried to think of analogies to describe some of the points that came up in the discussion. Here is one, concerning the question: Do we need data literacy standards?

    Sets of content standards are the warp. Data literacy for living in a 21st century democracy (ie. your green arrow) is the weft. The weft is the thread that unites and gives strength to the independent warp strands (the content standards). The warp is not very useful without the weft.

    The weft (data literacy skills) would have an entirely different kind of structure than the warp (standards). The kinds of problem challenges that Helen suggested could be it. This winter we’ve started developing a set of applied challenges at Tuva, perhaps something like what Helen suggests, to pitch to students who are ready to synthesize basic skills.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s