Word Searching as a Tool in the Study of Dreams, or, Dream Research in the Era of Big Data

I’m giving a presentation with that title on Saturday, June 23, at the annual conference of the International Association for the Study of Dreams, held in Berkeley, California.  The presentation is part of a panel session, “What’s New in the Scientific Study of Dreams.”  I’m giving an overview of the word searching method I’ve been developing over the past several years, with a special focus on four “blind analysis” studies I’ve performed with the help of Bill Domhoff.  A youtube video preview of the presentation can be found here.

Here’s how I define blind analysis in the paper:

A blind analysis involves an exclusive focus on word usage frequencies, bracketing out the narrative reports and personal details of the dreamer’s life and making inferences based solely on statistical patterns in word usage—not reading the dreams at all, and basing one’s analysis strictly on numerical data.  The aim is to assess the patterns of dream content with the fewest possible preconceptions, as objectively as possible, before reading through the narratives and learning about the individual’s waking activities and concerns.

Zeo Sleep Data and the Ur-Patterns of Dream Content

So far I’ve done word search analyses on 20 series of dreams from individuals and 9 sets of dreams from large groups of people, a total of more than 18,000 dream reports. It’s too early to say anything definite about the patterns that have emerged from this data. More reports need to be gathered from a wider variety of people, and more improvements need to be made in the SDDb word search template.

Still, a few basic patterns have appeared in nearly all the collections I’ve studied. I’m calling them ur-patterns because they seem to represent deep structural elements of dream content (ur- as in “original” or “primal”). That’s my general hypothesis, anyway, and each new set of dreams is another chance to test and refine it.

Here are the ur-patterns I’ve identified so far:

  1. Of the five senses, sight words are used most often, smell and taste the least.
  2. Of the five major emotions (fear, anger, sadness, confusion, happiness), fear words are used most often.
  3. Of all the categories of cognitive activity, speech words are used most often.
  4. Of the four natural elements, water words are used most often.
  5. Falling words are used more often than flying words.
  6. There are more references to family characters than animal characters, and more to animals than to fantastic beings.
  7. There are more references to friendliness than physical aggression.

Looking at the KB DJ 2009-2010 series with Zeo sleep data (available at google docs), a scan for these patterns finds good but not perfect evidence for each one.

Vision-related words are used more frequently across all the Zeo measurements, with smell and taste words almost entirely absent. Fear words are used more frequently than other emotion words. Speech words are the most used among the cognition categories, and water is the highest among the natural elements, though earth is a consistently high second. The usage of falling words is always higher than, or equal to, flying words.

The family > animals pattern > fantastic beings was not as clear-cut. Fantastic beings always had the lowest word usage, but animals were not always lower than family. When the names of the dreamer’s immediate family were added to the search for characters, the total frequency of family-related words rose higher than the usage of animal words in 15 of the 17 subgroups.

The friendliness > physical aggression pattern was not perfectly evident either. In part this is due to a “false positive” problem in the SDDb template. The word search category for physical aggression includes the word “bit,” which the dreamer used in almost 10% of all the reports as a term meaning “small amount,” not a physical bite. I’ll provide revised numbers once I’ve fixed this. For now, looking at how often the word “bit” is used in each Zeo subgroup, it appears the physical aggression frequencies will drop below the friendliness frequencies in most, but not all, subgroups.

In sum, the ur-patterns appear across virtually all the subgroups of Zeo sleep measurement. No matter what aspect of sleep was measured, the dream reports used the same basic frequencies of words in several major categories. High or low proportions of sleep did not correlate with any major change of dream content, at least at this level of analysis.

In future posts I’ll look at the few variations from these patterns (high physical aggression, animal, flying, and earth references) in relation to the dreamer’s waking life concerns, taking the possibility of metaphorical meaning into account.

I will also look at each of the five types of Zeo data and see if I can identify any particular variations that rise to the level of statistically significant correlation. If any such correlations emerge, they may guide us toward specific areas where a measurable aspect of sleep does interact with basic patterns of dream content.

 

Children’s Dreams: A Word Search Analysis (part 2)

Once you’re ready to perform a word search analysis—once you’ve formulated a question, chosen a dream series, and acknowledged the limits of your approach—you have to decide the length of the dream reports you’re going to study.

 If you search for reports of any length, your results will include lots of short reports saying “none,” “no dream,” etc.  You’ll also get answers like “dreamed of a whale,” or “plane crash,” reports so short that it’s hard to work with them. You might also get super-long reports with elaborately detailed descriptions and additional waking commentary, which are also hard to work with.

 Unless you specifically want to study the shortest or longest reports, my advice is to set minimum and maximum word lengths for your searches. 

 I started by setting the searches for dreams between 20 and 300 words.  That gave me 622 reports to study.  After I learn more about the series I’ll look at the shorter and longer dreams and find a way to integrate them into the analysis.

 One factor I’m always thinking about is how to make my findings commensurable with those of other researchers. For example, the Hall and Van de Castle content analysis system, which has been used as a base of comparison by many researchers, focuses on dream reports between 50 and 300 words in length. Eventually I’ll look at that narrower range of reports, but at the beginning of my analysis I want to cast a wider net and include more reports in my initial assessment, hence the lower minimum length.

 So, where to start the word searching? 

 My immediate, admittedly vain concern was to know whether I was right or wrong about a recent prediction I made about this dream series.

 A couple weeks ago I wrote a post about people’s dreams of Harry Potter, drawing on results from another survey on highly memorable dreams I commissioned from Harris Interactive.  In that survey 1003 American adults 18 years and older reported dreams between 20 and 300 words in length, and I found two reports using at least one of the following words:

 “harry potter” voldemort hogwarts hagrid dumbledore malfoy snape hermione draco

At the end of that post I predicted there would be more HP-related dreams in the Harris children’s survey, which I had not yet studied.  Now that I have the children’s survey uploaded into the SDDB, I can quickly put that prediction to the test. 

 Of the 622 dreams between 20 and 300 words in the children’s survey, 6 of the dreams used at least one of these HP-related words, 1% of the total vs. 0.2% in the adult survey.

 It’s not an epic difference, but it’s statistically significant (p=.036), and it makes sense in terms of the different roles the HP novels have played in the waking lives of children vs. adults.      

 OK, I wasn’t completely wrong.  That’s the first thing I wanted to settle.

Part Three:Using the pre-set template of 40 word categories

Children’s Dreams: A Word Search Analysis (part 1)

I’ve just begun a new project using word search methods to study dream reports from children and adolescents.  I thought that showing in real time the steps of my analytic process might help other people learn how to apply these methods to their own dream studies.

Any research project starts with a question.  In this case my question was about “big dreams” in childhood (the subject of a book-in-progress).  I wanted to know more about recurrent patterns in the dreams that children and adolescents remember most vividly.  Other researchers like David Foulkes have studied normal, average dream patterns in children, but my question focused on the distinctive features of highly memorable dreams in the early stages of life

Earlier this spring I contacted Harris Interactive, an opinion research company, regarding their “YouthQuery” survey, which enables a researcher to ask a single question and receive online answers from @1000 American children ages 8-18, along with a few other demographic data points.  (The cost of this survey, while considerable, was no more than I’ve paid research assistants to help with other projects in the past.)

There are pros and cons to online surveys.  On the downside, it’s impossible to validate a person’s answers, and it favors participants who are educated and affluent enough to use computers.  On the upside, participants can give their answers in a private setting in their own words, which is extra valuable for a word search approach.

I try not to let excessive angst about methodology slow me down.  Every study has its limits.  Once you’ve identified them, you move on and do the work.  I’m more interested in discovering what a method can do rather than dwelling on what it can’t do.

The Harris people and I decided to word the survey question as follows:

“We are interested in hearing about a dream that you had and remember a lot about.  Please try to tell us everything you remember about the dream, including where you were, who else was there, what happened, how you felt, what you were thinking during the dream and how it ended.  Please also tell us about how old you were when you had the dream.”

The other questions asked in the standard YouthQuery survey regarded age, gender, race/ethnicity, current grade in school, school location (urban, suburban, rural), and school type (public, private, parochial).

Harris conducted the survey in early April, and then I uploaded the info (thanks to Kurt Bollacker) into the sleep and dream database (SDDB).  The dream reports and answers to the other questions can be seen at:

Making this information publicly available enables others to check my work and test my claims, always a good thing in empirical research.  More importantly, it allows other researchers to explore facets of the data beyond what I or any single analyst can pursue.

Dream researchers have operated for too long with isolated sources of data that never receive more than one investigator’s systematic attention.  Digital databases can help our field move forward into a more dynamic and collaborative future.

Next: testing my first predictions