Does Ernest Hemingway really use the fewest adverbs? Do authors write about their own gender more than others’? In his new book, Nabokov’s Favorite Word Is Mauve, Ben Blatt uses statistical analysis to deconstruct popular and classic literature and interrogate truisms about writing fiction. Many of the claims he makes are intriguing. He finds that male writers tend to use the pronoun “he” far more often than “she” in their books, whereas female writers use “he” and “she” almost equally. Blatt also finds that over the last 200 years, writers’ tendency to use qualifiers (rather, very, little, pretty, etc.) in their fiction has decreased substantially. Blatt’s quantitative approach to literature is novel — and very entertaining — but the book is undermined by poor copyediting and methodologies that call into question the conclusions Blatt reaches.
To a bibliophile, the flaw that jumps out is Blatt’s seeming unfamiliarity with some of the fiction he calls on to support his findings. In Chapter 4, “Write by Example,” Blatt claims that writers’ use of qualifiers has been declining for two centuries. (Between 1900 and 1999, he writes, qualifier use per 10,000 words dropped from more than 200 to a little more than 100.) He cites Jane Austen as prime example of 19th-century qualifier abuse: “Jane Austen is one of the English language’s most celebrated authors but her use of words like very is off the charts.” Blatt’s claim, broadly speaking, is believable, but the excerpt from the novel he cites is terrible evidence to justify that writers of a different era conformed to different stylistic standards regarding qualifiers. The quote he chooses from Emma is Austen’s summary of Harriet’s dialogue:
She was very fond of singing. He could sing a little himself. She believed he was very clever, and understood every thing. He had a very fine flock, and, while she was with them, he had been bid more for his wool than any body in the country. She believed every body spoke well of him. His mother and sisters were very fond of him. (Emphasis Blatt’s)
Later in the chapter, Blatt quotes Dead Poet’s Society to explain how qualifiers can vitiate speech: ‘“…avoid using the word ‘very’ because it’s lazy.”’ Or, to put it another way, using “very” too often can make a person sound dumb. In Emma, Harriet Smith is an airhead and her vacancy is crucial to the novel’s plot. Thus, those abundant “verys” in the passage aren’t an indication either of Austen’s laziness or her conforming to the style conventions of another era; Austen uses them deliberately to telegraph to the reader that Harriet is dense. Blatt’s relying on this passage as an illustration of unconsciously absorbed literary standards suggests either shallow familiarity with his source material or a failure of literary analysis.
Blatt is also sometimes careless about the conclusions he draws from his data. For example, when he compares the relative frequency with which male and female writers use the pronouns “he” and “she,” Blatt concludes that, based on his sample, “Of the 50 classic books by men, 44 used he more than she and 6 did the opposite” and “Of the 50 classic books by women, 29 used she more than he and 21 did the opposite.” Both of these statements, however, are, at best, misleading, and possibly false, as Blatt identifies two books in his 50-book sample, one by a man and one by a woman, that use “he” and “she” at equal rates. Blatt rounds to the nearest percentage point, so it is possible that what he writes is, strictly speaking, true; there may be barely more appearances of “he” in Lady Chatterley’s Lover and barely more appearances of “she” in A Good Man Is Hard to Find, both of which round most closely to equal representation. If this is the case, however, why does Blatt not make this clear in the text?
Perhaps more importantly, Nabokov’s Favorite Word did not get the attention it needed from a copy editor. (On page 70, for instance, Blatt titles a list “Most Probable to be Richard Bachman” [Stephen King’s pseudonym], when what he means is “Most Probable to be Robert Galbraith” [J. K. Rowling’s pseudonym]). In a book of statistical analysis especially, Blatt’s lack of care defining criteria for inclusion in his samples (and adhering to those criteria invariably), calls into the question the conclusions he draws from his analysis. For instance, in the aforementioned analysis of gendered pronouns, Blatt waffles about whether his analysis is confined to novels or just to “books.” On page 41, he writes that he drew his data from the “100 novels on [the] classic literature list.” This list of “novels,” however, contains several collections of short stories, including A Good Man Is Hard to Find and Winesburg, Ohio. It is unlikely that including short stories would bias the results determining how often writers use “he” and “she;” it may, however, mislead the reader about how writers use gendered pronouns in fiction in general, as opposed to novels in particular. Blatt’s sloppiness in choosing his samples is not limited to this analysis alone. In another case of “Breakout Debut Novels,” he states that to qualify a work had to be “an author’s first novel.” Nevertheless, he includes in his sample Alice McDermott’s second novel, That Night, published in 1987, though McDermott’s first novel was A Bigamist’s Daughter, published in 1982.
Blatt’s problem defining criteria for his samples and adhering to them most profoundly undermines his investigation of different writers’ favorite words. Blatt concludes, for example, that Virginia Woolf’s favorite words are “flushing, blotting, mantelpiece;” Marilynne Robinson’s are “soapy, checkers, baptized;” and Lemony Snicket’s are “siblings, orphans, squalor.” Blatt designates only four criteria to determine whether a word is a favorite, one of which is that the word “is not a proper noun.” Blatt does omit all words that are unmistakably proper nouns; you won’t find Chicago, Arkansas, or Sahara among any writer’s favorites. Blatt, however, neglects to exclude words that writers use as proper nouns. This is most obvious in his choice of Virginia Woolf’s favorite word “flushing.” Based on searches performed on Google Books, Woolf only uses “flushing” (not as a proper noun) eight times in the nine novels that constitute Blatt’s Woolf sample. There are, however, 55 occurrences of “Flushing” in Woolf’s novel The Voyage Out, in which Woolf repeatedly refers to the characters Mr. and Mrs. Flushing. To determine a favorite word, Blatt also uses the criterion that the word “must be used in half an author’s books.” Excluding The Voyage Out, which never uses “flushing” as anything other than a character’s name, the word only appears in four of the nine novels that constitute Blatt’s Woolf sample. Thus, Blatt must be counting the erroneous appearances of “Flushing,” used as a proper noun, to arrive at his ranking.
Though I cannot prove it with the same certainty, Blatt likely repeats this flaw in several other authors’ favorite word lists. One of Marilynne Robinson’s favorite words, as determined by Blatt, for instance, is “soapy.” In her novel Gilead, “Soapy” is the name of the cat, who is mentioned by name 11 times. Excluding references to the cat, however, Robinson only uses the word “soapy” twice in all the novels in Blatt’s Robinson sample and never in Gilead. It is also possible the same error occurred in the identification of “squalor” as one of Lemony Snicket’s favorite words; there is a supporting character named Esmé Squalor in A Series of Unfortunate Events. Blatt could argue that names should be included in the analysis because a writer handpicks them for her characters. Nevertheless, Blatt either needs to redefine his criteria to make this inclusion clear or exclude from his sample instances where a writer uses common words as proper nouns.
Nabokov’s Favorite Word Is Mauve is a thoroughly entertaining romp, but the mistakes — especially Blatt’s lack of rigor in sticking to the criteria he defines for his samples — mean one should approach it with several grains of salt. Given the problems of methodology observed, one often can’t put faith in Blatt’s conclusions. It is unfortunate that his intriguing approach is compromised by lackluster execution. His analyses, approached with more rigor, could offer meaningful insight into the way great writers compose.