Scientists can now predict whether you’ll write a best-selling novel

They say there’s a book in every one of us, but that doesn’t necessarily mean it will sell to thousands. However, scientists have now come up with a set of rules that can predict whether your novel will be successful or not, and it all to do with leaving out clichés and overuse of verbs.

Using an algorithm, computer scientists can now tell with an 84 per cent accuracy rating, whether a book will be a best-seller. The technique is called statistical stylometry, which grew out of earlier methods used to detect authenticity and authorial identity.

How we can tell that JK Rowling wrote these books

How we can tell that JK Rowling wrote these books

It uses statistics to find out commonly used words, verbs and grammar, and when the scientists applied it to best-selling novels, they found it to be “surprisingly effective” in determining how well the book had done.

Based in Stony Brook University in New York, the team behind the algorithm stated that there were a number of factors involved in predicting whether a book would be successful, and these ranged from writing style, an engaging storyline and a certain amount of luck.

By using the algorithm however, the scientists found that there were definite trends that the successful books all had in common. One was the overuse of conjunctions such as “and” and “but” and also large numbers of nouns and adjectives. Whereas less successful books tended to contain more verbs and adverbs, concentrating on words that explicitly described actions and emotions such as “wanted”, “took” or “promised”. Best-selling books were also more likely to feature processes such as “recognised” or “remembered”.

To test their theory, the team downloaded books from the Project Gutenberg archive, and ran the algorithm in order to see if their predictions would come true. Comparing the historical performance of how well the book did, to the results of the algorithm, it was found that it had an 84% success rate.

Eight types of genres were considered, adventure, mystery, historical fiction, fiction, science-fiction, love stories, short stories, and poetry and the algorithm then worked on all these variants.

One of the scientists who worked on the team – Assistant Professor Yejin Choi said: “For a small number of novels, we also considered award recipients—such as Pulitzer and Nobel prizes—and Amazon sales records in order to define a novel’s success. Additionally, we extended our empirical study to movie scripts, where we quantified a film’s success based on the average review scores at imdb.com.”

Choi added: “Predicting the success of literary works poses a massive dilemma for publishers and aspiring writers alike. To the best of our knowledge, our work is the first that provides quantitative insights into the connection between the writing style and the success of literary works.

“Previous work has attempted to gain insights into the ‘secret recipe’ of successful books. But most of these studies were qualitative, based on a dozen books, and focused primarily on high-level content – the personalities of protagonists and antagonists and the plots. Our work examines a considerably larger collection – 800 books – over multiple genres, providing insights into lexical, syntactic, and discourse patterns that characterise the writing styles commonly shared among the successful literature.”

The findings of this research paper can be found at ACLWeb.