Despite all the efforts of publishers, it has always seemed impossible to predict whether or not a book would be runaway bestseller. This isn't too surprising - it's the kind of thing that is inherently unpredictable because there are simply so many variables involved. Yet a newly published book suggests it is possible to do just that. Are the authors crazed or brilliant? Neither, really. They have put together a mechanism based on computerised text analysis that is good at spotting bestsellers - and yet, oddly, this doesn't contradict that inherent unpredictability. Why? Because there are two different levels of bestsellerdom involved - and because I think there's one bit of information missing from the book (apologies to the authors if I've missed it).
So what does the software do? By looking at various word uses, patterns and shaping, it can make a good shot at predicting whether or not a book is likely to have featured on the New York Times bestseller list. This is very impressive - and, along the way, Jodie Archer and Matthew Jockers give some excellent advice on things that authors can do (or at least try to do) that will make their books more like these bestsellers.
This isn't a universal panacea. In fact the authors admit that what their algorithms spot is not what most would regard as great fiction. The system laps up the like of the output of Dan Brown and 50 Shades of Grey. But interestingly, it also is useful counter to those who say they can't understand why these kind of books sell because they are terribly written. In fact, in a number of respects these books are very well written - it's just that the criteria for 'well written' are not those used by the lit. crit. brigade.
Not only is this not a recipe for producing great literature, it's not about producing books everyone would like either. Taking a quick skim through the top 100 books selected by the analysis, there are perhaps three I would consider reading. But many of us are not 'bestseller' readers. We like our own little niches, and that's fine. This system isn't for us - it is about finding likely hits for the traditional bestseller market.
This genuinely is all very interesting, although the book has surprisingly little content for a full price hardback (it's large print, and there's a lot of dancing around exactly what they are doing). However what absolutely isn't true is the assertion made here that 'mega-bestsellers are not black swans'. The system uses a number of measures, and though it's true that most mega-sellers like Harry Potter and 50 Shades do well on some of the measures, they pretty well all fall down on others. So, for instance, to write a bestseller we are encouraged to avoid fantasy, very British topics, sex and descriptions of bodies. What the model seems to do well is to recognise what you might call the run-of-the-mill bestsellers, rather than pick out most of the real runaway successes as being stand-out.
There was also that missing bit of information. The authors are enthusiastic to tell us how many books that scored highly from their system were on the bestseller list, and that really is impressive. But they don't mention false positives - how many books the system thought should be bestsellers but weren't. That would have been interesting to discover more about.
I'm sure we'll hear more of this kind of analysis, but I really hope publishers don't put too much stock by it - because it is very much a lowest common denominator approach (certainly from the viewpoint of someone who wouldn't consider reading more than 95% of their recommendations). That's not to say that the book isn't interesting - and for an author, there are some excellent insights into some of the things that attract this generic group of readers (or put them off) that are worth considering even if you do write science fiction or British crime fiction (say).
A fascinating piece of analysis, provided you don't take it all too seriously.
The Bestseller Code is available from amazon.co.uk and amazon.com.