Rating an iPhone app takes just a second, maybe two. “Enjoying Skype?” a prompt will ask, and you click on a 1-5 star rating. Millions of people respond to these requests, giving little thought to their fleeting whim.
Behind the scenes, though, an entire industry has spent countless hours and lines of code to craft this moment. The prompt, seemingly random, can be orchestrated to hit your glowing screen only at times when you are most likely to leave a five star review.
Gaming apps will solicit a rating just after you reach a high score. Banking apps will ask when they know it’s payday. Gambling apps will prompt users after they are dealt the perfect Blackjack hand. A sporting app will give the nudge only when a user’s team is winning.
Apple has for a decade clamped down on “ratings farms” and “download bots” that companies use to fraudulently garner five-star scores and manipulate App Store rankings. And it has had some success. But these are blunt instruments trying to cheat the system in clear violation of Apple’s rules. The more sophisticated techniques stay within the rules but draw on behavioural psychology to understand your mood, emotions and behaviour — they are not hacking the system; they are hacking your brain.
“The algorithms that are used are very hush-hush,” says Saoud Khalifah, chief executive of Fakespot, a service that analyses the authenticity of reviews on the web. “They can target you when you are euphoric, when you have a lot of dopamine . . . They can use machine learning to determine [when] a user will be more inclined to leave positive reviews.”
Conversely, developers know when not to ask: a news app won’t solicit reviews from someone reading a story about death and destruction. The person who keeps getting their password wrong will certainly not be asked. This helps to prevent negative scores from becoming public, raising the overall average.
“We call it latent value sensing,” says Michael Sikorsky, chief executive of Robots and Pencils, which helps businesses in the mobile economy. “When you think you’ve got someone into a dark corner of the app, that is not the time to ask for a review.”
Such tactics — hidden from the public but an open secret among developers — have sparked widespread ratings inflation and become so prevalent that “among major enterprises, it’s hard to find ones that don’t do this”, says Brian Levine, vice-president of strategy and analytics at Mobiquity, a consultancy.
Even those who are reluctant, he adds, have come to realise it’s the cost of entry into Apple’s curated marketplace. “And so what’s happening is the App Store ratings are becoming meaningless for customers.”
The implications of the inflation are far-reaching. Millions of companies use some kind of mobile app to reach Apple’s nearly 1bn users. Commerce on the App Store grew to more than $500bn last year — more than most nations’ gross domestic product.
The average user, globally, spends 27 per cent of their daily waking hours on a mobile device, according to App Annie, a mobile data and analytics provider. Apple faces criticism — and a lawsuit from Fortnite developer Epic Games — for the 30 per cent fee it charges on revenue generated through the App Store. But that applies only to the 16 per cent of apps that charge fees, whereas ratings inflation impacts every app.
Competition between apps is intense, so garnering a high score is critical. Apptentive, a reputation management group, calls ratings the “lifeblood of the mobile app world”. Its research suggests that jumping from two stars to three stars can increase downloads by 306 per cent, while the leap from three stars to four delivers a 92 per cent boost. Gummicube, which helps companies with App Store optimisation, says four-fifths of users don’t trust an app with ratings below 4 stars.
“Everyone is incentivised to paint this positive world,” says Mr Khalifah. “The developer gets more installs, Apple gets more commission — it’s this snowball effect where you get more and more positivity.
“The problem,” he adds, “is that the truth gets rained on.”
The rise of the ‘in-app prompt’
The trigger for this inflation was Apple’s seemingly innocuous update to boost consumer engagement in September 2017. Users no longer had to proactively go to the App Store to rate an app — a system that often only attracted frustrated users.
Instead, with the introduction of iOS 11, Apple granted developers the ability to offer “in-app prompts”. The virtue of these prompts was they generated participation and, arguably, overcame the “responder bias” that gave a loudspeaker to negative voices. Targeting a wider swath of people was meant to increase accuracy.
In one sense, it was a big success. Engagement soared. The average app went from receiving 19,000 ratings in 2017 to more than 100,000 in 2019, according to Apptentive. By contrast, ratings in Google’s Play Store — which did not offer in-app ratings in this period — only climbed from 33,000 to 43,000.
But the way Apple designed the system has allowed developers to exploit multiple loopholes and steer consumers into inflating their ratings, say critics. By allowing developers to request the in-app prompt at a time of their choosing, developers can achieve “sample bias” by zeroing in on their fans and avoid asking users deemed a risk.
Apple requires developers to use a standard interface that asks for a 1-5-star rating, which it says is designed to collect honest feedback. However developers can introduce “framing bias”. If they prompt users with a positive note — such as “congratulations on hitting a high score!” — and then solicit them for a rating just after, the chances of a five-star rating improve.
The tech giant prohibits developers from prompting users with a message that says “How would you rate this app?”, seeing the answer, and then asking for an official App Store rating. However developers can still “prime” consumers by tweaking the question. Video conferencing apps can ask “How was the quality of your call?” to suss out the five-star responses — and only then ask Apple for the official ratings prompt.
“What they are doing is tipping the scales in their favour, in a public rating,” says Rob Markey, a consultant at Bain & Company and co-creator of Net Promoter Score, a metric that helps companies measure, manage and improve customer loyalty. “As companies get better and better at manipulating the scores, the ratings systems themselves become less and less useful to consumers.”
Other platforms have experienced issues with inflated ratings. Amazon is investigating the most prolific reviewers on its UK website after a Financial Times investigation found evidence that they were profiting from posting thousands of five-star ratings.
Apple users are allowed to opt out of receiving in-app prompts. Moreover they can, at any time, go to the App Store and write a negative review, and Apple doesn’t allow developers to block them. However it does allow app makers to “reset” their ratings, and because in-app prompts are so effective at getting ordinary users to tap 5 stars, negative vibes can be drowned out. Mr Sikorsky cites one client whose app had 1,090 one-star reviews, but within weeks of changing the feedback mechanism the app received more than 35,000 ratings — with 90 per cent giving it 5 stars.
“It has very much been engineered,” says Wendy Johansson, a user experience designer at Publicis Sapient, a consultancy.
Apple has tried to prevent developers from nudging users into giving a higher rating and threatens to ban developers who violate the rules. In response to questions from the Financial Times, Apple says it has removed apps from the App Store, and developers from its Apple Developer Program, for breaking its rules.
“Our App Store Review Guidelines make it clear that any developer who attempts to cheat the system, such as by manipulating ratings or how their app appears in search results, may have their app taken down and could be removed from the Developer Program,” Apple adds.
Yet there is evidence that developers have found numerous ways to game the system, without violating Apple’s rules. When asked about their tactics, developers point to Apple’s own in-app prompt guidelines, which states: “Make the request when users are most likely to feel satisfaction with your app, such as when they’ve completed an action, level, or task.”
For Mr Khalifah, one unintended consequence of Apple’s framework was limiting developers from asking individual users for a rating to just three times a year, per app. This was designed to avoid irritating consumers, but in effect it made the in-app prompts a scarce commodity. That incentivised developers to build “Black Mirror-style algorithms” — a reference to the British dystopian technology TV series — to figure out when users were most happy, he says.
As a result App Store ratings have been compromised, to the benefit of dominant players, says Mr Levine. “It’s anti-competitive, because only the big companies with more money are able to take advantage of this situation effectively,” he adds.
He argues that higher ratings can stifle innovation, because developers can create a mediocre app and still garner a 4.5 star rating. “A lot of apps aren’t being worked on, as much as they should, because all the indications are that customers like them,” he says.
Exactly how much ratings have soared is difficult to pinpoint, as Apple does not provide complete ratings data and history. But third parties have documented widespread ratings inflation after the introduction of iOS 11.
Among America’s seven biggest banking apps, ratings that varied between 1.2 and 4.9 stars in early 2017 are now 4.8 stars for all. In the Google Play store for Android devices, the highest rated among these same apps is 4.7 — the lowest is 4.4.
Even the apps ranked 50th most popular in the categories for shopping, lifestyle, finance, travel and entertainment are all rated at least 4.8 stars in the App Store. In the Play Store, apps with the same ranking vary between 3.8 and 4.7 stars, according to App Annie.
When Mr Levine analysed a cluster of eight popular apps that had introduced the in-app ratings prompt, he found the average score climbed from 3 stars to 4.7 stars within six months, while the number of user ratings shot up by a factor of 62.
Subway, the sandwich chain, struggled with poor app ratings for years before its score jumped from 1.7 stars to 4 stars within two weeks in early 2018. A note for the software update said it resolved a few minor bugs, while the main new feature was “[making] it easier to rate the app and provide feedback”.
The idea that higher ratings simply reflect better quality apps for the iPhone and iPad is contradicted by data showing that ratings with written reviews attached have experienced no inflation at all.
“We actually see a drop in the average review score on iOS among all apps and games from 4.2 in August 2017 to 3.9 in Sept 2017, to 3.4 by July 2020,” says Lexi Sydow, senior market insights manager at App Annie.
‘You only had one job’
Written reviews no longer carry much weight, as developers can filter out many of the one-star ratings and amplify higher scores without even using sophisticated techniques.
The most simple method, says Apptentive whose clients include eBay, CNN, and Alaska Airlines, is called “the love dialogue”.
It recommends that developers prime users with a simple message. “Do you love [this app]?” When a user clicks “no”, they are directed towards a private feedback channel. When they click “yes”, they receive Apple’s official “rate this app” interface.
Ashley Sefferman, Apptentive’s head of content, says she does not consider this “gaming”, rather it helps developers channel “actionable” feedback and hear more from their fans.
However, Apptentive statistics show that about two-fifths of users who click “no” to the love dialogue are deemed a risk and are steered away from a public review. Ms Sefferman has been recommending the technique since at least 2016 and calls it so effective that there is little excuse for having a low rating.
“The reason your app doesn’t have five stars is because the way you ask for in-app feedback is incorrect,” says an online Apptentive “how to” guide.
Google’s Android has long resisted offering in-app ratings, despite pressure from developers. Before 2017, the percentage of five-star ratings for Android apps downloaded from the Play Store was higher than Apple’s for all five categories tracked by Apptentive. But since 2017 App Store ratings have taken a commanding lead.
That is now likely to change. On August 5, Android relented and began offering in-app rating prompts. Like Apple, Android says its intention is for developers to get more “honest and unbiased” feedback. But it also cites developers praising the tool for helping them achieve, as one put it, an “all-time highest rating just a week after we implemented in-app reviews” — a clear acknowledgment that developers can expect a boost in ratings irrespective of whether they actually improve their app.
Bain’s Mr Markey says creating a marketplace with fair ratings should be critical for any platform provider. “It’s like, you have one job,” he says. “If you don’t do that, you lose buyers or you lose sellers, eventually.”
But both developers and consumers face the same problem: aside from Apple and Google, smartphone users have no place else to go.