Original Link: https://www.anandtech.com/show/16794/oneplus-9-performance-examination
Examining OnePlus' Performance Behaviour: Optimization or Misrepresentation?
by Andrei Frumusanu on July 6, 2021 6:00 AM EST- Posted in
- Smartphones
- Mobile
- OnePlus
- Snapdragon 888
- OnePlus 9 Pro
Benchmarks and performance measurements are a main-stay of evaluation of devices and integral parts of the review process for a lot of people – including both actual consumers as well as publications or analysts as ourselves. In the past, when this relationship between benchmarks and real-world apps was broken, we’ve always attempted to expose such behaviour in order to have the vendors correct their ways, which lead to quite a few articles over the years:
- Mobile Benchmark Cheating: When a SoC Vendor Provides It As A Service
- Huawei & Honor's Recent Benchmarking Behaviour: A Cheating Headache
- They're (Almost) All Dirty: The State of Cheating in Android Benchmarks
Every now and then, these topics always resurface as vendors attempt to “differentiate” their devices amongst the crowd – it’s a repeated process which unfortunately by now no longer really surprises us when it happens.
Today’s piece fits within this class of articles, and more specifically covers OnePlus’ newest OnePlus 9 Pro flagship phone, and how its performance behaviour indeed manages to be extremely unique in the current mobile landscape. It’s something so unusual and baffling, as it truly blurs the line between battery optimisation, performance cheating, and general device specification misrepresentation.
We have detected that OnePlus is blacklisting popular applications away from the its fastest cores, causing slow down in typical workloads such as web browsing. We have confirmed that (a) benchmarks or (b) unknown apps get full performance; most of the top popular non-benchmark apps get notably reduced performance. This is perhaps to improve battery life at the expense of performance, but it does mean that the regular benchmark results are somewhat useless for user experience.
Starting off with weird benchmark numbers
As always with these stories, it all starts out when discovering some weird oddity when going over the usual review process. The OnePlus 9 Pro was released in early April, however due to other work in the pipeline we never got to fully review the phone until now – well, that’s also a bit delayed due to today’s piece.
In testing, I had encountered something which really perplexed me, and caught my attention; seemingly inexplicable slow browser benchmark figures which were not in line with any other Snapdragon 888 device in the market, getting only a fraction of the scores and performance of other devices.
OnePlus 9 Pro - Chrome & Vivaldi Performance
In particular, Chrome seemed to be suffering from extremely weird behaviour that at worst ended up with the browser only being able to use the SoC’s little Cortex-A55 cores.
In the first/left video – I’m starting Chrome fresh and running the browser-based Speedometer 2.0 benchmark. During the first run, the phone is managing a score of 61.5 – a low score that’s very abnormal for a Snapdragon 888. Monitoring the CPU’s behaviour during the run points out that the system is never loading the Cortex-X1 core of the Snapdragon 888, and instead is running the benchmark on the Cortex-A78 cores. Furthermore, these are running at only 2GHz instead of their maximum 2.41GHz. What’s more perplexing, is that when re-running the test immediately again in sequence, the workload is now being completely isolated to the little Cortex-A55 cores, with an expectedly horrible score of 16.8.
We’ll get into more detail a bit later about browsers – but the only way I ever managed to achieve the full performance of the Snapdragon 888 and have the X1 cores being loaded in the benchmark was in Vivaldi, resulting with a score of 107 which is in line with other Snapdragon 888 phones. What’s utterly perplexing however, is that while this score was repeatable back-to-back, it was only ever achievable on a fresh installation of the browser. Closing the app and re-launching it caused it to again no longer work on the Cortex-X1 cores, and only run on the Cortex-A78 cores – this time around at the full 2.41GHz.
In any other WebView container integrated in any app, I was never able to get any web content to work on the X1 cores, or if it worked, it acted like Vivaldi in that it worked once and then never again.
This resulted into some really oddball benchmark numbers that portray the OnePlus 9 Pro as a early-2010’s budget device, with horrible performance.
The thing is, is that these figures did not fall in line with any other benchmark scores of the device. All our in-house benchmarks as well as third-party benchmarks presented normal performance figures with what you’d expect from a Snapdragon 888 phone, showing nothing particularly different.
Diving deeper: Traces of detection, OnePlus' kernel code
Naturally this perplexing behaviour piqued my interest as I was trying to figure out what’s happening and what’s going wrong. Investigating the device’s OS logs, I managed to detect a repeatable behaviour between applications that behaved extremely weirdly, and those that didn’t.
In particular, there are entries in regards to some sort of OnePlus Performance service that is running on the phone that is handling Quality of Service requests. Generally, these kinds of mechanisms aren’t particularly interesting as many vendors do have OS framework side mechanisms to allow for better performance experiences such as for example when launching or switching between apps. What was weird about OnePlus here is that it didn’t treat all apps equally:
I/OPPerf: perfAcquire # perflock change #: SPerfInfo{net.oneplus.launcher 160 cpu_bouncing_01 0}
I/OPPerf: perfAcquire # SCHEDTUNE change # : SPerfInfo{net.oneplus.launcher 160 cpu_bouncing_01 0}
I/OPPerf: mayPerfRelease # : SPerfInfo{net.oneplus.launcher 160 cpu_bouncing_01 0}
I/OPPerf: mayPerfRelease # reset SCHEDTUNE # : SPerfInfo{net.oneplus.launcher 160 cpu_bouncing_01 0}
I/OPPerf: perfAcquire # SPerfInfo{com.android.chrome 160 cpu_bouncing_02 0}
I/OPPerf: perfAcquire # set SCHEDTUNE #: SPerfInfo{com.android.chrome 160 cpu_bouncing_02 0}
In this log snippet, we see that the service acquires some schedtune performance lock (essentially some QoS level) when entering the default launcher, and when switching away from it, it releases this lock. When switching to Chrome, it also acquires a lack with some parameter called “cpu_bouncing_02” – a bit more details later on this.
I/OPPerf: perfAcquire # SPerfInfo{net.oneplus.launcher 160 cpu_bouncing_01 0}
I/OPPerf: perfAcquire # set SCHEDTUNE #: SPerfInfo{net.oneplus.launcher 160 cpu_bouncing_01 0}
I/OPPerf: mayPerfRelease # : SPerfInfo{net.oneplus.launcher 160 cpu_bouncing_01 0}
I/OPPerf: mayPerfRelease # reset SCHEDTUNE # : SPerfInfo{net.oneplus.launcher 160 cpu_bouncing_01 0}
The odd thing is, that when switching to a third-party browser such as Vivaldi, the lock doesn’t appear, and the performance service never actually does anything when switching away from the launcher.
I was curious what cpu_bouncing is supposed to be and sure enough, there’s a OnePlus module within the company’s kernel modifications which appears to be related to custom CPU frequency governor policies and configurations.
Beyond that, it also seems that OnePlus had made some scheduler modifications adding what it calls TPD or thread placement decision mechanism that allows for customised CPU affinity masks, limiting thread and task placements based on a classification that goes beyond the usual generic system CPU affinity, or even Android default framework app cpusets. This seemingly to me looks to be what is the mechanism that is limiting workloads from being placed on the Cortex-X1 cores – or even the A78 cores in some places.
A blacklist instead of a whitelist – Spoofing popular apps
So far, what we found is that it seems OnePlus’ OS is detecting the currently run app, and imposing different CPU DVFS and scheduler behaviours dependent on what you’re running. Because of the earlier mentioned observed behaviour in the OS logs when switching to Chrome or other apps (I’m using the official Twitter app here as an example), a way to confirm the performance discrepancy is attempting to simply spoof a custom app to identify itself as one of those detected apps, which I did with our custom toolset app:
In our custom CPU frequency scaling tracking test, which works on absolute timescales in microsecond granularities, we immediately see that there’s a large difference between running the test as a nondescript workload, and running it spoofed as the Chrome or Twitter apps. As the anonymous test, the CPU behaviour generally in line with what we’d see in Qualcomm device, though quite a bit more aggressive due to OnePlus using an 8ms load tracking window. The CPU quickly reaches the X1 cores at 2.84GHz as one would expect.
The Chrome and Twitter spoofed variants of the same test behave very different and the scaling is much slower, reaching maximum performance states 3-4x slower. That’s generally still fine, but what’s really concerning here is again that the workloads never actually reach the X1 cores, or only at very diminished frequencies far beloew the 2.84GHz state.
We’ve confirmed that the scaling is different, but what’s exactly the resulting performance? Running the SPEC suite disguised as both Chrome and Twitter, we can see some massive differences in the resulting performance, compared to running the workload as a non-detected application.
In the Chromed spoofed variant, it does appear that the workload is allowed to scale up to the Cortex-X1 cores of the Snapdragon 888, but these are limited at 2.26GHz instead of 2.84GHz. This 20% reduction in frequency comes with a corresponding 20% reduction in score performance in the test.
In the Twitter spoofed variant, the workload never reaches the X1 cores and instead falls in at a steady state of 2GHz on the Cortex-A78 cores. The performance here is correspondingly quite meagre, showcasing only 64% that of what the Snapdragon 888 is supposed to be achieve. This figure here also falls in line with the browser benchmark figures earlier when they do get scheduled on the A78 cores.
We’ve also reached out to Primate Labs’ John Poole to replicate the behaviour in a popular benchmark such as GeekBench. Spoofing GeekBench 5 as either Chrome or Twitter also results in notably lower benchmark figures that are 20% below of the peak performance of the Snapdragon 888. The X1 cores here appear to go to 2380MHz which lines up with the limitation that’s found in OnePlus kernel sources, and the A78 cores also never go beyond 2GHz in the multi-threaded tests.
I’m not too sure why we're seeing a single-threaded behaviour discrepancy between SPEC and GB5 here, and why my test toolkit degraded far lower in performance, but it appears to be related to OnePlus’ extremely convoluted thread placement policies.
PCMark is supposed to be more of a real-world representation of device performance in terms of dynamic UI workloads. Also here, when changing the application ID to something like LinkedIn results in some dramatic performance decreases in some of the tests. The “Writing” sub-test which I consider one of the most important and representative for overall device responsiveness sees a massive 34% reduction in its score. The Data Manipulation test is also very single-threaded bound, and because the X1 cores are neutered, it also sees a large 36% reduction.
What apps are affected, and the real elephant in the room
We mainly talked about Chrome and Twitter until now, but the big question here is what other applications are affected and detected by OnePlus’ performance limiting mechanism. While I haven’t discovered the likes of a smoking gun in terms of the actual blacklist used for application detection – most likely buried deep into the OS’ frameworks, the next best thing would be to simply test out various popular applications, and to confirm whether they’re being detected or not:
02
us.zoom.videomeetings
com.whatsapp
com.facebook.katana
com.zhiliaoapp.musically (TikTok)
com.instagram.android
com.snapchat.android
com.google.android.youtube
com.chrome.beta
com.android.chrome
01
com.android.settings
net.oneplus.launcher
net.oneplus.forums
net.oneplus.weather
com.oneplus.backuprestore
com.oneplus.filemanager
com.oneplus.note
com.oneplus.gallery
com.oneplus.camera
com.reddit.frontpage
com.twitter.android
com.amazon.mShop.android.shopping
com.android.vending
com.dropbox.android
org.mozilla.firefox
com.google.android.dialer
com.google.android.gm
com.google.android.documentsui
com.google.android.apps.docs.editors.docs
com.google.android.apps.photos
com.google.android.apps.meetings
com.google.android.apps.messaging
com.linkedin.android
com.discord
com.netflix.mediaclient
com.king.candycrushsaga
com.adobe.lrmobile
com.adobe.reader
tv.twitch.android.app
com.microsoft.emmx
com.brave.browser
com.nianticlabs.pokemongo
com.microsoft.teams
com.adobe.scan.android
org.videolan.vlc
com.strava
com.amazon.avod.thirdpartyclient
com.airbnb.android
com.ubercab
com.ubercab.eats
com.microsoft.office.outlook
com.microsoft.office.excel
com.microsoft.office.powerpoint
com.microsoft.office.officehubrow
com.microsoft.office.word
This list is by no means exhaustive, but simply represents the apps which I tested out before deeming it sufficient to get the point across.
What’s evident here, is that this is not a mechanism solely applying to a handful of apps, but applies to pretty much everything that has any level of popularity in the Play Store, including the whole of Google’s app suite, all of Microsoft’s Office apps, all popular social media apps, and any popular browser such as Firefox, Samsung Internet, or Microsoft Edge. Vivaldi was one of the browsers which wasn’t detected and subsequently one of the only I managed to get any reasonable performance out of.
The only apps which were notably absent from detection were some of the popular games out there, while the likes of Candy Crush were performance limited, Genshin Impact was not. Of course, on top of games, no benchmark app was detected. Other applications which also were not detected were less popular alternatives – while for example Uber and Uber Eats are detected, Lyft and Grubhub were not.
Beyond all popular applications – what also really stands out is that all of OnePlus’ own first-party apps are included within this list, even so far as the OS’ system settings, and this is where things become problematic.
How do we quantify performance in such a scenario?
At this point, there’s evidently a large disconnect between the performance that’s exhibited in the most popular applications out there and the experiences that users will be having within the most popular applications on the market, and even OnePlus own apps.
The one open question that remains is in regards to how exactly this whole mechanism affects the subjective user experience. After all, the phone has been out for a few months now and essentially nobody has remarked anything about the general performance of the phone. The reason here is that while performance peaks are evidently limited, it remains a responsive phone, and there are mechanisms at play which fight against the limitations, such as OS framework boosters and touch boosters. For example, while you will find web content pretty much limited to the little Cortex-A55 cores most of the time, this doesn’t apply when you interact with the phone as temporary touch booster will migrate things over to the middle A78 cores.
I wouldn’t blame anybody if they hadn’t necessarily noticed the performance discrepancy – I hadn’t immediately noticed it myself beyond the devices’ extremely slow momentum scrolling speed setting. However, having it side-by-side to a Samsung Galaxy S21 Ultra, or a Xiaomi Mi11 (Ultra) and paying attention, I do very much notice that the OnePlus 9 Pro is less responsive.
The problem with claims such as “less responsive” is that we cannot quantify it properly. While there are legitimate reviewers out there who are satisfied in writing about subjective performance evaluations – our pedigree here at AnandTech is all about being able to justify those claims with objective measurements. In this case, OnePlus is leaving us with very limited options.
Optimisation, or misrepresentation of what you’re being sold?
A further question is why exactly OnePlus has created such a mechanism in the first place? What exactly is the goal the company software teams were trying to achieve? Generally, in the past, application detection mechanisms were included as attempts to paint devices in better light in regards to their performance. In this case, this is actually still the case – it’s just that instead of increasing the benchmark performance, the company is reducing real-world application performance to below that of the theoretical hardware capabilities.
The only sensible rationale for such a decision is to improve a device’s power efficiency and battery life. The OnePlus 9 Pro, even though it advertises itself as using the latest LTPO OLED technology as Samsung’s Galaxy S21 Ultra for example, still suffers from notably worse power characteristics and worse power efficiency. In our web-browsing battery life test, even with this performance crippling mechanism in place, with both devices at 120Hz under the same test conditions, the OnePlus 9 Pro achieves 11.75 hours of runtime, versus the S21 Ultra’s 13.98 hours, the latter which runs at the SoC’s full performance potential. I’ll be running the same test within an undetected browser such as Vivaldi to see what it ends up at – but I suspect it’ll be notably worse for the OP9Pro.
While application behaviour and performance varies case by case, the one aspect that holds true in almost all scenarios is that the OnePlus 9 Pro doesn’t deliver on the full characteristics of the Snapdragon 888. In blacklisted/detected applications, when and if the X1 cores are being used at all, frequencies beyond 2.38GHz are unreachable save for brief booster moments. The vast majority of apps fall back to 2GHz Cortex-A78 cores. This is all a bit ironic as the reason the larger more performance X-series cores were created in the first place was to serve high transient response performance workloads, something they’re not allowed to do here.
The one argument I have in in interpreting this mechanism as a misrepresentation of device performance rather than an overall power efficiency optimisation is the very fact that it doesn’t apply equally to all apps. If you’re using some more obscure app out there, you’ll be getting better performance than compared to a more popular app. Benchmark applications are also of course not representing the “intended” performance of the device – and I claim here for it to be the intended performance of the device, as this is how OnePlus configures its own OS and first-party apps. Out of the box, almost all preinstalled apps behave in this performance limited fashion.
The whole situation is rather baffling, and certainly represents the first case of a vendor implementing application and benchmark detection in this manner, with differences in performance differing to such a degree. I’m not too sure what to make of it, bar simply exposing it and have users come to their own conclusion.
Editor’s note: Today’s article focuses solely on the OnePlus 9 Pro, tested on firmware 11.2.6.6. Due to lacking sibling or past devices, we were not able to confirm whether this mechanism is present on other OnePlus phones.
Update: OnePlus has published a statement to the colleagues at XDA:
“Our top priority is always delivering a great user experience with our products, based in part on acting quickly on important user feedback. Following the launch of the OnePlus 9 and 9 Pro in March, some users told us about some areas where we could improve the devices’ battery life and heat management. As a result of this feedback, our R&D team has been working over the past few months to optimize the devices’ performance when using many of the most popular apps, including Chrome, by matching the app’s processor requirements with the most appropriate power. This has helped to provide a smooth experience while reducing power consumption. While this may impact the devices’ performance in some benchmarking apps, our focus as always is to do what we can to improve the performance of the device for our users.”
The reasoning is generally in line with what we suspected; aiming at improving battery life the device. However, OnePlus doesn’t further explain why only specific apps were targeted, and why the mechanism was opaque to users. Furthermore, the statement seems to read as it was a post-launch update change, however the behaviour can be confirmed to have already existed in day-1 reviews from other sites such as PhoneArena or NoteBookCheck prior to the launch date of the phone on March 31st.