Original Link: https://www.anandtech.com/show/18955/asrock-industrial-nuc-box1360pd5-review-raptor-lakep-on-the-leading-edge
ASRock Industrial NUC BOX-1360P/D5 Review: Raptor Lake-P on the Leading Edge
by Ganesh T S on July 18, 2023 10:30 AM EST- Posted in
- Systems
- Intel
- UCFF
- Mini-PC
- ASRock Industrial
- Raptor Lake-P
ASRock Industrial has been a key player in the ultra-compact form-factor (UCFF) PC space over the last few years. They have managed to release 4"x4" systems based on the latest AMD and Intel platforms well ahead of other vendors. The company's NUC(S) BOX-1300 series was launched along with Intel's introduction of Raptor Lake-P in January. The NUCS BOX-1360P/D4 was made available in early February because it broadly carried over the same board components and DDR4 support of the previous generation (NUC BOX-1200 series) product.
The NUC BOX-13xxP/D5 series was introduced in January, but has taken a couple of quarters to appear in retail. The system supports DDR5 SODIMMs and uses a Hayden Bridge retimer on the Thunderbolt 4 port in order to enable USB 3.2 Gen 2x2 support in the front panel's Type-C ports.
Our review of the NUCS BOX-1360P/D4 already brought out the performance benefits of Raptor Lake-P over the previous generation -P series offerings. We also took a second deep dive into Raptor Lake-P in our review of the Arena Canyon NUC. Both systems used DDR4-3200 SODIMMs. Does DDR5 make a difference in Raptor Lake-P performance over DDR4? This review provides a comprehensive look at the NUC BOX-1360P/D5 and attempts to answer that question at different power limits for the processor.
Introduction and Product Impressions
Intel's Raptor Lake-P is an evolved version of Alder Lake-P in a more efficient manufacturing process. It retains the heterogeneous computing architecture with a mixture of performance and efficiency cores. Desktop Raptor Lake is equipped with improved cache sizes for the performance cores and more number of efficiency cores compared to desktop Alder Lake. However, the improvements in the -P series come entirely from the updated V-F curves. The turbo clocks are higher, allowing for better performance and power efficiency within the same nominal TDP of Alder Lake-P. Raptor Lake-P also includes additional Thunderbolt 4 ports and USB 3.2 Gen 2x2 support, subject to the adoption of specific board components.
ASRock is a well-known vendor in the consumer motherboard and mini-PC market. In 2011, the company set up the ASRock Industrial business unit to focus on industrial motherboards. The division branched out in 2018 as an independent vendor with exclusive focus on B2B products. The company has products for deployment in small businesses (offices), automation, robotics, security, and other industrial / IoT applications. As a company with a B2B focus primarily, the focus is on the development and sales of motherboards to various system integrators who can do their own value additions. The company also sells mini-PCs based on the developed motherboards into the retail channel. We have taken a close look at the performance profile of various ASRock Industrial UCFF PCs before, including that of the NUC BOX-1260P based on the Core i7-1260P Alder Lake-P processor and the NUCS BOX-1360P/D4 based on the Core i7-1360P Raptor Lake-P processor.
The company provided us with a barebones sample of the NUC BOX-1360P/D5 a few months back. It is their first Intel-based UCFF PC with DDR5 support. Unlike the NUCS BOX-1360P/D4 slim version without 2.5" drive support, the NUC BOX-1360P/D5 falls back to the I/O and chassis design seen in the NUC BOX-1200 series. Users get a 2.5" SATA drive bay, dual LAN capabilities, and both HDMI and DisplayPort display output options. The key differences compared to the NUCS BOX-1360P/D4 are summarized below.
- Replacement of DDR4-3200 SODIMM slots with DDR5-4800 SODIMM slots
- Additional 2.5 GbE RJ-45 port
- Replacement of a HDMI 2.0a port with a full-sized DisplayPort 1.4a port
- USB 3.2 Gen 2x2 (20 Gbps) support on both Type-C ports
- Hayden Bridge retimer for the Thunderbolt 4 port (compared to Burnside Bridge in the NUCS BOX-1360P/D4)
ASRock Industrial delivers the NUC BOX-1360P/D5 in a non-descript box (with no inkling of the model inside the package). In addition to the main unit, the company includes a VESA mount with screws, M.2 mounting screws, a quick guide for assembling the system, and a 120W (19V @ 6.32A) adapter with a geo-specific power cord within that.
ASRock Industrial markets their mini-PCs in a barebones configuration, with the choice of RAM and SSD left to the end user. Installing these components involves removing four screws from the underside of the unit and slotting in the SODIMMs and affixing the M.2 SSD with a screw. Like the previous members in the NUC BOX series, the screw slot for the M.2 2280 SSD is on a separate plastic tab. The side of the chassis are perforated for air intake and the rear has the air vent that allows the laptop-style blower fan to exhaust air after passing it through the heat spreader. Pictures of the chassis as well as the board are available in the gallery below.
The barebones version of the NUC BOX-1360P/D5 needs DDR5-4800 SODIMMs and a M.2 SSD or 2.5" SATA drive to complete the build. We opted to install 2x 16GB of G.Skill's RipJaws DDR5-4800 SODIMMs along with a 500 GB Samsung SSD 980 PRO M.2 2280 NVMe SSD.
Windows 11 Enterprise 21H2 along with the latest updates was installed prior to proceeding with the performance evaluation. Similar to other UCFF PCs from ASRock Industrial, the NUC BOX-1360P also allows the CPU operation mode to be set to either 'Normal' (default) or 'Performance'. The fan is set to full speed at all times in the latter setting, but has the advantage of a higher PL1 limit (40W vs. 28W) for the processor. The full specifications of the review sample in both modes are provided in the table below.
Systems Specifications (as tested) |
||
ASRock NUC BOX-1360P-D5 (Performance) | ASRock NUC BOX-1360P-D5 (Normal) | |
Processor | Intel Core i7-1360P Raptor Lake 4P + 8E / 16T, up to 5.0 GHz (P) up to 3.7 GHz (E) Intel 7, 18MB L2, Min / Max / Base TDP: 20W / 64W / 28W PL1 = 40W, PL2 = 64W |
Intel Core i7-1360P Raptor Lake 4P + 8E / 16T, up to 5.0 GHz (P) up to 3.7 GHz (E) Intel 7, 18MB L2, Min / Max / Base TDP: 20W / 64W / 28W PL1 = 28W, PL2 = 64W |
Memory | G.Skill RipJaws F5-4800S3434A16GA2-RS DDR5-4800 SODIMM 34-34-34-76 @ 4800 MHz 2x16 GB |
G.Skill RipJaws F5-4800S3434A16GA2-RS DDR5-4800 SODIMM 34-34-34-76 @ 4800 MHz 2x16 GB |
Graphics | Intel Iris Xe Graphics (96EU @ 1.50 GHz) |
Intel Iris Xe Graphics (96EU @ 1.50 GHz) |
Disk Drive(s) | Samsung SSD 980 PRO (500 GB; M.2 2280 PCIe 4.0 x4 NVMe;) (Samsung 128L V-NAND 3D TLC; Samsung Elpis Controller) |
Samsung SSD 980 PRO (500 GB; M.2 2280 PCIe 4.0 x4 NVMe;) (Samsung 128L V-NAND 3D TLC; Samsung Elpis Controller) |
Networking | 1x 2.5 GbE RJ-45 (Intel I226-LM) 1x 2.5 GbE RJ-45 (Intel I226-V) Intel Wi-Fi 6E AX210 (2x2 802.11ax - 2.4 Gbps) |
1x 2.5 GbE RJ-45 (Intel I226-LM) 1x 2.5 GbE RJ-45 (Intel I226-V) Intel Wi-Fi 6E AX210 (2x2 802.11ax - 2.4 Gbps) |
Audio | Realtek ALC256 (3.5mm Audio Jack in Front) Digital Audio with Bitstreaming Support over HDMI and Display Port |
Realtek ALC256 (3.5mm Audio Jack in Front) Digital Audio with Bitstreaming Support over HDMI and Display Port |
Video | 1x HDMI 2.0b (Rear) 1x DisplayPort 1.4a (Rear) 1x DisplayPort 2.1 (Front / USB4) 1x Display Port 1.4a over Type-C Alt-Mode |
1x HDMI 2.0b (Rear) 1x DisplayPort 1.4a (Rear) 1x DisplayPort 2.1 (Front / USB4) 1x Display Port 1.4a over Type-C Alt-Mode |
Miscellaneous I/O Ports | 1x USB4 / Thunderbolt 4 Type-C (Front, up to 40 Gbps) 1x USB 3.2 Gen 2x2 Type-C (Front, with DP Alt Mode) 1x USB 3.2 Gen 2 Type-A (Front) 2x USB 3.2 Gen 2 Type-A (Rear) |
1x USB4 / Thunderbolt 4 Type-C (Front, up to 40 Gbps) 1x USB 3.2 Gen 2x2 Type-C (Front, with DP Alt Mode) 1x USB 3.2 Gen 2 Type-A (Front) 2x USB 3.2 Gen 2 Type-A (Rear) |
Operating System | Windows 11 Enterprise (22000.2124) | Windows 11 Enterprise (22000.2124) |
Pricing | US $670 (barebones) US $810 (as configured, no OS) |
US $670 (barebones) US $810 (as configured, no OS) |
Full Specifications | ASRock Industrial NUC BOX-1360P/D5 Specifications | ASRock Industrial NUC BOX-1360P/D5 Specifications |
The next section takes a look at the various BIOS options and follows it up with a detailed platform analysis.
Setup Notes and Platform Analysis
Our evaluation of the NUC BOX-1360P/D5 (after completion of the build using the G.Skill SODIMMs and Samsung M.2 SSD) began with a look at the options available in the BIOS interface. As is typical for systems targeting the industrial market primarily, the main BIOS interface is a vanilla one. It does provide plenty of configuration options. The video below presents the entire gamut of available options.
The key feature is under Advanced > CPU Configuration > CPU Operating Mode, with the option to either keep it at 'Normal' or change it to 'Performance'. The latter setting increases the power budget available to the processor.
The block diagram below presents the overall high-speed I/O distribution.
The key updates over the NUCS BOX-1360P/D4 are evident in the above block diagram. The JHL9040R retimer enables DisplayPort 2.1 support as well as USB 3.2 Gen 2x2 (20 Gbps) support on the Thunderbolt 4 Type-C port. The HDMI port uses a ITE IT66318 retimer. Realtek ALC256 fulfils the analog audio codec duties. A dedicated SATA port is brought out on the board along with the required power pins. A TPM device from Infineon communicates over a SPI interface with the Core i7-1360P.
There is a lack of flexibility on the board design side for the HSIO lanes allocation due to the integration of the PCH inside the package. Despite that, ASRock Industrial has delivered a compelling set of I/O options given the form-factor constraints. Making both Type-C ports in the front panel to be Thunderbolt 4-capable would have been a welcome improvement over the previous Intel-based UCFF systems from the company.
In today's review, we compare the NUC BOX-1360P/D5 and a host of other systems based on processors with TDPs ranging from 15W to 35W. The systems do not target the same market segments, but a few key aspects lie in common, making the comparisons relevant.
Comparative PC Configurations | ||
Aspect | ASRock NUC BOX-1360P-D5 (Performance) | |
CPU | Intel Core i7-1360P Raptor Lake 4P + 8E / 16T, up to 5.0 GHz (P) up to 3.7 GHz (E) Intel 7, 18MB L2, Min / Max / Base TDP: 20W / 64W / 28W PL1 = 40W, PL2 = 64W |
Intel Core i7-1360P Raptor Lake 4P + 8E / 16T, up to 5.0 GHz (P) up to 3.7 GHz (E) Intel 7, 18MB L2, Min / Max / Base TDP: 20W / 64W / 28W PL1 = 40W, PL2 = 64W |
GPU | Intel Iris Xe Graphics (96EU @ 1.50 GHz) |
Intel Iris Xe Graphics (96EU @ 1.50 GHz) |
RAM | G.Skill RipJaws F5-4800S3434A16GA2-RS DDR5-4800 SODIMM 34-34-34-76 @ 4800 MHz 2x16 GB |
G.Skill RipJaws F5-4800S3434A16GA2-RS DDR5-4800 SODIMM 34-34-34-76 @ 4800 MHz 2x16 GB |
Storage | Samsung SSD 980 PRO (500 GB; M.2 2280 PCIe 4.0 x4 NVMe;) (Samsung 128L V-NAND 3D TLC; Samsung Elpis Controller) |
Samsung SSD 980 PRO (500 GB; M.2 2280 PCIe 4.0 x4 NVMe;) (Samsung 128L V-NAND 3D TLC; Samsung Elpis Controller) |
Wi-Fi | 1x 2.5 GbE RJ-45 (Intel I226-LM) 1x 2.5 GbE RJ-45 (Intel I226-V) Intel Wi-Fi 6E AX210 (2x2 802.11ax - 2.4 Gbps) |
1x 2.5 GbE RJ-45 (Intel I226-LM) 1x 2.5 GbE RJ-45 (Intel I226-V) Intel Wi-Fi 6E AX210 (2x2 802.11ax - 2.4 Gbps) |
Price (in USD, when built) | US $700 (barebones) US $840 (as configured, no OS) |
US $700 (barebones) US $840 (as configured, no OS) |
The rest of this review deals with the comparative benchmark numbers for the UCFF systems outlined in the table above. All of the systems are based on 4"x4" motherboards, though the PL1 and PL2 configurations vary.
System Performance: UL and BAPCo Benchmarks
Our 2022 Q4 update to the test suite for Windows 11-based systems carries over some of the standard benchmarks we have been using over the last several years, including UL's PCMark and BAPCo's SYSmark. New additions include BAPCo's CrossMark multi-platform benchmarking tool, as well as UL's Procyon benchmark suite.
UL PCMark 10
UL's PCMark 10 evaluates computing systems for various usage scenarios (generic / essential tasks such as web browsing and starting up applications, productivity tasks such as editing spreadsheets and documents, gaming, and digital content creation). We benchmarked select PCs with the PCMark 10 Extended profile and recorded the scores for various scenarios. These scores are heavily influenced by the CPU and GPU in the system, though the RAM and storage device also play a part. The power plan was set to Balanced for all the PCs while processing the PCMark 10 benchmark. The scores for each contributing component / use-case environment are also graphed below.
UL PCMark 10 - Performance Scores | |||
The NUC BOX-1360P/D5 shows marked improvement over the D4 model across the board. In the 'Performance' setting, the scores are better than the Arena Canyon's numbers. The AMD models enjoy significant lead in the productivity benchmark. The graphics performance of the RDNA2 iGPU in the 4X4 BOX-7735U also lends its weightage to the overall scores, with both operating modes of the Rembrandt refresh model leapfrogging the NUC BOX-1360P/D5.
UL Procyon v2.1.544
PCMark 10 utilizes open-source software such as Libre Office and GIMP to evaluate system performance. However, many of their professional benchmark customers have been requesting evaluation with commonly-used commercial software such as Microsoft Office and Adobe applications. In order to serve their needs, UL introduced the Procyon benchmark in late 2020. There are five benchmark categories currently - Office Productivity, AI Inference, Battery Life, Photo Editing, and Video Editing. The battery life benchmark is applicable to Windows devices such as notebooks and tablets. We presents results from our processing of the other three benchmarks.
UL Procyon - Office Productivity Scores | |||
In the Office workloads, the Raptor Lake-P systems all perform quite similar to each other irrespective of the memory technology used.
However, on the energy front, the 28W PL1 setting coupled with DDR5 SODIMMs result in the lowest consumption numbers for workload completion. The Arena Canyon NUC fares slightly worse in terms of scores while consuming the same amount of energy.
Moving on to the evaluation of Adobe Photoshop and Adobe Lightroom, we find the normal mode configuration of the NUC BOX-1360P/D5 performing similar to the Arena Canyon NUC while consuming lesser energy. From a raw performance viewpoint, the 40W PL1 setting is enough to make the system climb up to the top spot.
UL Procyon evaluates performance for video editing using Adobe Premier Pro.The two operating modes of the NUC BOX-1360P/D5 hold on to the top two spots.
In terms of energy consumption, the DDR5 configuration in the Normal mode manages to be the most efficient of the lot.
BAPCo CrossMark 1.0.1.86
BAPCo's CrossMark aims to simplify benchmark processing while still delivering scores that roughly tally with SYSmark. The main advantage is the cross-platform nature of the tool - allowing it to be run on smartphones and tablets as well.
BAPCo CrossMark 1.0.1.86 - Sub-Category Scores | |||
The two modes of the NUC BOX-1360P/D5 take the top two spots, but the performance of the DDR5 configuration is quite similar to the DDR4 ones. Since CrossMark attempts to consolidate different workloads together without idle time intervals and play it back in a non-real-time environment, it is not entirely representative of real-world performance like SYSmark 25.
System Performance: Miscellaneous Workloads
Standardized benchmarks such as UL's PCMark 10 and BAPCo's SYSmark take a holistic view of the system and process a wide range of workloads to arrive at a single score. Some systems are required to excel at specific tasks - so it is often helpful to see how a computer performs in specific scenarios such as rendering, transcoding, JavaScript execution (web browsing), etc. This section presents focused benchmark numbers for specific application scenarios.
3D Rendering - CINEBENCH R23
We use CINEBENCH R23 for 3D rendering evaluation. R23 provides two benchmark modes - single threaded and multi-threaded. Evaluation of different PC configurations in both supported modes provided us the following results.
Intel's Arena Canyon NUC calls the shots in the CINEBENCH workloads - both in the single-threaded and multi-threaded versions. The NUC BOX-1360P/D5 manages to sneak in to the second spot, but surprisingly does it in the normal mode (28W PL1) for the single-threaded case, and in the performance mode (40W PL1) for the multi-threaded one.
Transcoding: Handbrake 1.5.1
Handbrake is one of the most user-friendly open source transcoding front-ends in the market. It allows users to opt for either software-based higher quality processing or hardware-based fast processing in their transcoding jobs. Our new test suite uses the 'Tears of Steel' 4K AVC video as input and transcodes it with a quality setting of 19 to create a 720p AVC stream and a 1080p HEVC stream.
Intel's Arena Canyon NUC again has the upper hand over the NUC BOX-1360P/D5 in both software transcoding cases. Additionally, the presence of eight high-performance cores in the 4X4 BOX-7735U makes it an excellent performance in the x265 case. In both workloads, the 40W PL1 setting handily outperforms the 28W PL1 setting for the NUC BOX-1360P/D5 - just as one would expect.
Hardware transcoding frame rates within the same GPU generation are a matter of clock rates and power budget. Here, we see the NUC BOX-1360P/D5 and Arena Canyon NUCs performing very similar to each other, with the 40W PL1 setting not providing the former with any significant advantage.
Archiving: 7-Zip 21.7
The 7-Zip benchmark is carried over from our previous test suite with an update to the latest version of the open source compression / decompression software.
The higher power budget enables the performance mode-enabled NUC BOX-1360P to deliver the highest compression rates. However, that budget is not quite helpful for decompression, as the AMD models rule the roost by taking up all the top four spots.
Web Browsing: JetStream, Speedometer, and Principled Technologies WebXPRT4
Web browser-based workloads have emerged as a major component of the typical home and business PC usage scenarios. For headless systems, many applications based on JavaScript are becoming relevant too. In order to evaluate systems for their JavaScript execution efficiency, we are carrying over the browser-focused benchmarks from the WebKit developers used in our notebook reviews. Hosted at BrowserBench, JetStream 2.0 benchmarks JavaScript and WebAssembly performance, while Speedometer measures web application responsiveness.
From a real-life workload perspective, we also process WebXPRT4 from Principled Technologies. WebXPRT4 benchmarks the performance of some popular JavaScript libraries that are widely used in websites.
The NUC BOX-1360P/D5 configurations make up the top spots in all three benchmarks, but there exists no significant gilf between the units representing the latest generation of products from both processor vendors.
Application Startup: GIMP 2.10.30
A new addition to our systems test suite is AppTimer - a benchmark that loads up a program and determines how long it takes for it to accept user inputs. We use GIMP 2.10.30 with a 50MB multi-layered xcf file as input. What we test here is the first run as well as the cached run - normally on the first time a user loads the GIMP package from a fresh install, the system has to configure a few dozen files that remain optimized on subsequent opening. For our test we delete those configured optimized files in order to force a 'fresh load' every second time the software is run.
As it turns out, GIMP does optimizations for every CPU thread in the system, which requires that higher thread-count processors take a lot longer to run. So the test runs quick on systems with fewer threads, however fast cores are also needed. Interestingly, the 28W PL1 setting is the best across the board, but the Arena Canyon NUC performs just as well.
Cryptography Benchmarks
Cryptography has become an indispensable part of our interaction with computing systems. Almost all modern systems have some sort of hardware-acceleration for making cryptographic operations faster and more power efficient. In the case of IoT servers, many applications - including web server functionality and VPN - need cryptography acceleration.
BitLocker is a Windows features that encrypts entire disk volumes. While drives that offer encryption capabilities are dealt with using that feature, most legacy systems and external drives have to use the host system implementation. Windows has no direct benchmark for BitLocker. However, we cooked up a BitLocker operation sequence to determine the adeptness of the system at handling BitLocker operations. We start off with a 4.5GB RAM drive in which a 4GB VHD (virtual hard disk) is created. This VHD is then mounted, and BitLocker is enabled on the volume. Once the BitLocker encryption process gets done, BitLocker is disabled. This triggers a decryption process. The times taken to complete the encryption and decryption are recorded. This process is repeated 25 times, and the average of the last 20 iterations is graphed below.
Hardware acceleration is available for the operations in all of the systems. The time taken for processing is directly dependent on the available power budget. The recent AMD systems fare better than the Intel ones, and the NUC BOX-1360P/D5 lands in the middle of the pack in both workload components.
GPU Performance: Synthetic Benchmarks
Intel did not make significant changes in the integrated GPU when moving from Alder Lake to Raptor Lake. Process maturity has allowed it to clock the iGPU a bit higher, but the number of EUs remains the same as in the previous generation. GPU performance evaluation typically involved gaming workloads, and for select PCs, GPU compute. Prior to that, we wanted to take a look at the capabilities of the iGPU in the Core i7-1360P.
The Intel Iris Xe Graphics in the Core i7-1360P handily outperforms the iGPU in the Ryzen 5000U. However, we saw in the review of the 4X4 BOX-7735U that RDNA2 managed to wrest back the crown for AMD in the iGPU space. The evaluation of GPU workloads in the NUC BOX-1360P/D5 and comparison against previous results can allow us to check whether DDR5 can provide some extra performance benefits.
GFXBench
The DirectX 12-based GFXBench tests from Kishonti are cross-platform, and available all the way down to smartphones. As such, they are not very taxing for discrete GPUs and modern integrated GPUs. We processed the offscreen versions of the 'Aztec Ruins' benchmark.
At lower resolutions, the higher power budget is a boon for the NUC BOX-1360P/D5. Even the normal mode outperforms the best possible numbers from the 7735U. However, a resolution increase results in the RDNA2 iGPU reclaiming the title.
UL 3DMark
Four different workload sets were processed in 3DMark - Fire Strike, Time Spy, Night Raid, and Wild Life.
3DMark Fire Strike
The Fire Strike benchmark has three workloads. The base version is meant for high-performance gaming PCs. It uses DirectX 11 (feature level 11) to render frames at 1920 x 1080. The Extreme version targets 1440p gaming requirements, while the Ultra version targets 4K gaming system, and renders at 3840 x 2160. The graph below presents the overall score for the Fire Strike Extreme and Fire Strike Ultra benchmark across all the systems that are being compared.
UL 3DMark - Fire Strike Workloads | |||
The GFXBench results had already revealed that the RDNA2 iGPU in the 7735U outperforms the Iris Xe iGPU in the Core i7-1360P. So, it is no surprise that the 7735U takes the top two spots in both versions of the Fire Strike benchmark.
3DMark Time Spy
The Time Spy workload has two levels with different complexities. Both use DirectX 12 (feature level 11). However, the plain version targets high-performance gaming PCs with a 2560 x 1440 render resolution, while the Extreme version renders at 3840 x 2160 resolution. The graphs below present both numbers for all the systems that are being compared in this review.
UL 3DMark - Time Spy Workloads | |||
The usage of 1440p and 2160p resolutions again mean that the NUC BOX-1360P/D5 has to put up with the third place. There is a bit of an advantage for it over the DDR4-based Arena Canyon NUC and the NUCS BOX-1360P/D4 systems.
3DMark Wild Life
The Wild Life workload was initially introduced as a cross-platform GPU benchmark in 2020. It renders at a 2560 x 1440 resolution using Vulkan 1.1 APIs on Windows. It is a relatively short-running test, reflective of mobile GPU usage. In mid-2021, UL released the Wild Life Extreme workload that was a more demanding version that renders at 3840 x 2160 and runs for a much longer duration reflective of typical desktop gaming usage.
UL 3DMark - Wild Life Workloads | |||
The Wild Life workloads finally see the NUC BOX-1360P/D5's 40W PL1 avatar successfully overtage the 7735U's best performance mode.
3DMark Night Raid
The Night Raid workload is a DirectX 12 benchmark test. It is less demanding than Time Spy, and is optimized for integrated graphics. The graph below presents the overall score in this workload for different system configurations.
This workload sees a repeat of the Time Spy rankings, with the RDNA2 iGPU being miles ahead of the Iris Xe iGPU in the Core i7-1360P-based systems.
Workstation Performance - SPECworkstation 3.1
SFF PCs traditionally do not lend themselves to workstation duties. However, a recent trend towards miniaturized workstations has been observed. While UCFF systems are still not capable enough to become workstations, the rapid performance improvements over the years has encouraged us to benchmark some of the systems for both content creation workloads as well as professional applications. Towards this, we processed the SPECworkstation 3.1 benchmark from SPEC.
The SPECworkstation 3.1 benchmark measures workstation performance based on a number of professional applications. It includes more than 140 tests based on 30 different workloads that exercise the CPU, graphics, I/O and memory hierarchy. These workloads fall into different categories.
- Media and Entertainment (3D animation, rendering)
- Product Development (CAD/CAM/CAE)
- Life Sciences (medical, molecular)
- Financial Services
- Energy (oil and gas)
- General Operations
- GPU Compute
Individual scores are generated for each test and a composite score for each category is calculated based on a reference machine (HP Z240 tower workstation using an Intel E3-1240 v5 CPU, an AMD Radeon Pro WX3100 GPU, 16GB of DDR4-2133, and a SanDisk 512GB SSD). Official benchmark results generated automatically by the benchmark itself are linked in the table below for the systems being compared.
SPECworkstation 3.1 Official Results (2K) | |
ASRock NUC BOX-1360P-D5 (Performance) | Run Summary |
ASRock NUC BOX-1360P-D5 (Normal) | Run Summary |
ASRock 4X4 BOX-7735U (Performance) | Run Summary |
ASRock 4X4 BOX-7735U (Normal) | Run Summary |
Intel NUC13ANKi7 (Arena Canyon) | Run Summary |
ASRock NUC BOX-1260P | Run Summary |
ASRock 4X4 BOX-5800U (Performance) | Run Summary |
ASRock NUCS BOX-1360P-D4 | Run Summary |
Intel NUC12WSKi7 (Wall Street Canyon) | Run Summary |
Details of the tests in each category, as well as an overall comparison of the systems on a per-category basis are presented below.
Media and Entertainment
The Media and Entertainment category comprises of workloads from five distinct applications:
- The Blender workload measures system performance for content creation using the open-source Blender application. Tests include rendering of scenes of varying complexity using the OpenGL and ray-tracing renderers.
- The Handbrake workload uses the open-source Handbrake application to transcode a 4K H.264 file into a H.265 file at 4K and 2K resolutions using the CPU capabilities alone.
- The LuxRender workload benchmarks the LuxCore physically based renderer using LuxMark.
- The Maya workload uses the SPECviewperf 13 maya-05 viewset to replay traces generated using the Autodesk Maya 2017 application for 3D animation.
- The 3ds Max workload uses the SPECviewperf 13 3dsmax-06 viewset to replay traces generated by Autodesk's 3ds Max 2016 using the default Nitrous DX11 driver. The workload represents system usage for 3D modeling tasks.
Product Development
The Product Development category comprises of eight distinct workloads:
- The Rodinia (CFD) workload benchmarks a computational fluid dynamics (CFD) algorithm.
- The WPCcfd workload benchmarks another CFD algorithm involving combustion and turbulence modeling.
- The CalculiX workload uses the Calculix finite-element analysis program to model a jet engine turbine's internal temperature.
- The Catia workload uses the catia-05 viewset from SPECviewperf 13 to replay traces generated by Dassault Systemes' CATIA V6 R2012 3D CAD application.
- The Creo workload uses the creo-02 viewset from SPECviewperf 13 to replay traces generated by PTC's Creo, a 3D CAD application.
- The NX workload uses the snx-03 viewset from SPECviewperf 13 to replay traces generated by the Siemens PLM NX 8.0 CAD/CAM/CAE application.
- The Solidworks workload uses the sw-04 viewset from SPECviewperf 13 to replay traces generated by Dassault Systemes' SolidWorks 2013 SP1 CAD/CAE application.
- The Showcase workload uses the showcase-02 viewset from SPECviewperf 13 to replay traces from Autodesk's Showcase 2013 3D visualization and presentation application
Life Sciences
The Life Sciences category comprises of four distinct test sets:
- The LAMMPS set comprises of five tests simulating different molecular properties using the LAMMPS molecular dynamics simulator.
- The NAMD set comprises of three tests simulating different molecular interactions.
- The Rodinia (Life Sciences) set comprises of four tests - the Heartwall medical imaging algorithm, the Lavamd algorithm for calculation of particle potential and relocation in a 3D space due to mutual forces, the Hotspot algorithm to estimate processor temperature with thermal simulations, and the SRAD anisotropic diffusion algorithm for denoising.
- The Medical workload uses the medical-02 viewset from SPECviewperf 13 to determine system performance for the Tuvok rendering core in the ImageVis3D volume visualization program.
Financial Services
The Financial Services workload set benchmarks the system for three popular algorithms used in the financial services industry - the Monte Carlo probability simulation for risk assessment and forecast modeling, the Black-Scholes pricing model, and the Binomial Options pricing model.
Energy
The Energy category comprises of workloads simulating various algorithms used in the oil and gas industry:
- The FFTW workload computes discrete Fourier transforms of large matrices.
- The Convolution workload computes the convolution of a random 100x100 filter on a 400 megapixel image.
- The SRMP workload processes the Surface-Related Multiples Prediction algorithm used in seismic data processing.
- The Kirchhoff Migration workload processes an algorithm to calculate the back propogation of a seismic wavefield.
- The Poisson workload takes advantage of the OpenMP multi-processing framework to solve the Poisson's equation.
- The Energy workload uses the energy-02 viewset from SPECviewperf 13 to determine system performance for the open-source OPendTec seismic visualization application.
General Operations
In the General Options category, the focus is on workloads from widely used applications in the workstation market:
- The 7zip workload represents compression and decompression operations using the open-source 7zip file archiver program.
- The Python workload benchmarks math operations using the numpy and scipy libraries along with other Python features.
- The Octave workload performs math operations using the Octave programming language used in scientific computing.
- The Storage workload evaluates the performance of the underlying storage device using transaction traces from multiple workstation applications.
GPU Compute
In the GPU Compute category, the focus is on workloads taking advantage of the GPU compute capabilities using either OpenCL or CUDA, as applicable:
- The LuxRender benchmark is the same as the one seen in the media and entertainment category.
- The Caffe benchmark measures the performance of the Caffe deep-learning framework.
- The Folding@Home benchmark measures the performance of the system for distributed computing workloads focused on tasks such as protein folding and drug design.
We only process the OpenCL variants of the benchmark, as the CUDA version doesn't process correctly with default driver installs.
Overall, a large number of high-performance cores help the 4X4 BOX-7735U come out on top across almost all of the workload categories. Since these workloads demand sustained performance, the PL1 / PL2 tricks are not of much use. On the GPU compute side, Intel's iGPU fails on the caffe workload, causing the Intel-based mini-PCs to return abysmal scores for the component.
System Performance: Multi-Tasking
One of the key drivers of advancements in computing systems is multi-tasking. On mobile devices, this is quite lightweight - cases such as background email checks while the user is playing a mobile game are quite common. Towards optimizing user experience in those types of scenarios, mobile SoC manufacturers started integrating heterogenous CPU cores - some with high performance for demanding workloads, while others were frugal in terms of both power consumption / die area and performance. This trend is now slowly making its way into the desktop PC space.
Multi-tasking in typical PC usage is much more demanding compared to phones and tablets. Desktop OSes allow users to launch and utilize a large number of demanding programs simultaneously. Responsiveness is dictated largely by the OS scheduler allowing different tasks to move to the background. Intel's Alder Lake processors work closely with the Windows 11 thread scheduler to optimize performance in these cases. Keeping these aspects in mind, the evaluation of multi-tasking performance is an interesting subject to tackle.
We have augmented our systems benchmarking suite to quantitatively analyze the multi-tasking performance of various platforms. The evaluation involves triggering a ffmpeg transcoding task to transform 1716 3840x1714 frames encoded as a 24fps AVC video (Blender Project's 'Tears of Steel' 4K version) into a 1080p HEVC version in a loop. The transcoding rate is monitored continuously. One complete transcoding pass is allowed to complete before starting the first multi-tasking workload - the PCMark 10 Extended bench suite. A comparative view of the PCMark 10 scores for various scenarios is presented in the graphs below. Also available for concurrent viewing are scores in the normal case where the benchmark was processed without any concurrent load, and a graph presenting the loss in performance.
UL PCMark 10 Load Testing - Digital Content Creation Scores | |||
UL PCMark 10 Load Testing - Productivity Scores | |||
UL PCMark 10 Load Testing - Essentials Scores | |||
UL PCMark 10 Load Testing - Gaming Scores | |||
UL PCMark 10 Load Testing - Overall Scores | |||
All PCMark 10 workload components see the relative ordering being maintained even after the addition of the concurrent loading.
Following the completion of the PCMark 10 benchmark, a short delay is introduced prior to the processing of Principled Technologies WebXPRT4 on MS Edge. Similar to the PCMark 10 results presentation, the graph below show the scores recorded with the transcoding load active. Available for comparison are the dedicated CPU power scores and a measure of the performance loss.
Principled Technologies WebXPRT4 Load Testing Scores (MS Edge) | |||
Despite a 50%+ loss in performance the 40W PL1 configuration of the NUC BOX-1360P/D5 and the Arena Canyon NUC take the top spots even when concurrent loading is active.
The final workload tested as part of the multitasking evaluation routine is CINEBENCH R23.
3D Rendering - CINEBENCH R23 Load Testing - Single Thread Score | |||
3D Rendering - CINEBENCH R23 Load Testing - Multiple Thread Score | |||
The presence of heterogeneous cores is a challenge for handling new multi-threaded workloads when a multi-threaded workload like a transcoding task is already active. That is the primary reason for the AMD-based systems showing minimal performance loss when concurrent loads of different complexities are simultaneously triggered.
After the completion of all the workloads, we let the transcoding routine run to completion. The monitored transcoding rate throughout the above evaluation routine (in terms of frames per second) is graphed below.
The behavior of the NUC BOX-1360P/D5 is very similar to that of the NUCS BOX-1360P/D4, and it is not immediately obvious if Thread Director is working as intended.
ASRock Industrial NUC BOX-1360P/D5 (Performance) ffmpeg Transcoding Rate (Multi-Tasking Test) | |||
Task Segment | Transcoding Rate (FPS) | ||
Minimum | Average | Maximum | |
Transcode Start Pass | 3.5 | 13.21 | 46.5 |
PCMark 10 | 0 | 11.69 | 39.5 |
WebXPRT 4 | 3.5 | 11.18 | 21 |
Cinebench R23 | 2.5 | 11.74 | 41 |
Transcode End Pass | 4 | 13.01 | 42.5 |
The silver lining seems to be that the drop in transcoding performance is not as heavy as what was seen in other systems.
HTPC Credentials
The 2022 Q4 update to our system reviews brings an updated HTPC evaluation suite for systems. After doing away with the evaluation of display refresh rate stability and Netflix streaming evaluation, the local media playback configurations have also seen a revamp. This section details each of the workloads processed on the ASRock NUC BOX-1360P/D5 as part of the HTPC suite.
YouTube Streaming Efficiency
YouTube continues to remain one of the top OTT platforms, primarily due to its free ad-supported tier. Our HTPC test suite update retains YouTube streaming efficiency evaluation as a metric of OTT support in different systems. Mystery Box's Peru 8K HDR 60FPS video is the chosen test sample. On PCs running Windows, it is recommended that HDR streaming videos be viewed using the Microsoft Edge browser after putting the desktop in HDR mode.
YouTube Streaming Statistics - Normal Mode
YouTube Streaming Statistics - Performance Mode
The GPU in ASRock NUC BOX-1360P/D5 supports hardware decoding of VP9 Profile 2, and we see the stream encoded with that codec being played back. The streaming is perfect in both the normal and performance modes, thanks to the powerful GPU and hardware decoding support - the few dropped frames observed in the statistics below are due to mouse clicks involved in bringing up the overlay.
The streaming efficiency-related aspects such as GPU usage and at-wall power consumption are also graphed below.
YouTube Streaming Efficiency - Normal Mode
YouTube Streaming Efficiency - Performance Mode
The normal mode configuration is one of the most energy efficient playback segments we have seen in our labs, tying with the Panther Canyon NUC and the Akasa Newton TN. Despite the higher PL1 numbers and constantly running fan, the performance mode is also better off in energy numbers compared to the rest of the considered systems.
Hardware-Accelerated Encoding and Decoding
The transcoding benchmarks in the systems performance section presented results from evaluating the QuickSync encoder within Handbrake's framework. The capabilities of the decoder engine are brought out by DXVAChecker.
Video Decoding Hardware Acceleration in ASRock NUC BOX-1360P-D5
The iGPU in Raptor Lake-P system supports hardware decode for a variety of codecs including AVC, JPEG, HEVC (8b and 10b, 4:2:0 and 4:4:4), and VP9 (8b and 10b, 4:2:0 and 4:4:4). AV1 decode support is also present. This is currently the most comprehensive codec support seen in the PC space.
Local Media Playback
Evaluation of local media playback and video processing is done by playing back files encompassing a range of relevant codecs, containers, resolutions, and frame rates. A note of the efficiency is also made by tracking GPU usage and power consumption of the system at the wall. Users have their own preference for the playback software / decoder / renderer, and our aim is to have numbers representative of commonly encountered scenarios. Our Q4 2022 test suite update replaces MPC-HC (in LAV filters / madVR modes) with mpv. In addition to being cross-platform and open-source, the player allows easy control via the command-line to enable different shader-based post-processing algorithms. From a benchmarking perspective, the more attractive aspect is the real-time reporting of dropped frames in an easily parseable manner. The players / configurations considered in this subsection include:
- VLC 3.0.18
- Kodi 20.2
- mpv 0.35.1 (hwdec auto, vo=gpu-next)
- mpv 0.35.1 (hwdec auto, vo=gpu-next, profile=gpu-hq)
Fourteen test streams (each of 90s duration) were played back from the local disk with an interval of 30 seconds in-between. Various metrics including GPU usage, at-wall power consumption, and total energy consumption were recorded during the course of this playback.
All our playback tests were done with the desktop HDR setting turned on. It is possible for certain system configurations to automatically turn on/off the HDR capabilities prior to the playback of a HDR video, but, we didn't take advantage of that in our testing.
VLC Playback Efficiency - Normal Mode
VLC Playback Efficiency - Performance Mode
The 8Kp60 AV1 stream playback is attempted in software, resulting in extremely high power consumption at the wall (80+ W). Otherwise, at-wall numbers remain below 20W for VLC playback, with the normal mode configuration being particularly energy efficient.
Kodi Playback Efficiency - Normal Mode
Kodi Playback Efficiency - Performance Mode
Similar to VLC, the 8Kp60 AV1 clip is problematic for Kodi 20.2. However, active power consumption is lower for Kodi compared to VLC. Kodi, at idle, does consume more power than just leaving the desktop idle. So, the energ numbers for Kodi are not great compared to VLC. However, within the considered systems, Kodi was more energy-efficient on the normal mode NUC BOX-1360P/D5.
mpv 0.35.1 Playback Efficiency - Normal Mode
mpv 0.35.1 Playback Efficiency - Performance Mode
mpv attempts to play back the 8Kp60 AV1 clip with hardware acceleration, but the playback still dropped half the frames (same symptom we have been observing in all recent mini-PC reviews). Active power numbers are much higher than Kodi or VLC.
mpv 0.35.1 (GPU-HQ) Playback Efficiency - Normal Mode
mpv 0.35.1 (GPU-HQ) Playback Efficiency - Performance Mode
Adding GPU-HQ shaders to the mix drives up the energy consumption numbers for all systems, with the Wall Street Canyon becoming the most efficient of the lot. Surprisingly, only the AV1 clip continued to drop frames during playback.
Power Consumption and Thermal Characteristics
The power consumption at the wall was measured with a 4K display being driven through the HDMI port of the system. In the graph below, we compare the idle and load power of the ASRock NUC BOX-1360P/D5 with other systems evaluated before. For load power consumption, we ran the AIDA64 System Stability Test with various stress components, as well as our custom stress test with Prime95 / Furmark, and noted the peak as well as idling power consumption at the wall.
The numbers are consistent with the TDP and suggested PL1 / PL2 values for the processors in the systems, and do not come as any surprise. We are glad that ASRock Industrial has finally addressed the idle power consumption numbers issue that had been plaguing their NUC BOX systems since Tiger Lake. At 5.19W idling, only the Arena Canyon and Wall Street Canyon NUCs have better numbers. The higher peak number for Arena Canyon NUC shows that it can possibly outperform the NUC BOX-1360P/D5 in performance mode for specific short-burst workloads.
Stress Testing
Our thermal stress routine is a combination of Prime95, Furmark, and Finalwire's AIDA64 System Stability Test. The following 9-step sequence is followed, starting with the system at idle:
- Start with the Prime95 stress test configured for maximum power consumption
- After 30 minutes, add Furmark GPU stress workload
- After 30 minutes, terminate the Prime95 workload
- After 30 minutes, terminate the Furmark workload and let the system idle
- After 30 minutes of idling, start the AIDA64 System Stress Test (SST) with CPU, caches, and RAM activated
- After 30 minutes, terminate the previous AIDA64 SST and start a new one with the GPU, CPU, caches, and RAM activated
- After 30 minutes, terminate the previous AIDA64 SST and start a new one with only the GPU activated
- After 30 minutes, terminate the previous AIDA64 SST and start a new one with the CPU, GPU, caches, RAM, and SSD activated
- After 30 minutes, terminate the AIDA64 SST and let the system idle for 30 minutes
Traditionally, this test used to record the clock frequencies - however, with the increasing number of cores in modern processors and fine-grained clock control, frequency information makes the graphs cluttered and doesn't contribute much to understanding the thermal performance of the system. The focus is now on the power consumption and temperature profiles to determine if throttling is in play.
The 'Normal Mode' graphs throw no surprise whatsoever. The package power remains constant at 28W during active loading, with the iGPU alone getting a 20W allotment at the maximum. On the temperature front, the CPU package and cores are kept below 85C even under extreme stress. The point of concern is the SSD temperature reaching 75C in the disk stress test segment. This is not particularly good for SSD reliability. ASRock Industrial must endeavour to provide some sort of thermal solution for the SSDs in their mini-PCs.
The power numbers in the 'Performance Mode' graph appear good - 40W sustained package power, with the iGPU getting around 32W. SSD temperatures go high in the disk stress segment, as expected. However, the more worrisome aspect is the core and package temperatures hitting 100C. This is bound to create thermal throttling in the system.
Miscellaneous Aspects and Concluding Remarks
Networking and storage are aspects that may be of vital importance in specific PC use-cases. The ASRock NUC BOX-1360P/D5 comes with the Wi-Fi 6E AX210 WLAN card that also include Bluetooth 5.3 support. On the wired front, we have a couple of 2.5 Gbps ports backed by the Intel I226-LM and I2260V controllers. The system only supports vPro Essentials - so the full extent of remote management over a dedicated LAN port that can be obtained with vPro for Enterprise is sadly absent.
On the storage side, some applications require wide-temperature range and/or high endurance SSDs. The ASRock Industrial NUC BOX-1360P/D5 supports PCIe 4.0 x4 NVMe SSDs (and we used one in our configuration). However, cooling those within the space constraints imposed by the form-factor of the NUC is very challenging, as we saw in the SSD temperature graph in the previous section. In the absence of an effective thermal solution, it might be a better option to stick with a PCIe 3.0 x4 NVMe SSD for this unit. From a benchmarking perspective, we provide results from the WPCstorage test of SPECworkstation 3.1. This benchmark replays access traces from various programs used in different verticals and compares the score against the one obtained with a 2017 SanDisk 512GB SATA SSD in the SPECworkstation 3.1 reference system
SPECworkstation 3.1.0 - WPCstorage SPEC Ratio Scores | |||
The graphs above present results for different verticals, as grouped by SPECworkstation 3.1. The storage workload consists of 60 subtests. Access traces from CFD solvers and programs such as Catia, Creo, and Solidworks come under 'Product Development'. Storage access traces from the NAMD and LAMMPS molecular dynamics simulator are under the 'Life Sciences' category. 'General Operations' includes access traces from 7-Zip and Mozilla programs. The 'Energy' category replays traces from the energy-02 SPECviewperf workload. The 'Media and Entertainment' vertical includes Handbrake, Maya, and 3dsmax. Given that the comparison is between a wide range of SSDs in the systems - including both Gen 3 and Gen 4 / DRAM-equipped and DRAMless NVMe, the relative numbers for most workloads are not surprising. The Samsung SSD 980 PRO was a high-end flagship model, and the NUC BOX-1360P/D5 is able to bring out its performance capabilities (despite the SSD overheating concern, which is an orthogonal issue)
Closing Thoughts
ASRock Industrial has one of the most comprehensive Raptor Lake UCFF PC lineups in the market today. The company has a wide variety of options to choose from this year - in addition to the NUC(S) BOX-13xx/D4 and the NUC BOX-13xx/D5 reviewed here, the AMD-based 4X4 BOX-7735U is also a very credible contender. In fact, the AMD system fares much better than the Intel one over a number of different benchmarks, thanks to the presence of eight high-performance cores (compared to the 4P + 8E configuration of the Core i7-1360P). Even on the I/O front, the presence of two USB4 ports (compared to 1x USB 4 / Thunderbolt 4, and 1x USB 3.2 Gen 2x2) is also an advantage for the AMD system. It is frugal in terms of peak power consumption while delivering largely similar performance numbers when averaged across multiple workloads.
DDR5 support does give the edge in many of the workloads for the NUC BOX-1360P/D5 over the NUCS BOX-1360P/D4. However, many AMD mini-PCs with DDR5-5200 and DDR5-5600 support have already started appearing in the market. Under such circumstances, the DDR5-4800 support in the NUC BOX-1360P/D5 appears a bit dated. That said, the inclusion of USB 3.2 Gen 2x2 support in the Type-C ports is a very welcome update over the NUC BOX-1200 and the NUCS BOX-13xx/D4 series.
ASRock Industrial is no stranger to UCFF PCs, and that shows in the robust build and performance profile of the NUC BOX-1360P/D5. The system is an excellent Intel option in the mini-PC space, and the feature set and performance profile have been optimized under the constraints imposed by the form-factor. On the pricing front, the company has decided to introduce the NUC BOX-1360P/D5 at $670. The NUC BOX-1340P/D5 with similar features (based on the Core i5-1340P instead of the Core i7-1360P) is available for $540.
The only areas of concern that we had from our evaluation are the absence of a proper thermal solution for the SSD and the inability of the thermal solution to sustain a 40W PL1 indefinitely while maintaining a safe temperature for the processor package. The form-factor adopted by the system is usually meant for 28W TDP systems, and is hence not a big surprise. Despite these minor quibbles, we have to say that ASRock Industrial's NUC BOX-1360P/D5 presents consumers with a credible alternative to the DDR4-only Arena Canyon NUC in the mainstream class.