Original Link: https://www.anandtech.com/show/7173/samsung-ssd-840-evo-review-120gb-250gb-500gb-750gb-1tb-models-tested



I'm continually amazed by Samsung's rise to power in the SSD space. If you compare their market dominating products today to what we were reviewing from Samsung just a few years ago you'd assume they came from a different company. The past three generations of Samsung consumer SSDs have been good, but if you focus exclusively on the past two generations (830/840) they've been really good.

Last year Samsung bifurcated its consumer SSD lineup by intoducing the 840 Pro in addition to the vanilla 840. We'd seen other companies explore a similar strategy, but usually by playing with synchronous vs asynchronous NAND or sometimes just using different NAND suppliers between lines. Samsung used NAND to differentiate the two but went even more extreme. The non-Pro version of the 840 was the first large scale consumer SSD made with 3-bit-per-cell MLC NAND, more commonly known as TLC (triple-level-cell) NAND. Companies had toyed with the idea of going TLC well before the 840's release but were usually stopped either by economic or endurance realities. The 840 changed all of that. Although it didn't come with tremendous cost savings initially, over time the Samsung SSD 840 proved to be one of the better values on the market - you'd just have to get over the worry of wearing out TLC NAND.

Despite having a far more limited lifespan compared to its 2bpc MLC brethren, the TLC NAND Samsung used in its 840 turned out to be quite reliable. Even our own aggressive estimates pegged typical client write endurance on the 840 at more than 11 years for the 128GB model.


Samsung 19nm TLC NAND

We haven't seen Samsung's love of TLC embraced by other manufacturers. The most significant contrast actually comes from Micron, another NAND supplier turned SSD manufacturer, and its M500. Relying on 2bpc MLC NAND, the M500 gets its cost down by using a combination of large page/block sizes (to reduce overall die area) as well as aggressively embracing the latest NAND manufacturing processes (in this case 20nm). That's always been the Intel/Micron way - spend all of your time getting to the next process node quickly, and drive down cost that way rather than going TLC. The benefit of the TLC approach is the potential for even more cost reduction, but the downside is it usually takes a while to get production to yield high enough endurance TLC to make it viable for use in SSDs. The question of which is quicker is pretty simple to answer. If we look at the 25nm and 20nm generations from IMFT, the manufacturer was able to get down to new process nodes quicker than Samsung could ship TLC in volume.

The discussion then shifts to whether or not TLC makes sense at that point, or if you'd be better off just transitioning to the next process node on MLC. Samsung clearly believes its mainstream TLC/high-end MLC split makes a lot of sense, and seeing how the 840 turned out last time I tend to agree. It's not the only solution, but given how supply constrained everyone is on the latest NAND processes this generation - any good solution to get more die per wafer is going to be well received. Samsung doesn't disclose die areas of its NAND, so we unfortunately can't tell just how much more area efficient its TLC approach is compared to IMFT's 128Gb/16K page area efficient 20nm MLC NAND.

As with any other business in the tech industry, it turns out that a regular, predictable release cadence is a great way to build marketshare. Here we are, around 9 months after the release of the Samsung SSD 840 and we have its first successor: the 840 EVO.

As its name implies, Samsung's SSD 840 EVO is an evolution over last year's SSD 840. The EVO still uses 3-bit-per-cell TLC NAND, but it moves to a smaller process geometry. Samsung calls its latest NAND process 10nm-class or 1x-nm, which can refer to feature sizes anywhere from 10nm to 19nm but we've also heard it referred to as 19nm TLC. The new 19nm TLC is available in capacities of up to 128Gbit per die, like IMFT's latest 20nm MLC process. Unlike IMFT's 128Gb offering, Samsung remains on a 8KB page size even with this latest generation of NAND. The number of pages per block is also more like IMFT's previous 64Gbit 20nm MLC at 256:

IMFT vs. Samsung NAND Comparison
  IMFT 20nm MLC IMFT 20nm MLC Samsung 19nm TLC Samsung 21nm TLC Samsung 21nm MLC
Bits per Cell 2 2 3 3 2
Single Die Max Capacity 64Gbit 128Gbit 128Gbit 128Gbit 64Gbit
Page Size 8KB 16KB 8KB 8KB 8KB
Pages per Block 256 512 256 192 128
Read Page (max) 100 µs 115 µs ? ? ?
Program Page (typical) 1300 µs 1600 µs ? ? ?
Erase Block (typical) 3 ms 3.8 ms ? ? ?
Die Size 118mm2 202mm2 ? ? ?
Gbit per mm2 0.542 0.634 ? ? ?
Rated Program/Erase Cycles 3000 3000 1000 - 3000 1000 - 3000 3000 (?)

The high level specs, at least those Samsung gives us, points to an unwillingness to sacrifice latency even further in order to shrink die area. The decision makes sense since TLC is already expected to have 50% longer program times than 2bpc MLC. IMFT on the other hand has some latency to give up with its MLC NAND, which is why we see the move to 2x larger page and block sizes with its 128Gbit NAND die. Ultimately that's going to be the comparison that's the most interesting - how Samsung's SSD 840 EVO with its 19nm TLC NAND stacks up to Crucial's M500, the first implementation of IMFT's 128Gbit 20nm MLC NAND.

Modern Features

Along with the NAND update, the EVO also sees a pretty significant controller upgrade. The underlying architecture hasn't changed, Samsung's MEX controller is still based on the same triple-core Cortex R4 design as the previous generation MDX controller. The cores now run at 400MHz compared to 300MHz previously, which helps enable some of the higher performance on the EVO. The MEX controller also sees an update to SATA 3.1, something we first saw with SanDisk's Extreme II. SATA 3.1 brings a number of features, one of the most interesting being support for queued TRIM commands.

The EVO boasts hardware AES-256 encryption, and has its PSID printed on each drive label like Crucial's M500. In the event that you set and lose the drive's encryption key, you can use the PSID to unlock the drive (although all data will be lost). At launch the EVO doesn't support TCG Opal and thus Microsoft's eDrive spec, however Samsung tells us that a firmware update scheduled for September will enable both of these things - again bringing the EVO to encryption feature parity with Crucial's M500.

As one of the world's prominent DRAM makers, it's no surprise to find a ton of DRAM used to cache the firmware and indirection table on the EVO. DRAM size scales with capacity, although Samsung tosses a bit more than is necessary at a couple capacity points (e.g. 250GB).

Samsung SSD 840 EVO DRAM
  120GB 250GB 500GB 750GB 1TB
DRAM Size 256MB LPDDR2-1066 512MB LPDDR2-1066 512MB LPDDR2-1066 1GB LPDDR2-1066 1GB LPDDR2-1066

The move to 19nm 128Gbit TLC NAND die paves the way for some very large drive capacities. Similar to Crucial's M500, the 840 EVO is offered in configurations of up to 1TB.

Samsung SSD 840 EVO Specifications
  120GB 250GB 500GB 750GB 1TB
Controller, Interface Samsung MEX, SATA 3.1
NAND Samsung 19nm 3bpc TLC Toggle DDR 2.0 NAND
Form Factor 2.5" 7mm
Max Sequential Read
540MB/s
 
Max Sequential Write
410MB/s
520MB/s
 
Max 4KB Random Read
94K IOPS
97K IOPS
98K IOPS
 
Max 4KB Random Write
35K IOPS
66K IOPS
90K IOPS
 
Encryption AES-256 FDE, PSID printed on SSD label
Warranty 3 years

I'll get to the dissection of performance specs momentarily, but you'll notice some very high peak random and sequential performance out of these mainstream drives. The peak performance improvement over last year's 840 is beyond significant. The keyword there is peak of course.

Pricing

Samsung expects the 840 EVO to be available in the channel at the beginning of August. What we have in the table below are suggested MSRPs, which as long as supply isn't limited usually end up being higher than street prices:

SSD Pricing Comparison - 7/24/2013
  120/128GB 240/250/256GB 480/500/512GB 750GB 960GB/1TB
Crucial M500 $120.99 $193.56 $387.27   $599.99
Intel SSD 335   $219.99      
Samsung SSD 840 $98.44 $168.77 $328.77    
Samsung SSD 840 EVO $109.99 $189.99 $369.99 $529.99 $649.99
Samsung SSD 840 Pro $133.49 $230.95 $458.77    
SanDisk Extreme II $129.99 $229.77 $449.99    
SanDisk Ultra Plus $96.85 $174.29      
OCZ Vertex 450 $129.99 $246.84      

Prices are a bit higher than the outgoing Samsung SSD 840, which makes sense since we're looking at the beginning of the cost curve of a new process node. Crucial's highly sought after $600 960GB M500 seems finally back in stock just in time for the EVO to go head to head with it. Samsung is expecting roughly a $50 premium for the 1TB EVO over the Crucial solution, but over time I'd expect that gap to shrink down to nothing (or in favor of Samsung). The EVO is considerably more affordable than Samsung's 840 Pro, and the higher capcacity points are at particularly tempting prices.



Inside the Drives & Spare Area

The EVO is offered in a single form factor - 2.5" at a 7mm thickness. There are three torx (T5) screws that hold the chassis together, removing them gets you a look at the EVO's very simple internals. Surprisingly enough there's no thermal pad between Samsung's MEX controller and the chassis.

Samsung, like Intel, does a great job of reducing the number of screws and simplifying the assembly of its drives. I would prefer if Samsung didn't insist on using torx screws to hold the chassis together but I'm sure it does have some impact on reducing returns. There's also growing concern of counterfit SSDs which I guess screw choice could somewhat address.

There are two PCB sizes used in the EVO lineup, neither of which occupies the full volume of the 2.5"/7mm chassis. The 120 and 250GB drives use the smallest PCB, while the other drives use the larger layout. The larger PCB has room for 8 NAND packages, while the half length PCB can accommodate two. Each of the NAND packages can hold up to 8 x 128Gbit 19nm TLC die.

To deal with the realities of TLC, Samsung sets aside more of the drive for use as spare area on the EVO than it does on its MLC Pro line. Due to TurboWrite however, the percentage is actually a bit less than it was on last year's 840.

Samsung SSD 840 EVO Memory
Advertised Capacity 120GB 250GB 500GB 750GB 1TB
DRAM Size 256MB LPDDR2-1066 512MB LPDDR2-1066 512MB LPDDR2-1066 1GB LPDDR2-1066 1GB LPDDR2-1066
# of NAND Packages 2 2 4 8 8
# of NAND die per Package 4 8 8 4 8
NAND Capacity per Package 64 GiB 128 GiB 128 GiB 96 GiB 128 GiB
Total NAND 128 GiB 256 GiB 512 GiB 768 GiB 1024 GiB
Spare Area 12.7% 9.05% 9.05% 9.05% 9.05%

I've tossed internal shots of all of the EVO lineup into the gallery below:



Endurance

Samsung isn't quoting any specific TB written values for how long it expects the EVO to last, although the drive comes with a 3 year warranty. Samsung doesn't explicitly expose total NAND writes in its SMART details but we do get a wear level indicator (SMART attribute 177). The wear level indicator starts at 100 and decreases linearly down to 1 from what I can tell. At 1 the drive will have exceeded all of its rated p/e cycles, but in reality the drive's total endurance can significantly exceed that value.

Kristian calculated around 1000 p/e cycles using the wear level indicator on his 840 sample last year or roughly 242TB of writes, but we've seen reports of much more than that (e.g. this XtremeSystems user who saw around 432TB of writes to a 120GB SSD 840 before it died). I used Kristian's method of mapping sequential writes to the wear level indicator to determine the rated number of p/e cycles on my 120GB EVO sample:

Samsung SSD 840 EVO Endurance Estimation
  Samsung SSD EVO 120GB
Total Sequential Writes 4338.98 GiB
Wear Level Counter Decrease -3 (raw value = 35)
Estimated Total Writes 144632.81 GiB
Estimated Rated P/E Cycles 1129 cycles

Using the 1129 cycle estimate (which is an improvement compared to last year's 840 sample), I put together the table below to put any fears of endurance to rest. I even upped the total NAND writes per day to 50 GiB just to be a bit more aggressive than the typically quoted 10 - 30 GiB for consumer workloads:

Samsung SSD 840 EVO TurboWrite Buffer Size vs. Capacity
  120GB 250GB 500GB 750GB 1TB
NAND Capacity 128 GiB 256 GiB 512 GiB 768 GiB 1024 GiB
NAND Writes per Day 50 GiB 50 GiB 50 GiB 50 GiB 50 GiB
Days per P/E Cycle 2.56 5.12 10.24 15.36 20.48
Estimated P/E Cycles 1129 1129 1129 1129 1129
Estimated Lifespan in Days 2890 5780 11560 17341 23121
Estimated Lifespan in Years 7.91 15.83 31.67 47.51 63.34
Estimated Lifespan @ 100 GiB of Writes per Day 3.95 7.91 15.83 23.75 31.67

Endurance scales linearly with NAND capacity, and the worst case scenario at 50 GiB of writes per day is just under 8 years of constant write endurance. Keep in mind that this is assuming a write amplification of 1, if you're doing 50 GiB of 4KB random writes you'll blow through this a lot sooner. For a client system however you're probably looking at something much lower than 50 GiB per day of total writes to NAND, random IO included.

I also threw in a line of lifespan estimates at 100 GiB of writes per day. It's only in this configuration that we see the 120GB drive drop below 4 years of endurance, again based on a conservative p/e estimate. Even with 100 GiB of NAND writes per day, once you get beyond the 250GB EVO we're back into absolutely ridiculous endurance estimates.

Keep in mind that all of this is based on 1129 p/e cycles, which is likely less than half of what the practical p/e cycle limit on Samsung's 19nm TLC NAND. To go ahead and double those numbers and then you're probably looking at reality. Endurance isn't a concern for client systems using the 840 EVO.



TurboWrite: MLC Performance on a TLC Drive

All NAND trends towards lower performance as we move down to smaller process geometries. Clever architectural tricks are what keep overall SSD performance increasing each generation, but if you look at Crucial's M500 you'll see that it's not always possible to do. Historically, whenever a level of the memory hierarchy got too slow, the industry would more or less agree to insert another level above it to help hide latency. The problem is exascerbated once you start talking about TLC NAND. Samsung's mitigation to the problem is to dedicate a small portion of each TLC NAND die as an SLC write buffer. The feature is called TurboWrite. Initial writes hit the TurboWrite buffer at very low latency and are quickly written back to the rest of the TLC NAND array.

Since the amount of spare area available on the EVO varies depending on capacity, TurboWrite buffer size varies with capacity. The smallest size is around 3GB while the largest is 12GB on the 1TB EVO:

Samsung SSD 840 EVO TurboWrite Buffer Size vs. Capacity
  120GB 250GB 500GB 750GB 1TB
TurboWrite Buffer Size 3GB 3GB 6GB 9GB 12GB

I spent some time poking at the TurboWrite buffer and it pretty much works the way you'd expect it to. Initial writes hit the buffer first, and as long as they don't exceed the size of the buffer the performance you get is quite good. If your writes stop before exceeding the buffer size, the buffer will write itself out to the TLC NAND array. You need a little bit of idle time for this copy to happen, but it tends to go pretty quickly as it's just a sequential move of data internally (we're talking about a matter of 15 - 30 seconds). Even before the TurboWrite buffer is completely emptied, you can stream new writes into the buffer. It all works surprisingly well. For most light use cases I can see TurboWrite being a great way to deliver more of an MLC experience but on a TLC drive.

TurboWrite's impact is best felt on the lower capacity drives that don't have as many NAND die to stripe requests across (thus further hiding long program latencies). The chart below shows sequential write performance vs. time for all of the EVO capacities. The sharp drop in performance on each curve is when the TurboWrite buffer is exceeded and sequential writes start streaming to the TLC NAND array instead:

On the 120GB drive the delta between TurboWrite and standard performance is huge. On the larger drives the drop isn't as big and the TurboWrite buffer is also larger, the combination of the two is why the impact isn't felt as muchon those drives. It's this TurboWrite buffer that gives the EVO its improvement in max sequential write speed over last year's vanilla SSD 840.



RAPID: PCIe-like Performance from a SATA SSD

The software story around Samsung's SSD 840 EVO is quite possibly the strongest we've ever seen from an SSD manufacturer. Samsung's SSD Magician got a major update not too long ago, giving it a downright awesome UI. Magician gives you access to SMART details about your drive and provides decent visualization of things like total host writes. I'd love to see the inclusion of total NAND writes reported somewhere, as reporting host writes alone doesn't take into account write amplification and can give a false sense of security for those users deploying drives into very write intensive environments. There's a prominent drive health indicator which is tied to NAND wear and should draw a lot of attention to itself should things get bad. Samsung's SSD Magician also includes a built in benchmark, controllable overprovisioning and secure erase functionality.

Samsung sent us a beta of the next version of its Magician software (4.2) which includes support for RAPID mode (Real-time Accelerated Processing of I/O Data). RAPID is a feature exclusive to the EVO (for now) and comes courtesy of Samsung's NVELO acquisition from last year. As NVELO focused on NAND caching software, you shouldn't be too surprised by RAPID's role in improving storage performance. Unlike traditional SSD caches however that use NAND to cache mechanical storage, RAPID is designed to further improve the performance of an SSD and not make a HDD more SSD-like. RAPID uses some of your system memory and CPU resources to cache hot data, serving it out of DRAM rather than your SSD.

The architecture is rather simple to understand. Enabling RAPID installs a filter driver on your Windows machine that keeps track of all reads/writes to a single EVO (RAPID only supports caching a single drive today). The filter driver looks at both file types/sizes and LBAs, but it fundamentally caches at the block level (it simply gets hints from the filesystem to determine what to cache). File types that are meaningless to cache are automatically excluded (think very large media files), but things like Outlook PST files are prime targets for caching. Since RAPID works at the block level you can cache frequently used parts of a file, rather than having to worry about a file being too big for the cache.

The cache resides in main memory and is allocated out of non-paged kernel memory. In fact, that's the easiest way to determine whether or not RAPID is actually working - you'll see non-paged kernel memory jump in size after about a minute of idle time on your machine:

Presently RAPID will use no more than 25% of system memory or 1GB, whichever comes first. Both reads and writes are cached, but in different ways. The read cache works as you'd expect, while RAPID more accurately does something like buffering/combining for writes. Reads are simple to cache (just look at what addresses are frequently accessed and draw those into the cache), but writes offer a different set of challenges. If you write to DRAM first and write back to the SSD you run the risk of losing a ton of data in the event of a crash or power failure. Although RAPID obeys flush commands, there's always the risk that anything pending could be lost in a system crash. Recognizing this potential, Samsung tells me that RAPID tries to instead focus on combining low queue depth writes into much larger bundles of data that can be written more like large transfers across many NAND die. To test this theory I ran our 4KB random write IOmeter test at a queue depth of 1 with RAPID enabled and disabled:

Samsung SSD 840 EVO 250GB - 4K Random Write, QD1, 8GB LBA Space
  IOPS MB/s Average Latency Max Latency CPU Utilization
RAPID Disabled 22769.31 93.26 MB/s 0.0435 ms 0.7512 ms 13.81%
RAPID Enabled 73466.28 300.92 MB/s 0.0135 ms 31.4259 ms 31.18%

Write coalescing seems to work extremely well here. With RAPID enabled the system sees even better random write performance than it would at a queue depth of 32. Average latency drops although the max observed latency was definitely higher. I've seen max latency peaks as high as 10ms on the EVO, so the increase in max latency is a bit less severe than what the data here indicates (but it's still large).

My test system uses a quad-core Sandy Bridge, so we're looking at an additional 60 - 70% CPU load on a single core when running an unconstrained IO workload. In real world scenarios I'd expect that impact to be much lower, but there's no getting around the fact that you're spending extra cycles on doing this DRAM caching. RAPID will revert into a pass-through mode if the CPU is already tied up doing other things. The technology is really designed to make use of excess CPU and DRAM in modern day PCs.

The potential performance upside is tremendous. While the EVO is ultimately limited by the performance of 6Gbps SATA, any requests serviced out of main memory are limited by the speed of your DRAM. In practice I never saw more than 4 - 5GB/s out of the cache, but that's still an order of magnitude better than what you'd get from the SSD itself. I ran a couple of tests with and without RAPID enabled to further characterize the performance gains:

Samsung SSD 840 EVO 250GB
  PCMark 7 Secondary Storage Score ATSB - Heavy 2011 Workload (Avg Data Rate) ATSB - Heavy 2011 Workload (Avg Service Time) ATSB - Light 2011 Workload (Avg Data Rate) ATSB - Light 2011 Workload (Avg Service Time)
RAPID Disabled 5414 229.6 MB/s 1101.0 µs 338.3 MB/s 331.4 µs
RAPID Enabled 5977 307.7 MB/s 247.0 µs 597.7 MB/s 145.4 µs
% Increase 10.4% 34.0%   75.0%  

The gains in these tests range from only 10% in PCMark 7 to as much as 75% in our Light 2011 workload. I'm in the process of running a RAPID enabled drive against our Destroyer benchmark to see how it fares there. In our two storage bench tests here the impact is actually mostly on the write side, average performance actually regresses slightly in both cases. I'm not entirely sure why that is other than both of these tests were designed to be a bit more write intensive than normal in order to really stress the weaknesses on SSDs at the time. To make sure that reads could indeed be cached I ran ATTO at a couple of different test sizes, starting with our standard 2GB test:

ATTO makes for a great test because we can see the impact transfer size has on RAPID's caching algorithms. Here we see pretty much no improvement until transfers get larger than 32KB, indicating an optimization for caching large block sequential reads. Note that even though ATTO's test file is 2GB in size (and RAPID's cache is limited to 1GB) we're still able to see some increase in performance. At best RAPID boosts sequential read performance by 34%, driving the 250GB EVO beyond 700MB/s. Since the test file is larger than the maximum size of the cache we're ultimately limited by the performance of the EVO itself.

Writes show a different optimization point. Here we see big uplift above 4KB transfer sizes but more or less the same performance once we move to large block sequential transfers. Again this makes sense as Samsung would want to coalesce small writes into large blocks it can burst across many NAND die, but caching large sequential transfers is just risking potential data loss in the event of a crash/unexpected power loss. Here the potential uplift is even larger - nearly 60% over the RAPID-disabled configuration.

To see what would happen if the entire workload could fit within a 1GB cache I reduced the size of ATTO's test set to 512MB and re-ran the tests:

Oh man. Here performance just shoots through the roof. Max sequential read performance tops out at 3.8GB/s. Note that once again we don't RAPID attempting to cache any smaller transfers, only large sequential transfers are of interest. Towards the end of the curve performance appears to regress when the transfer size exceeds 1MB. What's actually happening is RAPID's performance is exceeding the variable ATTO uses to store its instantaneous performance results. What we're seeing here is a 32-bit integer wrapping itself. 

Writes see similarly insane increases in performance. Here the best performance is north of 4GB/s. When the entire workload can fit in the cache, Samsung appears to relax some of its feelings about not caching large transfers unfortunately. The focus extends beyond just small file writes and we see nearly 4GB/s when we're transferring 8MB of data at a time. We're likely also seeing the same issue where RAPID's performance is so high that it's overflowing the 32-bit integer ATTO uses to report it.

While I appreciate the tremendous increase in both read and write performance, part of me wishes that Samsung would be more conservative in buffering writes. Although the cache map is stored on the C: drive and is persistent across boots, any crash or power loss with uncommitted (non-flushed) writes in the DRAM cache runs the risk of not making it to disk. Samsung is quick to point out that Windows issues flush commands regularly, so the risk should be as low as possible, but you're still risking more than had you not deployed another DRAM cache. If you've got a stable system connected to a UPS (or a notebook on a battery) this will sound like paranoia, but it's still a concern. 

If, however, you want to get PCIe-like SSD speeds without shelling out the money for a PCIe SSD, Samsung's RAPID is the closest you'll get.



Performance Consistency

In our Intel SSD DC S3700 review I introduced a new method of characterizing performance: looking at the latency of individual operations over time. The S3700 promised a level of performance consistency that was unmatched in the industry, and as a result needed some additional testing to show that. The reason we don't have consistent IO latency with SSDs is because inevitably all controllers have to do some amount of defragmentation or garbage collection in order to continue operating at high speeds. When and how an SSD decides to run its defrag and cleanup routines directly impacts the user experience. Frequent (borderline aggressive) cleanup generally results in more stable performance, while delaying that can result in higher peak performance at the expense of much lower worst case performance. The graphs below tell us a lot about the architecture of these SSDs and how they handle internal defragmentation.

To generate the data below I took a freshly secure erased SSD and filled it with sequential data. This ensures that all user accessible LBAs have data associated with them. Next I kicked off a 4KB random write workload across all LBAs at a queue depth of 32 using incompressible data. I ran the test for just over half an hour, no where near what we run our steady state tests for but enough to give me a good look at drive behavior once all spare area filled up.

I recorded instantaneous IOPS every second for the duration of the test. I then plotted IOPS vs. time and generated the scatter plots below. Each set of graphs features the same scale. The first two sets use a log scale for easy comparison, while the last set of graphs uses a linear scale that tops out at 40K IOPS for better visualization of differences between drives.

The high level testing methodology remains unchanged from our S3700 review. Unlike in previous reviews however, I did vary the percentage of the drive that I filled/tested depending on the amount of spare area I was trying to simulate. The buttons are labeled with the advertised user capacity had the SSD vendor decided to use that specific amount of spare area. If you want to replicate this on your own all you need to do is create a partition smaller than the total capacity of the drive and leave the remaining space unused to simulate a larger amount of spare area. The partitioning step isn't absolutely necessary in every case but it's an easy way to make sure you never exceed your allocated spare area. It's a good idea to do this from the start (e.g. secure erase, partition, then install Windows), but if you are working backwards you can always create the spare area partition, format it to TRIM it, then delete the partition. Finally, this method of creating spare area works on the drives we've tested here but not all controllers may behave the same way.

The first set of graphs shows the performance data over the entire 2000 second test period. In these charts you'll notice an early period of very high performance followed by a sharp dropoff. What you're seeing in that case is the drive allocating new blocks from its spare area, then eventually using up all free blocks and having to perform a read-modify-write for all subsequent writes (write amplification goes up, performance goes down).

The second set of graphs zooms in to the beginning of steady state operation for the drive (t=1400s). The third set also looks at the beginning of steady state operation but on a linear performance scale. Click the buttons below each graph to switch source data.

  Crucial M500 960GB Samsung SSD 840 EVO 1TB Samsung SSD 840 EVO 250GB SanDisk Extreme II 480GB Samsung SSD 840 Pro 256GB
Default

Thanks to the EVO's higher default over provisioning, you actually get better consistency out of the EVO than the 840 Pro out of the box. Granted you can get similar behavior out of the Pro if you simply don't use all of the drive. The big comparison is against Crucial's M500, where the EVO does a bit better. SanDisk's Extreme II however remains the better performer from an IO consistency perspective.

  Crucial M500 960GB Samsung SSD 840 EVO 1TB Samsung SSD 840 EVO 250GB SanDisk Extreme II 480GB Samsung SSD 840 Pro 256GB
Default

 

  Crucial M500 960GB Samsung SSD 840 EVO 1TB Samsung SSD 840 EVO 250GB SanDisk Extreme II 480GB Samsung SSD 840 Pro 256GB
Default

Zooming in we see very controlled and frequent GC patterns on the 1TB drive, something we don't see in the 840 Pro. The 250GB drive looks a bit more like a clustered random distribution of IOs, but minimum performance is still much better than on the standard OP 840 Pro.

TRIM Validation

Our performance consistency test actually replaces our traditional TRIM test in terms of looking at worst case scenario performance, but I wanted to confirm that TRIM was functioning properly on the EVO so I dusted off our old test for another go. The test procedure remains unchanged: fill the drive with sequential data, run a 4KB random write test (QD32, 100% LBA range) for a period of time (30 minutes in this case) and use HDTach to visualize the impact on write performance:

Minimum performance drops down to around 30MB/s, eugh. Although the EVO can be reasonably consistent, you'll still want to leave some free space on the drive to ensure that performance always stays high (I recommend 15 - 25% if possible).

A single TRIM pass (quick format under Windows 7) fully restores performance as expected:

The short period of time at 400MB/s is just TurboWrite doing its thing.



AnandTech Storage Bench 2013

When I built the AnandTech Heavy and Light Storage Bench suites in 2011 I did so because we didn't have any good tools at the time that would begin to stress a drive's garbage collection routines. Once all blocks have a sufficient number of used pages, all further writes will inevitably trigger some sort of garbage collection/block recycling algorithm. Our Heavy 2011 test in particular was designed to do just this. By hitting the test SSD with a large enough and write intensive enough workload, we could ensure that some amount of GC would happen.

There were a couple of issues with our 2011 tests that I've been wanting to rectify however. First off, all of our 2011 tests were built using Windows 7 x64 pre-SP1, which meant there were potentially some 4K alignment issues that wouldn't exist had we built the trace on a system with SP1. This didn't really impact most SSDs but it proved to be a problem with some hard drives. Secondly, and more recently, I've shifted focus from simply triggering GC routines to really looking at worst case scenario performance after prolonged random IO. For years I'd felt the negative impacts of inconsistent IO performance with all SSDs, but until the S3700 showed up I didn't think to actually measure and visualize IO consistency. The problem with our IO consistency tests are they are very focused on 4KB random writes at high queue depths and full LBA spans, not exactly a real world client usage model. The aspects of SSD architecture that those tests stress however are very important, and none of our existing tests were doing a good job of quantifying that.

I needed an updated heavy test, one that dealt with an even larger set of data and one that somehow incorporated IO consistency into its metrics. I think I have that test. I've just been calling it The Destroyer (although AnandTech Storage Bench 2013 is likely a better fit for PR reasons).

Everything about this new test is bigger and better. The test platform moves to Windows 8 Pro x64. The workload is far more realistic. Just as before, this is an application trace based test - I record all IO requests made to a test system, then play them back on the drive I'm measuring and run statistical analysis on the drive's responses.

Imitating most modern benchmarks I crafted the Destroyer out of a series of scenarios. For this benchmark I focused heavily on Photo editing, Gaming, Virtualization, General Productivity, Video Playback and Application Development. Rough descriptions of the various scenarios are in the table below:

AnandTech Storage Bench 2013 Preview - The Destroyer
Workload Description Applications Used
Photo Sync/Editing Import images, edit, export Adobe Photoshop CS6, Adobe Lightroom 4, Dropbox
Gaming Download/install games, play games Steam, Deus Ex, Skyrim, Starcraft 2, BioShock Infinite
Virtualization Run/manage VM, use general apps inside VM VirtualBox
General Productivity Browse the web, manage local email, copy files, encrypt/decrypt files, backup system, download content, virus/malware scan Chrome, IE10, Outlook, Windows 8, AxCrypt, uTorrent, AdAware
Video Playback Copy and watch movies Windows 8
Application Development Compile projects, check out code, download code samples Visual Studio 2012

While some tasks remained independent, many were stitched together (e.g. system backups would take place while other scenarios were taking place). The overall stats give some justification to what I've been calling this test internally:

AnandTech Storage Bench 2013 Preview - The Destroyer, Specs
  The Destroyer (2013) Heavy 2011
Reads 38.83 million 2.17 million
Writes 10.98 million 1.78 million
Total IO Operations 49.8 million 3.99 million
Total GB Read 1583.02 GB 48.63 GB
Total GB Written 875.62 GB 106.32 GB
Average Queue Depth ~5.5 ~4.6
Focus Worst case multitasking, IO consistency Peak IO, basic GC routines

SSDs have grown in their performance abilities over the years, so I wanted a new test that could really push high queue depths at times. The average queue depth is still realistic for a client workload, but the Destroyer has some very demanding peaks. When I first introduced the Heavy 2011 test, some drives would take multiple hours to complete it - today most high performance SSDs can finish the test in under 90 minutes. The Destroyer? So far the fastest I've seen it go is 10 hours. Most high performance I've tested seem to need around 12 - 13 hours per run, with mainstream drives taking closer to 24 hours. The read/write balance is also a lot more realistic than in the Heavy 2011 test. Back in 2011 I just needed something that had a ton of writes so I could start separating the good from the bad. Now that the drives have matured, I felt a test that was a bit more balanced would be a better idea.

Despite the balance recalibration, there's just a ton of data moving around in this test. Ultimately the sheer volume of data here and the fact that there's a good amount of random IO courtesy of all of the multitasking (e.g. background VM work, background photo exports/syncs, etc...) makes the Destroyer do a far better job of giving credit for performance consistency than the old Heavy 2011 test. Both tests are valid, they just stress/showcase different things. As the days of begging for better random IO performance and basic GC intelligence are over, I wanted a test that would give me a bit more of what I'm interested in these days. As I mentioned in the S3700 review - having good worst case IO performance and consistency matters just as much to client users as it does to enterprise users.

I'm reporting two primary metrics with the Destroyer: average data rate in MB/s and average service time in microseconds. The former gives you an idea of the throughput of the drive during the time that it was running the Destroyer workload. This can be a very good indication of overall performance. What average data rate doesn't do a good job of is taking into account response time of very bursty (read: high queue depth) IO. By reporting average service time we heavily weigh latency for queued IOs. You'll note that this is a metric I've been reporting in our enterprise benchmarks for a while now. With the client tests maturing, the time was right for a little convergence.

AT Storage Bench 2013 - The Destroyer

There's simply no comparison between the EVO and Crucial's M500. Even at half the capacity, the EVO does a better job in our consistency test. SanDisk's Extreme II remains the king here but that's more of a performance tuned part vs. something that offers better cost per GB. Note just how impactful the added spare is on giving the EVO an advantage over even the 840 Pro. It's so very important that 840 Pro owners keep as much free space on the drive as possible to keep performance high and consistent.

AT Storage Bench 2013 - The Destroyer

 



Random Read/Write Speed

The four corners of SSD performance are as follows: random read, random write, sequential read and sequential write speed. Random accesses are generally small in size, while sequential accesses tend to be larger and thus we have the four Iometer tests we use in all of our reviews.

Our first test writes 4KB in a completely random pattern over an 8GB space of the drive to simulate the sort of random access that you'd see on an OS drive (even this is more stressful than a normal desktop user would see). I perform three concurrent IOs and run the test for 3 minutes. The results reported are in average MB/s over the entire time. We use both standard pseudo randomly generated data for each write as well as fully random data to show you both the maximum and minimum performance offered by SandForce based drives in these tests. The average performance of SF drives will likely be somewhere in between the two values for each drive you see in the graphs. For an understanding of why this matters, read our original SandForce article.

Desktop Iometer - 4KB Random Read

Random read speed is very close to that of the 840 Pro. The EVO doesn't look like a mainstream drive here at all.

Desktop Iometer - 4KB Random Write

Even peak random write performance is dangerously close to the 840 Pro. Only the 120GB drive shows up behind the pack. I should add that I'll have to redo the way we test 4KB random writes given how optimized current firmwares/architectures have become. The data here is interesting but honestly the performance consistency data from earlier is a better look at what happens to 4KB random write performance over time.

Desktop Iometer - 4KB Random Write (QD=32)

The relatively small difference between QD3 and QD32 random write performance shows you just how good of a job Samsung's controller is doing at write combining. At high queue depths the EVO is just as fast as the 840 Pro here. So much for TLC being slow.

Sequential Read/Write Speed

To measure sequential performance I ran a 1 minute long 128KB sequential test over the entire span of the drive at a queue depth of 1. The results reported are in average MB/s over the entire test length.

Sequential read and write performance, even at low queue depths is very good on the EVO. You may notice lower M500 numbers here than elsewhere, the explanation is pretty simple. We run all of our read tests after valid data has been written to the drive. Unfortunately the M500 attempts to aggressively GC data on the drive, so even though we fill the drive and then immediately start reading back the M500 is already working in the background which reduces overall performance here.

Desktop Iometer - 128KB Sequential Read

Desktop Iometer - 128KB Sequential Write

AS-SSD Incompressible Sequential Read/Write Performance

The AS-SSD sequential benchmark uses incompressible data for all of its transfers. The result is a pretty big reduction in sequential write speed on SandForce based controllers.

Incompressible Sequential Read Performance - AS-SSD

Incompressible Sequential Write Performance - AS-SSD

 



Performance vs. Transfer Size

ATTO is a useful tool for quickly measuring the impact of transfer size on performance. You can get the complete data set in Bench.

I pointed this out in the 4KB random write section, but Samsung continues to do a great job of dealing with low queue depth transfers on the EVO. Performance is consistently great across all of the EVO capacity points. There's also no difference between the EVO's behavior here and the 840 Pro.



AnandTech Storage Bench 2011

Two years ago we introduced our AnandTech Storage Bench, a suite of benchmarks that took traces of real OS/application usage and played them back in a repeatable manner. I assembled the traces myself out of frustration with the majority of what we have today in terms of SSD benchmarks.

Although the AnandTech Storage Bench tests did a good job of characterizing SSD performance, they weren't stressful enough. All of the tests performed less than 10GB of reads/writes and typically involved only 4GB of writes specifically. That's not even enough exceed the spare area on most SSDs. Most canned SSD benchmarks don't even come close to writing a single gigabyte of data, but that doesn't mean that simply writing 4GB is acceptable.

Originally I kept the benchmarks short enough that they wouldn't be a burden to run (~30 minutes) but long enough that they were representative of what a power user might do with their system.

Not too long ago I tweeted that I had created what I referred to as the Mother of All SSD Benchmarks (MOASB). Rather than only writing 4GB of data to the drive, this benchmark writes 106.32GB. It's the load you'd put on a drive after nearly two weeks of constant usage. And it takes a *long* time to run.

1) The MOASB, officially called AnandTech Storage Bench 2011 - Heavy Workload, mainly focuses on the times when your I/O activity is the highest. There is a lot of downloading and application installing that happens during the course of this test. My thinking was that it's during application installs, file copies, downloading and multitasking with all of this that you can really notice performance differences between drives.

2) I tried to cover as many bases as possible with the software I incorporated into this test. There's a lot of photo editing in Photoshop, HTML editing in Dreamweaver, web browsing, game playing/level loading (Starcraft II & WoW are both a part of the test) as well as general use stuff (application installing, virus scanning). I included a large amount of email downloading, document creation and editing as well. To top it all off I even use Visual Studio 2008 to build Chromium during the test.

The test has 2,168,893 read operations and 1,783,447 write operations. The IO breakdown is as follows:

AnandTech Storage Bench 2011 - Heavy Workload IO Breakdown
IO Size % of Total
4KB 28%
16KB 10%
32KB 10%
64KB 4%

Only 42% of all operations are sequential, the rest range from pseudo to fully random (with most falling in the pseudo-random category). Average queue depth is 4.625 IOs, with 59% of operations taking place in an IO queue of 1.

Many of you have asked for a better way to really characterize performance. Simply looking at IOPS doesn't really say much. As a result I'm going to be presenting Storage Bench 2011 data in a slightly different way. We'll have performance represented as Average MB/s, with higher numbers being better. At the same time I'll be reporting how long the SSD was busy while running this test. These disk busy graphs will show you exactly how much time was shaved off by using a faster drive vs. a slower one during the course of this test. Finally, I will also break out performance into reads, writes and combined. The reason I do this is to help balance out the fact that this test is unusually write intensive, which can often hide the benefits of a drive with good read performance.

There's also a new light workload for 2011. This is a far more reasonable, typical every day use case benchmark. Lots of web browsing, photo editing (but with a greater focus on photo consumption), video playback as well as some application installs and gaming. This test isn't nearly as write intensive as the MOASB but it's still multiple times more write intensive than what we were running in 2010.

As always I don't believe that these two benchmarks alone are enough to characterize the performance of a drive, but hopefully along with the rest of our tests they will help provide a better idea.

The testbed for Storage Bench 2011 has changed as well. We're now using a Sandy Bridge platform with full 6Gbps support for these tests.

AnandTech Storage Bench 2011 - Heavy Workload

We'll start out by looking at average data rate throughout our new heavy workload test:

Heavy Workload 2011 - Average Data Rate

In lighter workloads than our 2013 workload the EVO still does incredibly well.

Heavy Workload 2011 - Average Read Speed

 

Heavy Workload 2011 - Average Write Speed

AnandTech Storage Bench 2011 - Light Workload

Our new light workload actually has more write operations than read operations. The split is as follows: 372,630 reads and 459,709 writes. The relatively close read/write ratio does better mimic a typical light workload (although even lighter workloads would be far more read centric).

The I/O breakdown is similar to the heavy workload at small IOs, however you'll notice that there are far fewer large IO transfers:

AnandTech Storage Bench 2011 - Light Workload IO Breakdown
IO Size % of Total
4KB 27%
16KB 8%
32KB 6%
64KB 5%

Light Workload 2011 - Average Data Rate

Light Workload 2011 - Average Read Speed

Light Workload 2011 - Average Write Speed

 



Power Consumption

Low power consumption has always been a staple of Samsung's SSDs, and the EVO is no different. Idle and load power are among the best here. I'm also expanding our DIPM testing, first introduced in the SanDisk Extreme II review:

We're introducing a new part of our power consumption testing with this review: measurement of slumber power with host initiated power management (HIPM) and device initiated power management (DIPM) enabled. It turns out that on Intel desktop platforms, even with HIPM and DIPM enabled, SSDs will never go into their lowest power states. In order to get DIPM working, it seems that you need to be on a mobile chipset platform. I modified an ASUS Zenbook UX32VD to allow me to drive power to the drive bay from an external power supply/power measurement rig. I then made sure HIPM+DIPM were enabled, and measured average power with the drive in an idle state. The results are below:

SSD Slumber Power (HIPM+DIPM)

The EVO is almost as good as the Pro from a slumber power perspective, and significantly better than anything else in the list here.

Drive Power Consumption - Idle

Drive Power Consumption - Sequential Write

Drive Power Consumption - Random Write



Final Words

I was extremely excited about Crucial's M500 because it was the first reasonably priced ~1TB SSD. Even though its performance wasn't class leading, it was honestly good enough to make the recommendation a no-brainer. The inclusion of features like eDrive support were just the icing on the cake. With the EVO, Samsung puts forth a formidable competitor to the M500. It's faster, uses less power at idle and carries lower MSRPs for most of the capacity range. Microsoft's eDrive standard isn't supported at launch, but Samsung expects to change that via a firmware update this September.

Endurance isn't a concern with TLC for client workloads, although I wouldn't recommend deploying the EVO in a write heavy database server or anything like that.

The additional features that Samsung threw in the pot this round really show some innovative thinking. TurboWrite does a good job of blurring the lines between MLC and TLC performance, while Samsung's RAPID DRAM cache offers adventurous users a way of getting a taste of high-end PCIe SSD performance out of an affordable TLC SATA drive.

The 1TB version is exciting because it offers a competitive price with the 960GB M500 but with better performance. It's also good to have an alternative there as the 960GB M500 has been supply constrained at times. At first I didn't believe that Samsung's TLC strategy could hold weight against the Intel/Micron approach of aggressively pursuing smaller process nodes with MLC NAND, but the EVO does a lot to change my opinion. I'd have no issues with one of these drives in my system even as primary storage. The performance story is really good (particularly with the larger capacities), performance consistency out of the box is ok (and gets better if you can leave more free space on the drive) and you've got Samsung's firmware expertise supporting you along the way as well.

To say that I really like the EVO is an understatement. If Samsung can keep quantities of the 840 EVO flowing, and keep prices at or below its MSRP, it'll be a real winner and probably my pick for best mainstream SSD.

Log in

Don't have an account? Sign up now