Senior Editor Gordon Mah Ung displays his characteristic steely resolve while torturing a system with the Maximum PC benchmark suite.
Maximum PC Benchmarks
Effective April 2005
It 's been almost a year and a half since we last updated the zero-point system and benchmark suite that we judge all systems against. For 2005, we 're updating all but two of our tests to stress the current generation of hardware.
We 've now officially chucked MusicMatch, which we used to encode WAV files to high-quality MP3 files. We 've always leaned away from synthetic tests because hardware vendors are too smart. If you use an artificial test developed by a benchmarking house, vendors start to optimize their drivers to perform well in those tests. And believe it or not, that doesn 't always benefit the consumer. There have been documented cases of optimizations for synthetic tests that have adversely affected real-world performance. That 's obviously not the way it should be. If there are driver optimizations 'some would go so far as to label them 'cheats ' 'they should at least benefit the consumer.
All but one of the benchmarks adhere to our philosophy of real-world testing; and by 'real world, ' we mean they 're based on tasks and activities our readers will want to perform, conducted with tools and games that are readily accessible.
OUR NEW TESTS
BAPCo 's SYSmark2004
We 're sticking with BAPCo 's popular tool for measuring a PC 's performance in everyday chores (the nonprofit consortium 's name stands for Business Applications Performance Corporation). SYSmark2004 uses common applications such as Microsoft 's Word and Excel, Adobe 's Photoshop and After Effects, and Discreet 's 3ds max.
In the benchmark routine, the office applications are used to check e-mail, update calendars, and create Word and PowerPoint documents. The creative apps are used to construct a website, create 2D and 3D objects, and produce a low-res movie.
We 've found SYSmark 's unique 'think time” feature to be revealing. Old-fashioned application benchmarks time how fast Word scrolls through a long document, or how fast the machine can enter text. The problem with this approach is that human beings can’t type at 8,000wpm or work at hyper speed. Beginning with SYSmark2002, BAPCo’s benchmark began timing only how long it took to complete certain tasks that normally have you waiting.
SYSmark2004 is the latest iteration of that philosophy, and it’s probably the most accurate measure of how fast a computer is at working with applications, because both AMD and Intel are now members of the BAPCo consortium. Certain applications, such as WinZip and McAfee ViruScan 7, run concurrently with other applications. When dual-core CPUs ship sometime this year, we should see a boost in scores during this multitasking element of the benchmark.
SYSmark2004 reports a right-brain score for “Internet Content Creation,” a left-brain score for “Office Productivity,” and an overall score. The base score of 100 was achieved with a 2GHz P4 Northwood in a system equipped with 512MB of DDR266 running in single-channel mode and an 80GB IBM 7,200rpm DeskStar. Our new zero-point system is roughly 100-percent faster than that.
We’ve found that SYSmark2004 is most sensitive to CPU processing power. After that, memory bandwidth, total RAM, hard drive I/O, and 2D videocard performance have their impact on the final score. We run SYSmark2004 with BAPCo’s Patch 2 applied to resolve incompatibilities with service packs from Microsoft.
Adobe Premiere Pro
Our Premiere Pro test remains as it’s been since the days of Premiere 6.0. We take an AVI sequence shot with a Canon XL1, run through several transitions, add a soundtrack, and export it all to a generic MPEG-2 file using the Adobe Media Encoder. Our Premiere benchmark is primarily a measurement of processor speed, and it has traditionally favored short-pipeline processors such as the Athlon XP, Centrino, and Athlon 64/FX.
With the move to Premiere Pro, however, Adobe rewrote and recompiled most of the program’s core. Premiere Pro now heavily favors the Pentium 4 architecture and enjoys a significant boost from both Hyper-Threading and multi-processor systems. Adobe’s video-editing program will probably be the first of our homegrown benchmarks to get a face-lift. We’re planning to tweak our test to use video from an HD DV cam, and we’ll integrate video filters and transitions that use both the processor and the videocard’s GPU.
Adobe Photoshop CS
We’re trading Photoshop 7 for Adobe’s latest uber image editor, Photoshop CS, but we’ll use the same basic action script we adopted during the heat of the Apple-versus-PC wars. The foundation for Apple’s claim of superiority over the PC is based partially on Photoshop filters; but the trouble with Apple’s argument is that its tests encompass more fine print than a used-car contract. If you run this test on the second and third Tuesdays of a leap year, with this specific filter turned on, while standing on your head with a ham-and-peanut-butter sandwich in your left pocket—lo and behold—the Mac is faster. We decided to run ‘em all and let the Almighty sort it out.
We updated from Photoshop 7 to Photoshop CS and experienced a drop in performance. Evidence of bloatware, perhaps?
We take a JPG shot with a Canon EOS 10D and apply every freakin’ filter available. This test thrashes the CPU, thrashes main memory, and is also slightly disk-intensive. Interestingly enough, we measured a performance drop moving from version 7 to the CS version, even though we’re running the exact same action script. We plan to update the test when the new version of CS is available, and we’ll probably move to a larger file size to take state-of-the-art digital cameras into account.
We dumped our MP3-encoding test in favor of pure video encoding, because it taxes today’s hardware just that much more. It can also turn into a massive time-sink, whether you’re recoding a DVD you own for portable consumption or recoding a video that you’ve downloaded from the Internet.
For our test, we take a featurette from a DVD that we own (yes, we said own) and use DVDTox.com’s #1 DVD Ripper to encode it to Divx (version 5.2.1). To take the optical drive out of the equation, we move the VOB file to the hard drive, where we do a second-pass encode. Right now, the test seems to slightly favor the Pentium 4, but not by much. This test is CPU-intensive, RAM-intensive, and it touches the hard drive, too. It’s slightly less of an overall system test than Photoshop CS, but it is real-world and it’s still pretty rigorous even for today’s fastest systems. It takes roughly 30 minutes to encode our video, and this time should shrink as faster CPUs are brought to market.
FutureMark’s 3DMark2005 is the most punishing graphics test available. It makes today’s most advanced games look old and decrepit. For our purposes, we run the test with patch 1.2.0 and look at only Game 3. As we noted above, 3DMark2005 is a synthetic benchmark—no commercial game uses its engine—and it’s no wonder: With resolution set to 1280x1024, and 4x antialiasing and 4x anisotropic filtering enabled, even our top-speed, SLI-equipped Athlon 64 FX machine just barely kicks out playable frame rates.
Although this benchmark isn’t indicative of what today’s games can do, it’s probably an accurate simulation of the graphics power that games will demand 18 months from now. This benchmark is mostly a GPU test, but memory bandwidth and overall system speed are also factors. Our zero-point system kicks out a good—but not great—29.3fps.
We would like to use Valve Software’s Half-Life 2 as a DirectX test, but because the game is automatically updated via Steam each time it’s installed on a new system, it’s useless as a repeatable benchmark. If Valve ever creates a benchmark version of the game down the road, we’ll definitely consider adding it to our suite.
Doom 3 performance isn’t important just because it’s a high-profile game from id Software, it’s important because of the myriad games that will use the same engine. If your PC can deliver smooth frame rates playing Doom 3 now, it’ll give you butter when games such as Quake IV reach store shelves.
We decided to run Doom 3 at 1600x1200 resolution as a baseline, to suit the higher resolution of today’s LCD monitors. We also set the game to “high quality” and ran it with 4x antialiasing and 4x anisotropic filtering. The game has an “ultra quality” setting, for cards with 512MB frame buffers, but we decided that might be going a little too far, given that cards with that much local memory aren’t yet available. For the record, our SLI rig plays Doom 3 just fine with that setting.
Our new test beds can get 77fps out of Doom 3, even at 1600x1200 with 4x AA.
For our tests, we patch Doom 3 to version 1.1 and use the default demo1. Our zero-point system delivers 77.1fps, which we believe is the minimum frame rate that looks smooth.
NEW TEST BEDS
PCI Express and SLI should ensure a long life for our new test rigs
We considered just upgrading our old zero-points, but after considering the benchmark scores that new machines are hitting, and what they’re likely to hit in the next 12 to 24 months, we decided a complete overhaul was in order. The old machines, while still serviceable, wouldn’t have enough power to keep review systems from doubling or tripling our zero-point score. The fact that our old test systems used AGP, and offered no option for PCI Express graphics cards, had us boxed into a corner, as well; our old zero-points simply wouldn’t allow us to test new videocards.
It didn’t take long to nail down a configuration for our new rigs. We needed PCI Express, and we had to have SLI capability. That left us with one option: Asus’ A8N-SLI Deluxe—as of press time, it’s the only SLI motherboard available.
Our previous fleet of test beds was the first to consist of non-Intel chipsets and CPUs. Although we were initially worried about the nVidia nForce3 Pro 150 chipset’s unproven track record, we experienced zippo problems during the past year and a half. We expect similar trouble-free performance from the nForce4 SLI chipset that will serve as the foundation for our new test beds.
Once we chose the motherboard, picking the processor was a no-brainer: We went with AMD’s Athlon 64 FX-55. We crowned AMD’s CPU the speed king several months ago, and its status hasn’t changed since. This Socket 939 proc features 1MB of L2 cache, an on-die memory controller, and support for 64-bit extensions. We won’t be running 64-bit Windows as our standard OS, but having the option gives us a lot more flexibility than any current P4s can provide. We bumped main system RAM up a notch by requisitioning two 1GB “sticks” of Crucial Ballistix Tracer DDR400.
The graphics department was equally easy to spec out. Our zero-point reference system will house two nVidia GeForce 6800 Ultra videocards running in SLI. We won’t configure every Lab system with dual videocards; only those machines that will be used as the basis of comparison for system reviews.
Mass-storage duties fall to Maxtor’s impressive 250GB DiamondMax 10 hard drive. This drive has a huge 16MB buffer, supports NCQ (native command queuing), and is natively SATA; that is, it doesn’t use a PATA-to-SATA chip as most first-gen SATA drives do. It will pair-up well with the nForce4 SLI chipset, which features native SATA support.
In the past, we’ve conducted all hardware testing—with the notable exception of hard drives—on test beds based on an Athlon 64 CPU and nForce3 chipset. These systems don’t support SATA, though, so we benchmarked hard drives using test beds based on a Pentium 4 CPU and Intel 865P chipset. We later migrated to Intel’s newer 925XE chipset, because it supports additional SATA features (including NCQ). Because our latest test beds support both NCQ and SATA 3G drives, we’ll transition hard-drive testing to that platform, too.
We haven’t given up on soundcards just yet, so we’ve chosen Creative Labs’ Audigy 2 ZS card as our audio standard bearer. We decided against graduating to the more-expensive Audigy 4 Pro because the only significant differences between the two generations are improved DACs and a fancier break-out box.
It seems a safe bet that 2005 will be the year in which everyone finally pays attention to power supplies, but we’ll stick with PC Power and Cooling’s Turbo-Cool 510 Express and Turbo-Cool 510 SLI models. They get just a tad loud under full load, but supply stable, reliable power to the wide assortment of hardware we review; besides, we’ve never had one expire on us.