NVIDIA RTX 3080: Performance Test

Gary Baker

Gary Baker

.

September 17th 2020

In the early morning of September 2nd, the release of the RTX 30 series graphics card was definitely blockbuster news for many technology enthusiasts. The 30 series graphics card, which has been postponed several times, finally met after the official countdown. Regarding this conference, I believe that it has shocked users all over the world enough. On the one hand, the performance is doubled. The myth of doubling the performance of 10 series graphics cards has appeared again in 30 series graphics cards. On the other hand, it is the price, double the amount without increasing the price, which is enough to make anyone party.

21 days and 21 years, these 21 days NVIDIA has not let us wait in vain, and these 21 years have also allowed us to witness NVIDIA's brilliant achievements in the field of computer graphics.

In fact, as early as 2 months before the release, a variety of true and false news has continued to flow out, from the initial "3090 this year, replacing the previous TITAN" model change to the specific parameters of "3090 with 5248 CUDA" , And then to "the power supply interface is changed to a single 12pin", the truth is confusing.

NVIDIA RTX 3080: Performance Test image 3

The main improvements of this 30 series graphics card

In the press conference on September 2, Mr. Huang Renxun emphasized more than once that "this is the greatest performance improvement ever." From the effect shown at the press conference, the RTX 30 series graphics card can be described as double the amount without increasing the price. And the most direct change brought about by the second-generation RTX Ampere architecture is the skyrocketing performance, so the various smoke bombs before the press conference are obvious. The author will bring you the first evaluation of NVIDIA GeForce RTX 3080.

NVIDIA GeForce RTX 3080 appearance

Let's take a look at the appearance of this NVIDIA RTX 3080 graphics card. First of all, on the outer packaging, it has always been the minimalist style of NV. The square cardboard box is mainly black, supplemented by rose gold texture. And this time NVIDIA is also rare and useless green, the whole looks a bit like Tesla V100.

NVIDIA RTX 3080: Performance Test image 7

Outer packaging and graphics card

After getting the graphics card, the first impression it gives is that it has a very strong texture, which can be called a model of industrial design. In the press conference, we also saw that the RTX 30 series graphics card has made great changes in appearance, and a large area of ​​the card body is covered by heat sink fins.

After getting the graphics card, I actually found that all the cooling fins have a matte coating, so the touch is more warm. The shell part of the graphics card is wrapped in a large area of ​​metal, and the surface is made of frosted material.

NVIDIA RTX 3080: Performance Test image 10

The heat sink fins are all matte coating

The first impression of NVIDIA's RTX 3080 in your hand is perfection. This is definitely a work of art. Although in the past we would marvel at its exquisite workmanship in the public version evaluation, but like this time, the large area of ​​metal is so cleverly merged to form a combination of rigidity and softness. It is definitely in the beginning of the design. It takes a lot of effort, and if this effect is not good, it will become an "iron bump".

NVIDIA RTX 3080: Performance Test image 12

GeForce RTX 3080 appearance display

The reason why the appearance of RTX 30 series graphics cards needs to be greatly changed is because of the subversive design in terms of heat dissipation. It adopts a dual-axial flow design. The RTX 3080 actively dissipates the fans one after the other. According to official data, the air flow is increased by 55% compared to the previous design, the heat dissipation efficiency is increased by 30%, and the mute effect is increased by 3 times.

NVIDIA RTX 3080: Performance Test image 14

Cooling system schematic

The specific working principle is shown in the figure above. This is also the first time that the NVIDIA graphics card combines the heat dissipation system with the overall heat dissipation of the chassis to form a cooperative work.

NVIDIA RTX 3080: Performance Test image 16

How the cooling system works

The new heat dissipation system can suck in cold external air, flow through the GPU, and exhaust the hot air directly from the back of the chassis. The other back-pull fan also sucks in cold air, but flows through the heat dissipation fins on the heat pipe, and is guided to the back of the chassis through the overall heat dissipation system of the chassis.

NVIDIA RTX 3080: Performance Test image 18

PCB version comparison

NVIDIA has also made great adjustments on the PCB inside the graphics card. In order to match the new heat dissipation system, this time adopts an ultra-high-density PCB design with a "V" shape on the front end, which is 50% smaller than before. .

From the figure, you can see the dense arrangement of components on the board, with the RTX 3080 core in the middle, 10 memory particles distributed around it, and two empty soldering positions.

NVIDIA RTX 3080: Performance Test image 21

GeForce RTX 3080 PCB big picture

The 18-phase power supply is arranged in order on the left and right sides of the chip, and the tantalum capacitors are distributed at the corners. In addition, the power supply interface can be seen at the upper right of the whole board, and its space can really only accommodate the order interface. It can be said that the whole PCB board has almost no rich position.

NVIDIA RTX 3080: Performance Test image 23

Power supply adapter cable included

Since this public version of the graphics card uses a single 12pin power supply interface, in order to facilitate the adaptation of the players’ existing power supply, an adapter cable is also included in the package. The single 12pin can be designed as 8+8pin, but due to the design of the interface direction , Will just block the belief logo of "GeForce RTX", which is slightly flawed.

Changes brought by the NVIDIA Ampere architecture

Let's take a look at the "greatest performance increase ever" compared to the first generation of RTX Turing architecture, what changes will NVIDIA Ampere have.

NVIDIA RTX 3080: Performance Test image 27

 The First generation of RTX architecture turning.

NVIDIA RTX 3080: Performance Test image 28

Second-generation RTX architecture Ampere

First, let’s briefly review what we saw on the PPT of the September 2 press conference. Compared with the original Turing RTX architecture, the NVIDIA Ampere architecture has doubled its computing power, and each clock performs 2 times of coloring. Turing is 1 time, and the shader performance reaches 30 TFLOPS single-precision performance, while Turing is 11 TFLOPS.

The NVIDIA Ampere architecture doubles the throughput of light and triangle intersections, RT Core reaches 58 RT TFLOPS, and Turing reaches 34 RT TFLOPS.

In addition, in the new Tensor Core, it can automatically identify and eliminate less important DNN weights. The processing speed of sparse networks is twice that of Turing, and the computing power is as high as 238 Tensor TFLOPS, while Turing is 89 Tensor TFLOPS.

NVIDIA RTX 3080: Performance Test image 32

Chip description

The new NVIDIA Ampere GPU core has 28 billion transistors and an area of ​​628 square millimeters. It is based on Samsung’s 8nm NVIDIA custom process, GDDR6X memory from Micron, and as we said above, the three processing cores are all twice the speed of the original Turing. , Which constitutes the most powerful Ampere ever.

The powerful performance of the NVIDIA Ampere architecture is not achieved overnight by NVIDIA. It can be said that the Turing architecture used in the 20 series graphics card is indispensable. Let's take a look at the complete GA102 core.

NVIDIA RTX 3080: Performance Test image 35

Complete GA102 core

The complete GA102 GPU consists of 7 GPCs (graphics processing clusters), 42 TPCs (texture processing clusters) and 84 SMs (stream processors). GPC is the dominant advanced module, with all the key graphics processing units, each GPC contains a dedicated raster engine. In the new NVIDIA Ampere architecture, each GPC also contains two ROP partitions, and each partition contains 8 ROP units. Let's take a look at the changes of each SM unit.

NVIDIA RTX 3080: Performance Test image 37

SM Detailed

In each SM, there are four large processing partitions with a total of 128 CUDA cores, four third-generation Tensor Cores, one second-generation RT Core, one 256 KB cache file, and one 128 KB L1 cache , This L1 cache can be deployed according to different work requirements, and work efficiency is maximized.

In addition, everyone knows that the number of CUDAs of the RTX 3080 has soared to 8704, and the number of CUDAs of the RTX 3090 has reached an astonishing 10,496, but everyone should know that the GA100 core of the professional computing card Tesla A100 has a larger core Area, more transistors, theoretically only 8192 CUDA, how does the RTX 3080 achieve this effect?

In fact, this time NVIDIA Ampere's SM has doubled the FP32 arithmetic unit based on Turing, which doubles the number of FP32 arithmetic units per SM.

NVIDIA RTX 3080: Performance Test image 41

Complete GeForce RTX 3080 core

And usually we calculate the number of CUDA of the graphics card, instead of adding up all the units in the SM, but only counting the number of FP32 units, so the answer is obvious, the FP32 in SM: INT32 changes from 1:1 It is 2:1, such as 8704 CUDA of RTX 3080. In fact, it only has 4352 INT32 units, but because the number of internal FP32 has doubled, it finally achieved the amazing number of 8704.

But is this considered a "virtual bid"? In fact, for current games, floating-point operations are more commonly used than integer calculations, so the doubled FP32 can really double the performance.

NVIDIA RTX 3080: Performance Test image 44

Schematic diagram of optical chase working principle

In this NVIDIA Ampere architecture, NVIDIA officially announced the second generation RT Core. How is it different from the first generation? The first thing to know is that the working principle of RT Core is that the shader sends out a ray tracing request and hand it over to RT Core for processing. It will perform two tests, namely Box Intersection testing and Triangle Intersection testing. ). Judging based on the BVH algorithm, if it is a square, then return to reduce the scope to continue the test, if it is a triangle, then feedback the result for rendering.

The most time-consuming ray tracing is the calculation of intersection. Therefore, to improve the performance of ray tracing, it is mainly to accelerate the two kinds of intersection (BVH/triangle intersection).

NVIDIA RTX 3080: Performance Test image 47

Changes in RT Core

In Turing's RT Core, 5 BVH traversals, 4 BVH intersections, and one triangle intersection can be completed every cycle. In the second-generation RT Core, NVIDIA has added a new triangle position interpolation module and an additional The triangle intersection module of, the purpose of this is to improve the ray tracing performance of special effects such as motion blur.

NVIDIA RTX 3080: Performance Test image 49

Motion blur rendering principle

The second-generation RT Core allows ray tracing and shading to be performed at the same time. The more ray tracing is performed, the faster the acceleration. It doubles the processing performance of ray intersection. When rendering images with motion blur, follow NVIDIA's own The actual measurement is 8 times faster than Turing.

NVIDIA RTX 3080: Performance Test image 51

Sparse deep learning

In addition to the enhancement of ray tracing, the Tensor Core of the Ampere architecture has also been greatly enhanced. In the third generation of Tensor Core, NVIDIA has introduced sparse acceleration, which can automatically identify and eliminate less important DNN (deep neural network) weights. At the same time, it can still maintain good accuracy.

First, the original dense matrix will be trained, the sparse matrix will be deleted, and then the sparse matrix will be trained to achieve sparse optimization, thereby improving the performance of Tensor Core.

NVIDIA RTX 3080: Performance Test image 54

The processing power of the third generation Tensor Core is greatly improved

So the final result is that Tensor Core is processing sparse networks at twice the rate of Turing, with a computing power of up to 238 Tensor TFLOPS, while Turing is 89 Tensor TFLOPS.

At the same time, in the press conference, Huang Renxun also mentioned a new technology-RTX IO. At present, many games often dozens or perhaps even 100 G G installation space, for the burden of storage space aside, but the data stored in the hard disk, graphics card if you want to read, you need to first by the CPU reads from the hard disk compression over The data is decompressed and sent to the video memory.

NVIDIA RTX 3080: Performance Test image 57

Traditional data exchange

In this process, multiple CPU cores are occupied, the pressure increases sharply, and more memory is occupied. At this time, the GPU is actually in an idle state. The function of RTX IO is to go beyond the step of decompressing the CPU and then transferring data, read the compressed data on the hard disk directly from the PCIE bus, and complete the decompression, reducing CPU usage and improving performance.

NVIDIA RTX 3080: Performance Test image 59

RTX IO can greatly liberate the CPU burden

Of course, this technology, as the underlying operating mode of the system, needs to be implemented with the help of DirectStorage released by Microsoft. For games with current capacity, the improvement effect of RTX IO is limited, but if the game capacity of hundreds of G becomes the norm over time, This technology will play a huge role.

NVIDIA RTX 3080: Performance Test image 61

GDDR6X

In RTX 3080, GDDR6X memory is used. GDDR6X has a 320bit bit width and 19Gbps ​​bandwidth speed. Compared with Turing using GDDR6, it can increase the speed by 40%. In the same time, GDDR6X can transmit 2 times more data than GDDR6. . This is especially important for tasks that require a large amount of data load, such as ray tracing games, AI learning, and 8K video rendering.

At the same time, with the newly added HDMI2.1 interface, it can support single-line 8K video output, while the previous generation HDMI2.0 only supports 4K 98Hz video output. If you want to connect an 8K TV, you need more cable support.

 3DMARK theoretical performance test

First introduce the test platform. In order to ensure the best performance of RTX 3080 graphics card in this evaluation, the motherboard and CPU adopt the current desktop flagship configuration, as follows.

                            Test platform software and hardware configuration

                                                      Core Accessories 

CPUinteli9-10900K
MotherboardAsusEXTREME ROG RAMPAGE 12
Graphics CardNVIDIAGeForce RTX 3080 Founders Edition
RamGalaxyHOF EXTREME 8GB DDR4 4266MHz*2
Power supplyCORSAIRRM Series™ RM750 
Monitorphilips275M1RZ, 279C9

                                                 TESTING PLATFORM

Operating systemMicrosoft windows 10
Motherboard Driverintel chipset driver
Graphics Driver456.16
Graphics APIDIRECTX12
Frame number monitoringframeview, benchmark

In terms of test results, the benchmark test uses 3DMARK, and the game performance test uses the game's own Benchmark and FrameView to take the average value of the same scene.

NVIDIA RTX 3080: Performance Test image 72

RM Series™ RM750

RM Series™ RM750 — 750 Watt 80 PLUS® is a gold power supply built for high-end gaming platforms. The 750W rated power can meet the power demand of high-end gaming platforms. The 80 PLUS gold medal performance brings better The energy-saving performance, full module output can also provide a refreshing backline effect, which complements the 30 series graphics card.

NVIDIA RTX 3080: Performance Test image 74

GPU-Z parameters

First look at the parameters of GPU-Z, RTX 3080 uses GA102 core, Samsung 8nm, chip area reaches 628 square millimeters, has 8704 CUDA, frequency is 1440-1710MHz, uses 10GB GDDR6X memory, bit width is 320bit, and memory bandwidth It reaches 760.3GB/s, and the raster unit and texture unit are 96 and 272 respectively.

The following is the 3DMARK FS package used to measure the theoretical performance of the graphics card DX11 : FS, FSE, and FSU correspond to the theoretical performance of the graphics card at 1080P, 2K, and 4K respectively. The actual test results of the graphics card scores are as follows:

NVIDIA RTX 3080: Performance Test image 77

3D MARK FS suit test

In the 3DMARK FS suite test for the performance of the graphics card DX11, the RTX 3080 scored 54% higher than the RTX 2080 in FS, 58% higher in FSE, and 67% higher in FSU. It is not difficult to find that the higher the resolution score, the greater the gap, the same in the light pursuit effect and DLSS effect, the gap will be greater, we will introduce in detail below.

NVIDIA RTX 3080: Performance Test image 79

3D MARK TS set test

In the Time Spy and Time Spy Extreme tests for DX12 performance, the TS score of RTX 3080 is 65% higher than that of RTX 2080, and the TSE score is 76% higher. It is not difficult to find that the performance of RTX 3080 is particularly outstanding in the DX12 environment.

NVIDIA RTX 3080: Performance Test image 81

3D MARK light tracking test

PortRoyal is a test item in 3DMARK specifically for light tracking performance. Compared with RTX 2080, the score of RTX 3080 has increased by 79%.

Although the score of the theoretical test is a very important criterion for the performance of the graphics card, the actual game frame performance may be the most concerned about the players. Let's look at the actual game test.

Game performance test

In the game performance test, we selected "Control", " Shadow of the Tomb Raider ", "DOOM Eternal", "New Blood of the German Headquarters", " Far Cry 5 ", " Assassin's Creed Odyssey", domestic game "Frontier", Benchmark software for "Bright Memory: Infinite" . Among them, "Control" and "DOOM Eternal" do not have their own benchmarks, so we choose FrameView to take the average value of the same scene game for calculation, but the accuracy is definitely not comparable to the benchmark.

NVIDIA RTX 3080: Performance Test image 86

"Control" game test

The first is the masterpiece "Control". Currently, "Control: Final Compilation" is registered on steam . This game has excellent physical destruction and light and shadow effects, and because of the multiple choices in the settings, our test is divided into 2 groups and 6 tests. The first group is the preset highest quality, RTX OFF/DLSS OFF, the second group is the preset highest quality, RTX high/DLSS ON, we can see the specific performance in the above picture.

Among them, the scores of RTX 3080 are 58% and 55% higher than RTX 2080 at 1080P resolution; 73% and 68% higher at 2K resolution; 71% and 84% higher at 4K resolution. It can be seen that the higher the resolution, The better the light chasing effect, the more the RTX 3080 leads.

NVIDIA RTX 3080: Performance Test image 89

"Shadow of the Tomb Raider" game test

n "Shadow of the Tomb Raider", due to the addition of light pursuit and DLSS effects, we also divided into 2 groups of 6 tests. The first group is the default highest quality, RTX OFF/DLSS OFF, and the second group is At the default highest quality, RTX Ultra/DLSS ON. Among them, RTX 3080 is 30% and 45% higher than RTX 2080 at 1080P resolution; 57% and 55% higher at 2K resolution; 70% and 65% higher at 4K resolution, and the overall improvement is 50%-70% between.

NVIDIA RTX 3080: Performance Test image 91

"DOOM Eternal" game test

"DOOM Eternal" is the latest work in the Doom series. It has relatively low requirements for the configuration of the machine, and it is mainly refreshing. Among them, RTX 3080 is 47% higher than RTX 2080 at 1080P resolution; 59% higher at 2K resolution; 74% higher at 4K resolution. However, because "DOOM Eternal" also does not have a benchmark, it can only go to the scene average, and there are a lot of smoke effects in the scene, and the number of frames is not accurate, so it is for reference only.

NVIDIA RTX 3080: Performance Test image 93

"Assassin's Creed Odyssey" game test

Next is the Odyssey of Equality of All Living Beings. Although it is called Equality of All Living Beings, we can see from the picture that there are really graphics cards that can stabilize more than 60 frames at 4K resolution. Among them, RTX 3080 scores 38% higher than RTX 2080 at 1080P resolution; 42% higher at 2K resolution; 54% higher at 4K resolution.

NVIDIA RTX 3080: Performance Test image 95

"Far Cry 5" game test

"Far Cry 5" is also a 3A masterpiece with well-optimized comparisons. RTX 3080 scores 20% higher than RTX 2080 at 1080P resolution; 61% higher at 2K resolution; 92% higher at 4K resolution.

Temperature power consumption test

In terms of temperature and power consumption test, at room temperature of 24°C, we did not use a fully enclosed chassis, but a test platform method. This can ensure that the graphics card can minimize the external factors such as air ducts and other factors in addition to its own heat dissipation. .

NVIDIA RTX 3080: Performance Test image 99

Power consumption test (click to view larger image)

In the power consumption test, we choose FurMark software for copy machine testing, and the power consumption is only calculated on the graphics card itself. It can be seen that the new NVIDIA Ampere architecture graphics card is indeed a big power consumer. The two softwares are slightly different under peak conditions, but the overall average is between 310W-315W.

In terms of temperature, this time the RTX 3080 is still controlled at about 75°C, while the core area of ​​the RTX 2080 is 545 square millimeters, and the core area of ​​the RTX 3080 is 628 square millimeters, which is a full 15% larger, but the temperature is still well controlled. In terms of heat dissipation design, RTX 3080 has indeed worked hard.

To Sum Up

In terms of price, the overall pricing of the 30 series graphics card is very conscientious. Compared with the RTX2080, the performance of the RTX 3080 is nearly 2 times higher, and the price remains unchanged. And the positioning of RTX 3080, don’t forget that it is the current flagship product, which is over-performance for most players.

Some players will ask whether the RTX 20 series graphics card is considered a failed generation with such a "short life", I think it is not. Turing has created a new world of ray tracing and AI learning for us, laying the foundation for the future development direction of GPU, and realizing the real change from performance stacking to qualitative change. Ampere is standing on the shoulders of giants, taking the path of the previous generation wider and more solid.

Related Posts

© 2020–2022 Rocking Review