We looked at Geekbench 5,^{1} a benchmark for CPU performance. We combined Geekbench’s multi-core scores on its ‘Processor Benchmarks’ page^{2} with release dates and prices that we scraped from Wikichip and Wikipedia.^{3} All our data and plots can be found here.^{4} We then calculated score per dollar and adjusted for inflation using the consumer price index.^{5} For every year, we calculated the 95th percentile score per dollar. We then fit linear and exponential trendlines to those scores.

Figure 1 shows all our data for Geekbench score per CPU price.

The data is well-described by a linear or an exponential trendline. Assuming an exponential trend,^{6} Geekbench score per CPU price grew by around 16% per year between 2006 and 2020, a rate that would yield a factor of ten every 16 years.^{7}

This is a markedly slower growth rate than those observed for CPU price performance trends in the past, however since it is for a different performance metric to any used earlier, it is unclear how similar one should expect them to be– from 1940 to 2008, Sandberg and Bostrom found that CPU price performance grew by a factor of ten every 5.6 years when measured in MIPS per dollar, and by a factor of ten every 7.7 years when measured in FLOPS per dollar.^{8}

*Primary author: Asya Bergal*

DRAM, “dynamic random-access memory”, is a type of semiconductor memory. It is used as the main memory in modern computers and graphic cards.^{1}

We found two sources for historic pricing of DRAM. One was a dataset of DRAM prices and sizes from 1957 to 2018 collected by technologist and retired Computer Science professor^{2} John C. McCallum.^{3} The other dataset was extracted from a graph generated by Objective Analysis,^{4} a group that sells “third-party independent market research and data” to investors in the semiconductor industry.^{5} We have not checked where their data comes from and don’t have evidence about whether they are a trustworthy source.

Figure 1 shows McCallum’s data.^{6}

Figure 2 shows the average price per gigabyte of DRAM from 1991 to 2019, according to the Objective Analysis graph.^{8}

The two datasets appear to line up (see Figure 3 below),^{9} though we don’t know where the data in the Objective Analysis report came from– it could itself be referencing the McCallum dataset, or both could share data sources.

For both sources, the data appears to follow an exponential trendline. In the McCallum dataset, we calculate that the price / GB of DRAM has fallen at around 36% per year, for a factor of ten every 5.1 years and a doubling time of 1.5 years on average. The Objective Analysis data is similar, with the price / GB of DRAM falling around 33% per year, for a factor of ten every 5.8 years and a doubling time of 1.7 years.

The 1.5 and 1.7 year doubling times are close to the rate at which Moore’s law observed that transistors in an integrated circuit double.^{10} It seems possible to us that cheaper and denser transistors following this law are what enabled the cheaper prices of DRAM, though we haven’t investigated this theory.^{11}

Both datasets show slower progress in recent years. From 2010 onwards, the McCallum dataset falls in price by only 15% a year, for a rate that would yield a factor of ten every 14 years, and the Objective Analysis dataset falls by 12% a year, for a rate that would yield a factor of ten every 18.5 years.

*Primary author: Asya Bergal*

- 17 years for single-precision FLOPS
- 10 years for half-precision FLOPS
- 5 years for half-precision fused multiply-add FLOPS

GPUs (graphics processing units) are specialized electronic circuits originally used for computer graphics.^{1} In recent years, they have been popularly used for machine learning applications.^{2} One measure of GPU performance is FLOPS, the number of operations on floating-point numbers a GPU can perform in a second.^{3} This page looks at the trends in GPU price / FLOPS of theoretical peak performance over the past 13 years. It does not include the cost of operating the GPUs, and it does not consider GPUs rented through cloud computing.

‘Theoretical peak performance’ numbers appear to be determined by adding together the theoretical performances of the processing components of the GPU, which are calculated by multiplying the clock speed of the component by the number of instructions it can perform per cycle.^{4} These numbers are given by the developer and may not reflect actual performance on a given application.^{5}

We collected data on multiple slightly different measures of GPU price and FLOPS performance.

GPU prices are divided into release prices, which reflect the manufacturer suggested retail prices that GPUs are originally sold at, and active prices, which are the prices at which GPUs are actually sold at over time, often by resellers.

We expect that active prices better represent prices available to hardware users, but collect release prices also, as supporting evidence.

Several varieties of ‘FLOPS’ can be distinguished based on the specifics of the operations they involve. Here we are interested in single-precision FLOPS, half-precision FLOPS, and half-precision fused-multiply add FLOPS.

‘Single-precision’ and ‘half-precision’ refer to the number of bits used to specify a floating point number.^{6} Using more bits to specify a number achieves greater precision at the cost of more computational steps per calculation. Our data suggests that GPUs have largely been improving in single-precision performance in recent decades,^{7} and half-precision performance appears to be increasingly popular because it is adequate for deep learning.^{8}

Nvidia, the main provider of chips for machine learning applications,^{9} recently released a series of GPUs featuring Tensor Cores,^{10} which claim to deliver “groundbreaking AI performance”. Tensor Core performance is measured in FLOPS, but they perform exclusively certain kinds of floating-point operations known as fused multiply-adds (FMAs).^{11} Performance on these operations is important for certain kinds of deep learning performance,^{12} so we track ‘GPU price / FMA FLOPS’ as well as ‘GPU price / FLOPS’.

In addition to purely half-precision computations, Tensor Cores are capable of performing mixed-precision computations, where part of the computation is done in half-precision and part in single-precision.^{13} Since explicitly mixed-precision-optimized hardware is quite recent, we don’t look at the trend in mixed-precision price performance, and only look at the trend in half-precision price performance.

Any GPU that performs multiple kinds of computations (single-precision, half-precision, half-precision fused multiply add) trades off performance on one for performance on the other, because there is limited space on the chip, and transistors must be allocated to either one type of computation or the other.^{14} All current GPUs that perform half-precision or TensorCore fused-multiply-add computations also do single-precision computations, so they are splitting their transistor budget. For this reason, our impression is that half-precision FLOPS could be much cheaper now if entire GPUs were allocated to each one alone, rather than split between them.

We collected data on theoretical peak performance (FLOPS), release date, and price from several sources, including Wikipedia.^{15} (Data is available in this spreadsheet). We found GPUs by looking at Wikipedia’s existing large lists^{16} and by Googling “popular GPUs” and “popular deep learning GPUs”. We included any hardware that was labeled as a ‘GPU’. We adjusted prices for inflation based on the consumer price index.^{17}

We were unable to find price and performance data for many popular GPUs and suspect that we are missing many from our list. In our search, we did not find any GPUs that beat our 2017 minimum of $0.03 (release price) / single-precision GFLOPS. We put out a $20 bounty on a popular Facebook group to find a cheaper GPU / FLOPS, and the bounty went unclaimed, so we are reasonably confident in this minimum.^{18}

Figure 1 shows our collected dataset for GPU price / single-precision FLOPS over time.^{19}

To find a clear trend for the prices of the cheapest GPUs / FLOPS, we looked at the running minimum prices every 10 days.^{20}

The cheapest GPU price / FLOPS hardware using release date pricing has not decreased since 2017. However there was a similar period of stagnation between early 2009 and 2011, so this may not represent a slowing of the trend in the long run.

Based on the figures above, the running minimums seem to follow a roughly exponential trend. If we do not include the initial point in 2007, (which we suspect is not in fact the cheapest hardware at the time), we get that the cheapest GPU price / single-precision FLOPS fell by around 17% per year, for a factor of ten in ~12.5 years.^{21}

Figure 3 shows GPU price / half-precision FLOPS for all the GPUs in our search above for which we could find half-precision theoretical performance.^{22}

Again, we looked at the running minimums of this graph every 10 days, shown in Figure 4 below.^{23}

If we assume an exponential trend with noise,^{24} cheapest GPU price / half-precision FLOPS fell by around 26% per year, which would yield a factor of ten after ~8 years.^{25}

Figure 5 shows GPU price / half-precision FMA FLOPS for all the GPUs in our search above for which we could find half-precision FMA theoretical performance.^{26} (Note that this includes all of our half-precision data above, since those FLOPS could be used for fused-multiply adds in particular). GPUs with TensorCores are marked in red.

Figure 6 shows the running minimums of GPU price / HP FMA FLOPS.^{27}

GPU price / Half-Precision FMA FLOPS appears to be following an exponential trend over the last four years, falling by around 46% per year, for a factor of ten in ~4 years.^{28}

GPU prices often go down from the time of release, and some popular GPUs are older ones that have gone down in price.^{29} Given this, it makes sense to look at active price data for the same GPU over time.

We collected data on peak theoretical performance in FLOPS from TechPowerUp^{30} and combined it with active GPU price data to get GPU price / FLOPS over time.^{31} Our primary source of historical pricing data was Passmark, though we also found a less trustworthy dataset on Kaggle which we used to check our analysis. We adjusted prices for inflation based on the consumer price index.^{32}

We scraped pricing data^{33} on GPUs between 2011 and early 2020 from Passmark.^{34} Where necessary, we renamed GPUs from Passmark to be consistent with TechPowerUp.^{35} The Passmark data consists of 38,138 price points for 352 GPUs. We guess that these represent most popular GPUs.

Looking at the ‘current prices’ listed on individual Passmark GPU pages, prices appear to be sourced from Amazon, Newegg, and Ebay. Passmark’s listed pricing data does not correspond to regular intervals. We don’t know if prices were pulled at irregular intervals, or if Passmark pulls prices regularly and then only lists major changes as price points. When we see a price point, we treat it as though the GPU is that price only at that time point, not indefinitely into the future.

The data contains several blips where a GPU is briefly sold very unusually cheaply. A random checking of some of these suggests to us that these correspond to single or small numbers of GPUs for sale, which we are not interested in tracking, because we are trying to predict AI progress, which presumably isn’t influenced by temporary discounts on tiny batches of GPUs.

This Kaggle dataset contains scraped data of GPU prices from price comparison sites PriceSpy.co.uk, PCPartPicker.com, Geizhals.eu from the years 2013 – 2018. The Kaggle dataset has 319,147 price points for 284 GPUs. Unfortunately, at least some of the data is clearly wrong, potentially because price comparison sites include pricing data from untrustworthy merchants.^{36} As such, we don’t use the Kaggle data directly in our analysis, but do use it as a check on our Passmark data. The data that we get from Passmark roughly appears to be a subset of the Kaggle data from 2013 – 2018,^{37} which is what we would expect if the price comparison engines picked up prices from the merchants Passmark looks at.

There are a number of reasons why we think this analysis may in fact not reflect GPU price trends:

- We effectively have just one source of pricing data, Passmark.
- Passmark appears to only look at Amazon, Newegg, and Ebay for pricing data.
- We are not sure, but we suspect that Passmark only looks at the U.S. versions of Amazon, Newegg, and Ebay, and pricing may be significantly different in other parts of the world (though we guess it wouldn’t be different enough to change the general trend much).
- As mentioned above, we are not sure if Passmark pulls price data regularly and only lists major price changes, or pulls price data irregularly. If the former is true, our data may be overrepresenting periods where the price changes dramatically.
- None of the price data we found includes quantities of GPUs which were available at that price, which means some prices may be for only a very limited number of GPUs.
- We don’t know how much the prices from these datasets reflect the prices that a company pays when buying GPUs in bulk, which we may be more interested in tracking.

A better version of this analysis might start with more complete data from price comparison engines (along the lines of the Kaggle dataset) and then filter out clearly erroneous pricing information in some principled way.

The original scraped datasets with cards renamed to match TechPowerUp can be found here. GPU price / FLOPS data is graphed on a log scale in the figures below. Price points for the same GPU are marked in the same color. We adjusted prices for inflation using the consumer price index.^{38} All points below are in 2019 dollars.

To try to filter out noisy prices that didn’t last or were only available in small numbers, we took out the lowest 5% of data in every several day period^{39} to get the 95th percentile cheapest hardware. We then found linear and exponential trendlines of best fit through the available hardware with the lowest GPU price / FLOPS every several days.^{40}

Figures 7-10 show the raw data, 95th percentile data, and trendlines for single-precision GPU price / FLOPS for the Passmark dataset. This folder contains plots of all our datasets, including the Kaggle dataset and combined Passmark + Kaggle dataset.^{41}

The cheapest 95th percentile data every 10 days appears to fit relatively well to both a linear and exponential trendline. However we assume that progress will follow an exponential, because previous progress has followed an exponential.

In the Passmark dataset, the exponential trendline suggested that from 2011 to 2020, 95th-percentile GPU price / single-precision FLOPS fell by around 13% per year, for a factor of ten in ~17 years,^{45} bootstrap^{46} 95% confidence interval 16.3 to 18.1 years.^{47} We believe the rise in price / FLOPS in 2017 corresponds to a rise in GPU prices due to increased demand from cryptocurrency miners.^{48} If we instead look at the trend from 2011 through 2016, before the cryptocurrency rise, we instead get that 95th-percentile GPU price / single-precision FLOPS price fell by around 13% per year, for a factor of ten in ~16 years.^{49}

This is slower than the order of magnitude every ~12.5 years we found when looking at release prices. If we restrict the release price data to 2011 – 2019, we get an order of magnitude decrease every ~13.5 years instead,^{50} so part of the discrepancy can be explained because of the different start times of the datasets. To get some assurance that our active price data wasn’t erroneous, we spot checked the best active price at the start of 2011, which was somewhat lower than the best release price at the same time, and confirmed that its given price was consistent with surrounding pricing data.^{51} We think active prices are likely to be closer to the prices at which people actually bought GPUs, so we guess that ~17 years / order of magnitude decrease is a more accurate estimate of the trend we care about.

Figures 11-14 show the raw data, 95th percentile data, and trendlines for half-precision GPU price / FLOPS for the Passmark dataset. This folder contains plots of the Kaggle dataset and combined Passmark + Kaggle dataset.

If we assume the trend is exponential, the Passmark trend seems to suggest that from 2015 to 2020, 95th-percentile GPU price / half-precision FLOPS of GPUs has fallen by around 21% per year, for a factor of ten over ~10 years,^{55} bootstrap^{56} 95% confidence interval 8.8 to 11 years.^{57} This is fairly close to the ~8 years / order of magnitude decrease we found when looking at release price data, but we treat active prices as a more accurate estimate of the actual prices at which people bought GPUs. As in our previous dataset, there is a noticeable rise in 2017, which we think is due to GPU prices increasing as a result of cryptocurrency miners. If we look at the trend from 2015 through 2016, before this rise, we get that 95th-percentile GPU price / half-precision FLOPS has fallen by around 14% per year, which would yield a factor of ten over ~8 years.^{58}

Figures 15-18 show the raw data, 95th percentile data, and trendlines for half-precision GPU price / FMA FLOPS for the Passmark dataset. GPUs with Tensor Cores are marked in black. This folder contains plots of the Kaggle dataset and combined Passmark + Kaggle dataset.

If we assume the trend is exponential, the Passmark trend seems to suggest the 95th-percentile GPU price / half-precision FMA FLOPS of GPUs has fallen by around 40% per year, which would yield a factor of ten in ~4.5 years,^{62} with a bootstrap^{63} 95% confidence interval 4 to 5.2 years.^{64} This is fairly close to the ~4 years / order of magnitude decrease we found when looking at release price data, but we think active prices are a more accurate estimate of the actual prices at which people bought GPUs.

The figures above suggest that certain GPUs with Tensor Cores were a significant (~half an order of magnitude) improvement over existing GPU price / half-precision FMA FLOPS.

We summarize our results in the table below.

Release Prices | 95th-percentile Active Prices | 95th-percentile Active Prices (pre-crypto price rise) | |

11/2007 – 1/2020 | 3/2011 – 1/2020 | 3/2011 – 12/2016 | |

$ / single-precision FLOPS | 12.5 | 17 | 16 |

9/2014 – 1/2020 | 1/2015 – 1/2020 | 1/2015 – 12/2016 | |

$ / half-precision FLOPS | 8 | 10 | 8 |

$ / half-precision FMA FLOPS | 4 | 4.5 | — |

Release price data seems to generally support the trends we found in active prices, with the notable exception of trends in GPU price / single-precision FLOPS, which cannot be explained solely by the different start dates.^{65} We think the best estimate of the overall trend for prices at which people recently bought GPUs is the 95th-percentile active price data from 2011 – 2020, since release price data does not account for existing GPUs becoming cheaper over time. The pre-crypto trends are similar to the overall trends, suggesting that the trends we are seeing are not anomalous due to cryptocurrency.

Given that, we guess that GPU prices as a whole have fallen at rates that would yield an order of magnitude over roughly:

- 17 years for single-precision FLOPS
- 10 years for half-precision FLOPS
- 5 years for half-precision fused multiply-add FLOPS

Half-precision FLOPS seem to have become cheaper substantially faster than single-precision in recent years. This may be a “catching up” effect as more of the space on GPUs was allocated to half-precision computing, rather than reflecting more fundamental technological progress.

*Primary author: Asya Bergal*

In 2011 Jon Koomey reported that computation per kWh had doubled every roughly 1.5 years since around 1950, as shown in figure 1 (taken from him).^{1} Wikipedia calls this trend ‘Koomey’s Law‘. In 2015 Koomey and Naffziger reported in IEEE Spectrum that Koomey’s law began to slow down in around 2000 and by 2015, electrical efficiency was taking 2.5 years to double.^{2}

We have not investigated beyond this, except to note that there is not obvious controversy on the topic. We do not know the details of the methods involved in this research, for instance how ‘computations’ are measured.

]]>

This data was collected from Appendix 2 of *The progress of computing*, using Tabula (a program for turning tables in pdfs into other table formats). We have not checked its accuracy beyond a graph of the resulting data looking visually similar to a graph of the original.

**Important:** we previously noted that this data appears to be orders of magnitude different from other sources, and haven’t had time to look into this discrepancy.

Here is a Google sheet of the data. See ‘Nordhaus via Tabula’ page.

]]>In February 2018, Google Cloud Platform blog says their TPUs can perform up to 180 TFLOPS, and currently cost $6.50/hour.^{1} This gives us $171,000 to rent one TPU continually for a roughly three year lifecycle^{2} Which is 1.05 GFLOPS/$.

This service apparently began on February 12 2018.^{3} So this does not appear to be competitive with the cheapest GPUs, in terms of FLOPS/$, or even the cheapest cloud computing.

We have not finished exploring the apparent discrepancies between 2015 prices for performance and current records of 2015 prices for performance. However in the data described in our 2017 assessment of recent price trends (key figure here), prices appear to have been below $1 since 2008.^{1} The measurements are not entirely comparable, but we would not expect the differences to produce such a large price difference.

*The rest of this page is largely taken from our page written in 2015.*

In April 2015, the lowest recorded GFLOPS prices we knew of were approximately $3/GFLOPS, for various CPU and GPU combinations. Amortized over three years, this was $1.1E-13/FLOPShour. Prices in the $3-5/GFLOPS range seemed to be common, for GPU and CPU combinations and sometimes for supercomputers. Using CPUs, prices were at least $11/GFLOPS, and computing as a service cost more like $160/GFLOPS.

We have written about long term trends in the costs of computing hardware. We were interested in evaluating the current prices more thoroughly, both to validate the long term trend data, and because current hardware prices are particularly important to know about.

We separately investigated CPUs, GPUs, computing as a service, and supercomputers. In all categories, we collected some contemporary instances which we judged heuristically as especially likely to be cost-effective. We did not find any definitive source on the most cost-effective in any category, or in general, so our examples are probably not the very cheapest. Nevertheless, these figures give a crude sense for the cost of computation in the contemporary market. Our full dataset of CPUs, GPUs and supercomputers is here, and contains data on twenty two machines. Our data on computing as a service is all included in this page.

For CPUs and GPUs, we list the price of the CPU and/or GPU (GPUs were always used with a CPU, so we include the cost for both), but not other computer components. We compared prices between one complete rack server and the set of four processors inside it, and found the complete server was around 36% more expensive ($30,000 vs. $22,000). We expect this is representative at this scale, but diminishes with scale.

For computing services, we list the cheapest price for renting the instance for a long period, with no additional features. We do not include spot prices.

For supercomputers, we list costs cited, which don’t tend to come with elaboration. We expect that they only include upfront costs, and that most of the costs are for hardware.

We have not included the costs of energy or other ongoing expenses in any prices. Non-energy costs are hard to find, and we suspect a relatively small and consistent fraction of costs. Energy costs appear to be around 10% of hardware costs. For instance, the Intel Xeon E5-2699 uses 527.8 watts and costs $5,190.^{2} Over three years, with $0.05/kWh this is $694, or 13% of the hardware cost. Titan also uses 13% of its hardware costs in energy over three years.^{3} We might add these costs later for a more precise estimate.

To our knowledge we report only empirical performance figures from benchmark tests, rather than theoretical maximums. We sometimes use figures for LINPACK and sometimes for DGEMM benchmarks, depending on which are available. Geekbench in particular does not use the common LINPACK, but LINPACK relies heavily on DGEMM, suggesting DGEMM is fairly comparable. We guess they differ by around 10%.^{4}

We found prices and performance data for five contemporary CPUs, including three different instances of one of them. They ranged from $11-354/GFLOPS with most prices below $100/GFLOPS.^{5} The cheapest of these CPUs still looks several times more expensive than some GPUs and supercomputers, so we did not investigate these numbers in great depth, or search far for cheaper CPUs.

We found performance data for six recent combinations of CPUs and GPUs (with much overlap between CPUs and GPUs between combinations. They ranged from $3.22/GFLOPS to $4.17/GFLOPS.

Note that graphics cards are typically significantly restricted in the kinds of applications they can run efficiently; this performance is achieved for highly regular computations that can be carried out in parallel throughout a GPU (of the sort that are required for rendering scenes, but which have also proved useful in scientific computing).

Another way to purchase FLOPS is via virtual computers.

Amazon Elastic Cloud Compute (EC2) is a major seller of virtual computing. Based on their current pricing, renting a c4.8xlarge instance costs about $1.17 / hour.^{6} This is their largest instance optimized for computing performance (rather than e.g. memory). A c4.8xlarge instance delivers around 97.5 GFLOPS.^{7} This implies that a GFLOPShour costs $0.012. If we suppose this is an alternative to buying computer hardware, then the relevant time horizon is about three years. Over three years, renting this hardware will cost $316/GFLOPS, i.e. around two orders of magnitude more than buying GFLOPS in the form of GPUs.

Other sources of virtual computing seem to be similarly priced. An informal comparison of computing providers suggests that on a set of “real-world java benchmarks” three providers are quite closely comparable, with all between just above Amazon’s price and just under half Amazon’s price for completing the benchmarks, across different instance sizes. This analysis also suggests Amazon is a relatively costly provider, and suggests a cheap price for virtual computing is closer to $0.006/GFLOPShour or $160/GFLOPS over three years.

Even with this optimistic estimate, virtual computing appears to cost something like fifty times more than GPUs. This high price is presumably partly because there are non-hardware costs which we have not accounted for in the prices of buying hardware, but are naturally included in the cost of renting it. However it is unlikely that these additional costs make up a factor of fifty.

The Titan supercomputer purportedly cost about $97M to produce, or about $4,000 dollars per hour amortized over 3 years. It performs 17,590,000 GFLOPS which comes to $5.51/GFLOPS. This makes it around the same price as the cheapest GPUs. It is made of a combination of GPUs and CPUs, so this similarity is unsurprising.

The other six built supercomputers we looked at were more expensive, ranging up to $95/GFLOPS. Another cost-effective supercomputer, the L-CSC, was being built at the time it was most recently reported on, and while it should be completed now we could not find more data on it. Extrapolating from the figures before it was finished, when completed it should cost $2.39/GFLOPS, and thus be the cheapest source of FLOPS we are aware of.

The lowest recorded GFLOPS prices we know of are approximately $3/GFLOPS, for various CPU and GPU combinations. Amortized over three years, this is $1.1E-13/FLOPShour. Prices in the $3-5/GFLOPS range seem to be common, for GPU and CPU combinations and sometimes for supercomputers. Using CPUs, prices are at least $11/GFLOPS, and computing as a service costs more like $160/GFLOPS.

]]>Computing power available per dollar has increased fairly evenly by a factor of ten roughly every four years in the last quarter of a century (a phenomenon sometimes called ‘price-performance Moore’s Law‘). Because this trend is important and regular, it is useful in predictions. For instance, it is often used to determine when the hardware for an AI mind might become cheap. This means that a primary way such predictions might err is if this trend in computing prices were to leave its long run trajectory. This must presumably happen eventually, and has purportedly happened with other exponential trends in information technology recently.^{1}

This page outlines our assessment of whether the long run trend is on track very recently, as of late 2017. This differs from assessing the long run trend (as we do here) in that it requires recent and relatively precise data. Data that may be off by one order of magnitude is still useful when assessing a long run trend that grows by many orders of magnitude. But if we are judging whether the last five years of that trend are on track, it is important to have more accurate figures.

We sought public data on computing performance, initial price, and date of release for different pieces of computing hardware. We tried to cover different types of computing hardware, and to prioritize finding large, consistent datasets using comparable metrics, rather than one-off measurements. We searched for computing performance measured using the Linpack benchmark, or something similar.

We ran into many difficulties finding consistently measured performance in FLOPS for different machines, as well as prices for those same machines. What data we could find used a variety of different benchmarks. Sometimes performance was reported as ‘FLOPS’ without explanation. Twice the ‘same’ benchmarks turned out to give substantially different answers at different times, at least for some machines, apparently due to the benchmarks being updated. Performance figures cited often refer to ‘theoretical peak performance’, which is calculated from the computer’s specifications, rather than measured, and is higher than actual performance.

Prices are also complicated, because each machine can have many sellers, and each price fluctuates over time. We tried to use the release price, the manufacturer’s ‘recommended customer price’, or similar where possible. However, many machines don’t seem to have readily available release prices.

These difficulties led to many errors and confusions, such that progress required running calculations, getting unbelievable results, and searching for an error that could have made them unbelievable. This process is likely to leave remaining errors at the end, and those errors are likely to be biased toward giving results that we find believable. We do not know of a good remedy for this, aside from welcoming further error-checking, and giving this warning.

GPUs appear to be substantially cheaper than CPUs, cloud computing (including TPUs), or supercomputers.^{2} Since GPUs alone are at the frontier of price performance, we focus on them. We have two useful datasets: one of theoretical peak performance, gathered from Wikipedia, and one of empirical performance, from Passmark.

We collected data from several Wikipedia pages, supplemented with other sources for some dates and prices.^{3} We think all of the performance numbers are theoretical peak performance, generally calculated from specifications given by the developer, but we have not checked Wikipedia’s sources or calculations thoroughly. Our impression is that the prices given are recommended prices at launch, by the developers of the hardware, though again we have only checked a few of them.

We look at Nvidia and AMD GPUs and Xeon Phi processors here because they are the machines for which we could find data on Wikipedia easily. However, Nvidia and AMD are the leading producers of GPUs, so this should cover the popular machines. We excluded many machines because they did not have prices listed.

Figure 1 shows performance (single precision) over time for processors for which we could find all of the requisite data.

The recent rate of progress in this figure looks like somewhere between half an order of magnitude in the past eight years and an order of magnitude in the past ten, for an order of magnitude about every 10-16 years. We don’t think the figure shows particular slowing down—the most cost-effective hardware has not improved in almost a year, but that is usual in the rest of the figure.

We also collected double precision performance figures for these machines, but the machines do not appear to be optimized for double precision performance,^{4} so we focus on single precision.

Peak theoretical performance is generally higher than actual performance, but our impression is that this should be by a roughly constant factor across time, so not make a difference to the trend.

Passmark maintains a collection of benchmark results online, for both CPUs and GPUs. They also collect prices, and calculate price for performance (though it was not clear to us on brief inspection where their prices come from). Their performance measure is from their own benchmark, which we do not know a lot about. This makes their absolute prices hard to compare to others using more common measures, but the trend in progress should be more comparable.

We used archive.org to collect old versions of Passmark’s page of the most cost-effective GPUs available, to get a history of price for passmark performance. The prices are from the time of the archive, not necessarily from when the hardware was new. That is, if we collected all of the results on the page on January 1, 2013, it might contain hardware that was built in 2010 and has maybe come down in price due to being old. You might wonder whether this means we are just getting a lot of really cheap old hardware with hardly any performance, which might be bad in other ways and so not represent a realistic price of hardware. This is possible, however given that people show interest in this (for instance, Passmark keep these records) it would be surprising to us if this metric mostly caught useless hardware.

We are broadly interested in the cheapest hardware available, but we probably don’t want to look at the very cheapest in data like this, because it seems likely to be due to error or other meaningless exploitation of the particular metric.^{5} The 95th percentile machines (out of the top 50) appear to be relatively stable, so are probably close to the cheapest hardware without catching too many outliers. For this reason, we take them as a proxy for the cheapest hardware.

Figure 4 shows the 95th percentile fits an exponential trendline quite well, with a doubling time of 3.7 years, for an order of magnitude every 12 years. This has been fairly consistent, and shows no sign of slowing by early 2017. This supports the 10-16 year time we estimated from the Wikipedia theoretical performance above.

- The
**Wikipedia page on FLOPS**contains a history of GFLOPS over time. The recent datapoints appear to overlap with the theoretical performance figures we have already. - Google has developed
**Tensor Processing Units**(TPUs) that specialize in computation for machine learning. Based on information from Google, we estimate that they perform around 1.05 GFLOPS/$. - In 2015,
**cloud computing**appeared to be around a hundred times more expensive than other forms of computing.^{6}Since then the price appears to have roughly halved.^{7}So cloud computing is not a competitive way to buy FLOPS all else equal, and the price of FLOPS may be a small influence on the cloud-computing price trend, making the trend less relevant to this investigation. - Top
**supercomputers**perform at around $3/GFLOPS, so they do not appear to be on the forefront of cheap performance. See Price performance trend in top supercomputers for more details. **Geekbench**has empirical performance numbers for many systems, but their latest version does not seem to have anything for GPUs. We looked at a small number of popular CPUs on Geekbench from the past five years, and found the cheapest to be around $0.71/GFLOPS. However there appear to be 5x disparities between different versions of Geekbench, which makes it less useful for fine-grained estimates.

We have seen that the theoretical peak single-precision performance of GPUs is improving at about an order of magnitude every 10-16 years. And that the Passmark performance/$ trend is improving by an order of magnitude every 12 years. These are slower than the long run price-performance trends of an order of magnitude every eight years (75 year trend) or four years (25 year trend).

The longer run trends are based on a slightly different set of measures, which might explain a difference in rates of progress.

Within these datasets the pace of progress does not appear to be slower in recent years relative to earlier ones.

]]>The price of performance in top supercomputers continues to fall, as of 2016.

TOP500.org maintains a list of top supercomputers and their performance on the Linpack benchmark. The figure below is based on empirical performance figures (‘Rmax’) from Top500 and price figures collected from a variety of less credible sources, for nine of the ten highest performing supercomputers (we couldn’t find a price for the tenth). Our data and sources are here.

Sunway Teihu Light performs the cheapest GFLOPS, at $2.94/GFLOPS. This is around one hundred times more expensive than peak theoretical performance of certain GPUs, but we do not know why there is such a difference (peak performance is generally higher than actual performance, but by closer to a factor of two).

There appears to be a downward trend in price, but it is not consistent, and with so few data points its slope is ambiguous. The best price for performance roughly halved in the last 4-5 years, for a 10x drop in 13-17 years. The K computer in 2011 was much more expensive, but appears to have been substantially more expensive than earlier computers.

]]>

Trends in the cost of computing

Wikipedia history of GFLOPS costs

Brain performance in TEPS (includes the cost of brain-level TEPS performance on current hardware)

The cost of TEPS (includes current costs, trends and relationship to other measures of hardware price)

Information storage in the brain

Costs of human-level information storage

Research topic: hardware, software and AI

Index of articles about hardware

*Preliminary prices for human level hardware (4 April 2015)*

*A new approach to predicting brain-computer parity (7 May 2015)*

*Time flies when robots rule the earth (28 July 2015)*