Wikipedia history of GFLOPS costs

This is a list from Wikipedia, showing hardware configurations that authors claim perform efficiently, along with their prices per GFLOPS at different times in recent history.

In it, prices generally fall at around an order of magnitude every five years, and have continued to do so recently.

Notes

This list is from November 5 2017 (archive version). It is not necessarily credible. We had trouble verifying at least one datapoint, of the few we tried. Performance numbers appear to be a mixture of theoretical peak performance and empirical performance. It is not clear to what extent one should expect the included systems to be especially cost-effective, or why these particular systems were chosen.

The last point is in October 2017, and appears to be roughly in line with the rest of the trend. The last order of magnitude  took around 4.5 years. The overall rate in the figure appears to be very roughly an order of magnitude every five years.

List

Date Approximate cost per GFLOPS Approximate cost per GFLOPS inflation adjusted to 2013 US dollars[54] Platform providing the lowest cost per GFLOPS Comments
1961 US$18,672,000,000 ($18.7 billion) US$145.5 billion About 2400 IBM 7030 Stretch supercomputers costing $7.78 million each The IBM 7030 Stretch performs one floating-point multiply every 2.4 microseconds.[55]
1984 $18,750,000 $42,780,000 Cray X-MP/48 $15,000,000 / 0.8 GFLOPS
1997 $30,000 $42,000 Two 16-processor Beowulfclusters with Pentium Promicroprocessors[56]
April 2000 $1,000 $1,300 Bunyip Beowulf cluster Bunyip was the first sub-US$1/MFLOPS computing technology. It won the Gordon Bell Prize in 2000.
May 2000 $640 $836 KLAT2 KLAT2 was the first computing technology which scaled to large applications while staying under US-$1/MFLOPS.[57]
August 2003 $82 $100 KASY0 KASY0 was the first sub-US$100/GFLOPS computing technology.[58]
August 2007 $48 $52 Microwulf As of August 2007, this 26.25 GFLOPS “personal” Beowulf cluster can be built for $1256.[59]
March 2011 $1.80 $1.80 HPU4Science This $30,000 cluster was built using only commercially available “gamer” grade hardware.[60]
August 2012 $0.75 $0.73 Quad AMD Radeon 7970 GHz System A quad AMD Radeon 7970 desktop computer reaching 16 TFLOPS of single-precision, 4 TFLOPS of double-precision computing performance. Total system cost was $3000; Built using only commercially available hardware.[61]
June 2013 $0.22 $0.22 Sony PlayStation 4 The Sony PlayStation 4 is listed as having a peak performance of 1.84 TFLOPS, at a price of $400[62]
November 2013 $0.16 $0.16 AMD Sempron 145 & GeForce GTX 760 System Built using commercially available parts, a system using one AMD Sempron 145 and three Nvidia GeForce GTX 760 reaches a total of 6.771 TFLOPS for a total cost of $1090.66.[63]
December 2013 $0.12 $0.12 Pentium G550 & Radeon R9 290 System Built using commercially available parts. Intel Pentium G550 and AMD Radeon R9 290 tops out at 4.848 TFLOPS grand total of US$681.84.[64]
January 2015 $0.08 $0.08 Celeron G1830 & Radeon R9 295X2 System Built using commercially available parts. Intel Celeron G1830 and AMD Radeon R9 295X2tops out at over 11.5 TFLOPS at a grand total of US$902.57.[65][66]
June 2017 $0.06 $0.06 AMD Ryzen 7 1700 & AMD Radeon Vega Frontier Edition Built using commercially available parts. AMD Ryzen 7 1700 CPU combined with AMD Radeon Vega FE cards in CrossFire tops out at over 50 TFLOPS at just under US$3,000for the complete system.[67]
October 2017 $0.03 $0.03 Intel Celeron G3930 & AMD RX Vega 64 Built using commercially available parts. Three AMD RX Vega 64 graphics cards provide just over 75 TFLOPS half precision (38 TFLOPS SP or 2.6 TFLOPS DP when combined with the CPU) at ~$2,050 for the complete system.[68]

 

The following is a figure we made, of the above list.

Further discussion

Trends in the cost of computing

1 Comment

1 Trackback / Pingback

  1. AI Impacts – Trends in the cost of computing

Leave a Reply

Your email address will not be published.


*


This site uses Akismet to reduce spam. Learn how your comment data is processed.