Is Hardware chasing Software or vice-versa for Generative AI workloads ?

Is Hardware chasing Software or vice-versa for Generative AI workloads ?

Generative AI has changed the world profoundly, we didn't know this until ChatGPT was released in Nov-2022. The significance of ChatGPT is that it did the hard 10% of what was missing from prior generations of generative auto-regressive models, which was to scale up significantly from previous generations.

In addition the use of human feedback (RLHF) to improve the data and the model was very unique.

GPT-2 is a smaller model with 1.5 billion parameters, while GPT-3 is much larger with 175 billion parameters, and GPT4 has 1.76 Trillion Parameters!

ModelParameters
GPT-41.76 Trillion
GPT-3.5175 billion
PaLM 2540 billion
Claude93 billion, 137 billion
CohereNot specified
GPT-21.5 Billion

Large Language Models have profound implications for software and hardware
Is Hardware Chasing Software or Vice-Versa ?

Since the days of the early microprocessor the industry has been on two sides of the equation, asking the following questions:

Q1: Will software ever catch up and use the power of the hardware?

Q2: Will hardware ever catchup to software requirements  ?

Depending on the hardware cycle we were in, Q1 or Q2 was being asked more.

For instance, in the early days of x-86  processors, the operating system folks wanted more memory.

When x86 processors hit 1Ghz clock frequency , most software was trying to use the faster hardware to do more, essentially catching up to hardware.

There were always a few small technical segments which required out-sized compute resources, such as chip-design , simulation, weather prediction etc.

LLMs compute requirements are similar in some ways to the 32-bit  to 64-bit transition in x-86. Prior to 64-bit the mqx addresable memory was 4GB. this was not a limitation for majority of the software applications.  The applications that felt the limitstions of 32bit address limit were video apps , simulation software, EDA ans CAD aoftware etc.

this limitstion lead to a tremendous amount of innovation in software architecture , such as duatributed multi processing  ,  batched compiting etc.

light weight multi threading was inteoduced hnto rhe linux kernwl in 2003 . this was a major breakthru for aoftware developwrs , but again limited by 32bit.

Mistral: "The question of whether software or hardware will ever catch up to the other has been debated for decades, dating back to the 8088 microprocessor. Initially, the hardware side of the equation was the limiting factor, as the amount of available memory was limited. This prompted the development of software optimization techniques, such as using more memory-efficient data structures and minimizing code size, to run more efficiently on the available hardware. As hardware architecture evolved, such as with the rise of the x86 architecture and its successor, the 64-bit ARM architecture, hardware became faster and more powerful. This led to a new question: will software ever catch up to hardware? This question became more prevalent as hardware continued to advance at a faster rate than software, prompting the development of new programming paradigms, such as parallel computing and distributed computing, to take advantage of new hardware capabilities."

Bard "In the early 2010s, the introduction of SSDs helped to address the problem of storage speed. SSDs are much faster than traditional hard drives. This allows software to load and save data much faster.

In the late 2010s, the introduction of cloud computing helped to address the problem of computing resources. Cloud computing provides access to computing resources on demand. This allows software to be run on more powerful hardware than is available on a local machine.

As the hardware continues to improve, software will continue to become more complex and demanding. The challenge for software developers is to keep up with the pace of hardware innovation."

SS