Wednesday, 26 February 2025

How to establish resource needed to run Large Language Models like DeepSeek, LLAMA, QWEN2.5-Coder etc

 How to establish resource needed to run Large Language Models like DeepSeek, LLAMA, QWEN2.5-Coder etc 

1. GPU configuration with VRAM

2. Processors

3. System RAM

4. HardDrive Capacity

5. Mother Board

6. Power Supply

7. Cooling Fan

Upgrading GPU from RTX 3060 to RTX 4060 – Worth It? (To observe impact on inference speed for LLMs)

https://youtu.be/_WwWNFDw5lo


NVIDIA RTX 3060 12GB GDDR6 RAM - Perfect choice for DeepSeekR1-14 Billions Parameters to Run locally

https://www.youtube.com/watch?v=EJ5_2jQfpR0


NVIDIA RTX 3060 12GB GDDR6 RAM - Inference Speed is slower for DeepSeekR1-32Billions Parameters 

Reason1 - Seems related to System memory (Action : Increase system RAM to 64GB to see if inference speed gets improved.)

Reason2 - GPU dedicated Memory (Action : Upgrade GPU from RTX 3060 to RTX 4060 with 12GB or 16 GB or 24 GB.)

https://youtu.be/GKflDNCjrQo 

  

How one can established the system configuration required for running Large Language Models locally : NVIDIA RTX 3060 12GB GDDR6 I7 12TH Generation Processor with 64 GB RAM - LLAMA3.3 latest 70 Billions needs 64GB System RAM with slow inference speed. Action items to increase inference speed 

Action1: Will upgrade GPU with 4060 to see if inference speed gets increased or not.

Action2: If not then will try with 5070 ti GPU that should improved inference speed.

https://www.youtube.com/watch?v=S8ABnWwhRNI 

No comments:

Post a Comment