How one can established the system configuration required for running Large Language Models locally:
NVIDIA RTX 3060 12GB GDDR6 I7 12TH Generation Processor with 64 GB RAM - LLAMA3.3 latest 70 Billions needs 64GB System RAM with slow inference speed. Action items to increase inference speed
Action1: Will upgrade GPU with 4060 to see if inference speed gets increased or not.
Action2: If not then will try with 5070 ti GPU that should improved inference speed.
NVIDIA RTX 3060 12GB GDDR6 RAM - Perfect choice for DeepSeekR1-14 Billions Parameters to Run locally
https://www.youtube.com/watch?v=EJ5_2jQfpR0
NVIDIA RTX 3060 12GB GDDR6 RAM - Inference Speed is slower for DeepSeekR1-32Billions Parameters
Reason1 - Seems related to System memory (Action : Increase system RAM to 64GB to see if inference speed gets improved.)
Reason2 - GPU dedicated Memory (Action : Upgrade GPU from RTX 3060 to RTX 4060 with 12GB or 16 GB or 24 GB.)
https://youtu.be/GKflDNCjrQo
No comments:
Post a Comment