expired Posted by SehoneyDP • Mar 15, 2024
Mar 15, 2024 4:09 PM
Item 1 of 1
expired Posted by SehoneyDP • Mar 15, 2024
Mar 15, 2024 4:09 PM
NVIDIA GeForce RTX 3090 Founders Edition Dual Fan 24GB GDDR6X GPU Card (Refurb)
(Select Stores) + Free Store Pickup$700
Micro Center
Visit Micro CenterGood Deal
Bad Deal
Save
Share
Top Comments
Check out /r/localllama on reddit.
148 Comments
Sign up for a Slickdeals account to remove this ad.
If you manage to get this to work, please also keep in mind the limitation of the hardware. Thunderbolt 3 has bandwidth of 5GB/s, while PCIE 3.0 16x has 16GB/s. Bandwidth may or may not matter to your workflow, depending on how often you transfer data between CPU & GPU.
I was using ollama in windows 11 and 7 and 13B parameter llama2 was somewhat useable with out the gpu, but after getting all of the nvidia drivers and cuda stuff installed and the 3090 recognized by ollama, it absolutely flies. Way better experience! And you can use 4-bit quantized models up to about 30B parameters on the 24GB vram.
Then I got wsl2 installed, and ollama and open-webview running in a docker container with gpu access and it is great to play with uncensored models and multi modal models and the API to learn about LLMs and LangChain etc.
I had tried to use oobabooga/textgenwebui a few months ago but had issues with virtual environment and dependencies with the install, so I love the simplicity of my docker setup with ollama…
I agree with the others that if you want to play with LLMs, get the biggest vram card you can afford, ideally 24GB.
FYI to everyone. You won't be able to walk out the door with the card you reserved. I had good luck with it getting made right, but there's no guarantee here.
I was using ollama in windows 11 and 7 and 13B parameter llama2 was somewhat useable with out the gpu, but after getting all of the nvidia drivers and cuda stuff installed and the 3090 recognized by ollama, it absolutely flies. Way better experience! And you can use 4-bit quantized models up to about 30B parameters on the 24GB vram.
Then I got wsl2 installed, and ollama and open-webview running in a docker container with gpu access and it is great to play with uncensored models and multi modal models and the API to learn about LLMs and LangChain etc.
I had tried to use oobabooga/textgenwebui a few months ago but had issues with virtual environment and dependencies with the install, so I love the simplicity of my docker setup with ollama…
I agree with the others that if you want to play with LLMs, get the biggest vram card you can afford, ideally 24GB.
Sign up for a Slickdeals account to remove this ad.