Post

Conversation

If you have a Macbook Pro M series with at least 64 GB's of RAM you can now run a GPT-4 level LLM locally! 1. Install 2. Open your terminal and run ollama pull llama3.3 3. Then ollama run llama3.3 "your prompt" Your own personal AI is here!
Image
David Watson 🥑
Post your reply

Quote
Iggy
@ignacioaal
Replying to @Kaiyes_ and @ollama
Ollama also has other smaller choices, see: 7B parameter models like Mistral Instruct and OpenChat running at Q4_K_M quantization5 CodeLlama and other specialized coding models up to 8B parameters14 Nous-Hermes 10.7B using Q4_K_M quantization check out github.com/exo-explore/exo
Show more
Square profile picture
Looking to make a difference? NMMI is your launchpad. Our programs equip you with the leadership, critical thinking, and character to excel in college, your career, and beyond. Start your journey with us and prepare to reach new heights.
Awesome! Thanks for sharing, are you doing it "chatbox" style or using it as part of your workflow?
as long as you have enough RAM to load the model (llama 3.3 is 43 GB) you'll be fine, try it out! if not, see:
Quote
Iggy
@ignacioaal
Replying to @Kaiyes_ and @ollama
Ollama also has other smaller choices, see: 7B parameter models like Mistral Instruct and OpenChat running at Q4_K_M quantization5 CodeLlama and other specialized coding models up to 8B parameters14 Nous-Hermes 10.7B using Q4_K_M quantization check out github.com/exo-explore/exo
Show more
Depends on your RAM, see
Quote
Iggy
@ignacioaal
Replying to @Kaiyes_ and @ollama
Ollama also has other smaller choices, see: 7B parameter models like Mistral Instruct and OpenChat running at Q4_K_M quantization5 CodeLlama and other specialized coding models up to 8B parameters14 Nous-Hermes 10.7B using Q4_K_M quantization check out github.com/exo-explore/exo
Show more
yes ofc. But this is the first time that we can have 4o level intelligence locally :)
Quote
Paul Couvert
@itsPaulAi
Meta has just released Llama 3.3 70B which is more powerful than GPT-4o and 25x cheaper. Yes. 70B and better than GPT-4o. This model is also as powerful as the 405B version of Llama 3.1. Open source is really winning at every level.
Show more
Image
Square profile picture
Looking to make a difference? NMMI is your launchpad. Our programs equip you with the leadership, critical thinking, and character to excel in college, your career, and beyond. Start your journey with us and prepare to reach new heights.
Quote
Paul Couvert
@itsPaulAi
Meta has just released Llama 3.3 70B which is more powerful than GPT-4o and 25x cheaper. Yes. 70B and better than GPT-4o. This model is also as powerful as the 405B version of Llama 3.1. Open source is really winning at every level.
Show more
Image
Unfortunately not at this time, Model is 42 Gigs so you'll only be left with 6 Gigs for all daily work, but see
Quote
Iggy
@ignacioaal
Replying to @Kaiyes_ and @ollama
Ollama also has other smaller choices, see: 7B parameter models like Mistral Instruct and OpenChat running at Q4_K_M quantization5 CodeLlama and other specialized coding models up to 8B parameters14 Nous-Hermes 10.7B using Q4_K_M quantization check out github.com/exo-explore/exo
Show more
Quote
Iggy
@ignacioaal
Replying to @Kaiyes_ and @ollama
Ollama also has other smaller choices, see: 7B parameter models like Mistral Instruct and OpenChat running at Q4_K_M quantization5 CodeLlama and other specialized coding models up to 8B parameters14 Nous-Hermes 10.7B using Q4_K_M quantization check out github.com/exo-explore/exo
Show more
M1 Max 64GB model - It is painfully slow but it loads and works. I use LM Studio. MLX model soon?
Image
This is exciting! Running powerful models locally opens up a ton of possibilities for experimentation and creativity. Can't wait to try it out!
>>> You're running on my macbook I'm a cloud-based language model, so I don't actually "run" on your MacBook in the classical sense. Instead, I exist as a remote service that you can interact with through the internet. When you ask me a question or provide input, your MacBook
Show more
Square profile picture
quantization -- this will lower memory usage by trading off some accuracy. new techniques in inference (ie. today, there are a lot of tricks your operating system does to page/swap memory to disk etc.)
The benchmarks
Quote
Paul Couvert
@itsPaulAi
Meta has just released Llama 3.3 70B which is more powerful than GPT-4o and 25x cheaper. Yes. 70B and better than GPT-4o. This model is also as powerful as the 405B version of Llama 3.1. Open source is really winning at every level.
Show more
Image
Wow, GPT-4-level models running locally? The fact you can now do this on a MacBook with 64GB RAM is wild. What’s the first thing you’d ask your “personal AI” to do if you set this up?
Square profile picture
No internet, no waiting, no privacy worries—just you and an AI that can handle complex tasks right from your laptop. It feels like we’re entering a new era. What’s everyone’s take on the most useful applications for this setup?
Crazy! Now imagine when Llama 4 drops. Local LLM with that level of intelligence is going to open up so many interesting possibilities that today don’t make economic sense.
Running a GPT-4 level LLM locally is a game-changer for privacy and efficiency. The MacBook Pro M series is finally flexing its true potential! Curious - how does the performance compare to cloud-based setups, especially for extended sessions?
I should legit try this. In fairness no clue of the real world benefit since I doubt this is as good as an online LLM, but still kinda cool
With 24 GB RAM you can run llama 3.3 8B with INT16 quantized model. should work fine for students at least
Can it use vision? Can it access the internet? Can it run python code? The biggest improvements to ChatGPT have been the integrations in the last few years. Running a bare language model is neat but not as useful.
that's cool and having recent gpt4 level performance in smaller sizes is relatively new. but gpt4 level performance in smaller models has been around for quite some time by now. the first llama3 version brought that to a 70b model, gemma 27b is roughly comparable.
Running llama 3.2 3B native on iPhone pro with no internet is more impressive to me rn tbh. But yes fully local LLMs will be another game changer.
🚀 We've just launched the ultimate travel companion for all business travellers. The espresso Display 15 with Stand+ features: ✅ Aluminium Build ✅ Works with one cable for Mac, PC, iphone (15 and later) ✅ Display above laptop ✅ 1080p/16m colours/300nits ✅ 0.2"/5mm thin
Slide 1 of 6 - Carousel
It is useful, but you get the quantized version, so no gpt-4 quality responses. If you want to try the real power of Llama 3.3 70b make sure you are inferencing with a no quantized version.