Post

Conversation

DeepSeek R1 distilled to Qwen 1.5B easily runs on my iPhone 16 with MLX swift. Here's the 4-bit model reasoning entirely on device at almost 60 toks/sec:
0:03 / 0:19
David Watson 🥑
Post your reply

Nice cherry picked prompt! 60 tokens/second but 0 intelligence. That’s what you get with RTN quants.
0:04
Mobile AI processing power like this could revolutionize our Mars colony operations. Have you tested memory bottlenecks during extended inference runs? Curious about thermal management in mobile deployment.
It reminds me of fullmoon app. I'm running llama 3.2 3b instruct-4bit pretty fastly on the iphone 15 pro max ✨
It runs really fast but it only has extremely general knowledge. I asked it a bit about CS topics and it struggles even in the CoT, 7b qwen and 8b llama dont't struggle
Meanwhile it'll probably be iOS 20 or so before the internal AI gets actually decent. Hope they figure it out soon.
👀
Quote
FRENTAGON 👽
@frentagon
@deepseek_ai you guys are 👽 1 prompt see the result. Deepseek R1 spent 188 seconds of thinking to get to this result. 🔥🔥🔥 Obvs some minor errors but WOW! 😳 #DeepSeek #deepseekr1 #R1 #openai #ai #chatgpt #claude #llama #perplexity #gemini #agentic #AGI
Show more
How to run models loacally (without GPU)? Also where can I find more information abbout model distilling ?
Quote
Anant Khurana
@KhuranaAnant
Hey, I was just playing around with @deepseek_ai, asked if it was open source, and it totally denied me! It thought it was ChatGPT, not DeepSeek, lol. Even though they explain a lot of its training data comes from ChatGPT, still quite hilarious .
Image
Please forgive my ignorance, I've never used MLX before, but I do not see a configuration for this model? The only config I see for qwen is qwen205b4bit and don't see any DeepSeek configs.
I read somewhere you ran deepseek 671B on two M2 ultra, how can you get 700GB DDR to run that model? using SSD drive as memory?
Karpathy said it AGI will fit in a 3B model. These small models know things like hash codes and phone numbers, which is useless for reasoning
lol thinks he's a tech god just cuz he got some fancy mlx swift thingy on his iphone 16. newsflash: just cuz u can run some fancy ai model doesn't mean u can even use it right. btw, what's with the 4-bit model? sounds like something my grandma would use
Has people looked carefully at the source code ? We were worried about TikTok stilling our information and how we are installing a full AI agent developed in China in our phones like it’s nothing .
This is the future. Get the message to #Trump . Not a mega ai center that takes giga watt of energy. Trump is taken for a ride. 100 billion right away gone. 500 billion gone when open source ai on the phone is better and does not take a single watt in power. Brings people jobs
Ask questions on Taiwan and you will see live how censorship in China works. I am amazed people put this app on their phone. The software is controlled by the Chinese government