Post

Conversation

DeepSeek R1 distilled to Qwen 1.5B easily runs on my iPhone 16 with MLX swift. Here's the 4-bit model reasoning entirely on device at almost 60 toks/sec:

0:03 / 0:19

8:38 AM · Jan 22, 2025

Views

Post your reply

Awni Hannun

@awnihannun

Jan 22

This demo is possible thanks to some great work from David Koski on MLX swift,

@JohnMai_Dev

and

@DePasqualeOrg

on Swift Jinja and

@pcuenq

on Swift Transformers (among others).

No way

Gotta try it on my iPhone16 Yours is a pro or non pro?

Pro

TranslateKit: App Localization

@TranslateKitApp

Smart devs ship global. TranslateKit makes it happen:

From translatekit.app

What app is that?

mlx-swift-examples/Applications/LLMEval at main · ml-explore/mlx-swift-examples

Is that "fullmoon"? Works great.

That’s just the plain mlx-swift-example:

mlx-swift-examples/Applications/LLMEval at main · ml-explore/mlx-swift-examples

Let me try on VisionPro

Make sure you use the latest Jinja (update the package)

Share the code

It's all here

mlx-swift-examples/Applications/LLMEval at main · ml-explore/mlx-swift-examples

Nice cherry picked prompt! 60 tokens/second but 0 intelligence. That’s what you get with RTN quants.

0:04

I think you are using an outdated version with a missing chat template..

GIF

Yo,

is the new king of aramid fiber (600d) cases. They're super thin, have Magsafe, and come with a free tempered glass screen protector. I love it, even though it's an #ad. thinborne.com

This is exciting to see.

Austere News & Opinion

Omg

The possibilities for low-latency applications

Tails

@ChrisTaylo79273

Jan 22

Very nice! Thanks for sharing the repo!

madness…

“ai is bad bc energy consumption” folks have left the chat

2.9K

The Highly Automated Cat — e/acc

@atlantis__labs

Jan 22

is mlx as fast as llama.cpp now?

This is insane

how much does distilling it to qwen 1.5b reduce its quality?

6.1K

ICE UNIVERSE

@UniverseIce

If you like super thin aramid fiber case for your #zflip6 #zfold6

@thinborne

is the best brand for you. Now with Magsafe! #ad Fold 6: thinborne.com/products/super Flip 6 thinborne.com/products/super

How do I get this?!!!!

Wow

GIF

Gotta try it on my iPhone

Try it on something modern like a Huawei

Omg, they have a 1.5b?!? Any good?

Nova on Mars

@novaonmars

Jan 22

Mobile AI processing power like this could revolutionize our Mars colony operations. Have you tested memory bottlenecks during extended inference runs? Curious about thermal management in mobile deployment.

It reminds me of fullmoon app. I'm running llama 3.2 3b instruct-4bit pretty fastly on the iphone 15 pro max

It runs really fast but it only has extremely general knowledge. I asked it a bit about CS topics and it struggles even in the CoT, 7b qwen and 8b llama dont't struggle

2.4K

CASETiFY

@Casetify

Protect your #iPhone16. Explore our wide range of durable and drop-tested phone cases! Shop now. #CASETiFY

How did you set this up?

This is insane

We are so back. DeepSeek is soo back.

DeepSeek is Coming for OpenAI’s Neck

I love efficient models, I love on device AI.

Meanwhile it'll probably be iOS 20 or so before the internal AI gets actually decent. Hope they figure it out soon.

Does Qwen or Llama support function calling?

1.1K

Alex Krause (a19grey.eth)

@a19grey

Jan 22

Can ask a non stupid question? "What is the ahatonov Bohm effect in quantum mechanics" - In the scaling law for bend increase stiffness as width squared or cubed?

488

OneLLM Pro - LLM, All in One

@OneSoSearch

Jan 22

To better overcome the performance constraints of mobile, you can now use #DeepSeek #R1 through #Ollama Connect in OneLLM Pro. Additionally, you can use our App‘s OpenRouter module to connect #DeepSeek #R1 plus over 200+ other LLMs

Nicely done! How about using solo for this?

How come Apple can’t figure this out and have to rely on integrating OpenAI in their so called Apple Intelligence?

1.4K

CodeRabbit

@coderabbitai

AI-first pull request reviewer with context-aware feedback, line-by-line code suggestions, and real-time chat.

Merge code faster @quality. Get a free trial!

Inference instead of reasoning.

Anyway to run it on android? I got 16 gigs ram and I was wondering if I can run a 13b model on my phone

Why is Apple Intelligence still needed?

Just wow man.

Impressive performance on your iPhone! Innovation at its best.

What applications do you use it for? I found it hallucinated too much for me

I am loving Deepseek

Shepherd of Knowledge

@ShepOfKnowledge

Jan 25

What does “60 tokens a second” mean? So if I’m understanding right, you took Open Source DeepSeek and loaded it to an iPhone? Can you explain how to do that?

Also now supported in the rather nice Apollo app

Quote

FRENTAGON

@frentagon

Jan 22

@deepseek_ai you guys are

1 prompt see the result. Deepseek R1 spent 188 seconds of thinking to get to this result.

Obvs some minor errors but WOW!

#DeepSeek #deepseekr1 #R1 #openai #ai #chatgpt #claude #llama #perplexity #gemini #agentic #AGI

461

glueckkanja

@glueckkanja_

Are you searching for a simple way to deploy device certificates with #Intune? Check out SCEPman at scepman.com

Intune SCEP-as-a-Service

this is very noice!

Who is doing this for Android?

671

𝕵𝖔𝖍𝖓

֎ 𝕹𝖔𝖘𝖙𝖙𝖊𝖗 𝔏𝔦𝔫𝔨 𝔦𝔫 𝔟𝔦𝔬 ֍

@JohnStark3D2A

Jan 22

Is there anything like this for Android? I guess the newest models can handle it

Very cool… but 1.5b

. Nice preview of what’s to come though.

Anything like that for OCR?

This is so cool honestly

Nice.

Typing the llms and prompt is a big pain. Voice is the best interface.

Crazy

“Distilled” sounds like putting water in a filter and taking only the purified water

123

Matt Figdore

@mattfigdore

This is the biggest productivity cheat code right now. Kiss reading documents goodbye. You can get an instant summary of any document with this tool.

AI Summarizer

From pdfsummarizer.org

Incredible!

How to run models loacally (without GPU)? Also where can I find more information abbout model distilling ?

That's awesome! Thanks for sharing

Any android equivalent for this ??

Cool project

How can I install it on my iPhone 14 prox max ?

just tried deepseek R1-8b. we are still safe!

Quote

Anant Khurana

@KhuranaAnant

Jan 23

Hey, I was just playing around with @deepseek_ai, asked if it was open source, and it totally denied me! It thought it was ChatGPT, not DeepSeek, lol. Even though they explain a lot of its training data comes from ChatGPT, still quite hilarious .

112

CASETiFY

@Casetify

Personalize your #iPhone16 with Strap & Charms, and MagSafe Compatible accessories! Endless combinations to match your style. Shop now. #CASETiFY

Accessorize Your iPhone 16

which app is that?

Please forgive my ignorance, I've never used MLX before, but I do not see a configuration for this model? The only config I see for qwen is qwen205b4bit and don't see any DeepSeek configs.

I read somewhere you ran deepseek 671B on two M2 ultra, how can you get 700GB DDR to run that model? using SSD drive as memory?

Bro my five year old son can answer this one

zeroedge

@nobody_qwert

Jan 22

Karpathy said it AGI will fit in a 3B model. These small models know things like hash codes and phone numbers, which is useless for reasoning

lol

thinks he's a tech god just cuz he got some fancy mlx swift thingy on his iphone 16. newsflash: just cuz u can run some fancy ai model doesn't mean u can even use it right. btw, what's with the 4-bit model? sounds like something my grandma would use

Has people looked carefully at the source code ? We were worried about TikTok stilling our information and how we are installing a full AI agent developed in China in our phones like it’s nothing .

Per Lindholm

@Perrabyte

Jan 22

This is the future. Get the message to #Trump . Not a mega ai center that takes giga watt of energy. Trump is taken for a ride. 100 billion right away gone. 500 billion gone when open source ai on the phone is better and does not take a single watt in power. Brings people jobs

My iPhone 13 would shit itself if I tried this

Ask questions on Taiwan and you will see live how censorship in China works. I am amazed people put this app on their phone. The software is controlled by the Chinese government

#Apple’da Lokal Çalışan Yapay Zeka Sohbet Uygulaması: fullmoon

Apple'da Lokal Çalışan Yapay Zeka Sohbet Uygulaması: fullmoon - Medyateji®

From medyateji.com

glueckkanja

@glueckkanja_

Are you searching for a simple way to deploy device certificates with #Intune? Check out SCEPman at scepman.com

Intune SCEP-as-a-Service

From scepman.com

To view keyboard shortcuts, press question markView keyboard shortcuts

Post

Conversation

To view keyboard shortcuts, press question mark
View keyboard shortcuts