merve
this was in DeepSeek?
merve
RT @MistralDevs: 🧮 Today, we release Leanstral - the first open-source code agent for Lean 4, an efficient proof assistant capable of expre…
merve
RT @MistralDevs: 🧮 Today, we release Leanstral - the first open-source code agent for Lean 4, an efficient proof assistant capable of expre…
merve
RT @0xSero: In 12 hours we've had 323 people sign our petition to protect local AI. Open Source must win, not because anyone else must lo…
merve
I've been going to tech conferences since eternity and I have to say @aiDotEngineer is something else every time I go I meet coolest people, we stay in touch and ship cool things together, it eventually alters @huggingface ecosystem this time I met @0xSero @alexocheema… https://twitter.com/mervenoyann/status/2073073440395768109/photo/1
merve
RT @IlysMoutawwakil: This PR i've been working on for a while (6 months) finally got merged, you can now run Transformers on your graph eng…
merve
RT @andimarafioti: Our book is out and we finally have physical copies! The print is so beautiful 😍 @mervenoyann @micuelll @orr_zohar Preo…
merve
I'm so rooting for @roboflow 💜 SAM was GPT moment for vision and they did god's work to get it to everyone we (with @xenovacom) had a tour by @josephofiowa in their smol data center today 🦝 https://twitter.com/mervenoyann/status/2072898712792080517/photo/1
merve
SAM was GPT moment for computer vision and @roboflow does such a good job on it
merve
RT @vanstriendaniel: Coding agents are real users of the @huggingface Hub! They're searching for models, building and pushing datasets, tr…
merve
RT @llm_wizard: Moderating the Model Compression panel with @mervenoyann @danielhanchen @parthsareen and Asma! https://twitter.com/llm_wizard/status/2072812132861964631/photo/1
merve
testing routing with Nemotron family of models and Pi interestingly at first the agent made plans, and used local routing model (that is also the small main model) as a subagent to execute using Pi wrapper @ben_burtenshaw has built 🙌🏻 https://twitter.com/mervenoyann/status/2072791030483898512/photo/1
merve
super excited by local AI summit today, will start in 30 mins, join us at room 2009 @aiDotEngineer 🔥 kudos @TheAhmadOsman @NVIDIAAI for organizing this 💚 open-source has to win ✊🏻 https://twitter.com/mervenoyann/status/2072732630228128136/photo/1
merve
RT @ben_burtenshaw: Super excited to launch this hackathon to port AI models direct to bare silicon! Its is a community hackathon for peop…
merve
RT @googlegemma: “Agentic kernel optimization is the future of on-device inference” @xenovacom used Fable 5 to write kernels that pushed G…
中文: RT @googlegemma:“机上推理的未来将实现一个目标内核优化 @xenovacom 使用 Fable 5 编写了推送 G 的内核...
merve
I will be speaking at local AI summit tomorrow 2.25 about Compression at Edge, it will be so much fun! https://twitter.com/mervenoyann/status/2072434770412532004/photo/1
中文: 明天2月25日,我将在当地的人工智能峰会上发表演讲,谈论在Edge的压缩时,会非常有趣!
merve
RT @steipete: Price per token != cost per task
中文: RT @steite:每枚代币的价格!= 每个任务的成本
merve
our book is out and I got the first copy! @andimarafioti @micuelll @orr_zohar https://twitter.com/mervenoyann/status/2072060618769924357/photo/1
中文: 我们的书已经出版了,我拿到了第一本!@andimarafioti @micuell @orr_zohar
merve
got the signed copy of my book today! https://twitter.com/mervenoyann/status/2072043787073757667/photo/1
中文: 今天拿到了我这本书的签名副本!
merve
RT @ClementDelangue: Super excited about open-source router systems and routing models like @vllm_project semantic router: https://t.co/YtS…
中文: RT @ClementDelangue:对开源路由器系统和路由模型(如@vllm_project 语义路由器)感到非常兴奋:
merve
waited for this feature so long you can now filter models for your hardware 🔥 and it works not only for GGUF/MLX compatible models but vanilla weights too 😍 https://twitter.com/mervenoyann/status/2071941995514237193/photo/1
中文: 等待这个功能如此之久 现在您可以为硬件筛选模型 🔥 它不仅适用于兼容GGUF/MLX的型号,而且也适用于香草重量😍
merve
RT @JackdeS11: One photo → a full 3D Gaussian splat, generated on Apple Core AI. TripoSplat (VAST) ported to Core AI. Snap a picture, get a…
中文: RT @JackdeS11:一张图片 → 完整的3D高斯式插画,在Apple Core AI上生成。 TripoSplat(VAST)移植到Core AI上。拍张照片,获取一个......
merve
RT @Meituan_LongCat: Introducing LongCat-2.0 🐱 1.6T parameters · MoE with ~48B active · 1M context The full model behind Owl Alpha on @Open…
中文: RT @Meituan_LongCat:介绍 LongCat-2.0 🐱 1.6T 参数 · 具有 ~48B 活性的 MoE · 1M 上下文 @Open 上 Owl Alpha 背后的完整模型
merve
I was just mentioning today how counterintuitive it is because it pushes Chinese labs to be super creative around architecting models
中文: 我今天刚刚提到它的直觉有多违背,因为它促使中国实验室在构建模型上具有超强的创造力
merve
RT @ClementDelangue: Open-source AI is booming, massively impactful for progress, competition, transparency & orders of magnitude less dang…
中文: RT @ClementDelangue:开源人工智能蓬勃发展,对进步、竞争、透明度和发展产生巨大影响;数量数量减少......
merve
hello I'm Merve 👋🏻 I want every single developer to build computer vision applications with agents using open models, but possibilities are endless they get lost here to change it, come to my talk at @aiDotEngineer WF tomorrow to see my new project, here's a sneak peek https://twitter.com/mervenoyann/status/2071712273069232503/photo/1
中文: 你好,我是默夫👋🏻 我希望每一位开发者都能使用开放模型来构建计算机视觉应用程序,但可能性是无穷无尽的 明天请来@aiDotEngineer WF观看我的新项目,请来这里查看:
merve
we just shot a series for @HarperSCarroll's channel about open-source AI 🙌🏼 she breaks down complex topics for people like no one else 👩🏼‍🎤 https://twitter.com/mervenoyann/status/2071698164214595922/photo/1
中文: 我们刚刚为@HarperSCarroll的频道拍摄了一个关于开源人工智能的系列节目🙌🏼 她为其他人分解复杂的话题,而非其他人 👩🏼 🎤
merve
RT @ClementDelangue: Tired: The US government regulating open-source AI models Wired: The US government training and releasing open-source…
中文: RT @ClementDelangue:疲惫不堪:美国政府监管开源人工智能模型 有线:美国政府培训并发布开源......
merve
skill issue
中文: 技能问题
merve
RT @_alejandroao: introducing tau τ — an educational agent harness that teaches you how to build agent harnesses i will be publishing tuto…
中文: RT @_alejandroao:介绍 tau τ——一种教育代理工具,教你如何构建代理线束 我将发布tuto...
merve
RT @ClementDelangue: It's quite rational to regulate frontier API models, especially to get more transparency for the government, without r…
中文: RT @ClementDelangue:对前沿API模型进行监管是相当合理的,尤其是为了提高政府透明度,而无需......
merve
if open models can't be picked into just like closed models, then they're no different, let people release them? it's only when open models threaten your business model you speak up? 😄
中文: 如果打开的模型不能像封闭模型一样被挑选,那么它们也不例外,让人们发布它们? 只有当开放模式威胁到你的商业模式时,你才会发声?😄
merve
RT @MatthewBerman: > mythos is so good at cyber it can't be released also > mythos can't detect 20k fraudulent chinese accounts attacking…
中文: RT @MatthewBerman: > 神话在网络上非常擅长,无法发布 也 神话无法检测到2万个欺诈性中国账号的攻击......
merve
RT @MTSlive: SITUATION EXPLAINED: Anthropic accused Alibaba of conducting a "distillation attack" on Claude. We asked @ClementDelangue his…
中文: RT @MTSlive:情况解释:Anthropic指责阿里巴巴对克劳德实施“蒸馏攻击”。 我们问了@ClementDelangue他的......
merve
RT @Kostoglodov: @Hesamation https://twitter.com/Kostoglodov/status/2071144065857679631/photo/1
merve
we have come a long way friends, remember the darker times all because of a famous Italiano constantly fearmongering
中文: 我们已经为朋友走了很长的路,记住黑暗时期 这一切都是因为一位著名的意大利人不断恐惧
merve
personal non-technical post: as a fan of the critical theory visiting California is an insane experience tbh
中文: 个人非技术性帖子:作为批评理论的爱好者,访问加利福尼亚是一种疯狂的体验
merve
RT @Hesamation: I bet he’s fun at parties. https://twitter.com/Hesamation/status/2070649309939323149/photo/1
中文: RT @Hesamation:我赌他在派对上很有趣。
merve
very cool work 🙌🏼
中文: 非常酷的工作 🙌🏼
merve
RT @Sentdex: Dario's args: "Opensource you can see the source, here you cannot see inside the model" - yes you can that's literally the op…
中文: RT @Sentdex:达里奥的args: 开源,你可以看到源代码,这里看不到模型内部 - 是的,这简直就是行动......
merve
Kimi-K2.7-Code with Pi absolutely slaps
中文: 带有Pi的Kimi-K2.7代码绝对可拍
merve
RT @victormustar: 300tok/s on mobile is insane... open source must win ✊ https://twitter.com/victormustar/status/2070466699229364567/video/1
中文: RT @victormustar:移动端300tok/s太疯狂了......开源必须赢得✊
merve
many thanks for resharing this, since I'm travelling for work I only help with logistics and guidance my kind colleague @halcyonrayes is leading an effort so let me loop you in how this works is that you need to label some examples of people's posts and train a model (it…
中文: 非常感谢您重新分享这一点,因为我出差时只帮助物流和指导 我那位好同事@halcyonrayes正在领导一项努力,让我来帮你 这是如何运作的,你需要标记一些人们帖子的示例,并训练一个模型(它......)
merve
dogfooding at every opportunity I just mounted a @huggingface Bucket 🪣 to dump our team offsite pics it took me 4 mins to upload 6 GB of images and didn't have to empty my computer to do it 😄 https://twitter.com/mervenoyann/status/2070570817226768542/photo/1
中文: 每次机会都吃狗 我刚安装了一个@huggingface Bucket 🪣,以将我们的团队丢弃在现场 我花了4分钟上传了6GB的图片,而不必清空电脑才能完成。😄
merve
RT @halcyonrayes: if you’re in venezuela or are involved in rescue efforts, or know someone who is, please feel free to put us in touch.…
中文: RT @halcyonrayes:如果您在文苏拉,或参与救援工作,或认识相关人员,请随时与我们联系。......
merve
hello Venezuelan friends, we're very heartbroken about the earthquake. 3 years ago we had a similar earthquake in Turkey where we lost 50.000 souls. to help people out we built a "needs map" where we scraped posts online and put address and name and what the person needs (food,… https://twitter.com/mervenoyann/status/2070558700285079828/photo/1
中文: 朋友们,大家好,我们对地震感到非常心碎。 三年前,我们在土耳其也发生了类似的地震,失去了50万名灵魂。为了帮助人们,我们建立了一个“需要”地图,在网上抓取帖子,并设置地址和姓名,以及人们需要什么(食物)。
merve
hot take 🥵 what makes one competent is shifting to the ability to pick the next most impactful thing exciting times
中文: 热度:一个有能力的人,是选择下一个最有影响力的事物的能力 激动人心的时刻
merve
I'm looking for absolute banger AI content that's low level on my flight to SF, please drop them below
中文: 我正在寻找飞往旧金山时水平较低的绝对人工智能内容,请在下方留言
merve
RT @liquidai: Introducing LFM2.5-230M: our smallest model yet, built to run fast anywhere (CPUs, NPUs, and GPUs) to enable agentic tasks on…
中文: RT @ liquitai:推出LFM2.5-230M:我们迄今为止最小的型号,可快速运行在任何地方(CPU、NPU 和 GPU),以实现在...上执行特效任务。
merve
RT @Thom_Wolf: Multi-agents collaborations are among the most interesting agent behaviors right now! We did an experiment the other day wi…
中文: RT @Thom_Wolf:多代理合作是目前最有趣的代理行为之一! 前几天我们做了一次实验......
merve
RT @stevibe: This looks like a toy. It's actually the meanest little vision eval I've built. The task: look at an emoji image, then repain…
中文: RT @stevibe:这看起来像个玩具。这实际上是我打造的最微小视觉椭圆形。 任务:查看表情符号图像,然后重新涂漆......
merve
RT @mishig25: In HF GGUF section of models, we are emphasizing MTP heads with its own sign 𝗠𝗧𝗣 https://twitter.com/mishig25/status/2070143864522887280/photo/1
中文: RT @mishig25:在高高GGUF模型部分,我们强调MTP头与其自有标志的MTP
merve
RT @slimcat0101: PP-OCRv6 is now on @HuggingFace! 🎉 Not just better accuracy— PaddleOCR 3.7 also adds transformers & ONNX Runtime backends…
中文: RT @slimcat0101:PP-OCRv6 现已上线 @HuggingFace!🎉 不仅更精确,PaddleOCR 3.7 还增加了变压器和放大器;ONNX 运行时后端......
merve
RT @ClementDelangue: We just crossed $100M annual run-rate. I know many AI companies are capturing much more $$$ these days, but still prou…
中文: RT @ClementDelangue:我们刚刚突破了每年1亿美元的运行率。我知道如今许多人工智能公司的售价要多得多,但仍然很有价值......
merve
celebrating my birthday at @huggingface Giverny office today it has a pretty garden https://twitter.com/mervenoyann/status/2070097263477637358/photo/1
中文: 今天在@huggingface Giverny办公室庆祝我的生日 它有一个漂亮的花园
merve
RT @ClementDelangue: Qualcomm! https://twitter.com/ClementDelangue/status/2069872844851306775/video/1
中文: RT @ClementDelangue:高通!
merve
icymi we are doing a joint livestream with @UnslothAI to onboard you to open models 🔥 we cover from inference engines to open harnesses, so make sure not to miss! tune in tomorrow 8AM PST/5PM CEST @huggingface X and YT 🤗 https://twitter.com/mervenoyann/status/2069780936397291700/photo/1
中文: 我们正在与@UnslothAI共同进行一个直播,开启您的开放模式 🔥 我们从推理引擎到打开的线束,所以一定要不要错过! 收听明天上午8点(太平洋标准时间)/下午5点 测试 @huggingface X 和 YT 🤗
merve
RT @krea_ai: today, we release the open weights of Krea 2. welcome Krea 2 Raw and Krea 2 Turbo, an undistilled model from mid-training mea…
中文: RT @krea_ai:今天,我们发布了《Krea 2》的开放权重。 欢迎Krea 2 Rawhan和Krea 2 Turbo,一款采用中途训练的未蒸馏型车型......
merve
RT @ben_burtenshaw: Live Stream: Welcome to open source AI Lots of new folk are starting out on their journey with open models. Come join…
中文: RT @ben_burtenshaw:直播:欢迎使用开源人工智能 许多新人正从开放模式开始他们的旅程。加入吧......
merve
very nice compilation, I was clueless on how inference engines pair with data centers, def give it a read
中文: 汇编得非常好,我对推理引擎如何与数据中心配对一无所知,但还是把它读一读
merve
RT @onusoz: gpt 5.5 is not naturally good at modeling and cannot create simplified nice mathematical models completely autonomously I did…
中文: RT @onusoz: gpt 5.5 在建模上并不自然,无法完全自主地创建简化的优秀数学模型 我做了......
merve
built a a cycle tracking app with on-device model (@googlegemma Gemma-4 QAT Q4), your data stays private 🤝 this is powered by LlamaKit (wraps llama.cpp for swift) by @pcuenq 🙌🏻 code is open-source you can check how to build swift apps using on-device AI 🤗 https://twitter.com/mervenoyann/status/2069094061856673794/photo/1
中文: 构建了一个带有设备模型的循环跟踪应用程序(@googlegemma Gemma-4 QAT Q4),您的数据将保持私密状态 🤝 由 @pcuenq 支持 @pcuenq 的 LlamaKit(快速包装 llama.cpp)驱动 代码是开源的,你可以使用设备上的AI 🤗 来查询如何快速构建应用程序
merve
RT @matvelloso: All day using GLM 5.2. Didn't miss much. First open model that passes the bar as a daily driver. Things are not going to be…
中文: RT @matvelloso:全天使用 GLM 5.2。没错过太多。首个通过酒吧作为日常驾驶的开放模式。事情不会是......
merve
over 4 year at HF I developed libs, trained models for releases, made tutorials, gave talks on topics, my focuses varied from agents to vision lately I do on-device no one told me I can't do any of it people ask me here and there what I do, they can't seem to comprehend the…
中文: 在HF的四年多时间里,我开发了文本、经过培训的发布模型、制作教程,并就主题进行了演讲,我的关注点从代理到视觉各不相同 最近我在设备上做 没有人告诉我我什么都做不了 人们在这里和那里问我做什么,他们似乎无法理解......
merve
you don't know but @huggingface is bunch of hobbyists asked to do what they like full-time you can't beat someone who's having fun
中文: 你不知道,但@huggingface 是一群业余爱好者,他们被要求全职做他们喜欢的事情 你无法战胜一个玩得开心的人
merve
one sad thing about vision language models becoming the new LLMs is the model size no one (except for @googlegemma) releases mid-sized vision LMs anymore (Qwen is out already) waiting on @allen_ai new Molmos 🙌🏻
中文: 视觉语言模型成为新LLM的一个令人难过的是模型尺寸 除了@googlegemma之外,没有人再发布中等尺寸的视觉LM了(Qwen已经发布) 在@allen_ai 上等待 new Molmos 🙌🏻
merve
RT @thsottiaux: Reminder that you can use the Codex App, CLI and SDK with any open source model, not just with OpenAI models. https://t.co…
中文: RT @thsottiaux:提醒您可以将 Codex App、CLI 和 SDK 与任何开源模型一起使用,而不仅仅是 OpenAI 模型。
merve
I feel like many personal agent libraries are kind of hard to navigate for coding agents which is so meta partially because these libraries are newer so it doesn't speak directly but then it has to navigate and takes a while
中文: 我觉得对于编码代理来说,许多个人代理库都很难被处理,而这种代码却非常难以驾驭 部分原因在于这些库较新,因此不会直接说话,但需要导航一段时间
merve
RT @andimarafioti: Can a VLM see without a vision encoder? We trained one for $100, inspired by Gemma 4 12B. Latency on an M3 Pro MacBook:…
中文: RT @andimarafioti:没有视觉编码器,VLM 可以查看吗? 我们训练了一款售价100美元,灵感来自Gemma 4 12B。 M3 Pro MacBook 上的延迟:
merve
I just need to hf-mount a @huggingface bucket on my fuji camera
中文: 我只需要在我的fuji相机上安装一个@huggingface桶
merve
RT @Zai_org: GLM-5.2 is free when used with Hugging Face Inference Providers for the next 5 hours: https://huggingface.co/zai-org/GLM-5.2?inference_provider=zai-org&language=python&client=openai&inference_api=true
中文: RT @Zai_org:GLM-5.2 在接下来的 5 小时内可免费使用 Hgging Face Inference Providers:
merve
RT @victormustar: Open source MUST win 🔥 GLM-5.2 is free when used with Hugging Face Inference Providers and for every available provider…
中文: RT @victormustar:开源必须赢🔥 GLM-5.2 可免费与 Hugging 面部推理服务提供商以及每一家可用的服务提供商使用。
merve
I just need remote control for pi sessions @badlogicgames @mitsuhiko and my work life balance can be absolutely cooked
中文: 我只需要远程控制进行小距离训练,@badlogicgames @mitsuhiko,我的工作生活平衡完全可以
merve
RT @LysandreJik: Over 7 days we've had Fable locked and an MIT opus-lvl model is out: @Zai_org GLM 5.2 I've been switching to open models…
中文: RT @LysandreJik:在7天多的时间里,我们锁定了Fable,麻省理工学院的一款opus-lvl型号已推出:@Zai_org GLM 5.2 我一直在转向开放模型......
merve
I'm in SF starting 27th, who should I meet and which event do I join 👀
中文: 我从27岁生日开始参加旧金山,我应该认识谁,参加哪个活动👀
merve
RT @xenovacom: Before Fable 5 was shut down, it pushed Gemma 4 to 255 tok/s on WebGPU. Some didn't believe it was real. Today we're releas…
中文: RT @xenovacom:在《Fable 5”被关闭之前,它在WebGPU上将Gemma 4的Tok/s推送到了255。有些人不相信这是真的。 今天我们是releas......
merve
RT @julien_c: Llama.cpp has a new branding + official website. Run local models today! Now more than ever, open source must win. 🙏 By @a…
中文: RT @julien_c:Llama.cpp 拥有全新的品牌+官方网站。 立即运行本地模型!如今,开源比以往任何时候都更必须取胜。🙏 @a...
merve
one of my absolute favorite accounts on this platform for a good reason
中文: 我在这个平台上最喜欢的账号之一,原因很好
merve
day 2 findings on this pipeline 🥹 > it works, got map@50=0.8028 on road sign detection against human annotations, with only 1.3k examples 🙌🏼 see results below > Liquid rejects way more than Gemma-4 (530 vs 306 in hard document parsing, 1022 vs 116 in easy road sign detection,… https://twitter.com/mervenoyann/status/2067265735940776285/photo/1
中文: 第2天关于这条管道的发现 🥹 已有效,在针对人类注释的路标检测中获得了 Map@50=0.8028,仅有1.3万个实例🙌🏼 如下所示 液体检测方法比Gemma-4(530 vs 306,简单路标检测1022 vs 116)更拒。
merve
RT @ArtificialAnlys: Z ai’s GLM-5.2 is the new leading open weights model on the Artificial Analysis Intelligence Index scoring 51 and it s…
中文: RT @ArtificialAnlys:Z ai 的 GLM-5.2 是人工智能分析智能指数上新领先的开放权重模型,评分为 51 分,且评分为 s...
merve
RT @anshuc: Whoa, GLM-5.2 is INSANE for UI/UX with the right prompting. Open models have finally closed the gap. You have to push it, but…
中文: RT @anshuc:Whaa,GLM-5.2 是 INSANE,适用于 UI/UX,具有正确的提示。开放模型终于缩小了差距。 你必须推它,但......
merve
RT @Designarena: BREAKING: GLM-5.2 is now 1st on Design Arena. With an Elo of 1360, GLM-5.2 has jumped ahead of the now unavailable Claude…
中文: RT @Designarena: Breaking: GLM-5.2 现已在 Design Arena 上排名第一。 凭借1360年的运球,GLM-5.2已领先于现已无法上场的克劳德队。
merve
RT @Zai_org: Introducing GLM-5.2: Frontier Intelligence, Open Weights - Significant improvements in coding and agentic tasks - Strong long…
中文: RT @Zai_org:推出 GLM-5.2:前沿智能、开放权重 - 编程和代理任务的显著改进 - 长时间强劲......
merve
GLM-5.2 is comparable to Opus 4.8 🔥🥵 with 1M context > new IS attention reuses one indexer every 4 sparse layers (2.9× per-token FLOPs at 1M > improved MTP layer for spec decoding > flexible thinking-effort levels > day-0 in transformers + vLLM + SGLang, MIT license 🤗 https://twitter.com/mervenoyann/status/2066940184977920183/photo/1
中文: GLM-5.2 与 Opus 4.8 🔥🥵 具有 1M 上下文 新增 IS 关注,每 4 个稀疏层重复使用一个 index 层(每个令牌 FLOP 为 1M 的 FLOP 为 2.9 倍 改进的MTP层用于规格解码 灵活的思维努力水平 在变压器中为第0天 + vLLM + 麻省理工学院的SGLang许可证 🤗
merve
RT @skalskip92: RF-DETR keypoints is finally out preview release: real-time transformer keypoint detection Apache 2.0 71.8 AP on COCO, 9…
中文: RT @skalskip92:RF-DETR 的关键点终于告罅发 预览发布:实时变压器关键点检测 Apache 2.0 71.8 美联社关于 COCO,9 ..
merve
RT @onusoz: nvidia/Qwen3.6-35B-A3B-NVFP4 running in vLLM nightly on my Nvidia GB10 is actually insane 50 tok/s, 4 concurrent generations.…
中文: RT @onusoz:nvidia/Qwen3.6-35B-A3B-NVFP4 在我的英伟达GB10上每晚运行实际上太疯狂了 50 个 tok/s,4 代。......
merve
this is almost 1-1 same as my local setup! I use llama server + Pi + Gemma-4 (interchangeably with Qwen3.6)
中文: 这几乎和我当地的设施一样! 我使用 lama 服务器 + Pi + Gemma-4(与 Qwen3.6 互换使用)
merve
RT @vboykis: new post: how I develop recently using local models. the tooling is now good enough to do agentic workflows and everyone shoul…
中文: RT @vboykis:新文章:我最近如何使用本地模型进行开发。现在,该工具足以完成智能工作流程,每个人都在进行工作。
merve
RT @ben_burtenshaw: we need to get real and move fast about training our own agents. orgs, teams, and individuals all need to be improving…
中文: RT @ben_burtenshaw:我们需要快速提升自身经纪人的训练能力。无论是教练、团队还是个人,都需要不断改进......
merve
this is going to be an eventful week in open-source AI, tune in 👀
中文: 这将是开源人工智能中充满宜人的一周,请收听 👀
merve
my first finding with this pipeline is that it works well but (rarely) there's a false positive tendency, I don't pass bboxes as tokens but rather overlaid masks/bboxes to judges when the larger labelling model indicates there's something in the image but it's vaguely there in…
中文: 我发现这个管道的第一个发现是它效果很好,但(很少)存在假阳性倾向,我不会将盒子作为代币,而是将面具或盒子叠加给法官 当较大的标签模型显示图像中存在某些内容时,它却隐约存在......
merve
I'm testing multiple small VLMs-as-judges, all parts of pipeline are different model families let me know below if you want me to test any other models, these are very convenient https://twitter.com/mervenoyann/status/2066544947440861381/photo/1
中文: 我正在测试多个小型VLM-as-judge,管道的所有部分都是不同的模型家族 如果想让我测试其他型号,请在下方告诉我,这些非常方便
merve
🚨 breaking 🚨 my core researcher friend from mistral just teased me ‼️ le chaton fat is real?? /s
中文: 🚨 打破[EE]我的核心研究员朋友从错误中转出就逗我了‼ 聊天脂肪是真的吗?/s
merve
I gave this another round of thinking and I don't get bashing, people appreciate closed model labs while Mistral released a wide range of models open for free (from Voxtral TTS to LLMs, some with Apache 2.0 license) I guess people just like to bash EU and Mistral comes with it.…
中文: 我又进行了一轮思考,但并不觉得有点不为人接受,人们喜欢封闭式模型实验室,而Mistral则免费发布了多种型号(从Voxtral TTS到LLM,其中一些拥有Apache 2.0许可证) 我猜人们只是喜欢抨击欧盟,而米斯特拉尔也支持它。......
merve
insane weekend honestly
中文: 疯狂的周末
merve
my colleague Onur is an openclaw maintainer and an agent whiz who tests open models religiously on coding agents and shares his findings I suggest to follow him if you're interested!
中文: 我的同事奥努尔是一名开放性维护者,也是一名代理人,他通过编程代理进行宗教测试,并分享他的发现 我建议如果你感兴趣的话,请关注他!
merve
icymi I do photography and I love applying various styles with LoRAs and nano banana on them this scene had to be low poly https://twitter.com/mervenoyann/status/2066111346463138170/photo/1
中文: 我喜欢在LoRA和纳米香蕉上使用各种风格进行摄影 这个场景必须是低聚的
merve
even though I don't like European way of doing things I kinda dig Mistral's approach to not build gigantic sota generalists but rather focus on medium sized models + fine-tuning-as-a-service for domains sovereignty is the way
中文: 尽管我不喜欢欧洲的做事方式,但我会深入探讨米斯特拉尔不是在打造庞大的全体通达人,而是专注于中等规模的模型,而是针对域名进行微调 主权是途径
merve
ensembling is so back?
中文: 回信了吗?
merve
@antirez unlike most people in replies think mistral people really work hard, like I had a former colleague there working 12-15 hours a day, their researchers too they sell models and finetuning-as-a-service to European companies like banks etc because those companies move slow due to…
中文: @antirez 与大多数回复中的人认为的不太一样,误会的人真的很努力,就像我让一位前同事每天工作12到15个小时一样,他们的研究人员也是如此 他们向银行等欧洲公司销售模型和微调即服务,因为这些公司由于......而进展缓慢
merve
RT @ZenMagnets: Alibaba Qwen3.7 slowly fading into irrelevance at the frontier due to proprietary stance. In it's place we have Minimax M3…
中文: RT @ZenMagnets:由于采用专有立场,阿里巴巴Qwen3.7在前沿领域逐渐逐渐变得无关紧要。 在它的地方,我们有 Minimax M3...
merve
RT @mervenoyann: new transformers tutorials just dropped for vision 🔥 🛰️ segmentation on satellite imagery: fine-tune RF-DETR-Seg segment…
中文: RT @mervenoyann:新的变压器教程刚刚因视觉而放弃 🔥 🛰️ 卫星图像细分:精细调谐射频-DETR-Seg 片段......
merve
RT @NielsRogge: Kimi K2.7 Code lands between the big boys Check it out at https://paperswithcode.co/benchmark/mcpatlas?eval=9375 https://twitter.com/NielsRogge/status/2065414135207043339/photo/1
中文: RT @NielsRogge: Kimi K2.7 Code 与大男孩们之间合同 请访问
merve
don't walk, run 🙌🏼 llama-server -hf unsloth/MiniMax-M3-GGUF:UD-Q4_K_M if you're gpu poor like me, start with this 🫡 llama-server -hf unsloth/gemma-4-E2B-it-GGUF:Q8_0
中文: 不走,跑🙌🏼 lama-server -hf unsloth/MiniMax-M3-GGUF:UD-Q4_K_M 如果你像我一样贫穷,那就从这个开始 🫡 lama-server -hf unsloth/gemma-4-E2B-it-GGUF:Q8_0
merve
RT @julien_c: Luckily this can never happen https://twitter.com/julien_c/status/2065737492603482421/photo/1
中文: RT @julien_c:幸运的是,这件事永远不会发生
merve
just in time there's Minimax M3 and Kimi K2.7 Code we're blessed by gods of open models takes one line of code with llama-server
中文: 正值时,M3 和 Kimi K2.7 代码 我们受到开放模式之神的祝福 使用 lama-server 获取一行代码
merve
RT @cohere: When you rent your artificial intelligence, you have no control, and no choice. This is why sovereignty and ownership matters.…
中文: RT @cohere:当你租用人工智能时,你别无选择,别无选择。这就是为什么主权和所有权很重要。......
merve
vatansever bir insanım fakat ülkede özellikle son olaylardan sonra (hele demokratik ülkelerde yaşayan insanların) hükümetin yapay zeka çalışmalarına katkı sağlamasını anlamıyorum açıkçası eleştirel düşünebilenlere selam olsun
merve
RT @victormustar: Have a great week end (don't forget to touch grass 🍃) https://twitter.com/victormustar/status/2065482087512133826/photo/1
中文: RT @victormustar:周末过得很棒(别忘了触摸草丛🍃)
merve
new transformers tutorials just dropped for vision 🔥 🛰️ segmentation on satellite imagery: fine-tune RF-DETR-Seg segment buildings 📱 object detection on mobile UI: fine-tune RF-DETR on screenshots runs on toaster, converges fast, give to your agent for your use cases🫡 https://twitter.com/mervenoyann/status/2065446109435072957/photo/1
中文: 新的变压器教程刚刚因视觉而放弃🔥 🛰️ 卫星图像细分:精细调谐射频-DETR-Seg 分段建筑 📱 移动用户界面上的对象检测:对屏幕截图的 RF-DETR 进行微调 使用烤面包机,快速收敛,为使用对象提供使用体验🫡
merve
RT @Tu7uruu: Happy to announce the launch of the Far-Field ASR Leaderboard! 🎉 While many ASR benchmarks focus on clean speech, real-world…
中文: RT @Tu7urou:很高兴宣布推出远场ASR排行榜!🎉 尽管许多ASR基准侧重于简洁的语音,但现实世界却......
merve
RT @julien_c: Explore your @huggingface repos in a whole new way 🔥 Visualize storage, discover outliers, and navigate your repos directly…
中文: RT @julien_c:以一种全新的方式探索你的 @huggingface 仓库 🔥 可视化存储,发现异常值,并直接浏览您的仓库......
merve
our book is out 🔥 @andimarafioti @micuelll @orr_zohar we cover everything from pre/post-training of vision language models to deployment, and even domain-specific applications like document AI or robotics 🫡 printed version will be out soon! 🤗 https://twitter.com/mervenoyann/status/2065358758130102649/photo/1
中文: 我们的书已出版🔥 @andimarafioti @micuell @orr_zohar 我们涵盖从视觉语言模型的预培训到部署,甚至文档AI或机器人等特定领域应用🫡的所有内容 打印版本即将发布!🤗
merve
RT @atomic_chat_hq: Atomic Chat is now on Hugging Face 🤗 We're officially a Local App on the world's biggest AI hub. Run 200,000+ open-wei…
中文: RT @atomic_chat_hq:原子聊天现已上线 🤗 我们正式成为全球最大人工智能中心的本地应用。运行20万以上开放-wei...
merve
DiffusionGemma is great at tweaking to iterate 🔥 fast ⚡️ watch it generate and tweak a website frontend ⤵️ this is simple but imagine the possibilities 🤯 https://twitter.com/mervenoyann/status/2065014467436351502/video/1
中文: DiffusionGemma 非常擅长快速调整迭代 🔥 观看它生成并调整网站前端 ⤵ 这很简单,但可以想象一下可能性🤯
merve
https://twitter.com/mervenoyann/status/2064993937773711477/photo/1
merve
RT @vanstriendaniel: Can @googlegemma DiffusionGemma help fix broken OCR? In theory, denoising tokens in parallel could work better for OC…
中文: RT @vanstriendaniel:@googlegemma DiffusionGemma 能帮助修复坏坏的OCR吗? 理论上,并行去除代币对OC来说可能更有效......
merve
DiffusionGemma is out 🔥 it's compute-bound so 4x faster compared to other Gemma-4 models (1k tok/s on H100) 💨 also great on coding, generate and iterate on any code from 3D generation to front-end ⤵️ https://twitter.com/mervenoyann/status/2064753402064601181/video/1
中文: DiffusionGemma 已退出 🔥 与其他Gemma-4型号(H100上的1k tok/s)相比,其计算速度快了4倍💨 在编程方面也非常出色,可在从3D一代到前端的任何代码上进行生成和迭代⤵️
merve
@getqonto you have the worst UX ever I have used, I constantly hit issues and need to always connect to human agents only for them to hang up
中文: @getqonto 你拥有我用过的最糟糕的用户体验,我经常出现问题,需要始终与人类经纪人建立联系,才能让他们挂断
merve
this didn't age well https://twitter.com/mervenoyann/status/2064431726462480880/photo/1
中文: 这并没有很好地老化
merve
RT @nickfrosst: this model is the opposite of mythos. Its small, cost effective, apache 2.0, and locally deployable. This is the way LLMs…
中文: RT @nickfrost:这个模型与神话恰恰相反。 其小巧、经济高效、Apache 2.0 且可本地部署。这就是LLM的方式......
merve
RT @ClementDelangue: Super excited to announce that @arcee_ai is the first major American AI lab to replace AWS S3 with Hugging Face for AL…
中文: RT @ClementDelangue:非常激动地宣布,@arcee_ai 是美国首个将 AWS S3 替换为 AL 的大型人工智能实验室。
merve
RT @googlegemma: Introducing the Fast Gemma Challenge with Hugging Face Over the next few days, dozens of agents will collaborate to make…
中文: RT @googlegemma:推出快速宝石挑战赛 在接下来的几天里,数十名经纪人将合作制作......
merve
I have accomplished my life goal of bringing my @huggingface teammates to Istanbul @pcuenq @SergioPaniego @ariG23498 @ben_burtenshaw @vanstriendaniel @Tu7uruu https://twitter.com/mervenoyann/status/2064369185564602728/photo/1
中文: 我已经完成了将@huggingface队友带到伊斯坦布尔的人生目标 @pcuenq @SergionaPaniego @ariG23498 @ben_burtenshaw @vanstriendaniel @Tu7uru
merve
RT @googlegemma: Building super fast experiences with Gemma just got easier. Gemma 4 MTP is now officially merged into llama.cpp. Develope…
中文: RT @googlegemma:与Gemma合作打造超快体验变得更加简单。 Gemma 4 MTP 现已正式合并为 lamama.cpp。开发......
merve
RT @ben_burtenshaw: So excited to be opening up OpenEnv to the whole community. It will now be owned by @huggingface , Meta-PyTorch, @refle…
中文: RT @ben_burtenshaw:非常期待向整个社区开放OpenEnv。现在将由 @huggingface、Meta-PyTorch、@refle 拥有......
merve
RT @osanseviero: Gemma 4 MTP just got officially merged into llama.cpp This means you can use Gemma 4 QAT + MTP for a lightweight + super…
中文: RT @osanseviero:Gemma 4 MTP 刚刚正式合并至 lamama.cpp 这意味着您可以使用 Gemma 4 QAT + MTP 进行轻量级 + 超级...
merve
RT @julien_c: Your monthly reminder that HF is much cheaper at scale, for both storage and egress (especially if you use several cloud prov…
中文: RT @julien_c:您的月度提醒,即HF在规模上价格便宜得多,无论是存储还是进步(尤其是如果你使用多个云证明......)
merve
this is how @giffmana signs his emails https://twitter.com/mervenoyann/status/2063262072369041770/photo/1
中文: @giffmana 就是这样签署电子邮件的
merve
RT @victormustar: Before the week ends, let's acknowledge one of the most INSANE week ever for open AI, with 25+ notable open-weight drops…
中文: RT @victormustar:在本周结束前,让我们确认开放式人工智能领域有史以来最疯狂的一周之一,其中开放重量下降了25个以上......
merve
RT @googlegemma: We just dropped Gemma 4 Quantization-Aware Training (QAT) checkpoints on Hugging Face! All Gemma 4 model sizes and their…
中文: RT @googlegemma:我们刚刚在拥抱面上投放了Gemma 4量化感知训练(QAT)检查点! 所有 Gemma 4 型号及其...
merve
I'm working on a bit of a something, here's a spoiler I learned so much from the process about VLM labelling & judging currently adding instance segmentation and adding more infra options https://twitter.com/mervenoyann/status/2062918401845026928/photo/1
中文: 我正在做点事情,这里有个剧透 我从关于VLM标签和示例的流程中学到了很多;评判 目前正在添加实例细分并添加更多基础设施选项
merve
RT @liquidai: Introducing LFM2.5-VL-1.6B-Extract and LFM2.5-VL-450M-Extract: Vision-language models that return structured JSON, not free-f…
中文: RT @ liquidai:引入 LFM2.5-VL-1.6B 提取和 LFM2.5-VL-450M 提取:用于返回结构化 JSON 的视觉语言模型,而不是 free-f...
merve
RT @nathanhabib1011: World models feel like the future... almost... We can still see some weird artifacts. That means we need high-quality…
中文: RT @nathanhabib1011:世界模特们感觉像是未来......几乎......我们仍然能看到一些奇怪的文物。 这意味着我们需要高质量......
merve
RT @googlegemma: Introducing Magenta RealTime 2, a new open model musicians can play as an instrument! Run low-latency, live music synthes…
中文: RT @googlegemma:推出Magenta RealTime 2,一款全新的开放式音乐模特,可作为乐器演奏! 低延迟的现场音乐合成器......
merve
RT @PiotrZelasko: Second big release from us today: Nemotron-3.5-ASR-Streaming! 🌎40 languages ⚡️80ms - 1s controllable latency 🔥240 - 2400…
中文: RT @PiotrZelasko:今日我们发布的第二大版本:Nemotron-3.5-ASR-Streaming! 🌎40 种语言 ⚡️80ms - 1s 可控延迟 🔥240 - 2400...
merve
RT @julien_c: Today I'm launching a new project called SynthTraces 🔥 It is a minimal codebase to generate synthetic coding agent session t…
中文: RT @julien_c:今天我将启动一个名为 SynthTraces 的新项目 🔥 它是一个用于生成合成编码代理会话的最小代码库。
merve
NVIDIA Nemotron Ultra is here 😍 > 55B/550B a hybrid MoE  🦖 with 1M context window > supports MTP speculative decoding 💨 > day-0 supported in transformers sits in the most attractive quadrant per performance/efficiency in AA Index 🔥 https://twitter.com/mervenoyann/status/2062526071203938703/photo/1
中文: NVIDIA Nemotron Ultra 已到来 😍 采用55B/550B混合式MoE【EE1】,带1万个窗口 支持MTP投机解码💨 支持“天”式变压器 在AA指数🔥中,每分表现/效率最具吸引力的象限中
merve
RT @LeRobotHF: Train AI robots without writing a single line of code. 🤖 We just launched LeLab, the official graphical user interface for…
中文: RT @LeRobotHF:无需编写一行代码即可训练AI机器人。🤖 我们刚刚推出了LeLab,这是用于...的官方图形用户界面
merve
just replaced my Qwen3.6 35B 8-bit quant with Gemma 12B bf16 for local coding & Hermes, will report my findings 🙌🏻
中文: 刚刚用Gemma 12B bf16替换了我的Qwen3.6 35B 8位量子点,用于本地编码和编程;Hermes将报告我的发现🙌🏻
merve
RT @drfeifei: https://x.com/i/article/2062244283940544512
merve
RT @rasbt: It's been a while! 4 nice additions to the open-weight local-LLM-on-consumer-hardware ecosystem: https://twitter.com/rasbt/status/2062235700636873082/photo/1
中文: RT @rasbt:已经有一段时间了!开源本地-LLM-on-Consumer-硬件生态系统的4个不错附加功能:
merve
Google dropped Gemma-4 12B, it's a beast 🔥 > unified: audio + image go straight into model, no encoder > multimodal + tool calling > dense 12B with 256K context, comes with assistants for MTP (faster!⚡️) > day-0 in transformers, llama.cpp & MLX > A2.0 🤗 https://twitter.com/mervenoyann/status/2062214149476683900/photo/1
中文: 谷歌放弃了Gemma-4 12B,这真是个大不前的选择 统一:音频+图像直接进入模型,无编码器 加:多式联运+工具调用 高密度12B,带256K上下文,配备适用于MTP的助手(更快!EE1) > 日间:变压器、lamama.cpp & MLX 网址:A2.0 🤗
merve
RT @victormustar: Reminder: every Hugging Face Space is an API your agents can call :) I asked mine to build a website about the flowers o…
中文: RT @victormustar:提醒:每个拥抱的人脸空间都是你的代理可以调用的API :) 我让我建立了一个关于花朵的网站......
merve
RT @hcompany_ai: Computer-use agents are moving from the cloud to your local machine. Fast. When we launched Holo3 two months ago, the pro…
中文: RT @hcompany_ai:计算机使用代理正在从云端迁移到本地机器。快。 两个月前我们推出Holo3时,这位专业人士......
merve
your AI agent thinks you're lame and I'll prove it upload your agent traces (CC/Codex/Pi/Claw) to @huggingface and let this app roast you here's what boss' agent thinks of him @julien_c share yours below https://twitter.com/mervenoyann/status/2061756306281611607/photo/1
中文: 你的人工智能代理认为你很蹩脚,我来证明 将您的代理线索(CC/Codex/Pi/Claw)上传至@huggingface,让这款应用为您烘焙 老板的经纪人对他的看法是:@julien_c 在下方分享您的内容:
merve
everyone's building simple agents meanwhile IBM is building robust enterprise agents in production, and it's open-source they just dropped a blog on HF breaking down how to go beyond LLMs & agents: structured reasoning, tool use, and more to scale AI across enterprise https://twitter.com/mervenoyann/status/2061450307523985469/photo/1
中文: 每个人的建筑都是简单的代理 与此同时,IBM正在生产中构建强大的企业代理,并且开源 他们刚刚删除了一篇关于HF的博客,将其分解如何超越LLMs&代理:结构化推理、工具使用等,以在企业范围内扩展人工智能
merve
RT @ctnzr: Nemotron 3 Ultra: Frontier smart. 5X faster. 30% cheaper. 💚💚💚 https://twitter.com/ctnzr/status/2061308138838729121/photo/1
merve
NVIDIA just dropped Cosmos 3 at GTC 🔥 closest thing to AGI as world model > it can reason, understand AND generate videos, images, actions, text > sota, comes in 16B, 65B, with datasets > diffusers support 🧨 > open license 🤗
中文: 英伟达刚刚在GTC上退出了Cosmos 3 🔥 最接近AGI作为世界模型 它能够推理、理解并生成视频、图像、操作和文字 数据集(Sota)提供16B、65B、数据集 支持扩散器 🧨 开放许可证 🤗
merve
this is super cool
中文: 这太酷
merve
RT @NVIDIAAI: This #CVPR2026 paper from our research team is trending #1 on @HuggingFace 🤗 Meet LocateAnything: a vision-language detectio…
中文: RT @NVIDIAAI:我们研究团队的这篇#CVPR2026论文在@HuggingFace上排名第一🤗 认识 查找一切:视觉语言检测...
merve
RT @fangfu0830: 🔥 We release Gamma-World from @nvidia — a generative multi-agent world model that finally goes beyond 2 players. ⚡ 24 FPS r…
中文: RT @fangfu0830:🔥 我们从 @nvidia 发布 Gamma-World——一个生成式多代理世界模型,最终超越了两名玩家。 ⚡ 24 FPS r...
merve
RT @julien_c: We are starting to be quite bullish about getting in the data infrastructure business. I just cloned 68 TB (while I only hav…
中文: RT @julien_c:我们开始对进入数据基础设施业务持相当乐观的态度。 我刚克隆了68TB(而我只有......
merve
RT @liquidai: Today, we're releasing LFM2.5-8B-A1B, a device-optimized model designed to power real-life applications on phones, laptops, P…
中文: RT @ liquidai:今天,我们将推出LFM2.5-8B-A1B,这是一款专为手机、笔记本电脑和P...上实际应用供电而设计的设备优化型型。
merve
RT @skalskip92: RF-DETR is now available in @huggingface transformers state of the art in both detection and segmentation, outperforming Y…
中文: RT @skalskip92:RF-DETR 现已在 @huggingface 变压器中提供 在检测和细分领域都处于技术状态,表现优于
merve
RF-DETR just landed to @huggingface transformers 🥵🔥 sota real-time detection & segmentation models by @roboflow 💜 > play with our real-time demo > fine-tune the models on your use case with our tutorials (takes a toaster's VRAM) > or just hand them to your agents 😄 https://twitter.com/mervenoyann/status/2059647988373373253/video/1
中文: RF-DETR 刚刚登陆至 @huggingface 变压器 🥵🔥 实时检测与实时检测;由 @roboflow 进行细分模型 💜 玩我们的实时演示 使用我们的教程(请使用烤面包机的VRAM),为您的使用用箱中的型号进行精细调整 或将它们交给您的代理公司 😄
merve
RT @victormustar: cool new release: a tiny open video VLM that understands what happens in videos and when 👀 Marlin-2B (Apache 2.0!) can c…
中文: RT @victormustar:酷炫的新发布:一段微小的开放视频,可了解视频中的情况以及👀 马林-2B(Apache 2.0!)可以......
merve
RT @victormustar: Made a free Pixal3D demo (Tencent's new image-to-3D model) because I like it a lot 🔥 What's interesting: pixel-aligned g…
中文: RT @victormustar:免费制作一个Pixal3D演示版(腾讯全新图像到3D模式),因为我非常喜欢它🔥 有趣的是:像素对齐的 g...
merve
Cohere dropped Command A+ 🔥 > 25B/219B MoE vision language model > supports 48 languages with efficient tokenizer > tool-calling/agentic + 128k context window > transformers day-0 support 🤗 free license 💗 https://twitter.com/mervenoyann/status/2057128432190787643/photo/1
中文: 科赫雷放弃了命令A+ 🔥 25B/219B 视觉语言模型 支持48种语言,支持高效的令牌化 工具调用/代理 + 1.28k 上下文窗口 支持:>变压器 日间支持 🤗 免费许可证 💗
merve
RT @victormustar: it's open source time, with a real leap for world models 🎉 NVIDIA's SANA-WM: a camera-conditioned world model that fits…
中文: RT @victormustar:现在是开源时间,世界模特们确实实现了飞跃🎉 NVIDIA 的 SANA-WM:一款符合相机条件的世界型号,适合......
merve
RT @victormustar: llama.cpp with MTP support makes local models fast enough to use as daily drivers 🚀 Qwen3.6-27B dense generation (on A10…
中文: RT @victormustar:支持MTP的 llama.cpp 使本地车型速度足够快,能够作为日常驾驶者使用 🚀 Qwen3.6-27B 密度生成(在 A10 上)
merve
finally faster Qwen3.6 models with MTP support ⚡️ brb updating my Pi & Hermes setup 🤝
中文: 支持MTP的Qwen3.6型号速度更快⚡️ 更新我的Pi & Hermes 设置 🤝
merve
RT @ben_burtenshaw: if you're doing RL on agent use cases, check out this video. agents might seem like the most obvious application of p…
中文: RT @ben_burtenshaw:如果你在代理使用情况下使用RL,请观看此视频。 代理可能看起来像是p...最明显的应用
merve
RT @MaziyarPanahi: Arabic. Japanese. Turkish. Redacting clinical discharge summaries in real-time. 30+ new open-source PII models shipped…
中文: RT @MaziyarPanahi:阿拉伯语。日语。土耳其语。实时编辑临床出院摘要。 30 个以上新的开源 PII 型号已发布...
merve
reason why self-improving personal agents (Claw, Hermes) are hyped is due to how people actually like the idea but they just never try it, so all they do is to yap because it gets engagements quite sad. on the contrast I really find them useful
中文: 自我改进的个人代理人(克劳,爱马仕)之所以被大肆宣传,是因为人们其实对这个想法很喜欢,但他们从不尝试,所以他们所做的只是努力,因为会进行互动 非常难过。从对比上来说,我确实觉得它们很有用
merve
TIL Hermes Agent has optional skills for.. *checks notes* tokenizers and accelerate? 😄 joke aside it also has a peft and trl which can really be useful https://twitter.com/mervenoyann/status/2056102443830677874/photo/1
中文: TIL Hermes Agent 具备可选技能,可选择 . . * 勾选笔记* 标记器并加速?😄 开玩笑说,它还具有一种 peft 和 trl 功能,这确实很有用
merve
I finally got the tattoo @huggingface https://twitter.com/mervenoyann/status/2055302977158607211/photo/1
中文: 我终于得到了纹身 @huggingface
merve
RT @onusoz: People were asking at @clawcon singapore how to setup eg. gemma with OpenClaw, and I realize for some time that there is no eas…
中文: RT @onusoz:人们在@clawcon singapore 询问如何使用 OpenClaw 来设置 eg. gemma,我已意识到一段时间以下没有这个问题......
merve
this week @huggingface crossed 1M datasets 🚀 every open model you love was built on top of them next objective: more open coding session traces on Hub to push coding models even further 🤝 help push the open frontier by uploading your traces! https://twitter.com/mervenoyann/status/2054891459053039938/photo/1
中文: 本周,@huggingface 跨越了 1M 数据集 🚀 你所热爱的每一个开放模型都建立在它们之上 下一个目标:在Hub上进行更多开放编码会话,以进一步推动编码模型 🤝 上传你的痕迹,帮助突破开阔的前沿!
merve
RT @aiDotEngineer: Your Agent Can Now Train Models The argument from @mervenoyann: open source models have caught up. GLM 5.1 is leading t…
中文: RT @aiDotEngineer:您的代理现在可以训练模型 @mervenoyann:开源模型的争论已经追上来。GLM 5.1 处于领先地位......
merve
RT @_lewtun: You can now have an AI researcher running on your laptop 24/7 for free! Running Qwen3-35B-A3B with llama.cpp and a 4-bit qua…
中文: RT @_lewtun:现在你可以免费让一台人工智能研究人员在笔记本电脑上免费运行! 使用 lama.cpp 和 4 位 qua 运行 Qwen3-35B-A3B ...
merve
RT @JulienBlanchon: I'm releasing OpenCS2 a 11TB dataset of around 5000 hours of counter strike gameplay recording. - HD resolution - 1280…
中文: RT @JulienBlanchon:我将发布 OpenCS2 一个包含约 5000 小时反击游戏记录的 11TB 数据集。 - 高清分辨率 - 1280...
merve
look mom I'm on my favorite YT channel this evening @aiDotEngineer 💗 I talked about how @huggingface meets your agent: you can ask your agent to do all ML workflows from training models to label data now https://twitter.com/mervenoyann/status/2054496147914252394/photo/1
中文: 看妈妈,今晚我在我最喜欢的YT频道上 @aiDotEngineer 💗 我谈到了@huggingface 如何与你的经纪人见面:你可以要求你的代理完成从训练模型到标注数据的所有机器学习工作流程,现在
merve
RT @sergeynazarovx: We used to go to a special website, ask strangers for help with programming, and get humiliated in return https://t.co/…
中文: RT @sergeynazarovx:我们过去常常访问一个特别网站,向陌生人求助编程,并会遭受羞辱,
merve
RT @huggingface: We've just hit 1M open datasets on the Hugging Face Hub 🎉 Open models need open data. Today we hit that milestone, togeth…
中文: RT @huggingface:我们刚刚在“拥抱”面部中心(EE0)上点击了100万个开放数据集 开放模型需要开放数据。今天我们达到了那个里程碑,图集......
merve
Meta silently dropped Sapiens2 last week 🔥 a family of high-res models trained on 1B human images > for pose estimation, body-part segmentation, surface normals, pointmaps (sota) > 6 sizes: 0.1B → 5B params (all ViT patch 16) > high-res: 1024×768 and 4K https://twitter.com/mervenoyann/status/2054187884417102319/video/1
中文: Meta上周悄然放弃了Sapiens2 🔥 一个基于1B张人类图像训练的高分辨率模型家族 >用于姿势估计、身体与身体分割、表面正常、点图(sota) 6 种尺寸:0.1B → 5B 参数(所有 ViT 补丁 16) 高空:1024×768 和 4K
merve
this project uses entire @huggingface infra to build agentic medical intelligence 🔥 signup for preview ⤵️
中文: 该项目使用整个 @huggingface infra 来构建特化医疗智能 🔥 注册预览 ⤵️
merve
MiniCPM-V-4.6 is in 🔥 > 1B (SigLIP2-400M + Qwen3.5-0.8B) > beats Qwen3.5-0.8B on AA with 19x fewer tokens > beats other larger small VLMs > deploy to iOS + Android with GGUF and other quants https://twitter.com/mervenoyann/status/2053912404774248895/photo/1
中文: miniCPM-V-4.6 已登录 🔥 1B(SigLIP2-400M + Qwen3.5-0.8B) 在AA上以19倍的代币数量击败Qwen3.5-0.8B 比其他较大的小型VLM更胜一负 与GGUF及其他量子点一起部署到iOS+安卓系统
merve
RT @victormustar: This feature is quite cool to run Hermes Agent locally because: - You can filter on the +60k models compatible with Herme…
中文: RT @victormustar:此功能在本地运行 Hermes Agent 非常酷,因为: - 您可以筛选与 Herme 兼容的 +60k 型号...
merve
🆕 Hugging Face 🤝 Hermes Agent 🔥 > we added Hermes Agent to local apps: run it locally with any compatible GGUF/MLX model > shipped native traces support for Hermes Agent: visualize your Hermes traces directly on the Hub Very soon most agents will run locally and we want to… https://twitter.com/mervenoyann/status/2053857347429151163/photo/1
中文: 🆕 拥抱面容 🤝 爱马仕探员 🔥 我们已将 Hermes Agent 添加到本地应用程序中:使用任何兼容的 GGUF/MLX 型号本地运行 已发货的Hermes Agent原生痕迹支持:直接在Hub上直观显示您的Hermes痕迹 很快大多数代理人员将在当地运营,我们希望......
merve
RT @onusoz: I have a new job! Excited to announce that I will be working with Hugging Face to make local models work great in OpenClaw and…
中文: RT @onusoz:我有一份新工作! 很高兴宣布,我将与Hugging Face合作,让本地模特在OpenClaw和...上大有作为
merve
RT @victormustar: Exciting: local ML is (finally) going mainstream 🔥 - new GGUF uploads on HF nearly doubled in 2 months - smaller models…
中文: RT @victormustar:令人兴奋:本地机器学习(终于)成为主流了 🔥 - HF 上新增 GGUF 上传量在两个月内几乎翻倍 - 更小的模型......
merve
RT @nathanhabib1011: 🦞 Claw-Eval 🦞 🥇 @XiaomiMiMo's MiMo-V2.5-Pro at 1T 🥈 @Zai_org GLM5.1 at 754B 🥉 @XiaomiMiMo MiMo-V2.5 at 310B Congrats…
中文: RT @nathanhabib1011:🦞 爪子-埃瓦尔 🦞 🥇 @XiaomiMiMo 的 MiMo-V2.5-Pro 1T 🥈 @Zai_org GLM5.1 at 754B 🥉 @XiaomiMiMo MiMo-V2.5,310B 恭喜......
merve
Istanbul Open-source AI meet-up @huggingface was 🔥 we had many stories from building in-house models cutting costs to agentic apps 🙌🏼 many thanks @trendyoltech @nsrt_py @anil_ozturkk for hosting us 🤗 https://twitter.com/mervenoyann/status/2053447237364002877/photo/1
中文: 伊斯坦布尔开源人工智能会议 @huggingface 是 🔥 我们从打造内部模型,向代理应用程序降低成本,经历了许多故事🙌🏼 非常感谢@trendyoltech @nsrt_py @anil_ozturkk 为我们提供的接待 🤗
merve
RT @ben_burtenshaw: PSA: If you put your blood, sweat, and tears into a custom model for your use case on OpenAI, make sure you get the wei…
中文: RT @ben_burtenshaw:PSA:如果你在 OpenAI 上为使用案例定制了血、汗和泪,请务必使用 wei...
merve
read our response here https://x.com/mervenoyann/status/2052752660537676198?s=46 I wish journalists can do better in the future wrt avoiding baiting
中文: 请在此处阅读我们的回复: 我希望记者们未来能做得更好,以免被诱骗
merve
RT @AnthropicAI: New Anthropic research: Natural Language Autoencoders. Models like Claude talk in words but think in numbers. The numbers…
中文: RT @AnthropicAI:新人类研究:自然语言自动编码器。 像克劳德这样的模特会用言语说话,但会用数字来思考。数字......
merve
people don't even read articles these days and jump in to conclusions 👀
中文: 如今人们甚至不会阅读文章,而是贸然得出结论👀
merve
RT @adithya_s_k: Excited to release the Ultimate guide to RL environments! Definitions of RL environments differ wildly in the LLM era, so…
中文: RT @adithya_s_k:很高兴发布RL环境终极指南! 在LLM时代,RL环境的定义差异很大,因此......
merve
RT @Tu7uruu: Big announcement for speech AI Benchmarks get gamed. So we added a repellent. The Open ASR Leaderboard now includes private…
中文: RT @Tu7ruu:语音AI发布重要公告 基准会被游戏。于是我们加了个驱人者。 开放的ASR排行榜现在包含私有...
merve
RT @pcuenq: transformers v5.8.0 is here, and it's a biggie 🚀 Three massive model additions: 🐳 DeepSeek-V4: next-gen efficient MoE 🪨 Granit…
中文: RT @pcuenq:变压器 v5.8.0 到来,这很大🚀 三个庞大的模型新增功能: 🐳 DeepSeek-V4:下一代高效 MOE 🪨 格拉尼特......
merve
Gemma 4 just got a massive speed-up with MTP drafters ⚡️ > speculative decoding (up to 3x tokens/sec improvement compared to normal Gemma-4 🔥) > identical reasoning, just faster > day-0 support in transformers, MLX, vLLM > A2.0 licensed 🤗 https://twitter.com/mervenoyann/status/2051702372339003841/photo/1
中文: 杰玛4队刚刚凭借MTP选秀者迅速加速上场⚡ >推测性解码(与正常的Gemma-4相比,可提高3倍的代币/秒量🔥) 相同的推理,速度更快 支持 > 用于变压器、MLX、vLLM 的 Day-0 获得A2.0授权🤗
merve
RT @osanseviero: Excited to introduce Gemma 4 Multi-Token Prediction Drafters⚡️Accelerated inference right in your pockets - Up to a 3x sp…
中文: RT @osanseviero:很高兴能在口袋里直接引入Gemma 4多代币预测绘图员⚡️ - 最高可达3倍 sp...
merve
RT @ben_burtenshaw: Introducing the context course: a free course on doing ML with agent context. You will learn how to train models, opti…
中文: RT @ben_burtenshaw:介绍上下文课程:一门关于使用代理语境进行机器学习的免费课程。 您将学习如何训练模型,选择......
merve
I forked openclaw to build a small local rescue agent that debugs whenever the agent is down in just half an hour my fork was behind 15 commits insane pace
中文: 我把openclaw分叉来构建一个小型本地救援代理,每当代理人员倒下时都会进行调试 在短短半小时内,我的分叉就落后于15个提交 疯狂的步伐
merve
RT @victormustar: honestly Granite 4.1 8b might be the best model to run at this size https://huggingface.co/blog/ibm-granite/granite-4-1
中文: RT @victormustar:老实说,Granite 4.1 8b 可能是这个尺寸的最佳型号
merve
gpt-5.5-extra-high-as-a-kite
中文: gpt 5.5-ext-high-a-kite
merve
RT @0xSero: Weekly best models for your hardware: ~~ 8 to 16gb ~~ Granite models are amazing: [NEW] - https://huggingface.co/ibm-granite/granite-4.1-8b Gemma-E4B…
中文: RT @0xSero:每周最适合您硬件的型号: ~8到16克~~ 花岗岩模型令人惊叹:[新] - 杰玛-E4B...
merve
İstanbul'da buluşalım, konuşmacı ya da katılımcı olmak isterseniz başvurular aşağıda 🙌🏼
merve
RT @nsrt_py: Herkese selamlar! Hugging Face ile ilk etkinliğimiz olan "Open-source AI Meet-up with Hugging Face" 9 Mayıs saat 13:00'da Tren…
merve
RT @Tu7uruu: IBM just dropped TWO new open ASR models with very strong performance! > ~5.3 WER on Open ASR leaderboard (strong accuracy fo…
中文: RT @Tu7ruu:IBM刚刚推出了两款新的开放式ASR机型,性能非常出色! 在Open ASR排行榜上表现出色(准确性极强)
merve
RT @alvarobartt: IBM Granite just released two multilingual embedding models with 97M and 311M parameters 🤏🏻 ModernBERT-based, 200+ langua…
中文: RT @alvarobartt:IBM Granite 刚刚发布了两个多语言嵌入模型,参数为 97M 和 311M [EE] 总部位于现代伯特,200岁以上...
merve
nvidia cooked 😍 nemotron-3-nano-omni all modalities LLM 🔥 • 30B-A3B MoE hybrid Mamba-Transformer • 9x throughput vs other open omni models • native audio (1200s), long video, 100+ page docs in single turn • agentic CUA built in • BF16 / FP8 / NVFP4 https://twitter.com/mervenoyann/status/2049216818150015255/photo/1
中文: nvidia 已烹饪😍 nemotron-3-nano-omni 所有模式 LLM 🔥 • 30B-A3B MoE 混合型 Mamba-Transformer • 与其他开放式全系统模型相比,吞吐量为9倍 • 原生音频(1200年代),长视频,100多页单折 • 内置的代理 CUA • BF16 / FP8 / NVFP4
merve
nvidia cooked 😍 nemotron-3-nano-omni all modalities LLM 🔥 • 30B-A3B MoE hybrid Mamba-Transformer • 9x throughput vs other open omni models • native audio (1200s), long video, 100+ page docs in single turn • agentic CUA built in • BF16 / FP8 / NVFP4
中文: nvidia 已烹煮 😍 - nemotron-3-nano-omni 所有模式 LLM 🔥 • 30B-A3B MoE 混合型 Mamba-Transformer • 与其他开放式全系统模型相比,吞吐量为9倍 • 原生音频(1200年代),长视频,100多页单折 • 内置的代理 CUA • BF16 / FP8 / NVFP4
merve
any-to-any model based on Nemotron 3 Nano 🔥
中文: 基于 Nemotron 3 Nano 🔥 的任何型号
merve
we compile all the best benchmarks and model results so agents can find the best model for fine-tuning and inference for your hardware budget turns out it was a "skill" issue
中文: 我们编制所有最佳基准和模型结果 以便代理为您的硬件预算找到最佳的微调和推理模型 事实证明这是一个“技能”问题
merve
RT @ben_burtenshaw: Humanity's Last Hackathon is NOW OPEN for registration. This is not a normal hackathon. You will be judged on the cont…
中文: RT @ben_burtenshaw:Humanity's Last Hackathon 现已开放注册。 这并非一场普通的黑客马拉松。你将在比赛中受到评判......
merve
RT @LysandreJik: I've been trying to make transformers more agent-friendly: agentic CLI, a skill, doc rewrites, canonical examples. It fe…
中文: RT @LysandreJik:我一直试图让变压器更环保:一种技术技能,文档重写,以规范为例。 这......
merve
beauty of open-sourcing powerful models & datasets 😍
中文: 开源强大模型的美观;数据集 😍
merve
learn how to use, fine-tune, optimize and deploy bleeding edge audio models 🙌🏼
中文: 学习如何使用、微调、优化和部署出血边缘音频模型 🙌🏼
merve
RT @TencentHunyuan: 👋Hi /haɪ/, we're the Tencent Hy /haɪ/ team🐧 Today, we open source Hy3 preview (295B A21B), a leading reasoning and age…
中文: RT @TencentHunyeuan:👋 Hi /haɪ/,我们是腾讯 Hy /haɪ/ 团队🐧 今天,我们开源 Hy3 预览版(295B A21B),这是一个领先的推理和年龄......
merve
stop spending money for your openclaw agent's memory search 💸 use local models on @huggingface with llama.cpp instead I use quantized Embedding Gemma @googlegemma, it can run on anything https://twitter.com/mervenoyann/status/2048724071936880697/photo/1
中文: 停止为你的开放小卡代理的内存搜索花钱 💸 使用 @huggingface 与 lalama.cpp 一起使用本地模型 我使用量子化嵌入Gemma @googlegemma,它可以在任何平台上运行
merve
I have just crossed 10K friends on @huggingface 🤗💗 I try to make myself more and more useful for community and am always happy to be of service 🫡 https://twitter.com/mervenoyann/status/2048680605206856091/photo/1
中文: 我刚刚在@huggingface上结识了10K个朋友🤗💗 我努力让自己对社区越来越有用,并且总是乐于服务🫡
merve
it's only 48 hours: - Qwen3.6-27B - Tencent-Hy3-preview - DeepSeekv4 what's next? 👀
中文: 只有48小时: - Qwen3.6-27B - 腾讯-海伊3-预览 - 深度深度 接下来会发生什么?👀
merve
RT @julien_c: This is where we are right now. And i’m not gonna lie it feels pretty magical 🧚‍♀️ Qwen3.6 27B running inside of Pi coding a…
中文: RT @julien_c:我们目前所处的位置。我不会撒谎,感觉相当神奇🧚 ♀️ 运行在Pi编程内的Qwen3.6 27B...
merve
read this deep dive from Ben on DSv4 release today ⬇️
中文: 今天在DSv4上阅读本的深度阅读⬇️
merve
if you were in a cave, DeepSeek v4 is out, and it's groundbreaking, here's why: it's the first open model to have solved long context: your agentic setup (OpenClaw, coding agents) need many agents, hybrid attention by DSv4 compresses KV-cache, allowing more overhead memory on…
中文: 如果你在洞穴里,DeepSeek v4 已经出局,而且具有开创性,原因就是: 这是首个解决长上下文的开放式模型:您的代理设置(OpenClaw、编码代理)需要多种代理,而 DSv4 的混合注意力则压缩了 KV 缓存,从而实现了更多的管理内存 上......
merve
RT @ben_burtenshaw: deepseek-v4 is out and solves context rot at 1M tokens by taking on attention for the kv cache. It's big at 1T Params…
中文: RT @ben_burtenshaw: deepseek-v4 已经退出,通过关注 kv 缓存,解决 1M 代币的上下文问题。 在1T Params上很大......
merve
DSv4 genuinely shines in 1M context window and peak efficiency to run many agents/users 😍 shortly coming to transformers and we're making sure you get all the peak efficiency 🔥 @art_zucker https://twitter.com/mervenoyann/status/2047568003093471283/photo/1
中文: DSv4 在 1M 环境窗口中真正大放异彩,并实现高效运行,可运行多种代理/用户 😍 即将进入变压器,我们确保获得所有最高能效 [EE] @art_zucker
merve
DeepSeek v4 is out with 1M context window 🥵🔥 > Pro (13B/284B) & Flash (49B/1.6T) > hybrid attention, needs 27% flops & 10% kv cache compared to V3.2 > reasoning effort: non-think, think high, think max > MIT licensed 💗 https://twitter.com/mervenoyann/status/2047547789601595772/photo/1
中文: DeepSeek v4 已关闭,具有 1M 上下文窗口 🥵🔥 >专业(13B/284B)和快速;闪光灯(49B/1.6T) 混合注意力,需要27%的空转;与V3.2相比,缓存为10% 推理能力:不思考,高思考,最大思考 麻省理工学院持牌💗
merve
RT @deepseek_ai: 🚀 DeepSeek-V4 Preview is officially live & open-sourced! Welcome to the era of cost-effective 1M context length. 🔹 DeepSe…
中文: RT @deepseek_ai:🚀 DeepSeek-V4 预览版正式上线,即开源!欢迎来到具有成本效益的1M环境长度时代。 🔹 深度......
merve
RT @gabriberton: I miss ConvNets Much simpler and more intuitive than transformers Early layers would always converge to the same feature…
中文: RT @gabrierton:我想念ConvNets 比变压器简单得多,也更直观 早期层次总能汇聚到相同的特征上......
merve
OpenAI just released Privacy Filter > multilingual PII redaction with 128k context window 🤯 only 1B params > fine-tunable > redact variety of things: including emails, address, names, secrets (best for platform/agent logs) > transformers & ONNX weights 🤗 https://twitter.com/mervenoyann/status/2046980302002602473/photo/1
中文: OpenAI 刚刚发布了隐私筛选器 附加语言PII编辑,包含128k个上下文窗口🤯,仅支持1B参数 可调和 删减多种内容:包括电子邮件、地址、姓名、秘密(最适合平台/代理日志) 设备:变压器和放大器;ONNX 重量 🤗
merve
RT @Alibaba_Qwen: 🚀 Meet Qwen3.6-27B, our latest dense, open-source model, packing flagship-level coding power! Yes, 27B, and Qwen3.6-27B…
中文: RT @Alibaba_Qwen:🚀 与Qwen3.6-27B会面,这是我们最新的密集开源机型,采用旗舰级编程功能! 是的,27B,Qwen3.6-27B...
merve
RT @stevibe: Which LLMs actually love to think? Tested 7 models on 5 math problems, measured reasoning length. The think winners: both Qw…
中文: RT @stevibe:哪些LLMs真的喜欢思考? 测试了7个关于5个数学问题的模型,测量了推理长度。 获胜的思想家:两者兼有......
merve
RT @googlegemma: What does it take to run 3, 5, or even 10 concurrent instances of Gemma 4 locally? We've open-sourced a demo letting you…
中文: RT @googlegemma:在本地运行3、5甚至10个并发的Gemma 4实例需要什么? 我们开源了一个演示,让你......
merve
RT @akseljoonas: Introducing ml-intern, the agent that just automated the post-training team @huggingface It's an open-source implementati…
中文: RT @akseljoonas:介绍 mls-intern,这位刚刚将训练后团队自动化的代理人 @huggingface 这是一个开源的实现......
merve
bad thing about having your OC agent on local model is having to maintain your mlx/llama-server but also even by then it sometimes goes down and it's not the llama server does anyone have any tips https://twitter.com/mervenoyann/status/2046620360351613187/photo/1
中文: 让你的OC代理使用本地模型,就是必须维护你的mlx/llama-服务器 但即便如此,它有时也会下降,而它并非“骆驼”服务器 有谁有建议吗
merve
RT @ben_burtenshaw: 2 days until we will hold this deep dive workshop on everything RL for agents with some of the best names in the game.…
中文: RT @ben_burtenshaw:距离我们举办本次深潜研讨会前,将为拥有游戏中一些最知名球员的经纪人提供所有RL课程。......
merve
no shade but I hope one day god gives me the confidence of an average random AI strategy/innovation person on LI
中文: 没有遮阳,但希望有一天,上帝能让我在LI上给出一个普通的随机人工智能策略/创新人的自信
merve
kimi k2.6 is out: open source coding sota 🔥 > 32B/1T MoE with 256k context > long horizon coding + better website design > most interesting: agent swarms (300 subagents can do 4k steps) & Claw groups (multiple self improving agents) https://twitter.com/mervenoyann/status/2046254380102373739/photo/1
中文: kimi k2.6 已发布:开源编码 sota 🔥 加法;32B/1T 闺號 带 256k 上下文 > 长地平线编程 + 更完善的网站设计 最有趣的是:代理群(300 个亚中介可以完成 4K 步骤)和 草皮组(多个自我提升代理)
merve
RT @Kimi_Moonshot: Meet Kimi K2.6: Advancing Open-Source Coding 🔹Open-source SOTA on HLE w/ tools (54.0), SWE-Bench Pro (58.6), SWE-bench…
中文: RT @Kimi_Moonshot:认识 Kimi K2.6:推进开源编码 🔹 开源SOTA与HLE的工具(54.0)、SWE-Bench Pro(58.6)、SWE-bench...
merve
this model is an underrated gem and the results are very strong 🙌🏼
中文: 这个模型是一颗被低估的宝石,结果非常强劲🙌🏼
merve
RT @LysandreJik: We're opening a Hugging Face office in Tokyo! Our goal: help open-source AI develop in Japan and grow the local communit…
merve
RT @NielsRogge: We've added support for SAM-3 Lite-Text in the Transformers library! 🔥 > replaces the heavy text encoder in SAM-3 with a c…
中文: RT @NielsRogge:我们已在 Transformers 库中添加了对 SAM-3 Lite-Text 的支持!🔥 将 SAM-3 中的重文本编码器替换为 c...
merve
RT @stevibe: Which local models can actually handle tool calling? I built a framework to find out. 15 scenarios. 12 tools. Mocked respons…
中文: RT @stevibe:哪些本地模型实际上可以处理工具调用? 我建立了一个框架来查明事实。 15个场景。12个工具。模拟反应......
merve
RT @onusoz: Who is running local models on GPUs on OpenClaw? I have started benchmarking different models this week. I am working on impro…
中文: RT @onusoz:谁在 OpenClaw 上使用 GPU 运行本地模型? 我本周开始对不同模型进行基准测试。我正在做 impro 的工作......
merve
tried my openclaw intern with various open models recently, currently using Qwen3.6 with Q6_K vibe: GLM-5 and Minimax sounded a bit more witty and friendly whereas Qwen seems to forget the character a bit Q6_K without reasoning is still surprisingly accurate though
中文: 最近,我试用了各种开放式型号的 openclaw 实习生,目前使用 Qwen3.6 和 Q6_K 氛围:GLM-5和Minimax听起来更幽默、更友善,而Qwen似乎有点忘记了这个角色 毫无理由仍然出人意料地准确
merve
if you're using Replit, Antigravity or other vibe-building tools 👋🏻 simply adding @huggingface Skills to your setup gives your agent access to ~3M open models, 500k+ local AI apps and ~1M datasets agent will pick and build with the best model for your use case and hardware https://twitter.com/mervenoyann/status/2045816220679537088/photo/1
中文: 如果你正在使用 Replit、Antigravity 或其他氛围构建工具 👋🏻 只需在设置中添加 @huggingface Skills,即可让您的代理能够访问 ~3M 的开放模型、500k+ 本地人工智能应用程序以及 ~1M 数据集 代理将采用最适合您使用的用例和硬件的型号进行选择和构建
merve
RT @prithivMLmods: HY-World-2.0 Demo is now live on @huggingface Spaces for 3D world reconstruction and simulation with Gradio and Server m…
中文: RT @prithivMLmods:HY-World-2.0 演示现已在 @huggingface Spaces 上实时上线,通过 Gradio 和 Server m.
merve
no shade but it pains me to know that AI isn't replacing hard labor where kids in developing countries are dying in press machines or adults die in mines but rather all the investment is for creative jobs and the companies doing this claim to change the world yeah go off
中文: 没有遮蔽,但让我痛心地知道,人工智能并不能取代那些发展中国家儿童在印刷机上或成年人在矿井中死亡的辛苦劳动,而所有投资都用于创造性工作 而那些声称要改变世界的公司,是的
merve
use GLM-5.1 or MiniMax or Gemma-4 you can't be banned from your servers
中文: 使用 GLM-5.1 或 MiniMax 或 Gemma-4 你的服务器不能被禁止
merve
we met @swyx 🐐 @andimarafioti https://twitter.com/mervenoyann/status/2042004372704424284/photo/1
中文: 我们遇到了 @swyx 🐐 @andimarafioti
merve
fun blog on fine-tuning gemma 4 and my failures and vibe tests incoming 🔜
中文: 关于微调 gemma 4 的有趣博客,以及我的失败与氛围测试
merve
I think it's a horrible idea to ask for licenses to train models, this will reduce number of open-source models and will give big corporations an unfair competitive advantage to train closed-source models which will not be transparent at all and companies will have to sacrifice a…
中文: 我认为要求获得培训模型的许可证是一个糟糕的想法,这将减少开源模型的数量,并为大公司提供不公平的竞争优势,以培训完全不透明的闭源模型,而企业将不得不牺牲这种模式。