Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

NJX-njxย 
posted an update 2 days ago
view post
Post
7112
Recently, I have open-sourced an AI emotional companion product based on openclaw, called opensoul.

On this platform, you can create a "soulmate" that matches your personality, and configure it with the skills, tools you want it to have, as well as the platforms it can integrate with (such as Telegram, Discord, etc.).
You can even create group chats, invite multiple agents and your friends to chat about recent events, discuss projects together, and so on.

On the one hand, I hope it can better accompany you in daily life by virtue of its unique memory mechanism, self-feedback and iteration mechanism, and the modeling of users' emotions. On the other hand, I also hope it can help you better handle your work with its unique skills, tools and ability to deal with complex task scenarios.

Although the entire product has taken shape, I think there are still many areas that need adjustment and optimization. I also hope to rely on the strength of the community to do a good job in AI emotional companionship.

This is the project introduction URL: https://opensoul-web.vercel.app
This is the GitHub project URL: https://github.com/NJX-njx/opensoul
@AdinaY @lilianweng@burtenshaw@clem
let's just do it

ยท
danielhanchenย 
posted an update 1 day ago
OzTianluย 
posted an update 2 days ago
view post
Post
4522
๐Ÿ”ฅ UPGRADE in Kai: 30B Scaling! ๐Ÿ”ฅ
NoesisLab/Kai-30B-Instruct
NoesisLab/Kai-3B-Instruct
We are incredibly excited to announce that the Kai-30B-Instruct model and its official Space are now LIVE! ๐Ÿš€
If you've been following the journey from Kai-0.35B to Kai-3B, you know we're rethinking how models reason. Tired of verbose, slow Chain-of-Thought (CoT) outputs that flood your screen with self-talk? So are we.
Kai-30B-Instruct scales up our Adaptive Dual-Search Distillation (ADS) framework. By bridging classical A* heuristic search with continuous gradient descent , we use an information-theoretic log-barrier to physically prune high-entropy reasoning paths during training.
The result? Pure implicit reasoning. The model executes structured logic, arithmetic carries, and branch selections as a reflex in a single forward passโ€”no external scaffolding required.
At 3B, we observed a phase transition where the model achieved "logical crystallization". Now, at 30B, we are giving the ADS regularizer the massive representational capacity it needs to tackle higher-order symbolic abstractions and complex reasoning tasks.
๐Ÿงช Test Kai yourself in our new Space:
NoesisLab/Kai-3B-Instruct
๐Ÿ“ฆ Model Weights:
NoesisLab/Kai-30B-Instruct
Bring your hardest math, logic, and coding benchmarks. We invite the community to stress-test the limits of the penalty wall! ๐Ÿงฑ๐Ÿ’ฅ
  • 1 reply
ยท
hannayukhymenkoย 
posted an update 2 days ago
view post
Post
1307
Do you translate your benchmarks from English correctly? ๐Ÿค”
Turns out, for many languages it is much harder than you can imagine!

Introducing Recovered in Translation ๐ŸŒ together with @aalexandrov
ritranslation.insait.ai

Translating benchmarks is a painful process, requiring a lot of manual inspection and adjustments. You start from setting up the whole pipeline and adapting to every format type, including task specifics. There already exist some massive benchmarks, but they still have some simple (and sometimes silly) bugs, which can hurt the evaluations :( We present a novel automated translation framework to help with that!

Eastern and Southern European languages introduce richer linguistic structures compared to English and for benchmarks which heavily rely on grammatical coherence machine translation presents a risk of harming evaluations. We discover potential answer leakage or misleading through grammatical structure of the questions. Some benchmarks are also just outdated and need to be retranslated with newer and better models.

We present a framework with novel test-time scaling methods which allow to control time and cost investments, while at the same time mitigate the need for human-in-the-loop verification. While working on Ukrainian-focused MamayLM models, we had to translate 10+ benchmarks in a short span of time. Finding human evaluators is costly and time-consuming, same goes for using professional translators. With our pipeline we were able to do it in 3 days๐ŸŽ๏ธ

We hope our findings will help enable stronger multilingual evaluations and developments. We release all produced benchmarks on Hugging Face together with the source code and Arxiv paper ๐Ÿค—

Paper: Recovered in Translation: Efficient Pipeline for Automated Translation of Benchmarks and Datasets (2602.22207)
Code: https://github.com/insait-institute/ritranslation
Benchmarks: https://huggingface.co/collections/INSAIT-Institute/multilingual-benchmarks
  • 1 reply
ยท
etemizย 
posted an update 3 days ago
view post
Post
6117
AHA 2026 scores of Qwen3.5

27B
Huihui abliteration 65%
Heretic abliteration 55%
Normal 50%

35B
Huihui abliteration 64%
@jiaojjjjje abliteration 57%
@LeadFootThrottleCock abliteration 56%
Normal 49%
  • 6 replies
ยท
mayafreeย 
posted an update about 6 hours ago
view post
Post
514
I built a Space that lets you switch between all three Qwen3.5 official collection models in a single interface.

MAYA-AI/QWEN-3_5-CHAT

The architecture is the key part. Instead of using Gradio as the UI, I use it purely as an API engine. FastAPI serves a fully custom HTML/JS frontend that calls /gradio_api/call/chat via SSE streaming. No DOM conflicts, no layout constraints.

Four main features: instant model switching with automatic spec adjustment (max tokens, temperature ceiling, Vision availability all update per model), Thinking Mode via /think prefix with collapsible reasoning chain, Vision image upload via base64 conversion, and HF OAuth implemented directly at the FastAPI level.

For model selection: 122B-A10B with Thinking Mode for math, logic, and agents. 27B for writing, translation, and instruction following. 35B-A3B for fast everyday questions.

A few surprises during development โ€” Gradio 6.x removed several parameters quietly, base64 image strings broke gr.Image(type="pil") so I switched to gr.Textbox with backend PIL conversion, and Thinking Mode parsing needed a full rewrite with indexOf instead of regex.

Thanks to the Qwen team for making this possible. Try it out and let me know what you think.

#Qwen3 #Qwen35 #OpenSourceAI #HuggingFace #LLM #ThinkingAI #vidraft #MultimodalAI
prithivMLmodsย 
posted an update about 16 hours ago
view post
Post
1337
QIE-Object-Remover-Bbox Demo removes objects and artifacts from selected regions using bounding box grounding. Built on Qwen-Image-Edit-2509 with Rapid Diffusers acceleration, it delivers fast 4-step inference via the QIE-2509 adapter. ๐Ÿค—๐Ÿ”ฅ

๐Ÿ”—Demo Space: prithivMLmods/QIE-Object-Remover-Bbox
๐Ÿ”—Qwen-Image-Edit-Rapid-AIO: prithivMLmods/Qwen-Image-Edit-Rapid-AIO-V4
๐Ÿ”—Adapter-(LoRA): prithivMLmods/QIE-2509-Object-Remover-Bbox

๐Ÿ”—Collection: https://huggingface.co/collections/prithivMLmods/qwen-image-edit-layout-bbox

To learn more, visit the app page or the respective model pages.
unmodeled-tylerย 
posted an update about 19 hours ago
view post
Post
1124
Link to Repo: https://github.com/unmodeled-tyler/thought-tracer

I had a great time at Mistral's Hackathon in SF over the weekend! There were a lot of incredibly talented builders there and it was an honor to be a part of it! ๐Ÿ˜„

I built Thought Tracer - a TUI-based logitlens application for Ministral 3B/8B with optional AI analysis from Mistral Large on the Mistral API.

Thought Tracer allows you to see what the model "believes" at each layer until it arrives at it's final next token prediction. The Entropy tab displays entropy through each layer, additionally providing both token-level and prompt-level risk for hallucination.

If you have a Mistral API key, the AI analysis section is actually pretty cool because it's returned in rendered markdown and easily understandable language - providing a commentary on how the model likely arrived at it's final prediction, and also offering diagnostics for model developers. This commentary actually makes the tool pretty beginner friendly to anyone interested in exploring AI research tools for the first time.

Check it out if you're interested!
  • 3 replies
ยท
ronantakizawaย 
posted an update 1 day ago
view post
Post
1093
Introducing the WebUI dataset: a compilation of screenshot to code pairs of modern websites detailing the styling, framework used, and box bounds for all viewports (Desktop, mobile, tablet).

This dataset showed signs of improved performance in web design LLM benchmarks for a finetuned QWEN 2.5 VL-7B!

#web #ui #datasets

ronantakizawa/webui
  • 2 replies
ยท
appvoidย 
posted an update 1 day ago
view post
Post
943
Let's keep the momentum for small models. I just published dot. It's the first pretrained causal model that is trained on math/symbols rather than english. The goal is to get an agnostic fewshot meta learner that learns from reality itself instead of language.

It's already decent at some tasks, with next version coming in a few weeks.


appvoid/dot
ยท