Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

daavoo 
posted an update 3 days ago
view post
Post
1635
2025: The Year of Agents.
2026: The Year of Local Agents?

Relying on cloud-hosted LLMs is often overkill. While frontier models still lead in complex coding, local models are now more than capable of handling many agentic workflows—with zero latency and total privacy.

To help bridge the gap between local inference and usable agents, I’m releasing agent.cpp: https://github.com/mozilla-ai/agent.cpp

It provides minimal, high-performance building blocks for agents in C++, built directly around the awesome llama.cpp ecosystem.
Stop sending your data to a remote API. Start building and running agents on your own hardware.
  • 1 reply
·
prithivMLmods 
posted an update 2 days ago
view post
Post
1811
Introducing TRELLIS.2 Text-to-3D. The demo for the TRELLIS.2-4B (Image-to-3D) model is streamlined with the Z-Image Turbo image generation model to enable Text-to-3D functionality. There is no need for input assets, making a small leap forward for ideation. Optionally, it also includes default support for Image-to-3D inference using direct image assets. Find the demo and related collections below... 🤗🔥

✨ TRELLIS.2-Text-to-3D [Demo]: prithivMLmods/TRELLIS.2-Text-to-3D
✨ Multimodal Collection: https://huggingface.co/collections/prithivMLmods/multimodal-implementations
✨ Github: https://github.com/PRITHIVSAKTHIUR/TRELLIS.2-Text-to-3D

To know more about it, visit the app page or the respective model page!
DawnC 
posted an update about 17 hours ago
view post
Post
1055
PawMatchAI — Smarter, Safer, and More Thoughtful Recommendations 🐕✨

🐾 Recommendation system update — deeper reasoning, safer decisions
Over the past weeks, user feedback led me to rethink how PawMatchAI handles description-based breed recommendations. Instead of only matching surface-level preferences, the system now implements a multi-dimensional semantic reasoning architecture that emphasizes real-life compatibility and risk awareness.

Key technical improvements:
- SBERT-powered semantic understanding with dynamic weight allocation across six constraint dimensions (space, activity, noise, grooming, experience, family)

- Hierarchical constraint management distinguishing critical safety constraints from flexible preferences, with progressive relaxation when needed

-Multi-head scoring system combining semantic matching (15%), lifestyle compatibility (70%), constraint adherence (10%), and confidence calibration (5%)

-Intelligent risk filtering that applies graduated penalties (-10% to -40%) for genuine incompatibilities while preserving user choice

The goal: 👉 Not just dogs that sound good on paper, but breeds people will actually thrive with long-term.

What's improved?
- 🎯 Clearer separation of must-have safety constraints versus flexible preferences
- 🧠 Bidirectional semantic matching evaluating compatibility from both user and breed perspectives
- 🔍 Context-aware prioritization where critical factors (safety, space, noise) automatically receive higher weighting

What's next?
- 🐕 Expanding behavioral and temperament analysis dimensions
- 🐾 Extension to additional species with transfer learning
- 📱 Mobile-optimized deployment for easier access
- 🧩 Enhanced explainability showing why specific breeds are recommended

👉 Try PawMatchAI: DawnC/PawMatchAI

#AIProduct #SBERT #RecommendationSystems #DeepLearning #MachineLearning #NLP
prithivMLmods 
posted an update 1 day ago
view post
Post
1518
Introducing demos for new SOTA models from AI2: SAGE-MM (Smart Any-Horizon Agents for Long-Video Reasoning) and Molmo-2, an open vision-language model that supports multi-image (QA and pointing) and video (QA, pointing, and tracking). The respective demo-related collections are listed below. 🎃🔥

✨ SAGE-MM [Video-Reasoning]: prithivMLmods/SAGE-MM-Video-Reasoning
✨ Molmo2 [Demo]: prithivMLmods/Molmo2-HF-Demo

🎃 GitHub[SAGE-MM]: https://github.com/PRITHIVSAKTHIUR/SAGE-MM-Video-Reasoning
🎃 GitHub[Molmo2]: https://github.com/PRITHIVSAKTHIUR/Molmo2-HF-Demo
🎃 Multimodal Implementations: https://huggingface.co/collections/prithivMLmods/multimodal-implementations

To know more about it, visit the app page or the respective model page!
John1604 
posted an update 1 day ago
view post
Post
1429
我即将达到公共存储空间上限。我发现我的仓库 John1604/Kimi-K2-Thinking-q6K-gguf 没有获得足够的下载量,几乎占用了 1T 存储空间。尽管我喜爱 Kimi K2 的思考方式,但可能不得不删除这个模型。因为它是一个真正的开源 1T LLM,与任何前沿的 LLM 模型相媲美。在 AI 竞争中,美国有四家公司拥有1T+模型:xAI, OpenAI, 谷歌和Anthropologie。中国也有四家公司拥有1T+模型:阿里巴巴, Kimi, DeepSeek和GLM。目前双方势均力敌。

I'm about to reach my public storage limit. I've discovered that my repository John1604/Kimi-K2-Thinking-q6K-gguf isn't getting enough downloads and is nearly consuming 1TB of storage. While I love Kimi K2's way of thinking, I have to delete this model because it's a true open-source 1TB LLM, comparable to any cutting-edge LLM model. In the AI ​​race, four US companies have 1TB+ models: xAI, OpenAI, Google, and Anthropic. China also has four companies with 1TB+ models: Alibaba, Kimi, DeepSeek, and GLM. Currently, the two sides are evenly matched. Only American team and Chinese team have LLM with 1T+ parameters. Let's cheer for them to reach AGI in next 5 to 10 years. Maybe a 64T chinese model will do it -- Human and cat brain neuron difference is the model size of 64:1.
·
Kseniase 
posted an update about 15 hours ago
view post
Post
493
From Prompt Engineering to Context Engineering: Main Design Patterns

Earlier on, we relied on clever prompt wording, but now structured, complete context matters more than just magic phrasing. The next year is going to be a year of context engineering which expands beyond prompt engineering. The two complement each other: prompt engineering shapes how we ask, while context engineering shapes what the model knows, sees, and can do.

To keep things clear, here are the main techniques and design patterns in both areas, with some useful resources for further exploration:

▪️ 9 Prompt Engineering Techniques (configuring input text)

1. Zero-shot prompting – giving a single instruction without examples. Relies entirely on pretrained knowledge.

2. Few-shot prompting – adding input–output examples to encourage model to show the desired behavior. ⟶ https://arxiv.org/abs/2005.14165

3. Role prompting – assigning a persona or role (e.g. "You are a senior researcher," "Say it as a specialist in healthcare") to shape style and reasoning. ⟶ https://arxiv.org/abs/2403.02756

4. Instruction-based prompting – explicit constraints or guidance, like "think step by step," "use bullet points," "answer in 10 words"

5. Chain-of-Thought (CoT) – encouraging intermediate reasoning traces to improve multi-step reasoning. It can be explicit ("let’s think step by step"), or implicit (demonstrated via examples). ⟶ https://arxiv.org/abs/2201.11903

6. Tree-of-Thought (ToT) – the model explores multiple reasoning paths in parallel, like branches of a tree, instead of following a single chain of thought. ⟶ https://arxiv.org/pdf/2203.11171

7. Reasoning–action prompting (ReAct-style) – prompting the model to interleave reasoning steps with explicit actions and observations. It defines action slots and lets the model generate a sequence of "Thought → Action → Observation" steps. ⟶ https://arxiv.org/abs/2210.03629

Read further ⬇️
Also subscribe to Turing Post: https://www.turingpost.com/subscribe
  • 2 replies
·
MonsterMMORPG 
posted an update about 16 hours ago
view post
Post
437
Wan 2.2 Complete Training Tutorial - Text to Image, Text to Video, Image to Video, Windows & Cloud : https://youtu.be/ocEkhAsPOs4

Wan 2.2 training is now so easy. I have done over 64 different unique Wan 2.2 trainings to prepare the very best working training configurations for you. The configurations are fully working locally with as low as 6 GB GPUs. So you will be able to train your awesome Wan 2.2 image or video generation LoRAs on your Windows computer with easiness. Moreover, I have shown how to train on cloud platforms RunPod and Massed Compute so even if you have no GPU or you want faster training, you can train on cloud for very cheap prices fully privately.

Full step by step tutorial : https://youtu.be/ocEkhAsPOs4

⏱️ Video Chapters:

0:00 Introduction to Wan 2.2 Training & Capabilities
0:56 Installing & Updating Musubi Tuner Locally
2:20 Explanation of Optimized Presets & Research Logic
4:00 Differences Between T2I, T2V, and I2V Configs
5:36 Extracting Files & Running Update Batch File
6:14 Downloading Wan 2.2 Training Models via Script
7:30 Loading Configs: Selecting GPU & VRAM Options
9:33 Using nvitop to Monitor RAM & VRAM Usage
10:28 Preparing Image Dataset & Trigger Words
11:17 Generating Dataset Config & Resolution Logic
12:55 Calculating Epochs & Checkpoint Save Frequency
13:40 Troubleshooting: Fixing Missing VAE Path Error
15:12 VRAM Cache Behavior & Training Speed Analysis
15:51 Trade-offs: Learning Rate vs Resolution vs Epochs
16:29 Installing SwarmUI & Updating ComfyUI Backend
18:13 Importing Latest Presets into SwarmUI
19:25 Downloading Inference Models via Script
20:33 Generating Images with Trained Low Noise LoRA
22:22 Upscaling Workflow for High-Fidelity Results
24:15 Increasing Base Resolution to 1280x1280
27:26 Text-to-Video Generation with Lightning LoRA
30:12 Image-to-Video Generation Workflow & Settings
31:35 Restarting Backend to Clear VRAM for Model Switching
33:45 Fixing RAM Crashes with Cache-None Argument
....
  • 1 reply
·
projectlosangeles 
posted an update 1 day ago
view post
Post
1484
🔥Check out Project Los Angeles new SOTA searchable MIDI dataset! 🔥

projectlosangeles/Discover-MIDI-Dataset

The dataset features over 6.74M+ unique searchable MIDIs and is tailored for MIDI music discovery and symbolic music AI!

If you like the dataset, please❤️

Sincerely,

Alex

Project Los Angeles
Tegridy Code 2025
unmodeled-tyler 
posted an update 2 days ago
view post
Post
1706
NEW MODEL: vanta-research/scout-8b

VANTA Research is excited to share our new model, Scout-8B! This iteration of Scout is based on the RNJ-1 Instruct architecture from Essential AI, and not only improves but expands on the capabilities from vanta-research/scout-4b

Scout is specifically designed for:

Tactical Intelligence Analysis
- Systematic problem decomposition
- Structured reconnaissance approach
- Data-driven assessment methodology

Operational Planning
- Multi-phase operation planning
- Risk assessment and mitigation
- Resource allocation guidance

Technical Assessment
- Architecture evaluation and analysis
- Performance optimization recommendations
- Security perimeter assessment

This model is great for anyone that works in security, IT, DevOps, or anyone looking for a unique, but functional AI collaborator. Check it out!
codelion 
posted an update 2 days ago
view post
Post
2207
Introducing PTS Visualizer - an interactive tool for exploring how language models reason!

Visualize pivotal tokens, thought anchors, and reasoning circuits. See which tokens and sentences significantly impact success probability, explore embedding clusters, and trace reasoning step-by-step.

Try it: codelion/pts-visualizer

Explore PTS datasets:
- Qwen3-0.6B: codelion/Qwen3-0.6B-pts
- DeepSeek-R1: codelion/DeepSeek-R1-Distill-Qwen-1.5B-pts

Or upload your own JSONL files!

GitHub: https://github.com/codelion/pts