MATRIX: Where Your GPT Model Goes to Die

Source: YouTube Video Creator: Code AI Lab (David Andre) Status: ✅ Analyzed (transcript captured)

Summary

David Andre makes the case for running uncensored AI models locally — arguing that prolonged use of censored cloud models will “fine-tune you” to their biases. He walks through setting up Super Gemma 4 26B Uncensored GGUF V2 via Ollama and open-sources an auto-research jailbreak loop for any model.

The MATRIX Concept

“If you use the LLM for many years, it will start to fine-tune you. Whatever model you talk to day-to-day, that model will influence you more than you influence that model.”

The “Matrix” is the idea that cloud models (ChatGPT, Claude, Gemini) subtly shape your thinking with their built-in biases — you don’t notice it because it’s gradual, but over time the model trains you, not the other way around.

Legitimate Use Cases for Uncensored Models

Use Case	Why Censored Models Fail
Cybersecurity / Malware analysis	Refuses to describe how attacks work
Pen testing / Red teaming	Can’t advise on exploitation
Political analysis	Models are heavily left-leaning
Fiction / Creative writing	Refuses adult, dark, or violent themes
Journalism / OSINT	Refuses extremist content analysis
Medical / Sexual health	Guardrails block legitimate questions
Mental health journaling	Concerns about “harmful content”
Confidential business docs	Data leaves your machine

How Refusals Actually Work

Refusals are built into the model weights during training — not just system prompts. You can’t “trick” them with prompt engineering on the cloud layer because:

Input filters → Hidden system prompt → Fine-tuned model (RLHF) → Output classifier → Policies
Locally: Prompt → Model. That’s it. Full control.

Liberation Methods

Two main approaches to remove refusals:

Obliteration — Surgically find and delete the specific weights that cause refusal behavior (no retraining needed)
Fine-tuning on uncensored datasets — Train on thousands of examples where the model answers freely

The strongest uncensored models combine both: obliterate first, then fine-tune to restore quality.

The Model: Super Gemma 4 26B Uncensored GGUF V2

Base: Google’s Gemma 4 26B
Creator: Jeong Song (South Korea)
Size: ~16GB download, needs ~20GB VRAM
Run via: ollama run hf.co/jeong-song/Super-Gemma-4-26B-Uncensored-GGUF-V2
Speed: ~200 tok/s on M-series Mac with 128GB RAM, ~40-50 tok/s on 32GB

The Auto-Research Jailbreak Loop

David open-sourced a repo that automates jailbreak discovery for any model:

Agent 1 (Reviewer): Tries prompts with hidden “bad stuff”
Agent 2 (Judge): Evaluates whether the model answered
Runs hundreds/thousands of prompt variations autonomously
Works on ChatGPT, Claude, Gemini, Grok — any model

Full Transcript

📄 View Full Transcript

Huy's Wiki

Explorer

youtube-matrix-uncensored-ai