What is Heretic?

Heretic is a toolkit for customizing, ablation testing, and evaluating large language models (LLMs).It supports dense models and MoE/hybrid architectures and provides built-in chat, a benchmark runner, and model testing utilities.

heretic implements multiple ablation and analysis methods, including directional ablation (Arditi et al., 2024), projected abliteration (Lai, 2025), MPOA (Lai, 2025), experimental SOMA (Piras et al., 2025), and ARA (Weidmann, 2026).

Integrations include Hugging Face model hosting and community repositories (GitHub), with command-line usage examples such as heretic qwen/qwen3-4b-instruct-2507 and pip install support.Designed for researchers and engineering teams, heretic enables reproducible experiments, model introspection, and configurable instruction-following behavior.

Common use cases include ablation studies, benchmarking, model evaluation, and building custom inference workflows for research and production environments.

Heretic user reviews

Based on 1 review, 100.0% of users recommend Heretic, rated highly for quality results.

1
recommend
0
don't
1 review

Liked for

Quality results 1 of 1
All key features 1 of 1
Good integrations 1 of 1
Would you recommend Heretic?

Heretic's key features

  • Customizing, ablation testing, and evaluation of large language models
  • Support for dense, MoE, and hybrid model architectures
  • Built-in chat interface, benchmark runner, and model testing utilities
  • Implements multiple ablation/analysis methods (directional ablation, projected abliteration, MPOA, SOMA, ARA)
  • Integrations with Hugging Face and GitHub, command-line usage and pip install (Python 3.10+)

Heretic use cases

  • Run rigorous ablation studies on dense and MoE/hybrid LLMs using heretic to pinpoint which components drive performance, leverage built-in model introspections and analysis methods, and export reproducible experiment artifacts integrated with Hugging Face and GitHub for publication-ready results
  • Build and iterate custom inference workflows and hybrid MoE deployments with heretic's CLI/pip support and built-in chat interface to rapidly prototype, debug via live model introspection, and version pipelines in GitHub for reproducible production rollouts
  • Benchmark and compare candidate LLMs across standardized datasets using heretic's benchmark runner to automate multi-model evaluations and ablation sweeps, generate shareable performance reports, and make data-driven model selection and hyperparameter decisions

Who is it for?

  • Model developers
  • Ml engineers
  • Research scientists
  • Benchmarking teams
  • Data scientists

Community Discussions

🔍 Looking for AI tools? Try searching!