Qwen3-1.7B-Coder-Distilled-SFT — GGUF

GGUF quantizations of reaperdoesntknow/Qwen3-1.7B-Coder-Distilled-SFT for local and edge deployment via llama.cpp and compatible runtimes.

Coder teacher → STEM distillation → logical inference SFT → quantized. Structured reasoning in ~1.2GB.

Available Quantizations

File Quant Size Use Case
qwen3-1.7b-coder-distilled-sft-f16.gguf F16 ~3.8 GB Full precision reference
qwen3-1.7b-coder-distilled-sft-Q8_0.gguf Q8_0 ~2.1 GB Near-lossless, desktop
qwen3-1.7b-coder-distilled-sft-Q5_K_M.gguf Q5_K_M ~1.4 GB Balanced quality and size
qwen3-1.7b-coder-distilled-sft-Q4_K_M.gguf Q4_K_M ~1.2 GB Mobile, edge, fastest inference

Recommended: Q5_K_M for desktop, Q4_K_M for mobile/edge.

About the Model

Two-stage build:

Stage 1 — Coder Teacher Distillation: Qwen3-1.7B distilled from Qwen3-Coder-30B-A3B-Instruct on 6,122 STEM CoT samples. Proof-weighted cross-entropy (2.5x → 1.5x on derivation tokens) + KL divergence at T=2.0. The Coder teacher transfers structured decomposition patterns — sequential logic, state tracking, compositional reasoning — through the softmax landscape.

Stage 2 — Logical Inference SFT: Fine-tuned on KonstantinDob/logic_inference_dataset (~54,607 propositional logic pairs, LOGICINFERENCEe format). The model performs inference first, then concludes. Based on the LogicInference paper by Santiago Ontañón (Google Research).

Attribute Value
Base model Qwen/Qwen3-1.7B
Teacher model Qwen/Qwen3-Coder-30B-A3B-Instruct
Stage 1 data 6,122 STEM CoT samples
Stage 2 data ~54,607 logical inference pairs
Developer Reaperdoesntrun / Convergent Intelligence LLC: Research Division

Usage

llama.cpp CLI

./llama-cli -m qwen3-1.7b-coder-distilled-sft-Q4_K_M.gguf \
  -p "### Instruction:\nConsider the premises: If it rains, the ground is wet. It is raining. What can we conclude?\n\n### Response:\n" \
  -n 512 --temp 0.0

llama.cpp Python

from llama_cpp import Llama

llm = Llama(model_path="qwen3-1.7b-coder-distilled-sft-Q4_K_M.gguf", n_ctx=1024)

output = llm(
    "### Instruction:\nIs the following argument valid? All dogs are animals. Some animals are pets. Therefore, all dogs are pets.\n\n### Response:\n",
    max_tokens=512,
    temperature=0.0,
)
print(output["choices"][0]["text"])

Ollama

echo 'FROM ./qwen3-1.7b-coder-distilled-sft-Q4_K_M.gguf' > Modelfile
ollama create logic-reasoner -f Modelfile
ollama run logic-reasoner "If all humans are mortal and Socrates is human, what follows?"

LM Studio

Download any GGUF file and load directly in LM Studio.

Prompt Formats

STEM derivation (Stage 1):

Solve the following problem carefully and show a rigorous derivation.

Problem:
[Your problem]

Proof:

Logical inference / instruction-following (Stage 2):

### Instruction:
[Your question or logical inference problem]

### Response:

Limitations

1.7B model. Structured reasoning with hard capacity limits. Not a code generator despite the Coder teacher. Not a formal proof verifier. Complex multi-step inferences with many quantifiers may exceed capacity. Always verify critical outputs.

Source Model

Full training methodology at: reaperdoesntknow/Qwen3-1.7B-Coder-Distilled-SFT

Mathematical Foundations

This is a GGUF-quantized variant. The mathematical foundations (Discrepancy Calculus, Topological Knowledge Distillation) are documented in the source model's card. The discrepancy operator $Df(x)$ and BV decomposition that inform the training pipeline are preserved through quantization — the structural boundaries detected by DISC during training are baked into the weights, not dependent on precision.

Related Models

Model Description
Qwen3-1.7B-Coder-Distilled Stage 1 only
Qwen3-1.7B-Coder-Distilled-SFT Full precision source
Qwen3-1.7B-Distilled-30B-A3B-SFT-GGUF Instruct teacher + legal SFT GGUF

Citation

@misc{colca2026codersftgguf,
  title={Coder-Distilled Logical Inference GGUF: Structured Reasoning for Edge Deployment},
  year={2026},
  publisher={HuggingFace},
  url={https://huggingface.co/reaperdoesntknow/Qwen3-1.7B-Coder-Distilled-SFT-GGUF},
  note={Convergent Intelligence LLC: Research Division}
}

Convergent Intelligence LLC: Research Division "Where classical analysis fails to see, we begin."


Convergent Intelligence Portfolio

Part of the Qwen3 Coder Series by Convergent Intelligence LLC: Research Division

Mathematical Foundations

This is a GGUF-quantized variant. The mathematical foundations (Discrepancy Calculus, Topological Knowledge Distillation) are documented in the source model's card. The discrepancy operator $Df(x)$ and BV decomposition that inform the training pipeline are preserved through quantization — the structural boundaries detected by DISC during training are baked into the weights, not dependent on precision.

Related Models

Model Downloads Format
Qwen3-1.7B-Coder-Distilled-SFT 302 HF

Top Models from Our Lab

Total Portfolio: 41 models | 2,781 total downloads

Last updated: 2026-03-28 12:49 UTC

DistilQwen Collection

This model is part of the DistilQwen proof-weighted distillation series. Collection: 9 models | 2,788 downloads

Teacher Variant Comparison

Teacher Student Size Strength Models
Qwen3-30B-A3B (Instruct) 1.7B Instruction following, structured output, legal reasoning 3 (833 DL)
Qwen3-30B-A3B (Thinking) 0.6B Extended deliberation, higher-entropy distributions, proof derivation 3 (779 DL)
Qwen3-30B-A3B (Coder) 1.7B Structured decomposition, STEM derivation, logical inference 2 (825 DL) ← this model

Methodology

The only BF16 collection in the portfolio. While the broader Convergent Intelligence catalog (43 models, 12,000+ downloads) was trained on CPU at FP32 for $24 total compute, the DistilQwen series was trained on H100 at BF16 with a 30B-parameter teacher. Same methodology, premium hardware. This is what happens when you give the pipeline real compute.

All models use proof-weighted knowledge distillation: 55% cross-entropy with decaying proof weights (2.5× → 1.5×), 45% KL divergence at T=2.0. The proof weight amplifies loss on reasoning-critical tokens, forcing the student to allocate capacity to structural understanding rather than surface-level pattern matching.

Full methodology: Structure Over Scale (DOI: 10.57967/hf/8165)

Related in this series


Part of the reaperdoesntknow research portfolio — 49 models, 22,598 total downloads | Last refreshed: 2026-03-30 12:05 UTC

Downloads last month
1,709
GGUF
Model size
2B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

4-bit

5-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for reaperdoesntknow/Qwen3-1.7B-Coder-Distilled-SFT-GGUF

Finetuned
Qwen/Qwen3-1.7B
Quantized
(1)
this model

Collection including reaperdoesntknow/Qwen3-1.7B-Coder-Distilled-SFT-GGUF