shuoxing/llama3-8b-full-pretrain-wash-c4-2-4m-sft-bs64 Text Generation • 8B • Updated 8 days ago • 193
shuoxing/llama3-8b-full-pretrain-wash-c4-2-1m-sft-bs64 Text Generation • 8B • Updated 8 days ago • 207
shuoxing/llama3-8b-full-pretrain-wash-c4-1-8m-sft-bs64 Text Generation • 8B • Updated 8 days ago • 222
shuoxing/llama3-8b-full-pretrain-wash-c4-1-5m-sft-bs64 Text Generation • 8B • Updated 8 days ago • 250
shuoxing/llama3-8b-full-pretrain-wash-c4-1-2m-sft-bs64 Text Generation • 8B • Updated 8 days ago • 254
shuoxing/llama3-8b-full-pretrain-wash-c4-0-9m-sft-bs64 Text Generation • 8B • Updated 8 days ago • 275
shuoxing/llama3-8b-full-pretrain-wash-c4-0-6m-sft-bs64 Text Generation • 8B • Updated 8 days ago • 284
shuoxing/llama3-8b-full-pretrain-wash-c4-0-3m-sft-bs64 Text Generation • 8B • Updated 8 days ago • 287
shuoxing/qwen2-5-7b-full-sft-control-tweet-1m-en-reproduce-bs128 Text Generation • 333k • Updated Jan 26 • 2
shuoxing/qwen2-5-7b-full-sft-mix-high-tweet-1m-en-reproduce-bs128 Text Generation • 333k • Updated Jan 26 • 2
shuoxing/qwen2-5-7b-full-sft-mix-mid-tweet-1m-en-reproduce-bs128 Text Generation • 333k • Updated Jan 25 • 2
shuoxing/qwen2-5-7b-full-sft-mix-low-tweet-1m-en-reproduce-bs128 Text Generation • 333k • Updated Jan 25 • 4
shuoxing/qwen3-4b-full-sft-control-tweet-1m-en-reproduce-bs128 Text Generation • 196k • Updated Jan 25 • 4
shuoxing/qwen3-4b-full-sft-mix-high-tweet-1m-en-reproduce-bs128 Text Generation • 196k • Updated Jan 25 • 4
shuoxing/qwen3-4b-full-sft-mix-mid-tweet-1m-en-reproduce-bs128 Text Generation • 196k • Updated Jan 25 • 1