Llama 3.1 8B

Together
Text & Chat

Meta's compact 8B model. Surprisingly capable for its size, perfect for fast inference, edge deployment, and cost-sensitive applications.

Try Llama 3.1 8B

0.7

Response will appear here...

Sign up free to start generating
Get Started

Pricing

Price per Generation
Per generationFree

API Integration

Use our OpenAI-compatible API to integrate Llama 3.1 8B into your application.

Install
npm install railwail
JavaScript / TypeScript
import railwail from "railwail";

const rw = railwail("YOUR_API_KEY");

// Simple — just pass a string
const reply = await rw.run("llama-3-1-8b", "Hello! What can you do?");
console.log(reply);

// With message history
const reply2 = await rw.run("llama-3-1-8b", [
  { role: "system", content: "You are a helpful assistant." },
  { role: "user", content: "Explain quantum computing simply." },
]);
console.log(reply2);

// Full response with usage info
const res = await rw.chat("llama-3-1-8b", [
  { role: "user", content: "Hello!" },
], { temperature: 0.7, max_tokens: 500 });
console.log(res.choices[0].message.content);
console.log(res.usage);
Specifications
Context window
128,000 tokens
Max output
8,192 tokens
Provider
Together AI
Category
Text & Chat
Tags
open-source
compact
fast
Try this model

Free credits on sign-up

Start using Llama 3.1 8B today

Get started with free credits. No credit card required. Access Llama 3.1 8B and 100+ other models through a single API.