L

Initializing Studio...

Docs

Getting Started

  • Introduction
  • Quick Start
  • Installation

Fine-tuning

  • LoRA & QLoRA
  • Full Fine-tuning

API & SDK

  • REST API
  • Python SDK

Deployment

  • Cloud Deployment
  • Security

Resources

  • FAQ
  • Changelog
Docs

Getting Started

  • Introduction
  • Quick Start
  • Installation

Fine-tuning

  • LoRA & QLoRA
  • Full Fine-tuning

API & SDK

  • REST API
  • Python SDK

Deployment

  • Cloud Deployment
  • Security

Resources

  • FAQ
  • Changelog

API Reference

Complete REST API documentation for integrating LangTrain into your applications. All endpoints use the base URL: https://api.langtrain.xyz/v1

🔌

RESTful Design

Standard REST patterns with JSON request/response bodies

📄

OpenAPI Compatible

Full OpenAPI 3.0 spec available for code generation

⚡

Rate Limiting

1000 requests/minute on Pro, 10000 on Enterprise

🔔

Webhooks

Real-time event notifications for training and inference

Authentication

All API requests require authentication using your API key. Include your key in the Authorization header as a Bearer token.

Getting Your API Key:
1. Sign in to your LangTrain dashboard
2. Navigate to Settings → API Keys
3. Generate a new API key
4. Store it securely (keys are only shown once)

Security Best Practices:
- Never commit API keys to version control
- Use environment variables for key storage
- Rotate keys regularly
- Monitor usage for suspicious activity
python
1# Authentication examples
2import requests
3import os
4
5# Set your API key as environment variable
6API_KEY = os.getenv('LANGTRAIN_API_KEY')
7BASE_URL = 'https://api.langtrain.xyz/v1'
8
9# Headers for all requests
10headers = {
11 'Authorization': f'Bearer {API_KEY}',
12 'Content-Type': 'application/json',
13 'User-Agent': 'LangTrain-Python/1.0.0'
14}
15
16# Test authentication
17response = requests.get(f'{BASE_URL}/user/profile', headers=headers)
18
19if response.status_code == 200:
20 print("✅ Authentication successful")
21 user_data = response.json()
22 print(f"Welcome, {user_data['name']}!")
23else:
24 print(f"❌ Authentication failed: {response.status_code}")
25 print(response.json())

Models API

The Models API allows you to list available models, get model details, and manage custom models. All base models are pre-loaded and ready for fine-tuning.

Available Endpoints:
- GET /v1/hub/ - List all available models
- GET /v1/hub/models/{model_id} - Get model details
- GET /v1/hub/tiers - Get pricing tiers
- GET /v1/hub/featured - Get featured models

Model Categories:
- Language Models: General-purpose LLMs (Llama, Mistral, Qwen)
- Code Models: Specialized for code generation (CodeLlama, DeepSeek)
- Multimodal Models: Vision and text understanding (LLaVA, Phi-3-Vision)
python
1# Models API examples
2
3# 1. List all available models
4def list_models():
5 response = requests.get(f'{BASE_URL}/hub/', headers=headers)
6 data = response.json()
7
8 print(f"Found {len(data['models'])} models:")
9 for model in data['models']:
10 print(f" - {model['id']}: {model['name']} ({model['params_billions']}B params)")
11
12 return data
13
14# 2. Get specific model details
15def get_model_details(model_id):
16 response = requests.get(f'{BASE_URL}/hub/models/{model_id}', headers=headers)
17
18 if response.status_code == 200:
19 model = response.json()
20 return {
21 'id': model['id'],
22 'name': model['name'],
23 'description': model['description'],
24 'params_billions': model['params_billions'],
25 'context_length': model['context_length'],
26 'methods': model['methods'],
27 'pricing': model['pricing']
28 }
29 return None
30
31# 3. Get pricing tiers
32def get_pricing_tiers():
33 response = requests.get(f'{BASE_URL}/hub/tiers', headers=headers)
34 return response.json()
35
36# Usage examples
37models = list_models()
38llama_details = get_model_details('llama-3.1-8b')
39print(f"Llama 3.1 8B context length: {llama_details['context_length']}")

Datasets API

Upload, manage, and version your training datasets with the Datasets API.

Available Endpoints:
- POST /v1/datasets/ - Upload new dataset
- GET /v1/datasets/ - List your datasets
- GET /v1/datasets/{dataset_id} - Get dataset details
- DELETE /v1/datasets/{dataset_id} - Delete dataset
- POST /v1/datasets/{dataset_id}/validate - Validate dataset format

Supported Formats:
- JSONL: Line-delimited JSON (recommended)
- CSV: Comma-separated values with headers
- Parquet: Apache Parquet for large datasets
- HuggingFace: Direct integration with HF datasets
python
1# Datasets API examples
2
3# 1. Upload a new dataset
4def upload_dataset(file_path, name, description=None):
5 with open(file_path, 'rb') as f:
6 files = {'file': (file_path, f)}
7 data = {
8 'name': name,
9 'description': description or f'Uploaded {name}',
10 'format': 'jsonl'
11 }
12 response = requests.post(
13 f'{BASE_URL}/datasets/',
14 headers={'Authorization': f'Bearer {API_KEY}'},
15 files=files,
16 data=data
17 )
18 return response.json()
19
20# 2. List all your datasets
21def list_datasets():
22 response = requests.get(f'{BASE_URL}/datasets/', headers=headers)
23 datasets = response.json()
24
25 for ds in datasets['items']:
26 print(f"📁 {ds['name']} ({ds['rows']:,} rows)")
27 print(f" Format: {ds['format']} | Size: {ds['size_mb']:.1f}MB")
28 return datasets
29
30# 3. Validate dataset before training
31def validate_dataset(dataset_id):
32 response = requests.post(
33 f'{BASE_URL}/datasets/{dataset_id}/validate',
34 headers=headers
35 )
36 result = response.json()
37
38 if result['valid']:
39 print("✅ Dataset is valid and ready for training")
40 print(f" Samples: {result['total_samples']:,}")
41 print(f" Avg tokens: {result['avg_tokens_per_sample']}")
42 else:
43 print("❌ Validation errors:")
44 for error in result['errors']:
45 print(f" - {error}")
46 return result
47
48# Usage
49dataset = upload_dataset('./training_data.jsonl', 'Customer Support v1')
50validate_dataset(dataset['id'])

Fine-tuning API

Start and manage fine-tuning jobs with the Fine-tuning API. Monitor progress, adjust parameters, and deploy your custom models.

Available Endpoints:
- POST /v1/fine-tuning/jobs - Create fine-tuning job
- GET /v1/fine-tuning/jobs - List all jobs
- GET /v1/fine-tuning/jobs/{job_id} - Get job status
- POST /v1/fine-tuning/jobs/{job_id}/cancel - Cancel running job
- GET /v1/fine-tuning/jobs/{job_id}/events - Stream job events

Job Lifecycle:
1. queued - Job submitted, waiting for resources
2. running - Active training in progress
3. completed - Training finished successfully
4. failed - Error occurred (check logs)
5. cancelled - Job was manually cancelled

Supported Fine-tuning Methods:
- LoRA - Parameter-efficient adaptation (r=8-64)
- QLoRA - Quantized LoRA for larger models (4-bit)
- Full - Traditional full parameter training
python
1# Fine-tuning API examples
2
3# 1. Create fine-tuning job
4def create_finetune_job(model_id, dataset_id, config=None):
5 default_config = {
6 'method': 'qlora',
7 'lora_config': {
8 'r': 32,
9 'alpha': 64,
10 'dropout': 0.05,
11 'target_modules': ['q_proj', 'v_proj', 'k_proj', 'o_proj']
12 },
13 'training_config': {
14 'epochs': 3,
15 'batch_size': 4,
16 'learning_rate': 2e-4,
17 'warmup_ratio': 0.1
18 }
19 }
20
21 payload = {
22 'model_id': model_id,
23 'dataset_id': dataset_id,
24 'config': config or default_config,
25 }
26
27 response = requests.post(
28 f'{BASE_URL}/fine-tuning/jobs',
29 headers=headers,
30 json=payload
31 )
32 return response.json()
33
34# 2. Monitor fine-tuning progress
35def get_finetune_status(job_id):
36 response = requests.get(
37 f'{BASE_URL}/fine-tuning/jobs/{job_id}',
38 headers=headers
39 )
40
41 if response.status_code == 200:
42 job = response.json()
43 return {
44 'status': job['status'],
45 'progress': job.get('progress', 0),
46 'current_epoch': job.get('current_epoch', 0),
47 'loss': job.get('metrics', {}).get('train_loss'),
48 'estimated_completion': job.get('estimated_completion')
49 }
50 return None
51
52# 3. Stream training events in real-time
53def stream_events(job_id):
54 response = requests.get(
55 f'{BASE_URL}/fine-tuning/jobs/{job_id}/events',
56 headers=headers,
57 stream=True
58 )
59
60 for line in response.iter_lines():
61 if line:
62 event = json.loads(line)
63 print(f"[{event['type']}] {event['message']}")
64
65# Usage
66job = create_finetune_job('llama-3.1-8b', 'dataset_abc123')
67print(f"Started job {job['id']}, status: {job['status']}")

Inference API

Use the Inference API to generate text with base models or your fine-tuned models. Supports both synchronous and streaming responses.

Available Endpoints:
- POST /v1/completions - Text completion (legacy)
- POST /v1/chat/completions - Chat completion (recommended)
- POST /v1/embeddings - Text embeddings

Generation Parameters:
- temperature: Controls randomness (0.0-2.0, default: 0.7)
- top_p: Nucleus sampling threshold (0.0-1.0, default: 0.9)
- top_k: Top-k sampling (1-100, default: 50)
- max_tokens: Maximum output length (1-4096)
- stream: Enable streaming responses (default: false)
- stop: Stop sequences to end generation

Response Formats:
- Synchronous: Complete response in single request
- Streaming: Server-Sent Events (SSE) for real-time tokens
python
1# Inference API examples
2
3# 1. Chat completion (recommended)
4def chat_completion(model_id, messages, **kwargs):
5 payload = {
6 'model': model_id,
7 'messages': messages,
8 'max_tokens': kwargs.get('max_tokens', 512),
9 'temperature': kwargs.get('temperature', 0.7),
10 'top_p': kwargs.get('top_p', 0.9),
11 }
12
13 response = requests.post(
14 f'{BASE_URL}/chat/completions',
15 headers=headers,
16 json=payload
17 )
18 return response.json()
19
20# 2. Streaming chat completion
21def stream_chat(model_id, messages):
22 payload = {
23 'model': model_id,
24 'messages': messages,
25 'stream': True
26 }
27
28 response = requests.post(
29 f'{BASE_URL}/chat/completions',
30 headers=headers,
31 json=payload,
32 stream=True
33 )
34
35 full_response = ""
36 for line in response.iter_lines():
37 if line.startswith(b'data: '):
38 data = json.loads(line[6:])
39 if data != '[DONE]':
40 chunk = data['choices'][0]['delta'].get('content', '')
41 full_response += chunk
42 print(chunk, end='', flush=True)
43
44 return full_response
45
46# 3. Generate embeddings
47def get_embeddings(texts, model='text-embedding-3-small'):
48 payload = {
49 'model': model,
50 'input': texts
51 }
52 response = requests.post(f'{BASE_URL}/embeddings', headers=headers, json=payload)
53 return response.json()['data']
54
55# Usage examples
56response = chat_completion(
57 'llama-3.1-8b',
58 [{"role": "user", "content": "Explain quantum computing in simple terms"}],
59 max_tokens=200,
60 temperature=0.8
61)
62print(response['choices'][0]['message']['content'])

Webhooks

Receive real-time notifications when events occur in your LangTrain account.

Available Endpoints:
- POST /v1/webhooks - Register webhook endpoint
- GET /v1/webhooks - List registered webhooks
- DELETE /v1/webhooks/{webhook_id} - Remove webhook

Supported Events:
- training.started - Fine-tuning job started
- training.completed - Fine-tuning job finished
- training.failed - Fine-tuning job errored
- model.deployed - Model deployed to production
- usage.threshold - Usage threshold reached

Webhook Payload:
All webhooks include a signature header X-LangTrain-Signature for verification.
python
1# Webhook examples
2
3# 1. Register a webhook
4def register_webhook(url, events):
5 payload = {
6 'url': url,
7 'events': events,
8 'secret': 'your-webhook-secret'
9 }
10
11 response = requests.post(
12 f'{BASE_URL}/webhooks',
13 headers=headers,
14 json=payload
15 )
16 return response.json()
17
18# 2. Webhook receiver (Flask example)
19from flask import Flask, request
20import hmac
21import hashlib
22
23app = Flask(__name__)
24WEBHOOK_SECRET = 'your-webhook-secret'
25
26def verify_signature(payload, signature):
27 expected = hmac.new(
28 WEBHOOK_SECRET.encode(),
29 payload,
30 hashlib.sha256
31 ).hexdigest()
32 return hmac.compare_digest(expected, signature)
33
34@app.route('/webhook', methods=['POST'])
35def handle_webhook():
36 signature = request.headers.get('X-LangTrain-Signature')
37
38 if not verify_signature(request.data, signature):
39 return {'error': 'Invalid signature'}, 401
40
41 event = request.json
42
43 if event['type'] == 'training.completed':
44 job_id = event['data']['job_id']
45 model_id = event['data']['model_id']
46 print(f"✅ Training completed: {job_id} -> {model_id}")
47 # Deploy model, send notification, etc.
48
49 return {'received': True}
50
51# Register webhook for training events
52webhook = register_webhook(
53 'https://yourapp.com/webhook',
54 ['training.started', 'training.completed', 'training.failed']
55)
56print(f"Webhook registered: {webhook['id']}")

Error Handling

LangTrain API uses standard HTTP status codes and consistent error responses.

HTTP Status Codes:
- 200 OK - Request succeeded
- 201 Created - Resource created successfully
- 400 Bad Request - Invalid request parameters
- 401 Unauthorized - Invalid or missing API key
- 403 Forbidden - Insufficient permissions
- 404 Not Found - Resource doesn't exist
- 429 Too Many Requests - Rate limit exceeded
- 500 Internal Server Error - Server-side error

Error Response Format:
All errors return JSON with error object containing code, message, and optional details.
python
1# Error handling examples
2
3import requests
4from requests.exceptions import Timeout, RequestException
5
6class LangTrainError(Exception):
7 def __init__(self, code, message, details=None):
8 self.code = code
9 self.message = message
10 self.details = details
11 super().__init__(f"[{code}] {message}")
12
13def api_request(method, endpoint, **kwargs):
14 """Make API request with proper error handling"""
15 url = f'{BASE_URL}{endpoint}'
16
17 try:
18 response = requests.request(
19 method, url,
20 headers=headers,
21 timeout=30,
22 **kwargs
23 )
24
25 # Raise for HTTP errors
26 if response.status_code >= 400:
27 error = response.json().get('error', {})
28 raise LangTrainError(
29 code=error.get('code', 'unknown_error'),
30 message=error.get('message', 'An error occurred'),
31 details=error.get('details')
32 )
33
34 return response.json()
35
36 except Timeout:
37 raise LangTrainError('timeout', 'Request timed out')
38 except RequestException as e:
39 raise LangTrainError('connection_error', str(e))
40
41# Usage with retry logic
42import time
43from functools import wraps
44
45def retry_on_rate_limit(max_retries=3, backoff=1):
46 def decorator(func):
47 @wraps(func)
48 def wrapper(*args, **kwargs):
49 for attempt in range(max_retries):
50 try:
51 return func(*args, **kwargs)
52 except LangTrainError as e:
53 if e.code == 'rate_limit_exceeded' and attempt < max_retries - 1:
54 wait = backoff * (2 ** attempt)
55 print(f"Rate limited. Retrying in {wait}s...")
56 time.sleep(wait)
57 else:
58 raise
59 return wrapper
60 return decorator
61
62@retry_on_rate_limit(max_retries=3)
63def safe_api_call():
64 return api_request('GET', '/hub/')

Rate Limits & Quotas

API usage is subject to rate limits based on your subscription tier.

Rate Limits by Tier:
| Tier | Requests/min | Requests/day | Max Concurrent |
|------|-------------|--------------|----------------|
| Free | 60 | 1,000 | 2 |
| Pro | 1,000 | 50,000 | 10 |
| Enterprise | 10,000 | Unlimited | 100 |

Rate Limit Headers:
- X-RateLimit-Limit - Requests allowed per window
- X-RateLimit-Remaining - Requests remaining
- X-RateLimit-Reset - Unix timestamp when limit resets

Best Practices:
- Implement exponential backoff on 429 errors
- Cache responses when possible
- Use webhooks instead of polling
- Batch requests where supported
python
1# Rate limit handling
2
3def check_rate_limits(response):
4 """Extract and display rate limit info from response headers"""
5 limit = response.headers.get('X-RateLimit-Limit')
6 remaining = response.headers.get('X-RateLimit-Remaining')
7 reset = response.headers.get('X-RateLimit-Reset')
8
9 if remaining and int(remaining) < 10:
10 print(f"⚠️ Warning: Only {remaining}/{limit} requests remaining")
11 print(f" Resets at: {datetime.fromtimestamp(int(reset))}")
12
13 return {
14 'limit': int(limit) if limit else None,
15 'remaining': int(remaining) if remaining else None,
16 'reset': int(reset) if reset else None
17 }
18
19# Batch requests for efficiency
20def batch_inference(prompts, model_id, batch_size=10):
21 """Process multiple prompts efficiently"""
22 results = []
23
24 for i in range(0, len(prompts), batch_size):
25 batch = prompts[i:i + batch_size]
26
27 response = requests.post(
28 f'{BASE_URL}/chat/completions/batch',
29 headers=headers,
30 json={
31 'model': model_id,
32 'requests': [
33 {'messages': [{'role': 'user', 'content': p}]}
34 for p in batch
35 ]
36 }
37 )
38
39 rate_info = check_rate_limits(response)
40 results.extend(response.json()['responses'])
41
42 # Respect rate limits
43 if rate_info['remaining'] and rate_info['remaining'] < 5:
44 time.sleep(1)
45
46 return results
47
48# Usage
49prompts = ["Summarize this:", "Translate this:", "Explain this:"]
50results = batch_inference(prompts, 'llama-3.1-8b')

On this page

AuthenticationModels APIDatasets APIFine-tuning APIInference APIWebhooksError HandlingRate Limits & Quotas