Prompt Engineering for Udviklere
Systematisk tilgang til prompts. Ikke magic tricks, men reproducerbare patterns du kan bruge i production.
TL;DR
- • Brug templates - aldrig inline strings i production
- • Few-shot examples forbedrer output kvalitet markant
- • Chain-of-thought for komplekse reasoning tasks
- • Version dine prompts som kode
Prompt Templates
Den foerste regel i prompt engineering: Aldrig hardcode prompts i din kode. Brug templates der kan versioneres, testes og genbruges.
Simple templates med f-strings
For simple use cases er Pythons f-strings fine:
1def create_summary_prompt(text: str, max_words: int = 100) -> str:2 return f"""Opsummer foelgende tekst i maksimalt {max_words} ord.3Fokuser pa de vigtigste pointer.4Skriv pa dansk.5
6Tekst:7{text}8
9Opsummering:"""10
11# Usage12prompt = create_summary_prompt(article_text, max_words=50)13response = client.messages.create(14 model="claude-sonnet-4-20250514",15 max_tokens=200,16 messages=[{"role": "user", "content": prompt}]17)Strukturerede templates med Jinja2
For komplekse prompts med conditionals og loops, brug Jinja2:
1from jinja2 import Template2
3ANALYSIS_TEMPLATE = Template("""Du er en {{ role }}.4
5{% if context %}6Kontekst:7{{ context }}8{% endif %}9
10Analyser foelgende {{ content_type }}:11{{ content }}12
13{% if output_format %}14Output format: {{ output_format }}15{% endif %}16
17{% if examples %}18Eksempler pa onsket output:19{% for example in examples %}20- {{ example }}21{% endfor %}22{% endif %}23""")24
25def create_analysis_prompt(26 content: str,27 role: str = "senior software arkitekt",28 content_type: str = "kode",29 context: str | None = None,30 output_format: str | None = None,31 examples: list[str] | None = None32) -> str:33 return ANALYSIS_TEMPLATE.render(34 role=role,35 content=content,36 content_type=content_type,37 context=context,38 output_format=output_format,39 examples=examples40 )Few-Shot Learning
Few-shot prompting er en af de mest effektive teknikker. Du giver modellen eksempler pa onsket input/output, og den laerer monsteret.
1def classify_with_examples(text: str) -> str:2 prompt = """Klassificer foelgende kundehenvendelser.3Kategorier: [support, salg, klage, feedback]4
5Eksempel 1:6Input: "Hvordan nulstiller jeg min adgangskode?"7Kategori: support8
9Eksempel 2:10Input: "Jeg vil gerne hore mere om enterprise planen"11Kategori: salg12
13Eksempel 3:14Input: "Produktet virkede ikke som lovet, jeg vil have refusion"15Kategori: klage16
17Eksempel 4:18Input: "Elsker jeres nye feature!"19Kategori: feedback20
21Nu klassificer denne:22Input: "{text}"23Kategori:"""24
25 response = client.messages.create(26 model="claude-sonnet-4-20250514",27 max_tokens=20,28 messages=[{"role": "user", "content": prompt.format(text=text)}]29 )30 return response.content[0].text.strip()Dynamic few-shot selection
For bedste resultater: Vaelg eksempler der ligner det aktuelle input. Brug embeddings til at finde de mest relevante:
1from sentence_transformers import SentenceTransformer2import numpy as np3
4class DynamicFewShot:5 def __init__(self, examples: list[dict]):6 self.model = SentenceTransformer('all-MiniLM-L6-v2')7 self.examples = examples8 # Pre-compute embeddings for all example inputs9 self.embeddings = self.model.encode([e['input'] for e in examples])10
11 def get_relevant_examples(self, query: str, k: int = 3) -> list[dict]:12 query_embedding = self.model.encode([query])[0]13
14 # Compute cosine similarities15 similarities = np.dot(self.embeddings, query_embedding) / (16 np.linalg.norm(self.embeddings, axis=1) * np.linalg.norm(query_embedding)17 )18
19 # Get top-k most similar20 top_indices = np.argsort(similarities)[-k:][::-1]21 return [self.examples[i] for i in top_indices]22
23# Usage24few_shot = DynamicFewShot(training_examples)25relevant = few_shot.get_relevant_examples("Kan I hjaelpe med integration?")Chain-of-Thought (CoT)
For komplekse reasoning tasks, bed modellen om at taenke hoejt. Det forbedrer accuracy betydeligt, specielt for matematik og logik.
1def solve_with_reasoning(problem: str) -> dict:2 prompt = f"""Los foelgende problem trin-for-trin.3
4Problem: {problem}5
6Taenk igennem dit svar:71. Hvad er de kendte variable?82. Hvilke formler/principper er relevante?93. Udfør beregningerne trin for trin104. Verificer dit svar11
12Vis al din tankegang."""13
14 response = client.messages.create(15 model="claude-sonnet-4-20250514",16 max_tokens=2000,17 messages=[{"role": "user", "content": prompt}]18 )19
20 # Extract reasoning and final answer21 full_response = response.content[0].text22 return {23 "reasoning": full_response,24 "answer": extract_final_answer(full_response)25 }Zero-shot CoT
Nogle gange er det nok bare at tilfoeje "Let's think step by step":
1def analyze_step_by_step(code: str) -> str:2 prompt = f"""Analyser denne kode for potentielle bugs.3
4{code}5
6Lad os taenke trin for trin:71. Hvad goer koden?82. Hvilke edge cases kan der vaere?93. Er der potentielle runtime errors?104. Hvad er din konklusion?"""11
12 response = client.messages.create(13 model="claude-sonnet-4-20250514",14 max_tokens=1500,15 messages=[{"role": "user", "content": prompt}]16 )17 return response.content[0].textStructured Output
Naar du skal parse output programmatisk, brug JSON mode eller eksplicitte format-instruktioner:
1import json2
3def extract_entities(text: str) -> dict:4 prompt = f"""Ekstraher entiteter fra foelgende tekst.5
6Tekst: {text}7
8Returner KUN valid JSON i dette format:9{{10 "personer": ["navn1", "navn2"],11 "organisationer": ["org1"],12 "steder": ["sted1"],13 "datoer": ["dato1"]14}}15
16JSON:"""17
18 response = client.messages.create(19 model="claude-sonnet-4-20250514",20 max_tokens=500,21 messages=[{"role": "user", "content": prompt}]22 )23
24 # Parse JSON from response25 try:26 return json.loads(response.content[0].text)27 except json.JSONDecodeError:28 # Fallback: try to extract JSON from response29 import re30 match = re.search(r'\{[^{}]*\}', response.content[0].text, re.DOTALL)31 if match:32 return json.loads(match.group())33 raise ValueError("Could not parse JSON from response")Prompt Versioning
Behandl prompts som kode. Version dem, test dem, og deploy dem:
1from dataclasses import dataclass2from datetime import datetime3from typing import Callable4
5@dataclass6class PromptVersion:7 version: str8 template: str9 created_at: datetime10 description: str11 test_cases: list[dict]12
13class PromptRegistry:14 def __init__(self):15 self.prompts: dict[str, list[PromptVersion]] = {}16
17 def register(18 self,19 name: str,20 template: str,21 version: str,22 description: str = "",23 test_cases: list[dict] = None24 ):25 if name not in self.prompts:26 self.prompts[name] = []27
28 self.prompts[name].append(PromptVersion(29 version=version,30 template=template,31 created_at=datetime.now(),32 description=description,33 test_cases=test_cases or []34 ))35
36 def get(self, name: str, version: str = "latest") -> str:37 if name not in self.prompts:38 raise KeyError(f"Prompt '{name}' not found")39
40 versions = self.prompts[name]41 if version == "latest":42 return versions[-1].template43
44 for v in versions:45 if v.version == version:46 return v.template47
48 raise KeyError(f"Version '{version}' not found for prompt '{name}'")A/B Testing af Prompts
Test forskellige prompt-varianter mod hinanden:
1import random2from collections import defaultdict3
4class PromptExperiment:5 def __init__(self, name: str, variants: dict[str, str]):6 self.name = name7 self.variants = variants8 self.results = defaultdict(list)9
10 def get_variant(self) -> tuple[str, str]:11 """Returns (variant_name, prompt_template)"""12 variant_name = random.choice(list(self.variants.keys()))13 return variant_name, self.variants[variant_name]14
15 def log_result(self, variant: str, score: float, metadata: dict = None):16 self.results[variant].append({17 "score": score,18 "metadata": metadata or {}19 })20
21 def get_stats(self) -> dict:22 stats = {}23 for variant, results in self.results.items():24 scores = [r["score"] for r in results]25 stats[variant] = {26 "count": len(scores),27 "mean": sum(scores) / len(scores) if scores else 0,28 "min": min(scores) if scores else 0,29 "max": max(scores) if scores else 0,30 }31 return stats32
33# Usage34experiment = PromptExperiment("summary_prompt", {35 "concise": "Opsummer i 50 ord: {text}",36 "detailed": "Giv en detaljeret opsummering med bullet points: {text}",37 "structured": "Opsummer med: 1) Hovedpointe 2) Detaljer 3) Konklusion: {text}"38})Common Pitfalls
- For vage instruktioner - Vaer specifik om format, laengde, og tone
- Manglende edge case handling - Hvad skal modellen goere hvis input er tomt?
- Ingen output validation - Check altid at output matcher forventet format
- Prompt injection - Sanitize user input der indgaar i prompts
- Token waste - Hold prompts koncise, men ikke pa bekostning af klarhed
Vaerktojer
Disse tools kan hjaelpe med prompt development og testing:
- Promptfoo - Open-source prompt testing framework
- LangSmith - LangChain's observability platform
- Weights & Biases - Experiment tracking for ML
- Helicone - LLM observability og analytics
Naeste skridt
Med disse fundamenter kan du bygge robuste prompt-systemer. Tjek vores Claude API guide for at laere om tool use, hvor du kan kombinere prompts med function calling.