Visual Roadmap β Part 3: Advanced TopicsΒΆ
Fine-tuning decisions, multimodal AI, inference optimization, and AI safety.
11. Fine-tuning Decision TreeΒΆ
flowchart TD
START{Can a good prompt solve it?}
START -->|Yes| PE[Prompt Engineering]
START -->|No| DOCS{Need private/current docs?}
DOCS -->|Yes| RAG_FT[RAG]
DOCS -->|No| STYLE{Need consistent style/format?}
STYLE -->|Yes| FT[Fine-tuning]
STYLE -->|No| PRIV{Data privacy critical?}
PRIV -->|Yes| FT
PRIV -->|No| COST{Inference cost a concern?}
COST -->|Yes| FT_SMALL["Fine-tune smaller model"]
COST -->|No| PE
subgraph Fine-tuning Methods
FULL[Full Fine-tuning >40GB VRAM]
LORA["LoRA / DoRA (8-12GB)"]
QLORA["QLoRA 4-bit (4-6GB)"]
PT["Prompt Tuning (<1GB)"]
end
12. Multimodal AIΒΆ
flowchart TD
subgraph Vision Language
IMG[Image] --> VIT_M[ViT Encoder]
VIT_M --> PROJ[Projection Layer]
TXT[Text Prompt] --> LLM_M[LLM]
PROJ --> LLM_M
LLM_M --> RESP[Text Response]
end
subgraph Image Generation
PROMPT[Text Prompt] --> CLIP_M[CLIP Encoder]
CLIP_M --> UNET["U-Net (Denoising)"]
UNET --> VAE[VAE Decoder]
VAE --> IMAGE[Generated Image]
end
subgraph Models
GPT5[GPT-5.4 Vision]
GEM[Gemini 3.1 Pro]
CL[Claude Sonnet 4.6]
FLUX_M[FLUX 1.1]
SD35[Stable Diffusion 3.5]
end
13. Inference OptimizationΒΆ
flowchart LR
MODEL[Trained Model] --> Q[Quantization]
MODEL --> KV[KV-Cache Optimization]
MODEL --> SPEC[Speculative Decoding]
MODEL --> BATCH[Continuous Batching]
Q --> Q4["4-bit (GPTQ / AWQ)"]
Q --> Q8["8-bit (bitsandbytes)"]
KV --> PAGED[PagedAttention - vLLM]
KV --> PREFIX[Prefix Caching]
SPEC --> DRAFT[Draft Model + Verify]
BATCH --> VLLM[vLLM / SGLang / TGI]
VLLM --> DEPLOY[Production Deployment]
DEPLOY --> GPU["NVIDIA A100/H100"]
DEPLOY --> APPLE[Apple Silicon - MLX]
DEPLOY --> EDGE["Edge - Phi-4 / Qwen3-0.6B"]
14. AI Safety & GuardrailsΒΆ
flowchart TD
INPUT[User Input] --> V["Layer 1: Input Validation"]
V --> PI["Layer 2: Prompt Injection Detection"]
PI --> MOD["Layer 3: Content Moderation"]
MOD --> PII["Layer 4: PII Detection"]
PII --> LLM_S["Layer 5: LLM Processing"]
LLM_S --> OUT["Layer 6: Output Validation"]
OUT --> LOG["Layer 7: Monitoring & Logging"]
LOG --> RESP_S[Safe Response]
subgraph Red Teaming
RT1[Prompt Injection Attacks]
RT2[Jailbreak Attempts]
RT3[Data Extraction Probes]
RT4[Bias & Fairness Testing]
end