Tag
#content-moderation
2 posts tagged content-moderation.
- Defense
Output Filtering Architecture for Production LLMs: Semantic Classifiers, Regex Guards, and LLM-as-Judge
A deep-dive into layered output filtering for production LLMs — combining semantic classifiers, regex scrubbing, and LLM-as-judge techniques to catch harmful, policy-violating, and hallucinated content before it reaches users or downstream systems.
- Defense
Output Filtering Architecture for Production LLMs: A Defense Engineer's Blueprint
How to architect a multi-layer output filtering pipeline for production LLMs — covering deterministic guards, ML classifiers, schema validation, and async sequencing patterns to minimize latency while maximizing coverage.