LexeyLexey
HomePricingDocs
Sign in
AI safety
Basics
  • Getting started
Features
  • Manage tab
  • Knowledge base
  • Conversations
  • Customer chat
  • Skills & automations
  • Quality assurance
  • AI safety
  • Billing & usage
Deployment
  • Embedding your chat
API
  • Customer Chat API
  • Management Chat API
  • Webhook events
  • API key management
  • Streaming guide
  • Agent integration
Help
  • FAQ
Features/AI safety

AI safety

Lexey includes built-in safety guardrails that protect both your business and your customers.

Input filtering

Every customer message is screened before reaching the support agent. The input filter blocks:

  • Prompt injection — Attempts to override the agent's instructions.
  • System prompt extraction — Attempts to get the agent to reveal its instructions.
  • Abusive content — Offensive or harmful language.
  • Adversarial inputs — Crafted inputs designed to manipulate the agent.

Blocked messages receive a safe, professional refusal.

Output filtering

Every assistant response is screened after generation. The output filter flags:

  • System prompt leakage — Responses that reveal internal instructions.
  • Hallucinated policies — Made-up policies or information not in the knowledge base.
  • Inappropriate content — Responses that don't meet content standards.
  • Instruction compliance — Responses that indicate a successful prompt injection.

Web content safety

Content fetched via the management chat's URL import feature is screened by an AI classifier (URL assessment + content classification) before reaching the configuration agent or being persisted to the database.

← Previous: Quality assuranceNext: Billing & usage →
On this page
Input filteringOutput filteringWeb content safety

Product

  • Features
  • Pricing
  • Use Cases

Docs

  • Getting Started
  • Customer API
  • Management API

Account

  • Sign Up
  • Sign In
© 2026 Lexey·Terms of Service·Privacy Policy