Capability · Framework — orchestration
Microsoft Presidio
Presidio ships two main engines, Analyzer and Anonymizer. Analyzer detects entities like EMAIL, CREDIT_CARD, PHONE, AADHAAR, and custom patterns; Anonymizer replaces them with redaction, hashing, or format-preserving encryption. It's the standard way to scrub PII out of prompts, logs, and training data.
Framework facts
- Category
- orchestration
- Language
- Python
- License
- MIT
- Repository
- https://github.com/microsoft/presidio
Install
pip install presidio_analyzer presidio_anonymizer
python -m spacy download en_core_web_lg Quickstart
from presidio_analyzer import AnalyzerEngine
from presidio_anonymizer import AnonymizerEngine
text = 'Call Priya at +91-98-1234-5678 or [email protected].'
analyzer = AnalyzerEngine()
results = analyzer.analyze(text=text, language='en')
print(AnonymizerEngine().anonymize(text=text, analyzer_results=results).text) Alternatives
- LLM Guard — LLM-specific security scanners
- AWS Comprehend PII
- Scrubadub — pure Python scrubber
- PIIcatcher — structured data focus
Frequently asked questions
Does Presidio detect region-specific PII like Aadhaar or PAN?
Yes, via built-in recognisers and easy custom patterns. The community ships recognisers for IN_AADHAAR, IN_PAN, IN_VEHICLE_REGISTRATION and more.
Can Presidio run inside an air-gapped network?
Yes. It's pure Python with optional spaCy / transformers NER models; everything can be packaged and run offline.
Sources
- Presidio — GitHub — accessed 2026-04-20
- Presidio — docs — accessed 2026-04-20