Islamic Primary Source Corpus
668,436 structured, attributed, watermarked documents.
One searchable library.
Every document indexed, cross-referenced, and accessible through Theo.
Quran
القرآن الكريم
6,236
ayat
Complete Quranic text with full Arabic vowelization, byte-verified against the Tanzil.net reference standard.
Hadith
الحديث النبوي
449,415
narrations across 61 collections
Prophetic sayings, actions, and approvals spanning 13 tiers of scholarly authority with structured narrator chains and attributed grades.
Rijal
علم الرجال
197,868
narrator assessments from 20 works
Biographical dictionaries spanning 600 years of Islamic scholarship, with extracted assessments and cross-work consensus grades.
Ilal
علل الحديث
14,917
chain defect analyses from 8 works
Specialist analyses documenting hidden transmission problems not apparent from surface-level grading.
61 collections. 449,415 narrations.
From the Kutub al-Sittah through Musnad Aḥmad and 53 additional major, supplementary, and specialist collections.
The Kutub al-Sittah
Ṣaḥīḥ al-Bukhārī
al-Bukhārī (d. 256 AH)
7,135 entries
Ṣaḥīḥ Muslim
Muslim ibn al-Ḥajjāj (d. 261 AH)
7,460 entries
Sunan Abī Dāwūd
Abū Dāwūd (d. 275 AH)
5,275 entries
Jāmiʿ al-Tirmidhī
al-Tirmidhī (d. 279 AH)
3,910 entries
Sunan al-Nasāʾī
al-Nasāʾī (d. 303 AH)
5,769 entries
Sunan Ibn Mājah
Ibn Mājah (d. 273 AH)
4,341 entries
All 61 collections indexed and searchable through Theo — including Muwaṭṭaʾ Mālik, Musnad Aḥmad, Sunan al-Bayhaqī, and 52 additional works spanning tafsīr, sīra, methodology, and fabrication catalogs.
Narrator assessment at scale.
197,868 entries spanning 600 years of Islamic scholarship.
The Rijal index aggregates narrator biographies from 20 classical biographical dictionaries — from al-Bukhārī's al-Tārīkh al-Kabīr through Ibn Ḥajar's Tahdhīb al-Tahdhīb and al-Dhahabī's Siyar Aʿlām al-Nubalāʾ.
Each narrator entry carries extracted assessments — trustworthy, weak, contested — with the source scholar attributed. When multiple works assess the same narrator, Theo surfaces the full assessment history and detects disagreement patterns across centuries of criticism.
Edition 2 added Ibn Maʿīn's three recensions (7,421 entries) — the founding figure in hadith criticism science, whose assessments predate most other rijal literature.
What changed in Edition 2
Released March 2026. Key structural improvements over Edition 1.
+35,774 documents
From 632,662 to 668,436 total documents across all four indexes.
Isnad/matn separation
99.4% of hadith texts structurally parsed into narrator chain and prophetic text.
Attributed grades
57,218 hadith graded with full provenance — named scholar, specific book, methodology.
Structured chains
110,655 documents with ordered narrator arrays, averaging 6.2 narrators per chain.
3 new rijal works
Ibn Maʿīn's three recensions added — the founding critic in hadith science.
Asbāb al-Nuzūl
1,228 Quranic revelation context entries — a first in computational Islamic studies.
How we built it.
Academic-grade data quality with enterprise-grade integrity protection.
3
Independent sources
18
Enrichment stages
135K+
Validation tests
HMAC
SHA-256 watermark
Every document in the IPSC™ passes through 18 enrichment stages — from base field mapping and IP branding through LLM-assisted isnad separation and structured chain parsing. Three independent source pipelines are cross-validated before any document enters the corpus. A cryptographic HMAC-SHA256 signature is embedded in every document — proof of origin that cannot be stripped without knowing the private key.
Explore the corpus through Theo
Search 61 collections. Trace narrator chains. Verify grades. All from a single conversation.