LXX Analysis — The Septuagint as a Queryable Corpus¶
The lxx_query module exposes the full Septuagint (Rahlfs 1935, CenterBLC edition)
as a first-class queryable corpus with 623,693 tokens covering all 39 canonical OT books
plus deuterocanonical books. Each token carries word form, lemma, transliteration, gloss,
Strong's G-number, part of speech, and full morphology.
The LXX is the Scriptures the NT authors quoted. When Paul writes about dikaiosyne (righteousness), he uses vocabulary shaped by the LXX rendering of tsedaqah. This notebook examines how the LXX uses Greek words — their distribution, morphological forms, and book-level patterns — as a bridge between the Hebrew OT and the Greek NT.
Sections:
- LXX as a Corpus (load_lxx_data, query_lxx)
- Key Theological Words in the LXX
- LXX Verb Morphology
- Word Distribution Across LXX Books
- LXX Frequency Tables
- OT to LXX to NT Translation Pipeline (lxx_alignment)
import sys
sys.path.insert(0, '../../../src')
import warnings
warnings.filterwarnings('ignore')
import pandas as pd
pd.set_option('display.max_rows', 60)
pd.set_option('display.max_columns', 20)
pd.set_option('display.width', 120)
print('Ready.')
1. LXX as a Corpus¶
The first call to load_lxx_data() parses the LXX data and caches it as Parquet.
Subsequent calls are instant. The corpus includes canonical and deuterocanonical books;
use is_deuterocanon=False to filter to canonical OT only.
from bible_grammar import load_lxx_data, query_lxx
df_lxx = load_lxx_data()
print(f'LXX tokens: {len(df_lxx):,}')
print(f'Books: {df_lxx["book_id"].nunique()} (incl. deuterocanon: {df_lxx["lxx_book"].nunique()})')
print(f'Canonical only: {len(df_lxx[~df_lxx["is_deuterocanon"]]):,}')
2. Key Theological Words in the LXX¶
The LXX made deliberate lexical choices when translating Hebrew theological terms. These choices shaped NT vocabulary: διαθήκη (covenant), εἰρήνη (peace), θεός (God), and κύριος (LORD) all derive their NT theological weight from their LXX usage.
from bible_grammar import print_lxx_query
# diatheke (G1242) — the LXX word for covenant (translates berit)
# Per-book distribution shows where covenant language concentrates
print_lxx_query(strongs='G1242')
# theos in the LXX Prophets — the most theologically loaded word
print_lxx_query(strongs='G2316', book_group='prophets')
3. LXX Verb Morphology¶
The LXX translators faced choices about which Greek tense/voice/mood to use when rendering Hebrew verb forms. Study of LXX verb morphology reveals translation techniques and the translators' interpretation of the Hebrew.
from bible_grammar import lxx_verb_stats
# Verb morphology for poieo (G4160, do/make) in the LXX
# Compare with NT usage to see how the LXX shapes NT vocabulary
lxx_verb_stats(strongs='G4160')
4. Word Distribution Across LXX Books¶
Per-book distribution of key theological words reveals where in the canon certain concepts concentrate. eirene (peace) maps closely to the distribution of Hebrew shalom.
from bible_grammar import lxx_by_book
# eirene (G1515, peace) — the LXX rendering of shalom
# Canonical OT order shows distribution pattern
eirene = lxx_by_book(strongs='G1515')
print('eirene in the LXX (canonical books):')
eirene[eirene['count'] > 0]
5. LXX Frequency Tables¶
Frequency tables over morphological attributes reveal the LXX's verbal preferences. The LXX Prophets favor Aorist and Future for divine speech — a translation technique that reflects the Hebrew perfect-as-past and imperfect-as-future system.
from bible_grammar import lxx_freq_table
# Tense distribution for LXX verbs in the Prophets
lxx_freq_table('tense', part_of_speech='Verb', book_group='prophets')
6. OT to LXX to NT Translation Pipeline¶
The complete pipeline for tracing a Hebrew theological term through the LXX into the NT:
- Which Hebrew roots does the LXX render with this Greek word? (lxx_alignment)
- How is the Greek word distributed across the LXX? (lxx_by_book)
- How does the NT use the same Greek word? (query / concordance)
Example: δικαιοσύνη (righteousness) — tracing Hebrew tsedek (H6664) through LXX rendering into Pauline theology.
from bible_grammar import lxx_alignment
# How does the LXX render shalom (H7965, peace)?
# Uses inline tree alignment — word-level, not statistical
lxx_alignment('H7965')
# ruach (H7307, spirit/wind) — how does the LXX resolve the ambiguity?
lxx_alignment('H7307')
# hesed (H2617, lovingkindness/steadfast love) — no single English word covers its range
# The LXX usually renders it eleos (mercy)
lxx_alignment('H2617')
from bible_grammar import lxx_alignment, lxx_by_book, query_lxx
from bible_grammar import concordance
# Complete OT to LXX to NT pipeline for dikaiosyne (righteousness)
# Step 1: what Hebrew words does the LXX render as dikaiosyne?
print('Step 1: How LXX renders tsedeq (H6664, righteousness)')
print(lxx_alignment('H6664').head(5).to_string())
print()
# Step 2: LXX distribution for dikaiosyne (G1343)
print('Step 2: dikaiosyne in the LXX')
print(lxx_by_book(strongs='G1343')[lambda df: df['count'] > 0].to_string())
print()
# Step 3: NT usage of dikaiosyne
print('Step 3: dikaiosyne in the NT')
nt_conc = concordance(strongs='G1343')
if not nt_conc.empty:
print(nt_conc.groupby('book_id').size().sort_values(ascending=False).head(10).to_string())
7. LXX Translation Consistency¶
How uniformly does the LXX render a given Hebrew root? Using IBM Model 1
word-level alignment, lxx_consistency measures:
- Consistency score (0–100): percentage of aligned tokens using the most common LXX rendering for that book
- Per-book rendering profile: which Greek lemma(s) each book uses
- Cross-book divergences: books that diverge from the corpus-wide primary
High consistency (>90%) means the LXX translator treated the word uniformly. Low consistency often signals a Hebrew root with a wide semantic range — the LXX made different contextual choices across books or traditions.
from bible_grammar.lxx_consistency import (
lxx_consistency, print_lxx_consistency, consistency_heatmap
)
# ruach (H7307, spirit/wind) — famously inconsistent across LXX books
# Some books render it pneuma, others anemos (wind), others psyche
print_lxx_consistency('H7307')
# hesed (H2617, lovingkindness/steadfast love)
# The LXX uses eleos (mercy) most often but with notable divergences
print_lxx_consistency('H2617')
# shalom (H7965, peace) — highly consistent: eirene almost always
print_lxx_consistency('H7965')
# Consistency heatmap: shows rendering choices across books visually
from IPython.display import Image
path_chart = consistency_heatmap('H7307')
print(f'Saved: {path_chart}')
Image(str(path_chart))
# Raw DataFrame — per-book rendering profile
df = lxx_consistency('H2617')
df
# Batch comparison — multiple theologically significant roots
from bible_grammar.lxx_consistency import batch_consistency
roots = ['H7965', 'H2617', 'H7307', 'H1285', 'H6664'] # shalom, hesed, ruach, berith, tsedeq
df_batch = batch_consistency(roots)
df_batch