Hebrew & Greek Verb Stem Overview¶
A consolidated statistics showcase covering Hebrew OT verb stems and Greek NT morphology.
This notebook replaces the older 02_query_demo.ipynb and 03_statistics.ipynb by
combining their content in one place with corrected output paths.
It demonstrates the query and stats APIs with representative charts and
ad-hoc query examples spanning both Testaments.
Sections:
- Hebrew Verb Stems — Entire OT (freq table + bar chart)
- Verb Stem Distribution Across Torah Books (grouped bar)
- Niphal Perfect Verbs by Book (bar chart)
- Greek Tense x Voice in the NT (heatmap)
- Greek Aorist Passives — Paul vs. Rest of NT
- Qal Imperatives in the Torah (ad-hoc query example)
- Export to CSV
import sys
sys.path.insert(0, '../../../src')
import pandas as pd
import matplotlib.pyplot as plt
from bible_grammar.query import query, reload
from bible_grammar import stats, charts
reload()
%matplotlib inline
plt.rcParams['figure.dpi'] = 120
print('Ready.')
1. Hebrew Verb Stems — Entire OT¶
The Qal is by far the most common stem (~43,000 tokens), representing basic active verbal action. The Hiphil (causative) and Piel (intensive) are the most common derived stems. The Pual, Hophal, and Hithpael are significantly less common.
all_stems = stats.freq_table(query(source='TAHOT', part_of_speech='Verb'), 'stem')
all_stems
fig = charts.bar_chart(
all_stems, x='stem',
title='Hebrew Verb Stems — Entire OT',
output_path='../../../output/charts/ot/verbs/ot_verb_stems_total.png'
)
plt.show()
2. Verb Stem Distribution Across Torah Books¶
Each Torah book shows a similar Qal-dominant profile, but with notable variation in the derived stems. Leviticus has proportionally more Piel (ritual/priestly language) while Genesis and Numbers have higher Hiphil usage.
torah_stems = stats.freq_table(
query(book_group='torah', part_of_speech='Verb'), ['book_id', 'stem']
)
# Keep top 6 stems for readability
top_stems = torah_stems.groupby('stem')['count'].sum().nlargest(6).index
torah_top = torah_stems[torah_stems['stem'].isin(top_stems)]
torah_top.head(20)
fig = charts.grouped_bar(
torah_top, x='book_id', hue='stem',
title='Hebrew Verb Stems in Torah Books (top 6 stems)',
output_path='../../../output/charts/ot/verbs/torah_verb_stems_grouped.png'
)
plt.show()
3. Niphal Perfect Verbs by Book¶
Jeremiah, Isaiah, and Ezekiel lead in Niphal Perfects — prophetic passive declarations of divine action. Leviticus ranks high due to purity-law passives ('it shall be unclean / it is cut off').
nip = stats.niphal_perfects_by_book()
print(f"Total Niphal Perfects: {nip['count'].sum()}")
nip.head(10)
fig = charts.bar_chart(
nip, x='book_id',
title='Niphal Perfect Verbs by OT Book (top 20)',
xlabel='Book', top_n=20,
output_path='../../../output/charts/ot/verbs/niphal_perfects_by_book.png'
)
plt.show()
4. Greek Tense x Voice in the NT¶
The NT verb system is dominated by Present Active and Aorist Active forms. The heatmap reveals the relative scarcity of Middle voice in the NT compared to classical Greek, and the near-absence of Perfect Passive constructions.
nt_verbs = stats.greek_verb_forms()
tv = nt_verbs.groupby(['tense', 'voice'])['count'].sum().reset_index()
fig = charts.heatmap(
tv, index='tense', columns='voice',
title='Greek NT: Tense x Voice',
output_path='../../../output/charts/nt/verbs/nt_tense_voice.png'
)
plt.show()
5. Greek Aorist Passives — Paul vs. Rest of NT¶
The Aorist Passive (often called the 'divine passive' in theological literature) is used to describe divine actions without naming God explicitly. Paul uses this construction frequently in his theological arguments (Romans, Galatians).
paul_ap = query(book_group='pauline', tense='Aorist', voice='Passive')
other_ap = query(testament='NT', tense='Aorist', voice='Passive')
other_ap = other_ap[~other_ap['book_id'].isin(paul_ap['book_id'].unique())]
print(f"Aorist Passives — Paul: {len(paul_ap)}, Rest of NT: {len(other_ap)}")
by_book = query(testament='NT', tense='Aorist', voice='Passive')
by_book_count = stats.freq_table(by_book, 'book_id')
fig = charts.bar_chart(
by_book_count, x='book_id', top_n=None,
title='Greek Aorist Passives by NT Book',
output_path='../../../output/charts/nt/verbs/nt_aorist_passive_by_book.png'
)
plt.show()
6. Qal Imperatives in the Torah (Ad-hoc Query Example)¶
This demonstrates the ad-hoc query() API — any combination of filters can
be applied on the fly. Genesis has by far the most Qal imperatives (divine commands,
patriarchal directives); Leviticus has relatively few (mostly categorical law).
imperatives = query(book_group='torah', stem='Qal', conjugation='Imperative')
print(f"Qal Imperatives in Torah: {len(imperatives)}")
by_book = imperatives.groupby('book_id').size().reset_index(name='count').sort_values('count', ascending=False)
by_book
# Verb stem distribution in a single book
gen_stems = stats.verb_stems_by_book(book='Gen')
fig = charts.bar_chart(
gen_stems, x='stem',
title='Hebrew Verb Stems in Genesis',
xlabel='Stem', top_n=None,
output_path='../../../output/charts/ot/verbs/genesis_verb_stems.png'
)
plt.show()
# Export Niphal perfect details
nip_detail = query(stem='Niphal', conjugation='Perfect')
nip_detail.to_csv('../../../output/exports/niphal_perfects.csv', index=False)
print(f"Saved {len(nip_detail)} rows to output/exports/niphal_perfects.csv")
# Export all Hebrew and Greek verb data
heb_verbs = query(source='TAHOT', part_of_speech='Verb')
heb_verbs.to_csv('../../../output/exports/hebrew_verbs.csv', index=False)
print(f"Hebrew verbs exported: {len(heb_verbs):,} rows")
grk_verbs = query(source='TAGNT', part_of_speech='Verb')
grk_verbs.to_csv('../../../output/exports/greek_verbs.csv', index=False)
print(f"Greek verbs exported: {len(grk_verbs):,} rows")