OT Speaker Attribution & Discourse Particles¶
Identification of who speaks in the Hebrew Bible using MACULA Hebrew subjref
links on speech-verb tokens, plus analysis of key Hebrew discourse particles.
Speech verbs tracked: אָמַר (say), דָּבַר (speak), קָרָא (call/proclaim), עָנָה (answer), צָוָה (command), שָׁלַח (send), נָאַם (declare/oracle formula).
Discourse particles analyzed: הִנֵּה (presentative), כִּי (connective/causal), וְ (connective), לָכֵן (consequence), עַתָּה (temporal), גַּם (additive), אַךְ (restrictive).
Sections:
- OT Speaker Attribution (print_speaker_summary)
- Divine Speech by Book (print_divine_speech_by_book)
- Who Speaks in a Book (who_speaks)
- Divine Speech Verse References (divine_speech_verses)
- Generate Speaker Report
- Discourse Particle Tagging
- Particle Summary by Book
- Cross-Book כִּי Comparison
import sys
sys.path.insert(0, '../../../src')
import warnings
warnings.filterwarnings('ignore')
import pandas as pd
pd.set_option('display.max_rows', 60)
pd.set_option('display.max_columns', 20)
pd.set_option('display.width', 120)
print('Ready.')
1. OT Speaker Attribution¶
The ot_speaker module identifies who speaks in the Hebrew Bible using MACULA
Hebrew subjref links on speech-verb tokens. This answers: what proportion of
each OT book is direct divine speech? Who dominates dialogue in Job, Genesis, Jeremiah?
What speech verbs does YHWH use in Isaiah vs Deuteronomy?
from bible_grammar import print_speaker_summary
# What does YHWH+Elohim say in Isaiah?
print_speaker_summary(['H3068', 'H0430'], books=['Isa'], label='YHWH+Elohim')
2. Divine Speech by Book¶
Per-book divine speech percentage across the entire OT. Lamentations (~31.8%) and Psalms (~25.9%) have the highest ratios; Leviticus (~22.0%) reflects dense legal/priestly speech. Historical narratives tend to have lower percentages.
from bible_grammar import print_divine_speech_by_book
print_divine_speech_by_book(min_count=3)
3. Who Speaks in a Book¶
Character dialogue breakdown for individual books. Job has a distinctive multi-voice structure: Job dominates (47 speech tokens), with Elihu, God, and the three friends as secondary voices. Genesis dialogue is dominated by YHWH/Elohim and the patriarchs.
from bible_grammar import who_speaks
# Who speaks in Job? — character dialogue breakdown
print('=== Who speaks in Job ===')
print(who_speaks('Job').to_string(index=False))
# Who speaks in Genesis?
print('=== Who speaks in Genesis ===')
print(who_speaks('Gen', top_n=15).to_string(index=False))
4. Divine Speech Verse References¶
Retrieve all verse references where YHWH speaks in a given book. These can be used for sermon preparation, course illustration, or targeted syntactic study of divine speech patterns.
from bible_grammar import divine_speech_verses
# All refs where YHWH speaks in Jeremiah
refs = divine_speech_verses('Jer')
print(f'Jeremiah: {len(refs)} YHWH speech refs')
for r in refs[:10]:
print(f' {r}')
5. Generate Speaker Report¶
Generates a full Markdown report for a given speaker in a given book. Includes top speech verbs, book distribution, and cross-testament context.
from bible_grammar import speaker_report
# Generate full Markdown report for YHWH speech in Isaiah
report = speaker_report(
['H3068', 'H0430'], books=['Isa'], label='YHWH+Elohim',
output_dir='../../../output/reports/ot/lexicon'
)
print(f'Report: {report}')
6. Discourse Particle Tagging¶
Seven key Hebrew discourse particles, classified by function using MACULA's English gloss:
| Particle | Label | Functions detected |
|---|---|---|
| הִנֵּה | presentative | attention-getter ('behold/look') |
| כִּי | connective | causal / content / adversative / conditional / asseverative / temporal |
| וְ | connective | sequential / adversative / logical / emphatic / temporal |
| לָכֵן | consequence | 'therefore / so' |
| עַתָּה | temporal | discourse 'now' (logical pivot) |
| גַּם | additive | 'also / even' (emphasis) |
| אַךְ | restrictive | 'only / surely / but' |
from bible_grammar import print_discourse_particles, print_particle_summary
# Isaiah 40: three hinne + ki content/causal/temporal clauses
print_discourse_particles('Isa', 40)
7. Particle Summary by Book¶
Genesis כִּי: 55% causal, 29% content. Deuteronomy has more כִּי conditional (legal protasis) and לָכֵן consequence markers, reflecting its instructional/covenantal genre.
# Genesis ki sense breakdown
print_particle_summary('Gen')
# Deuteronomy: more ki conditional (legal protasis) and laken consequence
print_particle_summary('Deu')
8. Cross-Book כִּי Comparison¶
The multi-functional כִּי is one of the most important words in Biblical Hebrew discourse analysis. Its distribution across causal, content, adversative, conditional, and temporal functions varies significantly by genre.
from bible_grammar import discourse_particle_summary
books = ['Gen', 'Deu', 'Isa', 'Psa', 'Job']
frames = []
for b in books:
df = discourse_particle_summary(b)
# Filter to ki (כִּי)
ki = df[df['particle_label'] == '\u05db\u05bc\u05b4\u05d9'] # כִּי
ki = ki.copy()
ki['book'] = b
frames.append(ki)
combined = pd.concat(frames, ignore_index=True)
pivot = combined.pivot_table(
index='discourse_function', columns='book', values='count', fill_value=0
)
print('=== ki function distribution by book ===')
print(pivot.to_string())