This section is about converting: www.gutenberg.org/ebooks/10, and most likely the plaintext version: stackoverflow.com/a/43060761/895245 to the same data format as www.kaggle.com/datasets/oswinrh/bible mapping book/chapter/verse to the text:
1 1 1 In the beginning God created the heaven and the earth.
1 1 2 And the earth was without form, and void; and darkness was upon the face of the deep.
On particular annoyance is that the txt version has multiple verses per line at times.
We'd likely just want to use a slightly modified version of: stackoverflow.com/a/43060761/895245 that searches for patterns of type:with incremental integers.
(\d+):(\d+)