The “em-dashes” (—) come up a lot in online translations of books like Bible and Quran.
Normal keyboard “-” and “–” are different from “—” but microsoft office auto-formats “–” to that.
I kinda assumed it was ALL microsoft word data that caused training to include that.
I am only now realizing AI stole from even the religious texts and influenced by them as well.
Well, maybe a little. Em dashes and en dashes are pretty standard (and editorially enforced) in newspapers and academic journals. By length, every religious text is eclipsed by news and journal media on a daily basis.
Each dash has a different use case that all professionally-typeset books adhere to (not just religious texts).
Hyphens are for compound words; en-dashes are for ranges or (on rare cases) to disambiguate multiple levels of hyphens; and em-dashes are for parenthetical dashes (for publishers who don’t use spaced en-dashes instead).
Yes! This is the right answer, and I love em-dashes — the’re an awesome form of punctuation!
I’m a " - " type man, have no time for extra line
My friend, “—“ is fewer keystrokes than “ - “
*They’re
I am only now realizing AI stole from even the religious texts and influenced by them as well.
As far as I’m aware, religious texts are all public domain. While I hate AI, they should have free access to public domain just like anyone else.
And also, in my understanding religions are supposed to help the general populace live a more fullfilled life and get to a better end result. So in this case it should be fair to put forth eminent domain (or whatever the text equivalent would be) for both the original texts and the translations.
Not all are, it’s just many translations are old enough to be public domain. But some things like the English Standard Version of The Bible isn’t public domain, vs the Geneva Bible which is
While you’re largely right, it is worth noting that each translation is a distinct work under copyright law, and any translation made after 1929 may be still protected.
And that ignores really young religions, and the copyright status of high-authority extant religions such as Iranian Islam, Mormon and Roman Catholic Christianity, Ron Hubbard’s Scientology or state-atheist communism.
(Whether or not Hubbard, Lenin, Stalin, and Mao count as “religious leaders” is a distinction without a difference in discussion of the copyright status of their works.)
I had always thought it was a hangup of ‘borrowing’ of pre 21st century texts that had be OCR’d
Maybe it’s changed, but my experience with OCR is that it is not great at detecting nuances of punctuation.
Software converts human-typed comments to use fancy quotes, dashes and other punctuation. Even this platform does that with the Markdown extension Fancypants - look at the quotes in your post.
That’s where LLMs get this from.
E.g.: to get an em-dash here:
=> —
I think that highly depends on the client you use:
But I see your point
So-called AI steals everything, including punctuation.