I like Voice Dream Reader. I don't know how the voice compares to Natural Reader, but it does emphasize words and pronounce things differently based on context-cues. But those context cues are like periods and commas and stuff.
I find I stay approximately as engaged when listening to Voice Dream Reader when compared to an audiobook or someone reading stuff, but this could be an effect of having listened to several days worth of content via it.
Double checking you used "plus" voices and not just "premium" on Natural Reader? Plus still has issues but is much better than premium.
Thanks for the reply. I did use "plus." I also tried the "commercial" preview, and it's a bit better, I may end up compromising with it if I can't find a better solution.
Do you happen to have some samples handy of types of text you are typically reading? At least a few pages from a few different sources. Try to find some representative samples spectrum of the content you read.
I may be able set you up with an open source solution using Bark Audio, but it's impossible to know without poking at the Bark model and seeing if I can find a spot it works in and you start get samples that really sound like it understands. (For example if you use an English Bark voice with a foreign text prompt, even though the Bark TTS model knows the language, the English voice won't be able to speak it, or will have a horrific accent. Because Bark is kind of sort of modeling 'person-asked-to-speak-language-they-don't-know' in a way. Sort of like how GPT might do that if you changed language mid conversation. Well pre RLHF GPT.)
I don't want to make any promises, I have terrible focus, I don't frequent this site often, I give a 50% chance that I forget about this comment entirely until I suddenly remember I posted this in three months from now. Also while the Bark voices are wonderful (they sound like they understand what the are saying) the Bark audio quality (distortion, static) is not. You can stack another model on top to fix but it is annoying.
BUT it just so happens that the most recent source of my lack of focus, to some degree, has been poking at TTS stuff just for fun. Pure amateur hour over here. But the new models are so good they make a lot of stuff easy. And I just happened to see this comment after not visiting this site for weeks.
The https://play.ht/ best voices are maybe comparable though if you just want a quick solution. I do actually prefer Bark, if you can ignore the audio quality, but it's super unreliable and fiddly.
Thanks for the offer!
I'm trying to read through a lot of LW and astral codex posts right now. Here are some samples:
https://slatestarcodex.com/2014/12/17/the-toxoplasma-of-rage/
https://www.lesswrong.com/posts/vJFdjigzmcXMhNTsx/simulators
https://astralcodexten.substack.com/p/janus-simulators
https://www.lesswrong.com/posts/uyBeAN5jPEATMqKkX/lies-told-to-children-1
https://carado.moe/values-complex-not-objective.html
(if you meant audio as well, then for example, the sequences, LW curated podcast, and astral codex ten podcast all have lots of audio of associated text)
I think I'd be able to ignore things like static. I've listened to some decades-old recordings before with no problem.
If you think you'll forget to check this site, we could continue on a platform you use more often. My email is kuiranya (at) proton.me, I could give you my discord (for example) from there.
I'm looking into https://play.ht/ as well :)
For me, the physical act of scanning words takes active focus compared to the analogue in listening, which is automatic. (I don't think my comprehension or engagement is lower when listening, to be clear).
I've tried the naturalreaders.com 'pro' version, but experienced a few issues:
As a result, I think my brain doesn't register this AI-read text as 'something to listen to,' so it takes some active focus to continue listening, and eventually my focus shifts to something else while the audio keeps playing in the background. This does not happen with human-read text.
Anyone who can help me with this might have a high potential impact, since I'd be listening to text for a large portion of my day and am trying to use myself to do everything I can to help with alignment.