Emily Thomas
Emily Thomas has not written any posts yet.

Emily Thomas has not written any posts yet.

My brain also associates memories with locations, and I've noticed I do have the exact same location-based recall in full VR as I do in real life.
Thinking back, most of the main points I can recall from conversations in VR are either when I have just moved around the virtual space (so the visuals are different), or when new people come into the conversation (especially people with interesting avatars).
This can probably be refined further.
Which I then went and did, maybe.
You can actually get it down from 187 tokens to only 87, by also removing all the punctuation.
This gave only a slightly higher loss in accuracy than other examples, and seemed to preserve the main information. Could be an optional extra.
Compressed version:
don't humans also genuinely original ideas Come read fantasy book either Tolkien clone Song Ice Fire Tolkien professor Anglo Saxon language culture no secret got inspiration Song Ice Fire War Roses dragons Lannister Stark Lancaster York map Westeros Britain minus Scotland upside down Ireland stuck bottom wake sheeple Dullards blend Tolkien slurry shape another Tolkien clone Tolkien level artistic geniuses... (read more)
Oh, if we're only optimizing for tokens we can get the Tolkien example down from 187 to 110.
Word stemming adds tokens (while reducing characters). If we only care about tokens then removing the stopwords was doing all the work.
If we only remove the stopwords and nothing else we get:
don't humans also genuinely original ideas? Come, read fantasy book. either Tolkien clone, Song Ice Fire. Tolkien professor Anglo-Saxon language culture; no secret got inspiration. Song Ice Fire War Roses dragons. Lannister Stark Lancaster York, map Westeros Britain (minus Scotland) upside down-Ireland stuck bottom - wake, sheeple! Dullards blend Tolkien slurry shape another Tolkien-clone. Tolkien-level artistic geniuses blend human experience, history, artistic corpus slurry... (read more)
Good Idea! Let's make it better!
Main thing that comes to mind, a lot of tokens already come with spaces at the start of the word. Would removing them make things worse?
I put 'ButifIwrotewithoutspacesyouwouldprobablystillunderstandme' into openai's tokenizer.
It has 17 tokens.
I added the spaces back in to get 'But if I wrote without spaces you would probably still under stand me'.
It has 13.
Okay, so taking spaces out made it longer instead of shorter, but it's also a short bit of text and could be a fluke. What about your Tolkien example?
Your original Tolkien text has 187 tokens. Your compressed version without spaces has 160.
If you add spaces back in it only has 132 tokens... (read more)
This is something I've been meaning to learn for a while, but haven't known where to start.
Thank you for putting at all together so nicely :)
Some of these are strikingly similar to advice for how to interview users when designing user friendly software.
I guess it makes sense that there's some cross over.
I like it!
Rule 1: Don't destroy the world