• Member Since 11th Apr, 2012
  • offline last seen Wednesday

Bad Horse


Beneath the microscope, you contain galaxies.

More Blog Posts758

Jun
16th
2015

Monday musings: Pony typewriters · 2:32am Jun 16th, 2015

Recently I saw this picture on Chatoyance’s page:

… and I thought that perhaps I should explain to you all why our typewriters have 6 keys, plus the spacebar, which you can’t really see in that picture.

I know what you primates are thinking: “Hah! How silly.  Just 6 keys!”  So proud of your hoof-tentacles.

They look like pink, fleshy spiders to me. :trixieshiftright:

Well, think, human:  A human typewriter has over 50 keys, but only 43 that you use regularly.  So each key conveys at most log2(43) = 5.43 bits of information.

Bits

A "bit", you ask? A bit is how much information you get when someone tells you one out of two choices. It's kind of [0] the smallest piece of information you can have. If a typewriter had only 2 keys, and I knew you had to press one of them, showing me which one you pressed would give me 1 bit of information. (You can't give me less information [0] because if a typewriter had only 1 key, and I knew you had to press one of them, showing me which one you pressed would give me no information, because there's only 1 choice.)

If a typewriter had 4 keys arranged in a square, a way to tell me which key somepony typed would be first to tell me if it was on the top or bottom of the keyboard (1 bit), then if it was on the left or right side (another bit), = 2 bits of information.

If a typewriter had 8 keys arranged in a hypercube two squares, you could tell me which key somepony pressed by telling me which square it was from (1 bit), then top or bottom (1 bit), then left or right (1 bit). That's 3 bits of information. I think you can see that specifying one thing out of 2^n things gives n bits of information.

If x = 2^n, we say n is the logarithm base 2 of x, or log2(x). That's true even if n isn't an integer, but something like 2.8. [3] "Logarithm base 2" is just an ugly but shorter way of saying "the inverse of the function f(n) = 2^n". (That means you swap the input and the output: f(n) = x means the same as log2(x) = n.)

Everything in the universe is reducible to matter and magic, which are both reducible to energy, which is reducible to information. That's why Equestrian currency is based on the bit. It's priced so that one bit of currency is the value of one bit of new information to Princess Celestia. (This has resulted in inflation over time, as there are fewer and fewer things Princess Celestia doesn't know. [4])

A pony typewriter with 6 keys conveys up to log2(6) = 2.58 bits of information per key.  So it would take only 5.43 / 2.48 = 2.1 times as many pony letters from a 6-letter alphabet to convey the same information.

(Unfortunately, the Equestrian alphabet was invented before typewriters.  So, like your Japanese [1], we had to invent a separate alphabet just for typewriters.)

We could easily give each hoof 4 keys, so that a pony typewriter would have 8 keys.  Each letter would then convey 3 bits of information, and pony script would take only 1.8 times as many letters as English.

Now consider this:  Your typing could be much faster if I didn’t ever have to reach down to the bottom row of letters.  If you had a finger-keyboard with just 20 letters--2 for each finger and thumb--you might type 50% faster.  And each key would convey log2(20) = 4.3 bits of information. You’d have to type 5.4 / 4.3 = 1.26 times as many keys, but you’d type each key in 2/3 the time.  So you’d type a sentence with the same meaning in 1.26 * 2 / 3 = 0.84 of the time.  You’d type faster with a smaller alphabet and fewer keys!

Now who’s a silly monkey?

But English doesn’t really convey 5.4 bits of information per character.  Clever experiments showed that English conveys about 1 bit of information per character, by showing people sentences cut off in the middle of a word, asking them to guess what the next character was, and measuring their probability of guessing right.

Could ponies develop an optimized alphabet, using one space plus 7 letters all equally probable, express the same concepts in those 8 characters more compactly than you can with 43 characters, and thus beat humans in speed-typing contests?

Sadly, no.

The entropy of English comes from three easily-separable components:  The frequencies with which letters cluster together, the frequencies of different words, and contextual information, which you can think of as the frequencies with which different words are found near other words.

The first is easily approximated for English using trigrams, series of 3 letters.  I take 3 and not 4 because once you go above 3 you are mostly identifying word frequencies rather than letter cluster frequencies.  A simple experiment shows that, if you consider just the previous 2 letters in an English word, the next letter has an entropy of (conveys) about  2.1 bits per character.  (This estimate is skewed low, since the experimenter stripped out punctuation, but skewed high because it incorporates some word frequency information for short words.)  That means that the amount of information provided by the other components--word frequency and words that go together--is just 1.1 bits per character.  If you re-designed English words so that each trigram were equiprobable, each character would convey 5.4 bits minus the 1.1 bits of redundancy caused by word frequency and context, or 4.3 bits.  Your effective typing speed would improve immediately by a factor of 4.  You should begin this optimization immediately!

I’ve been told there’s one human language that’s done this already.  It’s called “Unix.”

I’m not sure I like the sound of that.

Fortunately Princess Celestia foresaw the development of typewriters thousands of years ahead of time, and did just that optimization when standardizing Equestrian spellings, so that most letter clusters have about the same frequency.  Sort of.  Except for the ones we don’t use.  I’ll get into that below.

Suppose we typed this language on a 7-letter-plus-spacebar keyboard.  We still wouldn’t quite get the maximum-possible 3 bits of information, because the space comes always and only between words.

We have about the same number of words in Equestrian as you do in English.  (Admittedly quite a lot more of ours are terms for different types of friendship, magic, or grass.)  Remember from above that, ignoring context, English letters in practice provide 2.1 bits each. “Ignoring context” means how much information your stream of letters gives you after you factor out the redundant information from trigram frequency, but before factoring out the redundancy of always talking about bananas or “football” or whatever it is that your lot usually talks about.  The average number of bits of such information in an English word is 2.1 times the average word length (5), for 10.2 bits per word.

If ponies achieved the optimal 3 bits per character, then the probability of the space character would be 1/8, meaning words had an average length of 7 letters, meaning each letter conveyed 10 / 7 = 1.42 bits.  That’s inconsistent, which means that as you get near the optimal 3 bits per character, the probability of the space rises above average, keeping you from ever getting to the optimal 3 bits per character.  The best you can ever achieve is found by minimizing the characters needed per word to provide 10 bits, which is

cpw(w) = 10 / [7 * w/7*(w+1) * lg2(w/7*(w+1)) + lg(1/(w+1)) / (w+1)]

subject to the constraint that cpw(w) < w.  This turns out to be at w = 3.4 average letters per word, at which point the (maximum possible) bits per pony 7-alphabet word is 2.9.

In English, if you had perfect, Unix-style equiprobable letter clusters, each letter would provide 2.1 bits without larger context, but only 1 bit given the entire sentence context.  That means half of the information in a Unixish language is redundant due to word frequency and word adjacency frequency.

That redundancy, however, can never be eliminated.  It is inherent to intelligent thought.  This is not known, but I believe it is so, because words have a Zipf distribution.  That means that the frequency of the nth most-frequent word is c / n for some constant c.

Zipf distributions are common in nature.  City sizes have a (nearly) Zipf distribution.  So do the number of citations that different scientific articles receive.  You can explain this by a simple “accumulative” model, which says that the probability that someone will move to a city, or that someone will cite a paper, is proportional to the number of other people who have moved to that city or cited that paper. (You can even compare this model to the real distribution of paper citations, and find the residual distribution left over that tells you what average fraction of a paper’s citation count is due to its quality rather than to chance! [2])

But it is not, I think, possible that Zipf’s law is so universal in nature that our words have a Zipf distribution because the things in the world have a Zipf distribution.  We talk about many things, some of which (like colors) don’t have a Zipf distribution.  Yet our words, taken as a whole, do.

Think carefully about that.

What that means is that our thoughts are governed by the Zipf distribution.  No matter how we divide the world up into concepts, we will revert, through the same accumulative process by which cities and paper citations do, to a Zipf distribution describing the frequency with which we think of different things.  Our thoughts will drift to different concepts at a frequency similar to the frequency with which other people talk about them.  Iterate this for a few years and you converge on the Zipf distribution.

I haven’t checked whether Equestrian words also have a Zipf distribution, but I expect this law of thought is general enough to cover humans and ponies. It should cover aliens from Alpha Centauri, as long as they aren’t specialized like ants, or (more likely) like object classes in computer programs, with each caste having its prescribed thoughts.

So the word frequency component of a language’s redundancy can never be eliminated as long as we are social animals.  I will posit that the contextual redundancy is similar, because it is (I’m guessing) mostly accounted for by the frequency with which word A suggests word B, which would simply have another Zipf distribution for word B based on word A.

That means that, even if ponies achieved a perfect 2.9 bits per character using words that sounded like Unix commands, half of those bits (really, 2.9 * (1.1 / 2.1) = 1.5 bits) would be wasted by the predictability inherent to minds that always converge on Zipf distributions.  So our perfect Pony language would express 1.4 bits per character, a little more than the 1 bit per character of unoptimized English.

But our perfect Pony language would not be very poetic or even pronounceable.  Spoken languages have vowels and consonants, and things get pretty rough if you habitually string more than 1 vowel or 2 consonants in a row.

Also, we still wouldn’t beat a human typing, because to type 2 keys in a row with one finger or hoof takes about as much time as typing 4 keys with alternating fingers or hooves.  Ponies want to type with alternating hooves as much as possible!

American English has, by one count, 24 consonant phonemes and 15 vowel phonemes, which is about average.  By an amazing coincidence, we have exactly the same number, plus the whinny, neigh, and bray, which aren’t phonemes but a kind of verbal punctuation.

In English, most letters contribute only to either consonant or vowel phonemes. This makes it possible to parse a word into phonemes unambiguously even though the same letter may participate in several phonemes.  But if Equestrian assigned 3 letters to vowels and 4 to consonants, we could have only 9 2-letter vowels and 16 two-letter consonants, and we wouldn’t be able to alternate hooves more than half the time.  (Of course we can’t have one-letter phonemes, because with our efficient word spellings it would be impossible to tell whether the second letter was the continuation of the phoneme, or another phoneme after a 1-letter phoneme.)

I think it would a bit much to ask us to drop 14 phonemes from the language when typing.

That’s why Equestrian phonemes are typed with three letters each, consonants starting with a key on the left, and vowels starting with a key on the right.

With 4 keys on the left and 3 on the right, there would be 4 x 3 x 4 = 48 possible consonants and 3 x 4 x 3 = 36 possible vowels.

We don’t need that many keys!  We have to type 3-letter phonemes anyway, as explained above, and so all that a 7th key would give us is more phonemes we don’t need.

That’s why Equestrian typewriters have 6 keys, 3 on the left and 3 on the right, plus a space bar.  There are 3 x 3 x 3 = 27 left-right-left sequences, and 27 right-left-right sequences.  Since consonants mostly alternate with phonemes, this lets us keep alternating hooves as we type most of the time.

So, in fact, over half of all possible 3-letter sequences occur only across phoneme boundaries!  That wrecks our previously perfect probabilities.  See, we’re only getting lg2(24) = 4.64 bits per consonant and lg2(15) = 3.91 bits per vowel, because we’re only using 24 consonants and 15 vowels.

But that wastage is essential. If we used every possible 3-letter combination nearly equiprobably, it would be impossible to tell when reading when you'd skipped a letter or two and were now reading your phonemes out-of-frame. You might be reading the latest Daring Do novel, glance away for a moment, look back to the page and pick it up starting from a letter that was in the middle of a trigram, and suddenly you'd be reading a very bad pancake recipe. By leaving 1/4 of our trigrams mostly unused, you'll notice pretty quick--in just a couple of words, max--when you've fallen out of frame.

The vowels and consonants are all nearly equiprobable, so each 3 letters provides a maximum of on average (24 * 4.64 + 15 * 3.91) / (24 + 15) = 4.36 bits per 3 letters, or 1.45 bits per character.  Because of the irreducible redundancy mentioned above, that only gives us an effective information rate of 1.45 * (1.1 / 2.1) = .76 bits per character.

So to be throat-pronounceable, hoof-typeable, and readable, Equestrian’s maximum-possible typing efficiency is 3/4 that of English.  And that’s why it takes me so long to write a story.


[0] but not really

[1] The Japanese did it 1000 years before inventing typewriters. That's really planning ahead.

[2]  The answer, it turns out, is indistinguishable from zero.  You can derive Rescher’s Law of Diminishing Returns from this result.  (It’s just like Kurzweil’s Law of Accelerating Returns, except it uses the inverse of an exponential function rather than an exponential.)

[3] Explaining what it means to raise something to the power of 2.8 is left as an exercise to the reader, or monokeras.

[4] It's therefore possible to plot the value of the Equestrian bit over time and extrapolate to find where it rises to infinity, which will be the date on which Princess Celestia will know everything. That, according to the priestesses of Zebrica, is when the world will end. Fortunately it's nothing to worry about; current estimates place it over a thousand years in the future, in the year 2013.

Comments ( 46 )

Added complexity: What if each key can be pressed on one of four corners with a tilted hoof? 6x4=24, and that's not counting if you push two keys at once.

Oooor pressing two keys at the same time would result in a different character?

3153221 You know what would make that even faster is if you just cut each key into 4 pieces, so you didn't have to tilt it, you could just strike each of the 4 former corners quickly. It would be like having 4 keys! Except there'd be just one! Cut into 4 parts.

Wait a minute...

Fascinating. Unfortunately, I'm human (or I was the last time I checked) and I don't have that excuse.

Or maybe she's using a stenotype.

And here I thought they were chording keyboards, where each letter corresponds to a combination of two keys pressed at once.

3153239 You still have the limitation of each key being roughly hoof sized. Even with two keys and four way tilt, you have 4(squared) combinations or 16. Three keys gives you 16+16+16 or 48. (ABC, with AB=16, AC=16, and BC=16.) Four keys gives you... um... Math was always my weak point.

You lost me.

And what about Daring Do's A. K. Yearlings 2-button typewriter?

Ponies and Unix? Only here.

Also, this suggests that if you open the refrigerator some evening to find
derpicdn.net/img/view/2013/4/18/301644__safe_solo_princess+celestia_smile_wat_missing+accessory_hiding_refrigerator_artist-colon-palestorm_peeking.jpeg a guest, there may be no defense at all.

"Why are you in my refrigerator?"
"I want a peanut butter sandwich, and that's where you keep your bread."
"Amusing, but no."
"sudo make me a sandwich."[1]
"Goddamnit, my arms are moving on their own! Arrgh, crunchy or smooth?"

Obviously, ponies know more than one kind of magic.

"You could have prevented this by changing your passwords regularly, and by doing remote logins with ssh instead of telnet."
"Arrgh! I thought telnet was safer because it was deprecated!"
"Silly goose. That's WHY it's deprecated."

[1]Yes, I read xkcd too. Doesn't everyone?

Hap

I kind of skipped to the end. But my lips were going numb, and I'm pretty sure I can't feel my nose right now.

One press charges a meaning, the second press imparts the symbol.

The impact is like adding an umlaut to U visually, but is pronounced as a radically different sound, with contextual sensitivity.

Two presses of the same key generates solely that character

The spacebar functions as a spacebar and as the meta key function.

Utility marks such as periods commas or dashes are generated combining a keypress with a spacebar press

Two sequential keystrokes in total are required to produce one character, with exception of meta key functions.

This allows ponies to have busy typewriter sounds in spite of having two hooves.

This is a market consisting of pastel ponies, manufacturers will always prioritize proper aesthetic impact before efficiency.

if a typewriter does not sound typewritery then it is not a typewriter!

You damn ponies use too much math.

Where is the favorite button for blog posts? Now you have me thinking about whether your expanation for rare trigrams to detect frame shifts is similar to why the genetic code has three different stop codons rather than just one.

Personally I'm curious how you write in English using an Equestrian keyboard.

It is a good thing Princess Celestia predicted the existence of typewriters, otherwise things might have really sucked for the Equestrians.


I was reading something interesting which was semi-related the other day.

As everyone knows, Chinese has a ridiculously large script space (4,600 common characters), which has the effect of allowing them to write very densely - average word length is a mere 1.5 characters, versus 5.1 in English.

But what is interesting is that Chinese speakers and English speakers read at almost exactly the same speed on average. English speakers read 7-8 characters at a time on average, while Chinese read a bit over 2, meaning that speakers of both languages read about 1.5 words per eye movement. Thus, we would expect readers of the Equestrian language to scan even more characters than English speakers at a time (though alas, they would take much more physical space to transmit the same amount of information).

A related study found that the faster a language was spoken - the more syllables pronounced per second while speaking it - the less information dense said language was on a per-syllable basis, such that a number of major languages studied ended up with very similar rates of data transmission despite having vastly different rates of speech.

The authors suggested that this might indicate that rate of transmission via language is constrained primarily not by language but by cognitive limitations, and that languages tended to push up against the barrier of said limitations, rather than being the cause of said constraints.

If creatures of different species had significantly different cognitive capabilities, it might result in the more intelligent creatures constantly speaking too quickly for the less-intelligent creatures to understand. Likewise, one could potentially crudely measure the cognitive capabilities of a species by their data transmission rate via speech or written text.

If you look at stereotypes about how humans behave, people who speak slowly are often regarded as being slow mentally, while if you stick a bunch of nerds together, I've noticed that their rate of speech seems to increase. This makes me wonder if you could potentially use rate of speech relative to the average for a language as a crude measure of intelligence even of individuals, but I suspect that a lot of people moderate the speed of their speech for others, making this a likely useless metric for individuals.

Two hooves operating seven keys is inefficient.

I'll stick with my fleshy primate spideresque hoof-tentacles and full English language, and use my ten digits on a ten key keyboard (that allows 100 distinct keystrokes), thank you very much.

:trollestia:

I think Chatoyance has written that there's some foot-stops similar to what a piano has integrated into the bottom portion of the machine that alter any given character press.

3153505
I wonder how fast you could type on something like that.

I'd imagine it'd have to be slower than an actual keyboard because you've only got half as many fingers to work for you, and a lot require double actions.

3153575
According to a review I read, someone with a 50 word per minute typing speed on a standard keyboard layout was able to manage about 20-30 wpm after a week or so of practice. So, yes, slower, but that may also be a practice/familiarity thing.

The big selling point of it is the size and its ability to enable multitasking. I first found out about it (or something very similar) several years ago when I saw a clip of someone who had a version of this built into the handlebars of their recumbent bike, and was using it to write a book while cycling all over the United States.

3153304

"sudo make me a sandwich."

password? :trixieshiftright:

It's a bit less efficient, but I had come up with, some time ago, a means to use mechanical registers, similar to an adding machine, to use two keys to address a character matrix. THe right and left keys would step the character matrix selector horizontally and vertically, respectively. Space would strike the character, reset the registers, and advance the carriage. THe carriage return lever would also reset the registers while advancing the page feed. Each register is independent, so you can press both keys at once to advance both registers. I think I also had some other options. Holding one of the two character matrix keys down at the last press and pressing the space bar would strike the character, but leave the registers set (for double characters, or smarter typing, where your next letter can be advanced from the current one in fewer keypresses). Holding the other character matrix key and pressing spacebar would clear the register, and NOT advance the carriage, so you could quickly reset the register.

It's less efficient, but it DOES let you type with only three keys and a carriage return lever! :twilightsheepish:

Alternately, we know ponies seem to have hoof grip... What if they are like pressable joysticks? each key could be a 4 way hat switch (or maybe 8 way). You grip the key with Magic Pony Hoof Grip™ and slide it in the direction you wish to type. Then you press. It adds a lot more key combinations. If key selection order comes into play (shift right key and then left versus shifting left key and then right), then you get even more key combinations. I think there's about 42 combinations if you have key shift precedence, and 26 without.

You describe 24 consent phonemes, 15 vowel phonemes, and the neigh, whinny, and bray.

24 + 15 + 3 = 42

Exactly what can be achieved with 2 keys in a 4 way slide and press hat configuration, with key order significance. The two key + space pony typewriter could type in straight phonemes, if you skip punctuation, with the exact layout shown in the show.

If we use english lettering, we have 26 letters and 10 numbers. From 42 combos, that leaves, 3 punctuations, and 3 horsey sounds. If you repurpose "i" or "l" as a 1, as many typewriters did, you get 4 punctuations. That's enough for the basic four. "." "," "?" "!" If ponies spell out their numbers, they would gain 9 or 10 more possible punctuations.

In all, I'd say the 2 key pony typewriter can handle anything english can throw at it, if the keys are slide selectable, with order significance. :twilightsmile:

Funny how the pony is calling us silly, and yet he's the one whose society independently came up with a design paradigm that interacts best with anatomy they don't have: Devices with multiple buttons you have to press repeatedly. Maybe if you were really smart, you'd have typewriters controlled by two joysticks or something similar. That lets each hoof convey either 3 or 4 bits of information with each movement (depending on if up-up-left and up-left are different positions), without ever leaving the controls. 4 bits of information=16 letters. Even if you guarantee they alternate hooves, two hoof movements is 8 bits (256 possible letter combinations). Or if you don't guarantee alternating hooves (thus using "which one is next" as an extra bit of information), you could type the English language, with simple punctuation, at a rate of one letter per one hoof movement. (You'd need a third joystick for numbers, but we have an extra row on the keyboard for numbers; those are used very infrequently and won't slow your typing much overall).

And you could do that without having to learn a second, constructed written language; that's using the unoptimized written language we already had.

Also, doorknobs.

(You put too much thought into this.)

edit: I did my math wrong the first time.

3153309 3153284 You didn't know AK Yearling was a ham radio operator?

3153481 Write in English? And try to master your bizarre spelling conventions? No, thank you. All messages between here and your world are translated due to Frizzmane's Principle of Narrative Necessity. I also have a unicorn give my stories a pass with Star Stutter's This One Trick. It cleans up the grammar, fixes point-of-view errors, and adds things like characterization, plot, and meaning. All I have to do is keep banging these keys.

3153628

I saw a clip of someone who had a version of this built into the handlebars of their recumbent bike, and was using it to write a book while cycling all over the United States.

Wow. Now that's multitasking. Of course I suppose I could just dictate into a microphone and run it through text-to-speech later.

Have you used one of those?

3153685

but I had come up with, some time ago, a means to use mechanical registers, similar to an adding machine, to use two keys to address a character matrix.

What was it for? Your mechanism sounds like magnetic core memory's address mechanism.

3153813

Maybe if you were really smart, you'd have typewriters controlled by two joysticks or something similar.

We tried that, but we ended up just playing video games all day.

3153928

Your search - Frizzmane's Principle of Narrative Necessity. - did not match any documents.

Suggestions:

Make sure all words are spelled correctly.
Try different keywords.
Try more general keywords.
Try fewer keywords.

Man, I haven't felt this much disappointment since the mods failed my story the first time around for having too short a description. So about three days.

3153949 Old serial entry adders used to contain a mechanical register for each digit the adder supported. Each number key had a catch with a different position, incrementing position with digit. When you pressed the key, it pushed the register as many "ticks" as cooresponded to the digit pressed, based on those unique catch positions, per digit.

When you released the key, you advanced the register select to the next position, so that the next keypress would activate the next register, and not the one you just used.

To adapt this to a typewriter, you only need two registers. Pressing a key advances its cooresponding register by one place, and if you get to the end, the next keypress resets it to the start. This gives you a matrix where your 4 most common letters are in the first 2x2 square (easily addressed with 2 hoof motion of one or both hooves), then your next most common letters lie in the edges of the next part of a 3x3 matrix offers 5 more characters in 3 hoof motions. Less used letters or punctuation would be addressed by repeated presses to advance the registers. While uncommon letters or punctuation would be slow to access, it allows for no real upper limit to the number of accessible special characters that can be addressed. Register size is the only limiting factor. The strikers would also work similar to an adder. One register raises the strikers, but instead of 8 strikers with identical numbers, you shift the strikers over, and each striker has unique characters. Only one striker is used, activated by the space bar. Thats the unfortunate aspect of both designs. I presented. To address so many characters, with two keys, or even with the hat switch style keying (which reduces the number of keypresses to no more than 3 per character), is that you do end up having to use the space bar to strike every, single, character...

But the concept is sound. With limited keys, you get more characters in a two step address then strike method, more than you can get with active strike.

Tje exception is the hat switches. If you strike on release, you can still press one switch then the other, and auto strike on release from the stored energy of the keypress. Manual adders used this method.

Here is a video of the register matrix in operation. The difference, is one key would cycle the pin in one row, and another key would do the same for a second row of pins. There would only be one striker, but the printing mechanism would have as many characters per arm as the first register, and as many printing arms as there are pins in the second register. The first arm would set the vertical offset of the printing head, while the second register would set the horizontal offset of the head. The striker would hit the one character selected. You could also display he character on the user facing side of the printing head, through a small opening.

Both these mechanical adding machines have features that could be adapted to a pony typewriter. I also gotta give props to the people who designed and built these things. They did with cams and gears and ratchets and levers, what we use silicon chips to achieve! Very impressive gadgets!

3153951

If you're not even going to acknowledge the prospect of a design that lets you easily type every phoneme in your language with one movement each of alternating hooves, you might as well just admit that you stole typewriter technology from the dragons and tried to "scale it down" to a pony device. All this transparent bluster about design efficiency is just sad.

3153369
I agree with the Time traveler, very informative.

Also, if I don't see you use this semi-essay in one of you stories, I'll be massively disappointed. Now excuse me as I put ice on my head from trying to intake the influx of random data you've treated me to.... Ow....

3153949
Sadly I have not, though I am rather curious about giving one a try.

3153518
Stop yer greppin'. No one gives a fsck.

3153928
And here I thought you were using Google Translate. It's very magical. You'd like it.

and adds things like characterization, plot, and meaning.

Aha, I always knew story writing was more magic than anything. Now to go steal everyone's spells.

I tried testing Zipf's law with all of the stories currently on fimfiction, and the results don't seem to line up with the theory.

>>> counts = sorted(counts)
>>> counts = numpy.array(counts)
>>> logcounts = numpy.log(counts)
>>> ranks = numpy.arange(len(counts), 0, -1)
>>> logranks = numpy.log(ranks)

>>> scipy.stats.linregress(logranks[-100000:], logcounts[-100000:])
(-1.9936844296435057, 26.22314935623751, -0.98804430486816286, 0.0, 0.00098375009266678639)
>>> scipy.stats.linregress(logranks[-50000:], logcounts[-50000:])
(-1.7566219981637452, 24.053857398919789, -0.98732706868214626, 0.0, 0.0012627406798813407)
>>> scipy.stats.linregress(logranks[-5000:], logcounts[-5000:])
(-1.2214741094693546, 19.821179206824148, -0.99679091514778206, 0.0, 0.0013875190845974127)
>>> scipy.stats.linregress(logranks[-500:], logcounts[-500:])
(-1.0139838287269618, 18.684047372160293, -0.99552643466122315, 0.0, 0.0043124015744977814)
>>> scipy.stats.linregress(logranks[-50:], logcounts[-50:])
(-0.78791233411611017, 17.96469423403088, -0.98605236889475401, 4.8383592307958564e-39, 0.019195652255738899)

# The exponent varies drastically depending on how many ranks you look at, which suggests that the correlation isn't really polynomial.
# The coefficient of correlation is high because, up to the given rank, some polynomial term approximates the expression well. That doesn't mean that one term approximates the expression for all ranks, which is what Zipf's law claims.

Maybe it's important that the word "the" represents only 4.7% of fimfiction words as opposed to 7% in the Brown Corpus. Or that "of" comes in 5th place here with 1.8% compared to Brown Corpus's 2nd place with 3.5%. Fimfiction also has an incredibly varied vocabulary of over 700,000 words, or we've very creative in our misspellings.

Bad Horse That redundancy, however, can never be eliminated. It is inherent to intelligent thought.

Your intuition I think is right, but only in cases where entropy exceeds memory capacity. Nautral language objects tends to have extremely high entropy*, and so learning them (correctly) requires redundancy. Specific short boolean algebraic expressions tend to have low entropy, and so redundancy is not necessary to (correctly) learning them.

* Take the set of all natural language objects and the set of all things they can refer to. I'm referring to the entropy of that set.

I actually have reason to believe that intelligent thought is not possible in our world without redundancy. I've run a few tests on overtraining in image recognition and found that explicitly favoring redundancy actually has a drastic effect (50% vs 15% error rate) on a program's ability to recognize images. I'll be testing this on natural language text soon, and I'm expecting similar results.

I'm not sure how relevant this is to Shannon's experiment, but the ability to predict a missing character in a sentence is a function of (1) the information relayed by that character and (2) the noise introduced by that character. Misspellings, for example, introduce noise. In practice, we might actually relay less than 1 bit of information per character typed.

3159969 Thanks for the test and interesting comments! Would you please explain the output of commands like scipy.stats.linregress(logranks[-500:], logcounts[-500:]) ?
What does logranks[-500:] mean, and what do the 5 numbers following it mean?

Your intuition I think is right, but only in cases where entropy exceeds memory capacity. Nautral language objects tends to have extremely high entropy*, and so learning them (correctly) requires redundancy.

I wasn't thinking of learning, but of the frequency with which we retrieve different concepts. If the strength of a concept's encoding is proportional to the frequency with which we hear it talked about, which is proportional to the frequency with which other people think about it, any random initial frequency of topics will (I think) eventually converge on a Zipf distribution.

I actually have reason to believe that intelligent thought is not possible in our world without redundancy.

Categories don't exist without common features, which are a form of redundancy. I think you mean something more specific, but I don't know what.

3160038
Explanation for my Zipf's law test above:

# Zipf's law predicts that the frequency with which a word occurs is correlated with its rank.
# Sort the word counts so they're in reverse order by rank (a low rank means the corresponding word appears infrequently),
>>> counts = numpy.array(sorted(counts))

# and get the rank associated with each word.
# The most frequent word has rank 1, the least frequent word has rank len(counts).
>>> ranks = numpy.arange(len(counts), 0, -1)

# Zipf's law predicts that the frequency of a thing (in this case, word) is proportional to rank^s for some s
# Any one-term polynomial correlation can be found by doing a linear regression on the log of the terms.

# The prediction is: log(frequency) = m*log(rank) + b
>>> logranks = numpy.log(ranks)
>>> logcounts = numpy.log(counts)

# Run a linear regression to find m and b using the 100000 most frequently-used words.
>>> slope, intercept, correlation_r, p_zero_slope, error = scipy.stats.linregress(logranks[-100000:], logcounts[-100000:])
# Repeat with the 50000 most frequently-used words.
>>> slope, intercept, correlation_r, p_zero_slope, error = scipy.stats.linregress(logranks[-50000:], logcounts[-50000:])
# Repeat with the 500 most frequently-used words.
>>> slope, intercept, correlation_r, p_zero_slope, error = scipy.stats.linregress(logranks[-500:], logcounts[-500:])

# slope and intercept are m and b in the linear model, and slope should be equal to the exponent in Zipf's law
# correlation_r is the coefficient of correlation
# p_zero_slope is the p-value with the null hypothesis being "the slope is 0"
# error is some measure of error... I think it's the average mean-square error

# The slope should be consistent for all of these if Zipf's law holds. In my tests, it varied from -0.78 to -1.99, becoming more negative as higher-ranked words were added to the regression.

"Redundancy" is overloaded, and I wasn't clear in my previous post what I meant. You're thinking of "redundancy" as in "repeated experiences", whereas I'm thinking of "redundancy" as in "repeated observables". I can explain by example.

Suppose you have an intelligence that can observe 5 boolean inputs per experience and categorize each experience as either A or B. The intelligence is told the following:

(1) 00000 -> A
(2) 11111 -> B
(3) 00100 -> A
(4) 10000 -> A
(5) 10111 -> B
(6) 11111 -> B
(7) 00100 -> A

(6) and (2) are equal, as are (7) and (3). Those are examples of redundant experiences.
The fifth bit is equal to the rounded average value of the first four bits. That's an example of a redundant observable.

In cases where redundant observables don't exist, memorization is the only way to make good decisions (assuming no small subset of observables is enough to make good decisions), and memorization is only effective when the intelligence has seen a nontrivial fraction of the possible experiences. This is only feasible when the entropy of experiences is very low, and it's generally not seen as intelligent behavior.

If redundant observables don't exist, then memorization is the best way to make decisions. If memorization is not the best way to make decisions, then redundant observables must exist. Assuming that there exists a good way to make decisions given (some set of) highly entropic experiences, it follows that redundant observables are necessary to make good decisions from highly entropic experiences since memorization alone is a bad way to make decisions in such cases.

Come for the ponies, stay for the computational linguistics! :derpytongue2:

I've actually had this thought before, and got as far as rough estimates of language density and optimization, but then I saw something shiny and was thankfully stopped before I started doing the math. Now I think I spent almost as much time just READING your math. Gotta love it!

That's a whole lot of math vomit to justify what's still a pretty bad solution to a given problem.

Ponies can write with their mouths (supposedly--we never really find out how good they are at it). It would be many, many times easier to create a machine which ponies interact with through a tablet-like pen in their mouths. It has the added bonus of not just being blatantly better for earth ponies and pegasi, but even more so for unicorns using magic. Remembering and punching combinations on a 7 key keyboard with gigantic keys is hardly an ideal solution to a pony-friendly typing machine.

Login or register to comment