Theoretical Physics is Magic

by Smeapancol

First published

In return for showing her the magic of friendship, Luna teaches Twilight to understand physics.

Twilight, ashamed of her inability to understand theoretical physics, enlists Luna is her teacher. Together, they follow an intellectual journey into the nature of space, time, and causality. This is a crossover between Friendship is Magic and the strange world of theoretical physics. There IS a story here, so if you're not interested in the physics part, you might enjoy reading just the start and end of every chapter. If you want to get some idea of the amazing places this is going, try looking over chapter 5.

Please send me a message if you find an error or something that needs clarification.

Thanks to Tzolkine for the cover image!

Day 1 - Abstract Arithmetic

View Online

[Best viewed with 'small' size text and 1.5 size spacing.
If the equations seem out of order, reload the page and they might be fixed by then.
For a pdf version of this chapter with better formatting, please click here.]

Unit 1: Space

Lesson 1- Algebra

Day 1- Abstraction

In a neglected corner of the Canterlot Archives, in an alcove at the end of a long hall lined with old books, Twilight Sparkle sat reading by the light of her horn, muttering to herself at the open book in front of her.

“Complex Hilbert space? Holonomy groups? What? Infinitely differentiable Riemannian manifolds? Vector... bundles-what?? Analytic continuation?! This doesn’t make any sense!”

Finally she slammed the book shut in frustration and tossed it on the floor.

“This... is... impossible! I don’t understand. I just don’t understand!” she cried, becoming more and more agitated. “This is nothing like astronomy! How can anyone understand it!? I don’t understand how I can not understand!”

Twilight let her head droop over her forelegs. She sighed and felt tears coming on. But then her ears perked at some commotion from down the hall and she hoped no one overheard her almost go insane again. She heard a familiar bellow.

“MAKE WAY FOR THE PRINCESS OF THE NIGHT! WE ARE LOOKING FOR A BOOK TO READ THIS EVENING!”

Twilight buried her face in her hoof. After all this time and Princess Luna still didn’t understand how to behave in a library! At least there weren’t many ponies around that late.

“GUARDS! STAND UP STRAIGHTER! THAT IS ACCEPTIBLE. NOW THEN, THIS ONE HAS TOO MANY PAGES! THIS ONE IS TOO LONG!”

Twilight groaned and wiped her eyes as the voice came closer. She saw the Princess’s glow from around the corner and tried to hide behind another book. However, the page she opened contained a particularly nasty-looking integral. She gulped when she saw it and dropped the book in fright.

“TWILIGHT SPARKLE! HELLO! THAT LOOKS LIKE JUST THE PLACE FOR YOU!”

She tried to collect herself as the Princess approached, but she could still feel herself shaking with frustration at herself.

“WHAT IS IT, TWILIGHT! ARE YOU FEELING...?”

Twilight waited for Luna to finish her sentence until she realized that Luna had no clue what she was feeling. “Princess... the voice, remember?”

“Oh yes, I keep forgetting. That feels like my normal way of talking.”

“What are you even doing here? Aren’t you guardian of the night?”

“Have you been reading here all night? It is dawn now. I just came off duty.”

Twilight groaned. “I guess I have. And nothing to show for it either.”

Luna noticed the book on the floor and lifted it with her telekinesis.” Oh! You are reading Quantum Mechanical Theory, the classic exposition by Atomic Force and Light Speed, two of the greatest physicists of our generation. I did not know you did physics too. Is there anything you do not know, Twilight Sparkle?”

“That’s just it, Princess! I don’t know it. No matter how hard I try, I just can’t follow it!”

“Oh. Well do not feel bad about that. Most ponies could never understand physics no matter how hard they tried!”

“Because I didn’t think there was anything I couldn’t understand! That’s never happened to me with anything before.”

Luna snapped her head to the side, making shadows dance across the room with her dimly glowing mane. “I UNDERSTAND NOW! YOU ARE ASHAMED OF BEING INEPT IN PHYSICS!”

Twilight buried her head under her hooves, certain that her secret was out now. “Princess! Please don’t use the Royal Canterlot voice. Everyone will hear you.”

“How silly of me. I keep forgetting,” Luna said with an elegant laugh.

“I’ve been trying to read this book since before I moved to Ponyville. I don’t understand it! Now I have trouble even looking at the book because the equations give me so much anxiety! I don’t know how to deal with something that I have so much trouble understanding!”

“There there, poor dear! It is a difficult subject. I had trouble with it myself!”

“You mean, you know physics?”

“Are you surprised? Goddesses have to know these things! Plus, there wasn’t a lot to do during my time on the moon, and studying physics was one of the things I could do. When I wasn’t plotting my revenge, of course.”

“Er, yes-of course.” Twilight tried giving Luna her most adorable puppy eyes. “Do you think you might be able to help me get started on this?”

Luna appeared to be in deep thought for a moment and then stomped her forehoof with “VERY WELL, TWILIGHT! IN RETURN FOR SHOWING ME THE MAGIC OF FRIENDSHIP, I DECREE THAT YOU SHALL APPEAR HERE EVERY EVENING TO STUDY PHYSICS.”

“Every night? I don’t think I have time for-”

“You are not turning down MY GENEROSITY are you?”

“Well no but-”

“It’s settled then! Your lessons begin... NOW!”

With a flash of Luna’s dark maroon aura, a large blackboard appeared in the hall, blocking the way out from floor to ceiling.

“Tell me, disciple, how much calculus have you done?”

“I’ve done a bit of calculus. That’s not so hard because it’s just memorization.”

“The reality is more subtle than that. Doing calculus is not just about memorizing rules but also pattern recognition, and that is the hard part. For example, take something like the integral

(1.1)

One could try doing this with integration by parts, but that will not get far. The trick is to notice that it can be can writen it like this.

(1.2)

And now you can see that it is really just a chain rule problem, and you recognize the derivative of .

In this problem it is not at all obvious which rule will actually work to simplify the expression. That is what makes all mathematics challenging!”

Luna finished writing out the integral and turned back to face Twilight, whereupon she gaped in shock. Twilight was on the ground, lying stiff and twitching”

“Oh dear Twilight Sparkle! WHAT’S WRONG?”

“I-integral,” whimpered Twilight.

Luna quickly erased the integral. “I had no idea your problem was so severe! There, it’s gone! See? No integral.”

“All gone?” Twilight murmured, still on the ground.

“All gone!”

Twilight shook her head and stood up again. “Sorry Princess... You see? It makes me too nervous!”

“We had better do something easier. How about addition and multiplication?”

Twilight laughed nervously. “I can do figures in my head! I think I could do something a little more advanced.”

“This is not that kind of arithmetic. We shall think about addition abstractly, without knowing what we are adding. We shall think about it in terms of the properties of addition, not in terms of the result. But do not think of addition as a procedure. Think of it more as a structure.

Now,” said Luna as she began to write on the board again, “addition has four properties: associativity, which allows us to treat a sum of any number of symbols as a single operation; commutativity, which, together with associativity, allows us to treat any finite sum as a homogeneous pile, identity, which says that there is an element 0 which can be added to anything without changing the result; and inverses, which says that everything has an inverse that can be added to it to produce 0.

Definition 1.1 : Addition

Let us warm up by trying to prove that the identity is unique. Let’s say that 0 and are both identity elements. In this case the line over the 0 is just something to distinguish it from the other 0. Use the properties of addition to prove that they are equal. Do you think you can do that for me?”

Twilight grinned. “I think I can handle this one! Well the only thing you can do to start off is add 0 to anything. And then I can just use the reverse of the identity rule to remove the .”

Proposition 1.1

“Good! But you made a slight omission. You will notice that the axioms only allow adding 0 on the right. So you ought to have used the commutativity rule to swap the 0 and before you could remove the .”

Proposition 1.2

“Hmm. But Princess, wouldn’t there have to be a similar rule for =? I mean I applied the identity rule twice, but the second time I applied it backwards. So by your logic, wouldn’t we have to have both a+0=a and a=a+0 as axioms?”

Luna was momentarily flustered. “Well! My answer is that we are treating the + relation as an abstract operation that is defined only by its properties. Whereas the = relation is actually meaningful, and it stands for equality, which I am assuming you know is by nature reflexive, symmetric, and transitive.”

“I see. That’s a very fine philosophical distinction!”

“ONWARD! I mean, onward. Now, like everything, 0 has an inverse. Prove that -0 and 0 are equal.”

“Once again there aren’t many choices available for what to do.”

Proposition 1.3

“First identity, then commutativity, and then inverses. Good.

Finally, prove that the inverse of each element is unique. Start by assuming -a and are both inverses of a.”

“I think I’m getting the hang of this now.”

Proposition 1.4

“The last thing to prove is that -(-a)=a.”

“Ok.”

Proposition 1.5

“Notice you had to use commutativity again for that one. Now you know everything there is to know about addition!

Now on to multiplication. Multiplication always obeys the rules of associativity and distributivity, but there is not necessarily any assumption about identities and inverses.

Definition 1.2 : Multiplication

Also, since we do not have commutativity, a multiplication operation needs to be understood as an ordered list rather than a lump, as with addition. This means that multiplication can sometimes only work one way. If a b is defined, this does not mean that b a is always defined.

If there is no identity, there cannot be inverses because the concept of an inverse depends on that of an identity. Now sometimes there is an identity defined for multiplication, but there is not always both a right and a left identity.”

Definition 1.3 : Multiplicative identity

Twilight nodded. “Before, we had to use the fact that the identity is both a right and a left identity to prove that it’s unique.”

“Indeed. When inverses are not assumed, we will often be preoccupied with the question of characterizing inverses and inverses and determining when they exist.

Now, something important happens when you multiply by zero. Show me what that is.”

“There is more going on now that we have both multiplication and addition, but I think I got it.

Proposition 1.6

And the proof that 0 a=0 would be the same.”

“This works as long as you can always find some b to multiply by any a. So it always works, for example, when there is an identity because you can just set b=1.

Next think about what happens when you multiply by -1.”

“These all seem like pretty obvious properties of numbers, and I mean I already know that will produce the additive inverse, so why don’t we just skip that one?”

“YOU WILL SOLVE EVERY PROBLEM I GIVE YOU!”

“Eep! Ok, ok! P-princess!”

Proposition 1.7

“Ahem. Good. For that you had to use the property that everything times 0 is 0. And from this we also know that (-1)(-1)=-(-1)=1.

Now we can talk about multiplication with inverses. Try to stay calm because I’m going to use some new symbols!”

Definition 1.4 : Multiplicative inverses

This says, if x is nonzero, there is an inverse such that the product is 1. I am just putting the right inverse and left inverse together in one axiom because I do not know of any cases in which we will have only a left inverse or only a right inverse.”

Proposition 1.8

Twilight breathed deeply and partly blocked the view of the blackboard with her hooves, trying to only see part of it at a time. “Got it!” She thought for a moment. “And since inverses commute, the same proof that -(-a)=a would work to prove that .”

“Right! Do you see why 0 cannot have an inverse?”

Twilight smirked a little. “I think it can have an inverse!

But only if 0=1.”

“Which implies?”

“Err... it implies...”

“Yes, everything is zero.

The next thing we need to talk about is the concept of closure. That means that there is a definite set of elements, and an operation is defined for every member of the set. And the operation always stays within the set. In other words, for a set S, we might define a function f:SSS, which satisfies the properties of addition or multiplication.

Now I shall mention some objects that are defined by various combinations of the properties we discussed today, but that is just for reference. We will not do anything with them today.

Definition 1.5 : Ring

A set that has both closed addition and closed multiplication with left and right identity is called a ring.

By the way, quick aside. If a set is closed under addition or multiplication for binary sums or products (which is how I defined closure), does this mean it is closed under infinite sums or products?”

“Er, yes?”

“No. Infinite sums or products of a set whose elements are defined to have certain properties need not satisfy those properties. The rational numbers, for example, are closed under addition, but infinite sums of rational numbers can be real numbers.”

“I hope we won’t have to do too many infinite sums...”

“Not for a while. But you will learn to love them!

Definition 1.6 : Group

A set which has only closed multiplication with inverses defined on it is a group. Since a group does not have addition, there is no 0 defined in it, so everything has an inverse.

“But didn’t the axioms you told me earlier say that there’s always a 0?”

“You can think of 0 as existing somewhere out in the void, but just not within the set defining the group. Eventually we will be multiplying by zero though.

Definition 1.7 : Field

A field is closed under multiplication and addition and normally has the additional axiom that multiplication is commutative. If you skip the commutativity axiom, it is a noncommutative field.

I should mention that mathematicians and physicists have a different meaning for the world field. This is the mathematical definition. What physicists call a field, mathematicians would say is a kind of commutative module. So most of the time when we talk about fields we will be talking about something else.”

A nearby potted plant rustled and suddenly Pinkie Pie’s head burst out from beneath it, the plant sitting atop her head. “Like for example, all the fields mentioned in the table of contents for the rest of this whole story!”

Twilight yelped in shock. “What are you talking about?? And how’d you even get in here, Pinkie?”

Pinkie’s head slowly submerged again. “There is no explaining Pinkie Pie!”

“Some day, Pinkie! Science has the answers!” Twilight screamed at the pot.

“...Well that was odd,” said Luna. “Anyway... fields are very familiar objects because they have all the ordinary arithmetical operations defined on them: addition, subtraction, multiplication, and division. In the future, if I talk about numbers, I will usually be referring to the element of some field, so it will be assumed that addition, multiplication, and division are allowed. The rational numbers, the real numbers, and the complex numbers are popular examples. Of course physicists do not bother with rational numbers.”

Twilight chuckled. “Of course physicists only use real numbers. Those are the only numbers that exist in real life!”

Luna scowled. “Ha. Ha. Ha. FOOL! Just for that, I have another problem for you! I suppose you know that , but now you must tell me what is.”

Twilight gulped and turned to the board. “No problem, Princess.”

Proposition 1.9

“Quite so. From now on you do not need to do these tedious proofs with every step written out. Now you may simply say, ‘since (-ⅈ)ⅈ=-(-1)=1, then ⅈ and -ⅈ are multiplicative inverses.’”

“Thank you, Princess. That’s a relief!”

“Now there is always a natural sense in which things that can be added together can be multiplied by the integers. You write something like

and so on.

Now extend this idea so that we have some set whose elements can be multiplied by the numbers any field, so we might not just have 2a and 3a, but also and And maybe even (i+7)a, depending on what the field was.”

Twilight pressed her hooves to her temples. “OK, I’m extending the idea in my mind.”

“That idea is basically a vector space.”

“Actually, I think I’ve heard of vectors before. Aren’t they things with both a magnitude and direction?”

“NO!” Luna yelled. “Those are nothing but... lies! Propaganda promulgated by... by rank amateurs!”

Twilight was taken aback by Luna’s reaction. “I’m sorry I just read that-”

“BURN THE BOOK YOU READ IT IN THEN!”

Twilight sat awkwardly for a moment as she waited for Luna’s anger to calm down.

“I’m sorry Princess. Is everything alright?”

“I am sorry, Twilight Sparkle, it is just that you have uncovered a horrible ambiguity in our terminology. I did not wish to have to burden you with this, but what most people call a vector is actually a representation of the rotation group, which also happens to be a vector as well, but there are lots of other vectors that aren’t characterized by having a magnitude and direction, even some with a completely different geometrical interpretation.”

“Er, I see...”

“And there are lots of other vectors that are not geometrical in the slightest.”

“I’ll keep that in mind!”

Definition 1.8 : Vector space

“Right. Now a vector space V over a field F is a set whose elements are called vectors, understandably enough. Vectors have a addition defined on them and a kind of multiplication. However, you cannot multiply a vector with a vector. You can only multiply a vector by an element of its field. This is called scalar multiplication. A vector space is closed under addition and scalar multiplication. That is what a vector space is.

You already know about multiplication and addition, so you already know the most basic properties of vector spaces.”

Twilight recited what she had learnd. “There’s a unique identity element 0 that gives 0 when multiplied by any scalar. The field has unique and distinct elements 0, 1, and -1, and it’s got additive and multiplicative inverses. So, for scalar multiplication, there’s a unique left multiplicative identity 1, but not a right multiplicative identity. And of course vectors don’t have multiplicative inverses since they can’t be multiplied.”

“Indeed!”

“Excuse me, but isn’t the symbol 0 ambiguous here? I mean there’s both an additive identity in the vector space and in the field, and they’re logically distinct from one another. If v is a vector and s is a scalar, you could write

In the first equation, the first zero is the zero scalar and the second is the zero vector, and in the second it’s not even clear which zero that could be!”

“There will be no ambiguity because all zeros behave the same, so you never actually need to worry about it. On the other hand, you could think of it another way. There is only one 0 anywhere, and it has the property of being an additive identity for everything and can be multiplied by everything to give itself. When it is used with vectors, it acts like a vector, and when it is used with scalars, it acts like a scalar.”

“But which is the correct way?”

“Oh, Twilight! Either way is just fine.”

Twilight wondered how Luna could throw a tantrum about terminology but have no opinion about a seemingly much more realistic philosophical issue.

Definition 1.9 : Module

“By the way,” continued Luna, “you can also define something like a vector space except over a ring rather than a field, and that is called a module. We’ll won’t need to go into that theory, but I just mentioned it for completeness. Now then!” Luna said with a strong wave of her forefoof, “I’ll see you tomorrow for your next lesson.”

“We’re done? But... we didn’t even do any physics!”

“You’re not ready for physics yet. Perhaps in a few days we may be able to do some of the most elementary physics conceivable!”

“Wait! I have to thank you, Princess! You did make me feel more confident because you made me practice with easy things and you gave me advice on how to think about think about them. I haven’t read a book that did that before.”

“Some day, you will understand. And Huzzah! I have a disciple now too!” In a negative flash, Princess Luna disappeared into a black hole and took her blackboards with her.

Twilight was pleased at how much she had done without having a panic attack, but then she began to wonder what she had gotten herself into.

Day 2 - Vectors

View Online

[Best viewed with 'small' size text and 1.5 size spacing.
If the equations seem out of order, reload the page and they might be fixed by then.
For a pdf version of this chapter with better formatting, please click here.]

Unit 1: Space

Lesson 1- Algebra

Day 2- Vectors

Luna had not said where their meetings were to take place, so Twilight hoped she was going to the right place by going to the Canterlot archives again. She did not want to anger Luna by not showing up, as she was never quite sure how the goddess, who had after all spent 1000 years an the moon without interacting with other ponies, would react to anything.

She heard a commotion as soon as she entered. Dreading what she might find, she sped to a gallop and raced towards it. The center of the library was ransacked and a tornado of books whirled above them. Luna was right below it with a number of cowering librarians in front of her.

“THAT IS THE MOST ILLOGICAL THING WE HAVE EVER HEARD! THERE IS BARELY ANY ROOM LEFT FOR NEW SUBJECTS AND THERE IS NOT EVEN A PROPER CLASSIFICATION FOR MAGIC! AND STOP COWERING! STOP FEARING ME OR I WILL TURN YOU ALL INTO LADYBUGS!”

“Princess Luna!” Twilight screamed, hoping she would be heard over the gale.

When Luna looked over and saw her the tornado stopped and all the books piled themselves into very tall stacks that reached near the ceiling.

“Hooray, my new disciple is nigh!” Luna reared with joy before collecting herself and resuming her ordinary stiff, regal demeanor. “These ponies were just explaining the Dewey decimal system to me.”

“Hahaha, yes! I saw, Princess. I think next time if you just ask for the book you want she’ll just get it for you.”

Luna nodded. “Oh! That would probably be more convenient for everyone.”

“I think so! Um, in fact, maybe next time we should meet in Canterlot Castle instead of over here.”

“Yes, why not? TOMORROW WE SHALL MEET... IN CANTERLOT CASTLE!”

“Shh! This is a library!”

“Sorry. Anyway, let us get down to work!” said Luna as she conjured a blackboard once again. A block of chalk appeared before them and Luna broke off a small piece with her telekinesis. “The first thing to learn, if you want to do physics, is Linear Algebra.”

“Um, what about everything we did yesterday?” asked Twilight.

“That was the zeroth thing you need to learn. Linear algebra give us the right way to add and multiply the kinds of mathematical objects that represent physical quantities.”

“Aren’t physical quantities just numbers that are multiplied in the ordinary way?”

“They can be, but not always. There are some things in physics that are just ordinary numbers, but more generally they are collections of numbers, which are called tensors. Only whole tensors can be multiplied and only according to certain rules.”

“Why is that?”

“It all follows from the concept of space. It has to do with representation theory and the local coordinate invariance of space. I will show you all about that eventually. Anyway, just trust me for now.”

“Very well, Princess.”

“And of course derivatives and integrals are linear operators as well, but they act on function spaces, which are a more abstract kind of vector space.”

Twilight felt like she was already dreading the lesson.

Definition 2.1 : Morphism

“Linear algebra is about vector spaces and about functions on vector spaces. For every kind of object in mathematics that has operations defined on it, like addition and subtraction it is also possible to define morphisms between different objects. These are functions mapping one object to another which preserve the object’s characteristic operations. Did that make sense?”

“No.”

Definition 2.2 : Linear Map

“Well nevermind because you only need to understand linear maps for now. These are functions between vector spaces that preserve vector space operations. In other words, a function A:VW is a linear map between the vector spaces V and W if

(2.1)

(2.2)

It is clear, first of all, that linear maps only exist between vector spaces over the same field, since in this equation, a has to be interpreted as an element in the fields over which both V and W are defined.

Now the interesting thing about linear map is that applying can be treated as a kind of multiplication. Do you know how you would prove that?

Twilight went to the board and levitated a bit of chalk. “I’d have to show that function composition obeyed the three properties of multiplication. Left distributivity is given by assumption as the first property of linear maps. Right distributivity would require showing something like this.

(2.3)

but I do not even know what A+B would mean.”

“Is the sum of two linear maps, as you have defined it, a linear map?”

“If we take equation 2.3 as a definition, then it looks like it satisfies the right properties.”

“So,” said Luna with a satisfied smile, “you have learned that A+B is a linear operator. But is it really a kind of addition?”

“Associativity and commutativity clearly hold because the addition of operators in defined in terms of the addition of vectors, so necessarily these properties carry over.

There’s also an additive identity. That would be the map that sends every vector to 0.”

“Which can be written as 0.”

“Yes... as previously noted. And there are inverses too, because you can just define (-A)(v)=-A(v). So there is a kind of linear operator addition and right multiplication by vectors distributes over it!”

“You have learned something important just now. You started out by writing an equation about adding functions, which for all you knew was meaningless, and asked, ‘is this true?’ You then showed how to interpret the equation and asserted the reality of what you had discovered. That is one of the fun things about mathematics; anything that is meaningful is real. Isn’t that wonderful”

“Sometimes I don’t know about your philosophy, Princess.”

“Of course it is wonderful, Twilight Sparkle! In any case, we know about left and right distributivity. What about associativity?”

“I’d have to prove something like this.

(2.4)

But I don’t see how-oh well it’s obvious. That’s just the definition of function composition, so of course that’s true. It’s also a statement of associativity for multiplication of A, B, and v.”

“Exactly. It has nothing to do with linearity.

So now we do not have to write A(v) anymore. We can just write A v and say that v has been multiplied by a linear map called A, which obeys the properties s

(2.5)

They distribute over addition (which is true of all multiplication) and they commute with scalars.

“This is confusing. We’re looking at the same thing in two different ways at the same time. A function is something you do, but we’re looking at the function as more of a thing now.

“That is one of the tricks to getting really good at mathematics. Once you can think of something as two different things at the same time, you know you are getting somewhere!”

Twilight was approaching her limit. She held her head and slowly shook it back and forth as she tried to let the philosophy settle in. “Ooohhhhooohhh...”

“Do not let your brain shut down on me now, Twilight Sparkle! We are only getting started!”

“Please Princess... no more of your philosophy today at least.”

“For the rest of the day only proofs. Promise! The issue that will preoccupy us for now is to characterize the linear operators, especially what we can say about their invert ability.

Now, think of an expression that looks like this:

(2.6)

Definition 2.3 : Linear Combination
This could be finite, or it could go on forever. Each is a vector and each is a scalar that goes with the corresponding . I have taken a set S of vectors that includes all the , multiplied each one by its own scalar, and then summed them all together. This is called a linear combination of S.

More formally, you should think of a linear combination as being given by a map from a set of vectors to a set of scalars. Expression 2.6 is more of a mnemonic, but there are linear combinations that could not be written in that way.”

“Why not?”

“Because I have written the expression as if S was a countable set, but it might be uncountable.”

“What does that mean?”

Definition 2.4 : Countable and uncountable

“Oh dear, oh dear... well you asked for it! Countable means that a set can be given as a finite or infinite list, like the list describing the sum in expression [number]. Essentially a set is countable if it can be objectively mapped to the natural numbers. A set is uncountable if it cannot.”

“Are you saying that there are degrees of infinity?” asked Twilight skeptically.

Proposition 2.1

“Precisely so. There are sets that are bigger than the natural numbers. Just as an example, I will show you that the range [0,1) of real numbers is uncountable.

The proof is quite simple. It is a proof by contradiction. Let us say we have an infinite list þ of real numbers which, we hypothesize, contains all of them. We can write it out in decimal form like so.

It is always possible to construct a new real number n which is not in the list. Start with the first digit after the decimal point of n. Make sure it is different from the first digit of the . Then you know, whatever the rest of the digits are, that it cannot be equal to the . Next, make sure the second digit of n is different from the second digit of . You then know that the new number cannot be equal to or . Do you see what happens?

“If you continue this procedure all the way down the list, you always end up with a real number that’s not equal to any element of þ!”

“Right! This disproves the hypothesis that the þ contains all the real numbers. And since we made no restrictions upon þ, this proves that no list can contain all the real numbers.”

Twilight groaned. “I thought you promised no more philosophy!”

Luna looked perplexed. “This is not philosophy. This is just the ordinary real world!”

“Those are numbers, not reality.”

“Close enough! Just as an aside, I will tell you that the integers are countable and so are the rational numbers. You can think about how to prove that yourself if you want, since you do not actually need to know that.”

“Alright.” Twilight hoped she would never have to think about it again.

Definition 2.5 : Spanning set

“Back to linear algebra. The thing to talk about is the span of a set of vectors. Think of an ordered set of vectors , called the spanning list, and think of all linear combinations of those vectors. This produces a vector space called the span of S, which I will call V.”

“Why is it important that S be ordered?”

“Because now we can specify every element of V as an ordered list of scalars. The ordering defines the mapping. That is why, in three-dimensional space, you may write vectors as

where the are all real numbers.”

“This is a very convenient way to think about a vector space. Normally something like this will define almost every vector space-as the span of a set of vectors. Similarly, it will be much easier to think concretely about linear maps. We just have to think about how the linear map acts upon a spanning list.

By the way, I have not proved that V is a vector space, but can you see why that is?

“It’s obvious.”

Definition 2.6 : Linear independence, basis

“Good. Next we need to talk about linear independence. Now suppose I had a linear combination of S with the condition that

Clearly there is always a trivial solution to this equation if all for all i. What we are interested in is a nontrivial solution. In other words, for some i, . If there is such a solution then S is said linearly dependent. Otherwise, it is linearly independent. If a spanning set S is linearly independent and spans a vector space V, then S is called a basis for V.”

Twilight nodded.

Proposition 2.2

“The nice thing about a basis is that every vector in the space it spans is uniquely given as a linear combination of the basis.

Now assume there are two distinct linear combinations, which I will write as and , which produce the same vector w.

I can just subtract one linear combination from the other to get zero.

which proves that S is not linearly independent. You see?”

“Yes.”

Corollary 2.3

“This also proves that if S is a list of vectors and an extra vector w in span(S) is appended to S, then the result will no longer be linearly independent. There will be two ways to produce w: itself, and, because w is in span(S), there must also be some linear combination of S that produces it.”

Twilight nodded. “But I need to develop some intuition about these. Can discuss a few simple examples?”

“Very well. Hum. Well think about the vector space ℝ. Are the vectors 1 and 2 linearly independent?”

“Of course not, since 0=2*1-2.”

“And clearly, it would be impossible to have a spanning set with more than one element in ℝ.”

“Yes. For any two numbers there’s a linear combination that results in zero.”

“What about ? Try the vectors {1,0}, {0,1}, {1,1} and {2,2}.”

“Those vectors are not linearly independent because the last two point in the same direction.”

“And if you removed the last one?”

“Then you could still form the third vector by summing the other two.”

“And if you removed the third vector from the list?”

“Then the vectors would be linearly independent because they all point in different directions.”

“I think you are getting the hang of this, Twilight Sparkle. What about ?”

“It seems like you could have three linearly independent vectors. So {1,0,0}, {0,1,0}, and {0,0,1} are linearly independent, but if you added any other vector to the list, they no longer would be. Hm. I just noticed there may be some other vector spaces hiding in here. Let , , and be these vectors. Suppose I let . Then and on their own form a linearly dependent set. This means that and are both linearly independent, so neither of them can have a nonzero factor in any linear combination that is equal to zero.”

“You are really getting the hang of this! Yes, you can always divide a list of vectors into a vector space which is linearly independent and another which is linearly dependent.”

Twilight smiled. She really did feel like she was understanding! “It seems like you can’t have more linearly independent vectors than the dimension of the space because that’s the number of vectors you need to span that space.”

“Quite so! But actually we haven’t defined the concept of a dimension yet, so you’re jumping ahead.”

Twilight frowned. “But aren’t you already admitting implicitly that this is so by writing all the vectors in ℝ as single numbers, and the vectors in as lists of two numbers, and allowing me to write the vectors in as lists of three numbers? Clearly the reason that notation works is that the list represents a linear combination of a spanning set of vectors, and if you weren’t sure you didn’t need any more than the dimension of the space, you would write a longer list!”

Luna narrowed her eyes. “I admit nothing.”

“Oh come on!” Twilight rearedfore hoof.

Luna waved her forehoof. “You have passed my test!”

“What test?” asked Twilight with annoyance.

“I merely reprimanded you to test your intuition. You have enough to proceed.”

“Right... a test.”

Theorem 2.4

“We shall have to prove that intuition now. We desire to show that... ahem... EVERY BASIS OF ANY FINITE-DIMENSIONAL VECTOR SPACE HAS THE SAME SIZE (WHICH SHALL BE THE DEFINITION OF DIMENSION), AND THAT EVERY LINEARLY-INDEPENDENT LIST OF VECTORS OF THE SAME SIZE AS THE DIMENSION MUST SPAN THE ENTIRE SPACE. This will characterize vector spaces.

“Shhh!” Twilight covered her face with embarrassment.

“Well that is the most important theorem!”

Lemma 2.5

The first thing to prove is that for every linearly dependent spanning set S, there exists a vector which can be removed from the set without altering the vector space it spans. Let us write

to denote a linear combination of S. That good?”

Twilight nodded.

“You are not going to go crazy on me, are you, Twilight Sparkle?”

Twilight was struck by the irony that Luna would be worried she might go crazy. “What exactly are you iterating over?”

“I just wrote i under the sum symbol to say that we sum over i. We do not worry over how big S is, so you can assume that the sum is over whatever values it needs to be.

“Now let be nonzero value and write

This proves the lemma. As to this notation, there is no good way of saying you want to iterate over some set except skip one element of it. The summoned says we sum over i such that ij. How annoying. Next what do you think you could do?”

Twilight stared at the equations for a second. “We want to show that the span of is the same as span(S), so to do that we’d have to show that we’d have to show that every vector which can be a linear product of S is also a linear product of .”

“Good! Now try to do that.”

“Say that there was a vector w which is a linear combination of S. Let’s write

Now I can substitute in the expression for .

and this shows that w can be written without relying on at all. QED, Princess!”

Proposition 2.6

“And look what a complicated-looking equation you’ve written!”

“I did, didn’t I?” Twilight said gleefully.

“Now here comes the proof that everything hinges on. In a vector space V, every finite linearly independent list is smaller than or equal in size to any spanning list of V.

Let S be a linearly-independent list of vectors (though not necessarily one that spans V, and let U be a list of vectors that spans V.

Now, if we remove an element from S and insert it into , we can be sure that is linearly dependent by corollary 2.3. Therefore there is some element that can be removed from without changing the span.

Next, remove another element from S and insert it to produce . Once again, must be linearly dependent. Since all of the in are linearly independent, so we cannot safely remove any of them, but there must be some that can be removed from without changing its span.

Continue the process, adding an extra element from S to with each step and removing an element of U. What happens? First suppose that S is finite and S>U.”

“In that case, the process must terminate at some step n such that has no more elements of U left in it.”

“But we know that is linearly dependent because 2.3, right?”

“So there’s a contradiction. Therefore, either S and U are both infinite...”

“In which case in which the process never terminates and nothing is proven.”

“... or S is finite and SU. Which proves the proposition.”

“Right! By the way, what can we prove if S and U are both finite and of the same size?”

“Then the process terminates at some .”

“Which proves that S is a spanning set! That proves part of theorem 2.3. To prove the rest, suppose U is a finite basis for V. Can S also be a basis?”

“Only if S is the same size as U.”

“If S is a finite basis, what does that say about U?”

U is also a finite basis of the same size as S.”

“So every finite basis of V must have the same size and every linearly-independent list which is the same size as a basis must also be a basis.”

Definition 2.7 : Dimension

“I hope I can finally talk about the dimension of a vector space now!”

“You can talk about the dimension of a finite-dimensional vector space now.” Luna said sternly. “For shorthand we can write dim(V) for the dimension of V. Now, what if U and S are both bases for V?”

Proposition 2.7

“Then the issue is more subtle.”

“Exactly. So for an infinite-dimensional vector space, it is possible that there could be two bases, one of which is countable and one uncountable. We will see some examples of that eventually.

“So infinite-dimensional vector spaces don’t really have a very clear dimension!”

“No, the most you can say is that their dimension is infinite, but it does not necessarily settle on any kind of infinity.”

“I think I need more intuition about that.”

“Not now. We won’t need any infinite-dimensional vector spaces for a while!

The important thing is that we now we know how to think of a vector as a list. You can define some basis of the vector space, and write vectors as a list that describes a linear combination on that basis.”

“Of course, I already did know that...” Twilight said under her breath.

Definition 2.8 : Image, kernel

“Now let us think about linear maps. Let A:VW be a linear map, and let V be a finite-dimensional vector space with basis B. I will define A B to refer to the list of vectors that results when each is multiplied by A. We refer to span(A B) as the image of A. The image is a subspace of W. And the kernel is the set of vectors in V which A sends to zero.”

“Got it.”

“We can write ker(A) for the kernel of A and Img(A) for the image of A. It should be easy to see that the kernel is a vector space.”

“That would follow easily from the definition of left distributivity for a linear map.”

“Now if A B is a basis for the image of A, then necessarily A B is linearly independent, whereas if some nonzero vector in V was sent to 0, then A B could not be linearly independent because there would be some nonzero linear combination of it that is sent to zero, right?”

“Right.”

Proposition 2.8

“So A is only invert able if its kernel is zero-dimensional.”

“Ok.”

“Now let us define the vector lists P and Q such that A P is a basis for the image of A and Q is a basis for the kernel of A. Clearly P is linearly independent because otherwise A P could not be, and clearly PQ must be linearly independent because, if not, then there would be some nonzero vector v such that v is in the kernel of A and A v is in the image of A, which by definition is impossible. Also, PQ must be a basis of V because otherwise that would imply a nonzero vector w which is not in the kernel of A and such that A w is not in the image of A.

Proposition 2.9

This proves that the dimension of the domain of A must be equal to the dimension of the kernel of A plus the dimension of the image of A.”

(2.7)

“Linear algebra is becoming simpler by the minute. It’s all very constrained.”

Definition 2.9 : Invertability

“Indeed. The last topic for today is invertability. A linear map A is invertable if there is a linear map Z such that Z A=1 and A Z=1. We do not like to worry about cases in where there is only a left inverse or only a right inverse, so we say that both have to be inverses of each other. By the normal rules of multiplication, we know the inverse is unique.”

Proposition 2.10

“And we also know that V and W must both have the same dimension, since both A and Z must have a zero-dimensional kernel!” said Twilight with glee.

“Right! This implies that both A and B are injective and surjective. Injective is the same as saying they both have a trivial kernel and if either were not surjective, then it could not be that the inverse was injective.

Definition 2.10 : Isomorphic

Now we say that two vector spaces are isomorphic if there is an invert able linear map between them. Two vector spaces which are isomorphic can be treated as the same, because you can think of either one as the other one, just with a map applied to it. Let us show that two finite-dimensional vector spaces are isomorphic if they have the same dimension.

Let A:VW and Z:WV. We know that every finite-dimensional vector space has a basis already, right?”

“Yes. That was in 2.6.”

“Oh yes. Now then, let be a basis for V and let be a basis for W. Now we can define

Clearly A and Z as defined are both linear, and since we know that every vector has a unique representation as a linear combination of a basis, this definition is not logically contradictory either. A and Z as defined are both inverses of one another, and therefore V and W are isomorphic!”

“This means that there is only one vector space of any dimension for every field!”

“Yes, so like , , and are pretty much what we are talking about, at least for finite vector spaces.”

“Wait a minute... what’s ?”

“Oh that... nothing! You’ll hear all about it later!”

Luna winked and disappeared. Twilight blinked and hoped she would be able to keep herself out of trouble.

Day 3 - Inner and Outer Products

View Online

[Best viewed with 'small' size text and 1.5 size spacing.
If the equations seem out of order, reload the page and they might be fixed by then.
For a pdf version of this chapter with better formatting, please click here.]

Unit 1: Space

Lesson 1- Algebra

Day 3- Inner and Outer Products

The next morning at dawn Twilight dutifully appeared in the Canterlot Castle courtyard. During peacetime the gates were open and there were a few ponies around sight-seeing On business. The sun was not quite up and she could see a tiny dark speck at the top of a tower, keeping watch over the night.

She was tired from not being able to sleep in after her all-night studying session the other day, but she didn’t want to do anything that might make Luna cause more trouble. It was not a terribly great distance to the Canterlot but it was not the easiest commute to make every day. However, there was little one could do when a princess demands her presence.

Presently Celestia alighted next to her sister on the tower to cast the spell to make the sun rise. Soon after, Luna gracefully floated down into the courtyard.

“Good morning, disciple Twilight Sparkle. Are you ready?”

“Yes, I mean such a lovely morning... who wouldn’t be ready for mathematics at this time?” said Twilight with a hint of sarcasm.

“Lovely morning? Well I admit it is one of my sister’s better dawns but to tell you the truth I do not think she put as much effort into it as she could have!”

“Er, it was a lovely night too I mean! I could tell that you worked hard at it.”

Luna smiled. “Why thank you! That is right, I did!

Now, today we'll take just a little step closer to geometry. Today I will begin to show you how Linear Algebra can take on a geometrical interpretation. Now the other day you suggested that a vector might be a thing with a magnitude and direction, which was, of course, completely wrong. It's not, of course, but let us think about things which do have a magnitude and direction and think about how our theory of vectors can apply to them.”

“So, like velocities and accelerations.”

“Yes, like those. So what is missing from our theory of vectors?”

“I don’t know.”

“Everything! Our vectors have neither a magnitude nor direction! That is another reason you should never define a vector that way.”

“How can you say they don’t have a magnitude and direction? It seems like they do.”

“What would a magnitude be? At the very least, it would be a function V→ℝ that tells how to come up with a length for each vector. But there is nothing like that in the definition! And as for a direction, just consider this example in .” Luna summoned a block of chalk, kicked off a piece with her forehoof, and wrote directly upon the inner wall of the Canterlot courtyard.

You can see that v and w may look like they have nearly the same direction, but now

The result of M applied to them is that they now look like they are in almost opposite directions. Remember how all M induces an isomorphism from to itself, so the relationship of the vectors before applying it is no more real than after. This just shows that a vector space on its own has no concept of angels.”

“Wait a minute... what’s that?”

“What?”

“That square of numbers you called M!”

“Oh dear... did I forget to tell you about matrices?”

“Well actually I do know what a matrix is, but I want to hear your explanation.”

“Very well!” said Luna, who smiled slightly at the flattery. “We know we can define a linear transformation Z:VW by what it does to each basis vector in V, right? This is possible because, since linear combinations on a basis define vectors uniquely, a map on a basis uniquely defines a map on V without ever trying to send the same vector to different places.

Say that I define a map Z:VW by . In other words, each is sent by M to the corresponding . The together form a basis in V, but the might not form a basis in W. I can now write Z as

(3.1)

where I have written little arrow on top of the vectors because otherwise this will get confusing.

If V is n-dimensional and W is m-dimensional, the linear map must define n vectors, each with m components. That means there are m n components in total that are required to specify the linear transformation, you see?”

“Yes.”

Definition 3.1 : Matrix

“So just as you can denote a vector as a list of length n, you can denote a linear transformation by an mn box, which is called a matrix. Normally this is done with the components of each written vertically like so, you see?”

where means the jth column of the ith row of Z.”

“It seems a little bit confusing,” said Twilight, “that the indices on the components of Z are reversed relative to those on the .”

“It is confusing, but that is the convention. So watch out Twilight Sparkle! Now find the multiplication rule for Z in terms of the components of the vectors in W. You may use to represent a basis in W.”

Twilight broke off her own piece of chalk. “I can start by writing each in terms of its components using the basis .

(3.2)

then I replace the ωs with zs.”

(3.3)

Proposition 3.1

“So,” said Twilight, each row of the matrix defines something like a linear combination on the components of . However, the and are numbers, not vectors, so it’s not quite the same.”

“Correct. Actually, it is not bad at all that I forgot to tell you about matrices yesterday because this is just sort of operation we need to give vectors magnitudes and directions. I now define the dot product.

(3.4)

you just sum the product of the components of two vectors. Now we can redefine matrix multiplication like so:

(3.5)

where now, each row in the matrix is a kind of vector rather than each column.”

“That’s much less confusing.”

“And it gives us a notion of length too.”

“Yes. If this were a vector in Euclidean space, then the length of a line segment is

(3.6)

by the Pythagorean theorem.”

“The dot product rule. This rule also give us a concept of an angle. Think of two unit vectors and . This diagram shows that their dot product will be related to the angle between them.


their dot product would be given by

so you see that only the part of which is parallel to contributes to their dot product. and if were orthogonal to , then the dot product would be zero. That will become our definition of orthogonality soon.

Twilight said, “So the angle would be given by

(3.7)

but does this diagram really give the right idea? You’ve actually drawn it so that one of the vectors is along an axis to make the conclusion obvious, but is that true in general?”

“That is a good question. If neither vector was aligned to an axis, then we could just draw rotated axes to align with either of the vectors and the conclusion would be the same. Of course, that depends on knowing that the dot product is rotation-independent. In the future, we will construct spaces which we know are rotation independent, and then we will find that some algebraic expression has an especially simple form when rotated in alignment with some axis. Since the space is rotation-independent, we will know that it does not actually matter how the expression is aligned, and will make general conclusions from it.

We must now ponder upon how this idea of a dot product can be generalized. We have an idea of the kind of thing we want, but we do not have the general idea yet. The dot product is not a real thing because it depends on the particular basis-two vectors will have a different dot product with one another if you just change the basis of the space. So we need to search for something more objective.”

Twilight nodded. “But wait a minute-is that really necessary? The dot product works, right? Why don’t we just say that one basis is correct and use it? This is all too esoteric. That’s exactly what I had trouble with in physics-there are so many concepts that are not, strictly speaking, necessary at all!”

Definition 3.2 : Operator
Suppose I define a linear map H:VV-a linear map from a space to itself is called an operator, by the way-which performs a coordinate transform on V. In other words, it is defined as a map from one basis of V to another. It is therefore invertible, correct?”

“Now, as I was saying, I defined the operator H. What do you get if you want to change coordinates yet preserve the value of the dot product?”

“I would have to multiply both vectors by H. That would be a change in coordinates. But then I would have to also multiply each by , since otherwise the dot product would not be preserved.”

“Right. You can think of the and as being the new vectors and the two s as somehow modifying the dot product itself. In order to write this expression in a more understandable way, let us think about something simpler. Suppose we have an invertable operator and we define a new operator M by

“Wait a minute. Are you sure there must actually be an operator M which can fit in there?”

“Yes. behaves just like a change of basis on , so there must be some change of basis on that produces the same result in the end. Your job is to find the relationship between M and .”

Twilight was perplexed. “I’m really not sure how I’d go about this.”

“Let me give you another way of thinking about it. You know how to multiply a vector by an operator on the left. Imagine now that you can multiply a vector on the right. instead, and you want to figure out how to do it.”

Twilight stared at the wall for a moment, and then she smiled. “I have it! Right multiplication by a matrix can be defined by the associativity rule. In other words,

“Go on!” said Luna.

“I just have to expand that expression out and see if I can find an equivalent

(3.8)

“Oh dear, this is going to get messy...” Twilight stopped. She could feel her four ankles quiver. “Somehow I’ll have to factor out , and then what’s leftover will be .”

Luna whispered in her ear. “Stay calm and do this problem for me. I promise I will show you something wonderful when you are done!”

Twilight blushed and wondered what Luna meant. However, her words had brought some of her confidence back. “Alright...” whimpered Twilight.

“Try writing it as an summand,” Luna suggested.

With trepidation, Twilight wrote on the wall again.

(3.9)

“There, you see!” said Luna. “You factored that whole thing out and it was as easy as moving a summand sign to the left. Now remember, by the rules of the dot product, unless i=j.”

Twilight closed her eyes and tried to think about what Luna said. This would mean that she could set j=i in the summand and remove the sum over j! She wrote

(3.10)

“That is it!” whispered Luna.

“But now what?”

“Are you sure you want me to tell you?”

Twilight’s heart was pounding.

(3.11)

“Bravo!” said Luna. “you have the right multiplication rule!”

“It’s just like left multiplication... except that the sum is over the columns of the matrix rather than the rows.”

“Exactly. Good disciple!”

Twilight smiled in spite of herself. “And what was the wonderful thing you were going to show me?”

Luna nodded mysteriously. “I will show you.

Definition 3.3 : Abstract index notation
You had been writing vectors something for your proof. This notation is very redundant. You know that a vector is a sum of components over a basis. All that matters is the components. So, get rid of the and get rid of the . Just write a vector as . The i is now a free index. I have used free indices earlier today, but now they have a special meaning. It means spatial vector.

We have also been writing something like to denote the product of M with . But we know how to matrix multiply, so there is no need to write it out so explicitly. Instead, write and assume that the repeated index j must be summed over. The expression has one free index, which makes it a spatial vector. Then we just write a linear operator as , and the two free indices indicate that it is a linear operator.”

“Isn’t it technically a matrix rather than a linear operator?” asked Twilight. “In writing each object with an index aren’t we implicitly denoting the object as an array with individual parts? I thought we always wanted to be careful not do anything that is basis dependent.”

“Historically, you are correct. But the notation does not need to be interpreted in this way. It is true that we do not wish to do anything that depends on a particular basis and in fact we do not even know if every vector space has a basis. However, we can always think of the indices as simply denoting the kind of object and the kind of multiplication rather than an actual index over an array.”

“I see.”

“I want you to try to prove the same thing you just did using the abstract index notation. Try to show how right-multiplication of linear operators works.” Luna then wrote

(3.12)

Twilight squinted at the expression for a moment. “Well now there’s nothing to prove! The way that the matrix multiplication is written makes it obvious! and M are just transposes of one another.”

“What have you learned, Twilight Sparkle?”

“I went through all that horrible algebra with summands when I could have just done... nothing!”

“Can you interpret a more generally applicable lesson to this?”

Twilight hung her head. “I won’t discount the power of abstraction. I’ll try to learn how to use the best mental tools.”

“That is right, Twilight Sparkle!”

Luna turned back to the wall and with a wave of her hoof said, “Now to return to the generalization of the dot product. As we decided,

I have written this in abstract index notation now.

Changing the basis results in two new vectors, and , as well as a new operator , which looks like the product of an operator with its own transpose. So we can think of the more general form of the dot product, which we shall call the inner product. It will work something like that.

Let’s define

(3.13)

This is a linear map which is a square of two linear maps. You will now prove some properties of . You may use whichever notation you find most convenient.

Proposition 3.2

The first property is symmetry. You must show that .”

Twilight wrote

(3.14)

and said, “For one of the steps, I had to rename some of the repeated indices-of course that’s fine because the letter on the indices is meaningless.”

Proposition 3.3

“The second property,” said Luna, “is positive definiteness. That means , and it’s only zero when .

Twilight said, “That one is easy. You can clearly see that is a dot product of a vector with itself.

(3.15)

The dot product of a vector with itself is a sum of squares, which we know will always positive, at least for real numbers.”

Definition 3.4 : Inner-product space

“Yes, and we will only bother to think about real vector spaces for now, so that is the most general result we need. However, eventually we shall want to generalize the concept. A real inner-product space is a vector space V over ℝ which comes with a symmetric, positive definite linear map M:VV→ℝ. If then define

(3.16)

Definition 3.5 : Quadratic form
and the operator g defines the linear map. The positive-definiteness of g implies that it is invertable. This is what I shall mean when I use the dot notation from now on. The operator g which defines the inner product is called a positive definite quadratic form.

Definition 3.6 : Length, unit vector

And finally we have an objective way to talk about lengths and angles. The length of a vector is is written and given by . A vector whose length is 1 is a unit vector.

Definition 3.7 : Parallel, orthogonal

Two unit vectors and are orthogonal if and parallel if .

Definition 3.8 : Orthonormal basis

Finally, these concepts allow us to construct a basis for our space that more closely resembles the familiar one from geometry. In geometry, you have the x, y, and z axes, and they are normally perpendicular to one another. And advancing along the x-coordinate by one unit is equivalent to a distance of one unit. Now we know how to make something like that because we know how to require that each vector in a basis is perpendicular to all the others and have a length of one.”

Twilight began to fume with frustration when she heard that. “That is so obvious it doesn’t make sense to even consider other ways of doing it! Why do we have to go through so much math just to end up constructing the most obvious thing ever??” wailed Twilight.

“That is just a cultural bias. A few thousand years ago, when mathematics was getting started, nobody had even heard of orthonormal bases!”

Twilight groaned and let her head droop to the floor.

“Now now,” Luna said sweetly, “What was the lesson you just told me you had learned a few minutes ago?”

Twilight sighed. “I will learn the tools because they will make things clearer in the end.”

“That is right! Do not forget so quickly, Twilight Sparkle.

Proposition 3.4

Now I want you to prove that an orthonormal set of vectors is linearly independent. This should be an easy one!”

Twilight thought for a moment and tried to find the perfect way to state the proof. “If two vectors, and , are orthogonal, then

Since the inner product is linear, it distributes over addition. The terms and are zero because the vectors are orthogonal. That leaves . Because the inner product is positive definite, the only way that this expression could be zero is if both and are zero. In other words, two orthogonal vectors are linearly independent. If any two orthogonal vectors are linearly independent, it follows that an orthogonal set of vectors must be linearly independent.”

“Quite correct, Twilight Sparkle.

Proposition 3.5

“The last thing we shall prove this pleasant but somewhat garish morning is that an orthonormal basis always exists. To do that we shall have to think a little bit about projection operators.

By a projection of onto , I mean that one finds the component of in the direction of . I will say that is a unit vector. This will make the problem easier. What do you think the formula for that is?”

“From your diagram, it seems as if ought to be the length of the result we want, and is in the right direction. So I think the answer is .”

Definition 3.9 : Projection

“That is right, and that is how a projection shall be defined.

Now, is the projection a linear operation on ?”

“It is. It is easy to see that it would distribute over addition and commute with scalars.”

“Let us call the operator . Can you write a formula for this operator acting on ?”

“I think I can using the abstract index notation.

(3.17)

That’s an inner product-a scalar-multiplied by a vector. The two operations together work out like a linear map.”

Definition 3.10 : Outer product

“Yes. Now let me define another kind of product. The outer product is a way of multiplying two vectors in an inner-product space to form a linear map. Here is the definition.

I put the g matrix in there because now, when you multiply by a vector, multiplication by w works like a proper inner product. Otherwise the result would not be coordinate-independent. And now we can write as

(3.18)

Make sense?”

Twilight nodded. “Yes, but if you can multiply two vectors to produce a linear map, shouldn’t you we able to multiply more of them together to produce a new kind of object?”

“Very good question! You can indeed do that. However, we shall save the more general theory for another day. My next question is, suppose you have a unit vector and you wish to project a vector so that it is perpendicular to rather than parallel? Show me the operator for that.”

“Well I want an operator which gives if and are perpendicular and gives 0 if they’re parallel. That should be enough to define the operator.” She then wrote it out on the stone floor.

“That should be enough to define how the operator works,” she said. Now if I just wrote something like

(3.19)

Then this has the desired properties. It just subtracts the parallel part from .”

“Indeed. Suppose that you had several unit vectors , , and so on which are all orthogonal, and you wish to project a vector so that it is perpendicular to all of them?”

“That’s not a difficult generalization at all. That would just be

(3.20)

“Can you use that in the proof that every inner product space has an orthonormal basis?”

“That seems quite feasible now. Given a basis , I just define

(3.21)

and so on. The vectors will form an orthonormal basis. And of course I’ve written normalize to indicate that the vector must be normalized by dividing it by its own norm.”

“Exactly! You have discovered what is called the Graham-Schmidt process. There are some caveats to this proof. Because the proof is recursive, it only proves that a vector space with a countable basis has an orthonormal basis. However, that is good enough for our purposes.”

“So,” said twilight, “we have finally reached the point where we can do ninth-grade geometry.”

“That is right!” said Luna, not appearing to have caught Twilight’s sarcasm, “and to think all those poor young ponies think they’re doing math every day when they really don’t know what they’re doing at all!”

Twilight sighed to herself. “Well the books I tried to read did say a lot about operators, and scalars, and matrices. So I suppose I must be learning something.”

“Of course you are.” Luna said with a flick of her mane. “But now I must now be off. Farewell, Twilight Sparkle.” Luna then bolted aloft.

“Wait!” yelled Twilight. She at least wanted to thank Luna for being such a dedicated teacher and putting up with her complaints. But it was too late. Luna was merely a speck in the sky and she soon disappeared behind the mountain.

On her way back home, Twilight mused. Luna could be a frustrating pony, but Twilight had enjoyed the morning in spite of herself. She had hoped to ask Luna more about how the material they had studied today actually related to physics, but Luna had left so abruptly that she hadn’t had the chance. It seemed like Luna was trying to be friendly with her, but was not comfortable speaking about topics other than math. She wondered if she might be able to do more to bring Luna out of her shell next time.

Day 4 - Rotations

View Online

[Best viewed with 'small' size text and 1.5 size spacing.
If the equations seem out of order, reload the page and they might be fixed by then.
For a pdf version of this chapter with better formatting, please click here.]

Unit 1: Space

Lesson 1- Algebra

Day 4- Rotations

The next day Twilight found Luna sitting quietly just inside the gate to the Canterlot gardens.

“Good morning, Princess.”

“Good morning, Twilight Sparkle. Today we shall talk more about transformations.”

Twilight grinned. “You mean like how you transformed into Nightmare Moon?”

Luna pouted slightly. “No, foolish Twilight Sparkle! I mean something much more interesting: coordinate transformations!”

Twilight could see that she had said the wrong thing. “I’m sorry, Princess! Just a joke, haha!”

“Hmf! Very well.” Luna began to lead her to the Canterlot wall, where a green pegasus colt with a sponge was cleaning their work from yesterday.

Luna waited for him to clear off enough room on the wall for them to begin work. Then her horn glowed a moment and the pegassus disappeared with a zap.

“Um.” said Twilight. “Where did he go?”

“Oh, I just sent him to somewhere else where there is a mess.”

“I don’t think you should just teleport people away without warning.”

“Why not? He’s my employee, so I can do whatever I want with him! Plus, it’s more efficient that way.”

“But that might terrify him! Don’t you think that could be disconcerting.”

Luna grumbled. “Well maybe. But I didn’t know what to say to him!”

Twilight sighed. “Well maybe you could say, ‘Thank you, and if you wouldn’t mind just find something else to do until we’re done here.’”

Luna shook her haid. “No no no! That sounds simply awful!”

“But-”

Definition 4.1 : Coordinate Transform

“No no no, I said. Anyway, to transform the coordinates is to define a new set of coordinates and transform everything so that they are described in the new set of coordinates instead of the old ones. For a vector space, that just means a change of basis. We know that there is one linear map for every basis map and that a map from one basis to another is invertible. So therefore we define a coordinate transform is an invertible operator.”

“Now look, you did it again!” said Twilight now a little irritated. “You took something that is meaningful in our intuition, in this case a coordinate transform, mentioned some properties that it has, and then redefined it in terms of those properties, thus turning it into something meaningless.”

Luna considered that for a moment. “Philosophically, we start by using our intuition. Intuition has rules that make it work, but they are not conscious. So we have to search for rules that predict our intuition. Once we have those rules, then we rely on the rules, not our intuition.”

“To change coordinates by a transform T, you multiply a vector by the operator, yes?”

“Yes.”

“But what do you do to change the coordinates of a linear map?”

“I’m not sure.”

“Well let us think about it. If , then a change of coordinates requires that

(4.1)

where is the coordinate transformed version of B. To make this work, we need that , or in other words that . This is how you do a coordinate transform on a linear map. Understand?”

“I understand, Princess!” Twilight said.

“This is another thing I want you to be able to think about in two different ways. You may have already realized this, but a linear map is a vector. It can be added to other maps and multiplied by scalars, so necessarily so. An operator B in the vector space V is a vector in the space VV.”

“What does the cross mean, Princess?”

“That is the Cartesian product. The Cartesian product operates on sets which have similar operations defined on them and it allows those operations to distribute over the product.”

“Ummmm... what?”

“I mean, suppose A and B are both sets with addition defined on them, and suppose A has members and B has members . Then for all i. You see? Each element of AB is just takes one element from A and one element from B. Addition is then defined by .”

“So VV is another vector space?”

“Correct. Although it’s not always the case that the Cartesian product of two things is always the same kind of thing. For example, the Cartesian product of two fields is a ring.”

“Ok!”

“Now look again at the expression . It looks like a linear map multiplied on the right and left by other linear maps so as to apply a coordinate transform, yes?”

Twilight nodded.

“But look at the following equation. What does it suggest to you?”

(4.2)

“It looks a lot like the rule defining a linear map.”

“Actually, it looks exactly like the rule defining a linear map! This proves that T can be considered a linear operator on VV, where multiplication by a vector B is defined as a coordinate transform on the operator B. So when you see , I want you to think, this is both a coordinate transform, but it is also .”

“Got it, Princess!”

“Now I will show that if V is an inner product space, then VV is also an inner product space. It’s really pretty clear now we want to be able to write something that is equivalent to . You know how to apply g to C by performing a coordinate change to get . Next we would have to do something like a dot product. We would want simply to sum over all the corresponding components of the matrices for B and C. It turns out you can do that this way:

(4.3)

Definition 4.2 : Trace
where the Tr stands for the trace. That means sum along the diagonal of a matrix, so

(4.4)

I suppose we shall prove that expression 4.3 is an inner product. How would you do that?

Twilight replied, “The expression is linear, so that is one property. The other property is symmetry. To make it easier, I can write expression 4.3 without the gs and just say that we’re in an orthonormal basis.”

Luna nodded. “Good start. Next try abstract index notation.”

Twilight didn’t think she needed a hint but she didn’t say anything.

(4.5)

“Well that’s clearly symmetric.” she said.

“Yes, and what is the lesson? There are many more vectors sitting around than just the spatial vectors and you should never listen to any ignoramus who says a vector is something with a magnitude and direction!”

Twilight could not help but roll her eyes.

“Now,” Luna continued, “we shall think about a coordinate transform O that preserves the inner product. Since the inner product defines the concept of an angle between two vectors, this is the same as saying that O applied to any two vectors must preserve the angle between any two vectors it is applied to, right?”

“Yes.”

Definition 4.3 : Orthogonal Transformation

“The O stands for orthogonal transformation, which is what this is. That means that , or that O and g commute with one another. These transforms are very special! Orthogonal transformations (and their generalizations) form the basis for all physics! So pay attention!” Luna stomped loudly.

Twilight started at the sharp sound of Luna’s palladium shoes against the stone. “I’m listening!”

“Start with a general inner product expression, and then perform a coordinate transform.

(4.6)

Since g commutes with O, it follows that

(4.7)

This uses the fact that g is invertible.”

Theorem 4.1

“So,” said Twilight, “a transformation that preserves the cross product is the inverse of its transpose. Or, in other words, to invert such an operator, merely take the transpose.”

“Exactly! It is also the case that the product of orthogonal transformations is an orthogonal transformation.

Proposition 4.2

(4.8)

Since 1 is an orthogonal transformation, this makes orthogonal transformations a group, something you may recall we briefly mentioned on day 1. We shall have more to say about that later!”

Luna sniffed a little. “Do you remember yesterday when we came up with a formula for a projection operator perpendicular to a unit vector? it was

Twilight nodded.

“You must imagine a similar operator which performs a flip, which we will write as . It should invert vectors parallel to and leave other vectors the same.”

“Yes, Princess! I can do that! See, here it is already.”

Definition 4.4 : Vector Flip

“Yes! That’s a kind of flip. You just go twice as far as the projection and you end up with an inversion. This kind of flip I’ll call a vector flip because it’s defined as a flip over a single vector.

Definition 4.5 : Flip

A more general idea of a flip is an orthogonal transformation that is its own inverse. It is easy to see that has this property. The flip inverts one component of the vector, so the same flip again will revert it to how it was. It also has a symmetric matrix, so it is an orthogonal transformation. Can you think of any other operations with this property?”

“A 180 degree rotation?”

“That is true! Of course, a 180 degree rotation is the product of two perpendicular flips, correct?”

“Is it Princess?”

“Yes! Give it a try with this picture. First flip across (1,0), and then flip across (0,1). See?”

Twilight played with the picture. “I understand the example, Princess, but is this true in general?”

“What do you think, Twilight Sparkle?”

“Come to think of it, that reminds me of a question I had yesterday. Each of the two flips picks out a particular vector, but the final result, the 180° rotation, does not. The only thing that is specially picked out is the plane of rotation. So as long as both flips were in the same plane and were perpendicular to one another, the final result would have to be the same 180° rotation. We can have the flips aligned to the axes without limiting our general knowledge of the resulting rotation.”

“Yes! Exactly that! You are really learning now, Twilight Sparkle. That was a very good symmetry argument.”

Twilight blushed a little at the praise.

Luna continued. “I will call operators like a one-dimensional flip because it only flips one vector. A 180° rotation is a two-dimensional flip. You could have flips for any number of dimensions. Are there any other possibilities for one-dimensional flips other than ?”

“The identity operator could be considered a one-dimensional flip.”

“Right. There are no other possibilities for a one-dimensional flip,” Luna continued. “This can be seen by thinking of flips as operators acting on a given one-dimensional subspace. A one-dimensional real vector space is just the real numbers, and you know that -1 and 1 are the only numbers which are their own inverses.

Thinking about flips more generally, suppose that there were a general flip f, and let’s say that . Then since by definition , it must be that . Now therefore

(4.9)

There are two possible ways to interpret this. If and are linearly independent, then f acts on the space they span as . On the other hand, if and are not linearly independent, then . In these cases, f acts on the one-dimensional subspace spanned by as either the identity operator or . What fact have we proved here?”

“Oh, oh! I know, Princess! You’ve proved that every flip operator can have a vector-flip factored out of it!”

“Why is that important?”

“Because that observation can be applied recursively on f! Every flip f can be written , where is some vector on which f flips and is a flip that acts on the subspace perpendicular to . So you can write any flip as a product of perpendicular vector flips!”

“Right! Of course that only applies to countable dimensional inner product spaces.”

Twilight nodded eagerly. The proof had really made her excited because she had actually seen where it was going before Luna had finished. She was getting smarter! The technique would never have occurred to her just a few days ago.

“And since each one dimensional flip means either multiplication by 1 or -1,” continued Luna, “You can just think of a flip in terms of two perpendicular subspaces. Those which are multiplied by 1 and those which are multiplied by -1.”

Twilight nodded.

“Now let’s think about one-dimensional flips that are not orthogonal. We shall first flip about and then about .” Luna wrote on the board.

(4.10)

“Notice what happens when you multiply two outer products together. Verify that step using abstract index notation!” Luna said curtly.

“Very well, Princess.

(4.11)

“No problem! Ehehehe!” Twilight laughed nervously.

“Now clearly, if the two flip vectors were orthogonal, then the second term in formula 4.9 would be zero. This is an important sort of object, so let us decide what it is.”

“Yes, let’s! I want to know what it is, Princess!”

“We shall! Since only two vectors define the operation, we should only need to think about the plane defined by the two vectors and ignore any perpendicular vectors, correct?”

Twilight nodded. “It’s easy to see that if you multiply by a vector that is perpendicular to both and , then the vector will not be altered.

“So we only need to think about two dimensions to understand what the sequential flip does. This diagram shows the result of flipping about . The purple vectors are and . The pink vectors show the result of performing the two flips on .

“It’s a rotation like before, only now we can make any angle!” observed Twilight.

Definition 4.6 : Planar Rotation

“Quite so. Two nonorthogonal vector flips produce a rotation in the plane spanned by the two flips. I will define a planar rotation as a product of two vector flips. It must be an orthogonal transformation because it is a product of orthogonal transformations, but this is also easy to derive.”

On queue, Twilight was already writing on the wall.

Proposition 4.3

(4.12)

“A seemingly more difficult problem is what you get when you multiply two rotations. Consider two rotations and . Now clearly if the plane spanned by and is perpendicular to the plane spanned by and (by which I mean every vector in the one is perpendicular to every vector in the other), then clearly and commute with one another, so there is no way to simplify that case further.”

“Wait a minute,” said Twilight. “How is it possible for every vector in one plane to be perpendicular to every vector in another plane? I can’t imagine how that’s possible.”

“Sometimes it is better not to try to imagine things. That can become unnecessarily confusing.”

“But they are both subspaces, right? So they would both have to include the zero vector, right? But they could not have any other vectors in common.”

“Yes, but that is alright because the zero vector is perpendicular to itself. But no, they can have no other vectors in common, since any other vector would not be perpendicular to itself.”

“But my question is, how is that possible? I can’t imagine two planes that intersect on only one point!”

“I see what you are asking, Twilight Sparkle! Two planes must intersect on a line, right?”

“Yes.”

“Wrong! That is only in three dimensions.”

“But I thought today we were doing geometry. How can there be a plane in four dimensions?”

“Just as easily as in three dimensions. Maybe even easier! In four or more dimensions, planes can intersect at a point.”

“I just don’t see how that’s possible.”

“You are trying to visualize it. Well you cannot. Sadly, your brain is not built for that, Twilight Sparkle! That is just why you must not take your intuitition too seriously. This is a case where it is wrong.

Now, what if and take place in the same plane? A rotation in two give dimensions is defined by an angle, and two rotations in the same plane should result in a rotation which is the sum of their angles. That should be clear if you imagine it, but do you have an idea how you would prove it?”

“True, but let us prove it more formally now.”

“Ummmm, ok. Oh, oh! I know the answer! If all for vectors are in the same plane, I can write . Then, I can do something like

(4.13)

“You have got the right idea,” said Luan. “You may fill the rest in later if you want. The important bit is this: two rotations in a plane commute and a planar rotation is invariant under planar rotations in its own plane. This is exactly what you learned earlier when you were flipping that picture.

Finally we are ready for the case in which the two rotations are neither perpendicular nor parallel! Here’s the trick. If the two planes span a 3-dimensional space, they must have a single 1-dimensional subspace in common, correct?”

“Yes,” said Twilight.

“Now,” continued Luna, “Because you can rotate a planar rotation in its own plane without changing the operator at all, or in other words, because a planar rotation depends on a plane and an angle, but not on any specific vector, we may now consider that and have been rotated such that is a vector in that shared subspace. That is what I have done in this diagram.


You see? First rotate from to and then from to . Now find out what happens.”

Twilight wrote on the board.

Proposition 4.4

(4.14)

“Why,” she said, “the result is another planar rotation!”

“That is right! But remember that the angles from to and to are only half the angles of the rotations. It is a different way to think about rotations than you might be used to.

We know now, though, that products of vector flips are always either general flips, products of planar rotations, or products of planar rotations and a flip. We are not yet in a position to prove this, but this also exhausts all the orthogonal transformations.”

“That formula 4.9 is a little bit inconvenient,” said Twilight. “I mean since the angle of the rotation is twice the angle of the two vectors. Isn’t there a formula for a rotation in terms of two vectors that rotates one vector directly into another.”

Definition 4.7 : Symmetric operator, asymmetric operator

“Yes,” said Luna. “We suppose we could! That might even be a good lesson. Well the first idea to introduce is that every operator can be split into a symmetric and asymmetric parts. This is very easy.

(4.15)

The first part is the symmetric part and the second is the asymmetric part. You can see this because if you take the trace of the symmetric part, it comes out the same, and if you take the trace of the asymmetric part, it comes out opposite.

(4.16)

Symmetric and asymmetric operators have the interesting property that each retains its symmetry or asymmetry under orthogonal transformations. You can see this pretty simply with the orthogonal transformation O.

(4.17)

You see? The symmetric part stays symmetric and the asymmetric part stays asymmetric!”

Twilight nodded.

“If you were a mathematician, you would say that this shows symmetric and asymmetric operators each form a representation of the group of orthogonal matrices.”

“Ok...”

“Some day you’ll think that’s terribly profound. Now I want you to separate equation 4.9 into symmetric and asymmetric parts.”

Twilight wrote on the board.

(4.18)

“The first two terms are the symmetric part, and the last is the asymmetric part,” she said.

“Let us say this represents a rotation about the angle θ. And let us define a new vector which differs from by the angle θ. The problem is now to somehow rewrite expression in terms of and .

Look at the asymmetric term first. We know that is equal to cos(θ/2). Now look at the operator . This is actually a great opportunity to think of operators as vectors. What is the length of that operator, conceived as a vector?”

Twilight wrote on the wall.

“The trace of a outer product of vectors is their inner product,” she said. “And there will be a factor of 2 in there which I can take out of the square root. So therefore,

“So!” said Luna, “just ignore the factor of for now. You can think of the asymmetric term as being like . Now remember the trig identity

If sin(θ/2) is a lot like , then we should expect that sin(θ) should be just like . That means we can replace the term with .

And what about the other term? Let me just suggest something to get you thinking a little more. A rotation matrix in two dimensions looks like this.

You can verify that that works when you get home. So we’re trying to find something that looks kind of like that, only it can work along some arbitrary two-dimensional subspace of some higher-dimensional space. Make sense?”

“I guess so.”

“Now we already found the a part that looks like the sines in that matrix. We need to find something that works like the cosines, right?”

“Ok...”

“So let us think of the second term in expression RotationFormulaSplit as being like 1-cos(θ). That way, when it adds to the identity at left side of RotationFormulaSplit, it might works out to be a nice cos(θ). Now recall the trig identity

So the second term-the really complicated one-we can think of as being both like 1-cos(θ) and also like . But that means we could also think a similar expression as being something like . What could we do to it to turn it into something more like 1-cos(θ), except written in terms of and ?

Twilight said, “Err... this isn’t making a lot of sense. I mean how-”

“Twilight Sparkle, you must have faith in your teacher! Now follow my lead,” Luna said sweetly.

Twilight creased her brow and sighed a little. “Alright, Princess. Well if that term is like then I could multiply by because the denominator is equal to 1-cos(θ) and the numerator is equal to .”

“That’s right! Exactly what I was thinking. So now let us write the new expression

(4.19)

and there we have it!”

“Really?” said Twilight. “I don’t think you’ve proved that at all!”

“You can prove it is right by confirming that it works the same as , but that will not be a terribly enlightening exercise.”

Twilight nodded. “Er, ok. It was a very strange process we used to arrive at ind and it’s actually not a very nice-looking formula in comparison to the other one. Maybe it makes more sense to think of rotations in terms of half the rotation angle, since they multiply so nicely that way.”

“Very good! But deriving the new formula was a good exercise because you learned a valuable lesson. I did not actually derive that expression. I simply made a bunch of completely extremely suspicious leaps of fancy and constructed a formula based on that. Then I verified that it was the correct formula.”

“What exactly is the lesson?”

“That you can do that.”

“Ohhhh.”

“Does this equation look scary to you?”

“A little, Princess.”

“But but not terrifying.”

“No.”

“Why not?”

“Because-I understand what the equation means and what it’s for.”

“Right. The fact that it looks complicated is incidental for you now. It’s a tool that you know how to use, and I gave you some analogies to help make it seem familiar. A bit of mathematical mythology.”

“Yes, you did, Princess. Thank you!”

“That’s enough for this morning I think.” said Luna after a slightly awkward pause. “Therefore, I will see you tomorrow.” Her horn began to glow as she summoned up a black hole.

“Wait!” said Twilight.

“What is it?” asked Luna

“Well I was wondering if we could talk about something else for a moment.”

“Why, what else would we ever wish to talk about?”

“I was just wondering how things were going with you.”

“Going?” Luna seemed confused for a moment. “Well, I just started reading this book on algebraic geometry...”

“No!” yelped Twilight before catching herself and calming down. “No... I mean, how are you feeling? Did you have a good night and morning?”

Luna looked confused for a minute. “Good morning... good night... Hmm. Yes. Yes, I think I did. Well! Thank you, Twilight Sparkle! I’m glad we had this talk!”

“No! That wasn’t-” but Luna had already disappeared in a flash of gravitons.

“... a real talk.” Twilight stompped her hoof. “She is just impossible!”

Day 5 - Polynomials

View Online

[Best viewed with 'small' size text and 1.5 size spacing.
If the equations seem out of order, reload the page and they might be fixed by then.
For a pdf version of this chapter with better formatting, please click here.]

Unit 1: Space

Lesson 1- Algebra

Day 5- Polynomials

The next day, Twilight found Luna in the Canterlot gardens. Luna was in repose under a palm frond, apparently enjoying the scents of nature.

“Princess? Are you here?”

“Under here, disciple!”

Twilight found Luna in repose under an enormous palm frond. Little flowers grew all around her. Light filtered in between the leaves of the frond but it was still dim. It was just like a little tent. Time stood still inside.

“Good morning,” said Twilight as she crouched under the palm frond. “This is a beautiful spot.”

“Yes. This is our secret hideout. Celestia does not know about it,” said Luna. “Today we must do a little detour.”

“I think a detour would require us to be actually on track in the first place,” said Twilight.

“Why, whatever do you mean?”

“I mean, we haven’t done any actual physics yet. We’ve just been wandering aimlessly through arcane mathematics.”

“Yesterday we did rotations, and those are the most basic, most important concept in geometry, which is the first principle of physics. You should not complain. If you jump into physics too quickly, you will learn nothing deeply. You must trust your tutor to lead you along the right track.”

“Very well, Princess. I trust you,” Twilight might have argued a little more, but it was such a beautiful day.

“Very good. Now then.” At that moment Luna conjured a notebook of graph paper. Her horn glowed and symbols began appearing on the paper.

(5.1)

“That’s a polynomial,” she said.

“I know,” said Twilight.

“Do you know what it means to factor a polynomial?” asked Luna.

Twilight tried to copy Luna’s spell. Soon her own symbols began appearing on the graph payer. “That means you write a polynomial as a product of like this

(5.2)

where c is an overall factor and the are the roots to the polynomial. That’s where p(x)=0.”

Luna gave Twilight a look that seemed like a combination of disgust and pity. “You really do not know this material at all, do you?” she said.

“But I thought-”

“I strongly doubt you were, Twilight Sparkle! For one thing, is the polynomial defined over the real numbers or the complex numbers, or something else entirely?”

“Well I didn’t-”

“And is the polynomial finite or infinite?”

“I didn’t realize-”

Theorem 5.1

“Indeed. The correct answer is to say that a finite polynomial can be factored this way if it is over an algebraically closed field:

(5.3)

Twilight frowned in frustration. She could never win with Luna. “What, pray tell, is an algebraically closed field?”

“An algebraically closed field means that every finite polynomial in the field has a root. Not all polynomials in every field can be factored. For example, just think of the polynomial

(5.4)

“Its solutions are ⅈ and -ⅈ,” said Twilight.

Luna shook her head. “No no no. Do not skip ahead! We have not said yet what field this polynomial is defined over.”

“Well what is it then?” said Twilight impatiently.

Theorem 5.2

“If this polynomial were defined over the complex numbers, then yes, its solutions would be ⅈ and -ⅈ. If it were defined over the real numbers, it would have no solutions. The fundamental theorem of algebra says that the complex numbers are, in fact, an algebraically closed field. We may get to a proof of it eventually, but for now, you should just trust me on it.”

Twilight nodded.

“The first thing to prove about factoring polynomials is to relate a root to a factor. Suppose a polynomial p(x) has a linear factor x-λ, or, in other words,

(5.5)

Clearly λ is a root of p(x).”

Lemma 5.3

Luna continued, “But what about the converse? Suppose λ is a root of p(x), or, in other words and by definition, p(λ)=0. Can a linear polynomial always be factored out? To prove this, we need a tedious little algebraic lemma.” Luna wrote

(5.6)

“To prove this, write it as a summand and then expand like so:”

(5.7)

The next part is some annoying summand algebra.” Luna wrote

(5.8)

“To do the next step, notice that the second part of every term in this summand is the opposite of the first part of the following term. This means that the entire summand cancels out except for the first part of the first term and the second part of the last term. We end up with

(5.9)

Twilight nodded, trying to follow along.

Proposition 5.4

“Now that the lemma is done, we can get on to the main result. Let it be given that λ is a root of p(x), which means that p(λ)=0. Then we can write

(5.10)

Lemma 5.3 says that a linear polynomial can be factored out of each of these terms. We can write expression 5.5 as

(5.11)

In other words, if a polynomial has a root, than a linear polynomial can be factored out of it. The details of the polynomial do not matter, other than to note that its degree must be less than that of p(x). The reason this is important is that proposition 5.4 can be applied recursively, and we can know that the resulting polynomial is simpler with each application.

Proposition 5.5

If the polynomial is finite, and if the polynomial is defined over an algebraically closed field, we can continue to factor out linear polynomials until only a linear polynomial is left. The linear polynomial that is left at the end might have an overall scalar factor that can be factored out as well. Hence

(5.12)

“Just what I wrote earlier.” said Twilight with some irritation.

“This is true for every algebraically-closed field. For fields that are not algebraically closed, there is no general rule about how polynomials factor. For example, the rational numbers are not an algebraically closed field, and it is easy to construct polynomials in it that cannot be factored at all. For example, is a rational polynomial but it has no roots among the rationals.

Theorem 5.6
However, there is a rule for factoring polynomials over the real numbers. The polynomial will factor into something like this:

(5.13)

In this expression, there are two kinds of roots. Real roots and complex pairs of roots and .

Proposition 5.7

If a complex number is a root of a real polynomial, so is its complex conjugate. Suppose a polynomial has real coefficients and it has a root λ. Then

(5.14)

which proves that is a root as well. If λ is a real number than this tells us nothing, but if it is complex it gives us a totally different complex root!

“And that actually holds for infinite polynomials too,” said Twilight.

“Why yes, I suppose it does! Since the fundamental theorem of algebra proves to us that we can factor polynomials into linear polynomial factors, we can write

(5.15)

But the factor is actually a quadratic real polynomial.

Proposition 5.8
Now we can finally get expression 5.13. First, factor out all the real roots. Then factor out the pairs of complex roots as quadratic polynomials. Then factor out any remaining overall scalar factor and you’re done!”

“...And why do I need to know any of this again?”

“You will understand very soon, Twilight Sparkle. Possibly tomorrow. But one thing I can tell you now is that we will find a correspondence between invertible linear operators and polynomials. We will be able to use this to show that every rotation operator should, in some sense, factor like a real polynomial. This will complete the proof that

“Alright Princess. Very well then.”

“Now, what about factoring infinite polynomials? I have already pointed out that the proofs I gave do not work, but do you know of any infinite polynomials which would provide a counterexample?”

“No.”

“I can give you a good one. Consider the infinite polynomial

(5.16)

Does this polynomial have any roots?”

“I don’t know, Princess.”

“No, it does not. You can see this by observing that

(5.17)

“Solve for f and you get . Of course that only works when the value of f is finite. That is, when x<1. Otherwise, f is infinite because every term in the series is larger than the previous.”

“Of course,” said Twilight, barely following.

“But you can see that the expression is never zero for x<1. Thus, f is either infinite or it is a finite expression that is never zero. That means no roots.”

“Alright then,” said Twilight to sum up. “A polynomial factors differently depending on whether it is finite or infinite and whether it is real or complex.”

“Quite so,” said Luna. “Well that’s the most important part of today’s lesson. I do not think we have enough time to start something new, so perhaps we should conclude with something fun!”

“Perhaps instead I could leave early? I mean there’s really a lot of work I should be...” Twilight paused when she saw Luna’s crestfallen expression. “I mean... I suppose I have time for a bit of fun...”

“Oh excellent. We don’t often have time to do any pure math around here.”

“Isn’t that the only thing we’ve done so far?”

“I have already told you that everything we have ever done will be critical to your understanding of physics. But now I think we will do just a bit of pure mathematics. I’m sure you know the quadratic formula of course, but do you know the formulas to solve cubic and quartic polynomials? I will show you the derivations.”

“Um, yes that sounds like great fun, Princess.”

“I’m glad you think so! Let’s start with a linear polynomial. How would you solve something like this?”

(5.18)

“Um, that one’s kind of obvious, Princess. I think I could do that one when I was a filly. It’s just

Theorem 5.9

(5.19)

“Yes well... we just included that for completeness. Now on to the quadratic polynomial.”

“Also one I already know,” said Twilight.

(5.20)

“Let me show you a trick to simplify first. We shall normalize the polynomial by dividing by c. This removes the factor on . Then the polynomial becomes

(5.21)

where and . We will need that trick in the rest of the derivations, so get used to it!”

“...Alright.” said Twilight. “Then I just complete the square

(5.22)

Theorem 5.10
and then I can then solve for x again.

(5.23)

There are two solutions because the square root could be positive or negative.”

“Quite right, but I must object to your use of the ± symbol. This symbol really on makes sense when you are working with real numbers. A positive real number has two square roots, one positive and one negative, and a negative real number has two square roots, one positive imaginary and one negative imaginary. However, for complex numbers generally, these two roots get mixed up and there isn’t one that you can say is objectively the positive or negative one. It makes more sense to write something like this:”

(5.24)

“What do you mean that the roots get mixed up?”

“Hmm. Imagine for a moment,” said Luna, “a complex number p≡(cos(θ)+ⅈ sin(θ)) |p|, where |p| is a real number. The parameter θ rotates p around a circle centered at zero. What do you think happens to the square roots of p as it rotates around?”

“I don’t know. Maybe there is a way to find an expression for the square root of p?”

“Here’s a hint. Use the half-angle formulas!”

“Oh, I think I see. I can just make the replacements and and I get a nice perfect square.” Twilight wrote

(5.25)

“So,” Luna interrupted, “the square roots of p are

(5.26)

where is the positive square root of |p|. Since |p| is real, there is, objectively, one positive root so there isn’t a problem in this case. We can get both roots by observing that since θ and θ+2π are equivalent, you could interpret as either itself or as .”

“Or just by writing ±,” grumbled Twilight under her breath.

Twilight continued. “So this expression says that as you rotate p around a circle, the square roots of p rotate at half the rate.”

“Indeed. Good observation, Twilight Sparkle! This means that if p goes all the way around the circle, the square roots of p have only rotated half way around. Thus, what was originally the positive square root of p has become the negative one and vice versa.”

“I never noticed that before, Princess!”

“This is why there is no objectively positive or negative square root of a complex number. It is arbitrary to specify one or the other. There is a similar relation for higher roots. For example, there are three cube roots for a complex number, arranged in an equilateral triangle. If you rotate a complex number around the origin, its cube roots rotate a third of the way around. Its four tesseract roots rotate one fourth of the way around and so on. ...Oh dear.”

“What is it, Princess?”

“I just realized that you probably will want to know about these properties of complex numbers for doing physics! That means this isn’t a total digression.”

Twilight rolled her eyes a little. “Don’t worry Princess. I think I can handle it.”

“Alright, but some day we must do some really, really pure mathematics. But now I want to show you another trick. Going back to the polynomial

(5.27)

try making the substitution

(5.28)

This gives

(5.29)

Thus, we have eliminated the linear term from the polynomial and are left with something that can easily be solved. You see?

(5.30)

To get the more general solution, simply replace back in the definitions of , and . There is a similar trick for all higher-order polynomials.”

Twilight nodded.

“Next, finally we shall do something new. The cubic equation! I write it already normalized.”

(5.31)

Now, with the replacement

(5.32)

we get a simpler cubic with the quadratic term eliminated.

(5.33)

What does this do for us? Look at the condition that is imposed on the solution of this equation by removing the quadratic term. Let us write a cubic polynomial that has been factored into its three roots , , and . Then if we expand that out like this

(5.34)

you can see that if the quadratic term (which I’ve highlighted in pink) is to be zero, the solutions to the polynomial must all sum to zero. What can we do with this knowledge?”

“I don’t know. Effectively it means there are only two solutions to find because the third is determined by the others.”

Definition 5.1 : Ansatz

“Right, and there is a clever trick to take advantage of that. Now if you wanted two independent numbers whose sum was p, you could introduce the a number q and just arbitrarily say that the numbers you wanted are p+q and p-q. You could write every possible pair of numbers this way, so you can always declare that they will come in that form. When you declare that you will write everything in a particular form without loss of generality, this is called an ansatz. The next step in the derivation of the cubic is to come up with the right ansatz.

Now, take the expression , where p is a complex number. The cube root of p actually stands for three complex numbers which form an equilateral triangle centered at the origin.”

“So they will sum to zero.”

“Indeed. You are thinking along the right lines. The trick is to write x as , the sum of two cube roots, and to show that this is an ansatz for three complex numbers whose sum if zero.”

Newly sensitized to the subtleties of complex roots, Twilight asked, “what does that expression mean exactly? Is each cube root added to each of the others?”

Luna leaned very close to Twilight’s face. “Good question. You are really starting to understand what you are dealing with. But if you added each root of p to each of q, you would have nine numbers, and we only need three. Instead, you only add one cube root of p to a cube root of q and then a different cube root of p to a different cube root of q until you have run out.”

“So we can definitely say that the sum of all the numbers represented by is zero, since each set of cube roots sums to zero,” said Twilight. “But how do we choose which ones to add?”

“Don’t worry about that question for now. I will first prove that the ansatz works. Let us say that we want to make the complex numbers u, v, and -u-v out of , and let us write the cube roots of p and q as and , where 0≤i≤2. We want to be able to say that

(5.35)

We know that the cube roots of p and q form equilateral triangles centered at the origin, so we can say that they are numbered either clockwise or counterclockwise. If the labels of both sets of roots go in the same direction, then v is constrained by the value of u. It is just a 120° rotation of u, either clockwise or counterclockwise. If that is the case, then just represents another equilateral triangle centered about the origin, which does not help us one bit. Therefore, we must say that the roots of p and q are labeled in opposite directions. Let us just arbitrarily say that q goes counterclockwise and p goes clockwise. Then we can write

(5.36)

You can think of and (u,v) each as spanning sets of the complex plane viewed as a two-dimensional vector space. The transformation between them is the matrix

(5.37)

This is an invertible matrix, which means that for any and you can always find a u and v and vice versa.”

“Thus proving the ansatz!” concluded Twilight, “But which two roots do we add?”

“Actually, it does not matter. Since the cube roots are arbitrary, it does not matter which root of p and q you begin with. You must ultimately get the same answer either way.”

“What is the proof of that?”

“It already is proven because the roots are arbitrary. Necessarily it cannot matter. If you want to convince yourself of that, you can try adding cube roots together on your own time.”

“Very well, Princess.”

“We are almost done with the cubic now,” continued Luna. “We substitute the new expression in for x and get

(5.38)

In order for this to be true, it must be that both terms on the left are zero. Do you see why?

“Because and are arbitrary. If the two terms had to sum to zero without both being zero individually, this would put a condition on and . We need to be able to write and as separate expressions of p and q.”

“That is correct. So that gives the two equations

(5.39)

which simplify to

(5.40)

or

(5.41)

These are quadratic equations. In fact, they are both the same quadratic equation, but with a different variable. However, if you let p be equal to one root, you find that q must be equal to the other root. So we can then write

(5.42)

as long as you pick the same root in both expressions. Then the solution to the cubic is

(5.43)

where

(5.44)

“That was lovely, Princess.”

“Thank you, Twilight Sparkle,” Luna actually smiled at that. “Now on to the quartic equation. The quartic can be simplified by some similar steps to the ones we used with the cubic, so we shall start out with

(5.45)

Next, add to both sides of the equation to complete the square on one side.

(5.46)

That was a good trick, but here is where the real trick comes in. We want to be able to make both sides of the equation into perfect squares. To do this, change the left side of the equation to . If you expand that out, you can see that this is equivalent to adding . So,

(5.47)

If the right side is a perfect square in x, it must be that

(5.48)

or

(5.49)

This is a cubic equation in Ξ! Something you just learned to do. It does not matter which solution you choose because each is able to form a perfect square out of expression 5.47 and each gives the same solutions. The expression then becomes

(5.50)

Here you have to choose which square roots you want to use. You may choose whichever you like and the result will still work, but notice that in expression 5.48 you already made a choice about the products of the roots. So make sure your choice is consistent with that!

Theorem 5.11

Next, take the square root of this expression and get a quadratic in x:

(5.51)

Here we make a choice that is not arbitrary. You could have chosen either square root and each will give different answers. I have added the factor to represent this choice. Any solution you can find when is 1 must also give solutions when is factor that represents the choice you made when you took the square root. The solutions whose solutions are

(5.52)

where is the choice of solution

(5.53)

and Ξ is a solution to the cubic equation given before.”

Twilight nodded and wondered how long this progression would continue before she would be set free.

“Now as to the quintic equation-

“Um, Princess. Don’t you think we’re a little overtime already?”

“Yes we are. Too bad, because the quintic equation is where it really starts to get interesting.”

“It does?”

“It turns out that there is no general solution to quintic polynomials that can be written using only sums, products, and roots. To prove that requires going into some very interesting and advanced mathematics.”

“Really?”

“Unfortunately, to understand this requires much more than what we can go into tonight. Or ever!“

“Oh...” said Twilight, half disappointed and half relieved.

“Well. I suppose I could tell you a little bit. Basically, you look at the group of permutations of the roots of the polynomial and then look at at its normal subgroups. If you can factor out an Abelian normal subgroup, this corresponds to taking the th root of an expression. If you can factor the group down to a final Abelian group, then you must be able to solve the corresponding polynomials with th roots. As it happens, the fifth-order polynomial does not have the right kind of permutation group. Isn’t that wonderful?”

“Uhhh...” said Twilight.

“Now, off with you, disciple!”

Twilight left in a bit of a daze.