## Prerequisite Skill sets for Artificial Intelligence Development. PART I

There is hardly any theory which is more elementary than linear algebra, inspite of the fact that generations of

Data scientist and  Mathematicians have obscured its simplicity by preposterous calculations with matrices

-Jean Dieudonnie French Mathematician

Mathematics requires a small dose, not of genius but of an imaginative freedom which in a larger dose would be insanity

– August K Rodgers

It’ll be about linear algebra, which as a lot of you know is one of these subjects that’s required knowledge

for just about any technical education, but it’s also I have noticed generally poorly understood by IT professionals

taking it for the first time. An Data Scientist might go through a class and learn how to compute lost of things, like determinant,

matrices multiplication or cross product which use the determinant or eigenvalues, but they might come out

without really understanding why matrix multiplication is defined the way that it is, why cross product has

anything to do with the determinant or what an eigenvalues really represent.

In this blog I have tried to help all those data scientist to re look of Linear algebra in altogether in a different

perspective in terms of Analytical Geometry or 3 Dimensional Spatial Geometry and help them understand how

this can be easily put to use in Computer vision , Image processing, Speech recognition, and Robotics.

Fields and Vector Spaces As has been indicated in the preface, our first aim is to re-work

the linear algebra presented in our AAS course, generalising from R to an arbitrary field as domain

from which coefficients are to be taken. Fields are treated in the companion course, Rings and

Arithmetic, but to make this course and these notes self-contained we begin with a definition of

what we mean by a field.

#### Axioms for Fields

A field is a set F with distinguished elements 0, 1, with a unary operation −, and with two binary operations +

and × satisfying the axioms below (the ‘axioms of arithmetic’).

Conventionally, for the image of (a, b) under the function + : F × F → F we write a + b; for the image of (a, b)

under the function × : F × F → F we write a b; and x − y means x + (−y), x + y z means x + (y z).

The axioms of arithmetic

(1) a + (b + c) = (a + b) + c        [+ is associative]

(2) a + b = b + a                   [+ is commutative]

(3) a +0= a

(4) a + (−a) = 0

(5) a (b c) = (a b) c                 [× is associative]

(6) ab = ba                               [× is commutative]

(7) a 1 = a

(8) a 6 = 0 ⇒ ∃ x ∈ F : a x = 1

(9) a (b + c) = a b + a c                            [× distributes over +]

(10) 0 ≠ 1

Note 1: All axioms are understood to have ∀ a, b, . . . ∈ F in front.

Note 2: See my Notes on Rings and Arithmetic for more discussion.

Note 3: Examples are Q, R, C, Z 2 or more generally Z p , the field of integers modulo p for any prime number p.

## Vector Spaces

Let F be a field. A vector space over F is a set V with distinguished element 0, with a unary operation −,

with a binary operation +, and with a function × : F × V → V satisfying the axioms below.

Conventionally: for the image of (a, b) under the function + : V × V → V

we write a + b; for a + (−b) we write a − b; and for the image of (α, v)

under the function × : F × V → V we write α v .

## Vector space axioms

Note: All axioms are understood to have appropriate quantifiers ∀ u, v, . . . ∈ V
and/or ∀ α, β, . . . ∈ F in front.

### EXAMPLES:

1. the polynomial ring F [x]  is a vector space over F ;
2. if K is a field and F a sub-field then K is a vector space over F ;

Exercise 1. Let X be any set, F any field. Define

Exactly as for rings, or for vector spaces over R, one can prove important “trivialities”.

Of course, if we could not prove them then we would add further axioms until we

had captured all properties that we require, or at least expect, of the algebra of vectors.

But the fact is that this set of axioms, feeble though it seems, is enough. For example:

PROPOSITION: Let V be a vector space over the field F . For any v ∈ V and any α ∈ F we have

Proof. For v ∈ V , from Field Axiom (3) and Vector Space Axiom (7) we have
0 v = (0 + 0) v = 0 v + 0 v.

Then adding −(0 v) to both sides of this equation and using Vector Space Axioms (4)
on the left and (1), (4), (3) on the right, we get that 0 = 0 v , as required for (i). The
reader is invited to give a proof of (ii).

For (iii), suppose that α v = 0 and α 6 = 0: our task is to show that v = 0. By Field
Axioms (8) and (6) there exists β ∈ F such that β α = 1. Then

v = 1 v = (β α) v = β (α v) = β 0

by Vector Space Axioms (8) and (5). But β 0 = 0 by (ii), and so v = 0, as required.
Clause (iv), like Clause (ii), is offered as an exercise:

## Subspaces

Let V be a vector space over a field F . A subspace of V is a subset U such that
(1) 0 ∈ U and u + v ∈ U whenever u, v ∈ U
(2) if u ∈ U , α ∈ F then α u ∈ U . and −u ∈ U whenever u ∈ U ;

Note 1: Condition (1) says that U is an additive subgroup of V . Condition (2) is closure under

multiplication by scalars.

## QUOTIENT SPACES

Suppose that U <= V where V is a vector space over a field F . Define the quotient
space V /U as follows:

set := {x + U | x ∈ V }   [additive cosets]

0 := U

additive inverse: −(x + U ) := (−x) + U

addition: (x + U ) + (y + U ) := (x + y) + U

Note: The notion of quotient space is closely analogous with the notion of quotient
of a group by a normal subgroup or of a ring by an ideal. It is not in the Part A syllabus,
nor will it play a large part in this course. Nevertheless, it is an important and useful
construct which is well worth becoming familiar with.

## Dimension

Although technically new, the following ideas and results translate so simply from
the case of vector spaces over R to vector spaces over an arbitrary field F that I propose
simply to list headers for revision:
(1) spanning sets;
linear dependence and independence;
bases;
(2) dimension;

(4) any linearly independent set may be extended (usually in many ways) to a basis;
(5) intersection U ∩ W of subspaces; sum U + W ;

(6) dim (U + W ) = dim U + dim W − dim (U ∩ W );

## LINEAR TRANSFORMATION

T 0 = 0,
T (−x) = −T x,
T (x + y) = T x + T y,
and T (λ x) = λ (T x)
for all x, y ∈ V and all λ ∈ F .

Note 1: The definition is couched in terms which are intended to emphasise that
what should be required of a linear transformation  is
that it preserves all the ingredients, 0, +, − and multiplication by scalars, that go into

linear if and only if T (α x + β y) = α T x + β T y  for all x, y ∈ V and all α, β ∈ F .

Note 2:

For a linear transformation T : V → W we define the kernel or null-space by Ker T := {x ∈ V | Tx = 0}.

## RANK NULLITY THEOREM:

nullity T + rank T = dim V

Note: The Rank-Nullity Theorem is a version of the First Isomorphism Theorem
for vector spaces, which states that

## DIRECT SUMS AND PROJECTION OPERATORS

The vector space V is said to be the direct sum of its subspaces U and W , and we
write V = U ⊕ W , if V = U + W and U ∩ W = {0}.

#### Lemma: Let U , W be subspaces of V . Then V = U ⊕ W if and only if for every v ∈ V there exist unique vectors u ∈ U and w ∈ W such that v = u + w.

Proof. Suppose first that for every v ∈ V there exist unique vectors u ∈ U and

w ∈ W such that v = u + w . Certainly then V = U + W and what we need to prove

is that U ∩ W = {0}. So let x ∈ U ∩ W . Then x = x + 0 with x ∈ U and 0 ∈ W .

Equally, x = 0 + x with 0 ∈ U and x ∈ W . But the expression x = u + w with u ∈ U

and w ∈ W is, by assumption, unique, and it follows that x = 0. Thus U ∩ W = {0}, as required.

Now suppose that V = U ⊕ W . If v ∈ V then since V = U + W there certainly exist vectors u ∈ U

and w ∈ W such that v = u + w . The point at issue therefore is: are u, w uniquely determined by v?

Suppose that u + w = u ′ + w ′ , where u, u ′ ∈ U and w, w ′ ∈ W . Then u − u ′ = w ′ − w .

This vector lies both in U and in W .

By assumption, U ∩ W = {0} and so u − u ′ = w ′ − w = 0. Thus u = u ′ and w = w ′ ,

so the decomposition of a vector v as u + w with u ∈ U and w ∈ W is unique, as required.

Note: What we are discussing is sometimes (but rarely) called the “internal” direct

sum to distinguish it from the natural construction which starts from two vector spaces

the direct product of groups or of rings. This is (equally rarely) called the “external”

then V is naturally isomorphic with the external direct sum of U and W .

We come now to projection operators. Suppose that V = U ⊕ W .

Define P : V → V as follows.

For v ∈ V write v = u + w where u ∈ U and w ∈ W and then define

P v := u. Strictly P depends on the ordered pair (U, W ) of summands of V , but to

keep things simple we will not build this dependence into the notation.

#### OBSERVATIONS:

(1) P is well-defined;
(2) P is linear;
(3) Im P = U , Ker P = W ;

Proofs. That P is well-defined is an immediate consequence of the existence and
uniqueness of the decomposition v = u + w with u ∈ U , v ∈ V .

For (3) it is clear from the definition that Im P ⊆ U ; but if u ∈ U then u = P u,

and therefore Im P = U . Similarly, it is clear that W ⊆ Ker P ;

but if v ∈ Ker P then

v = 0 + w for some w ∈ W , and therefore Ker P = W .

Finally, if v ∈ V and we write v = u + w with u ∈ U and w ∈ W then

#### Terminology: the operator (linear transformation) P is called the projection of V onto U along W.

Note 2. If P is the projection onto U along W then I − P , where I : V → V is the identity transformation,

is the projection onto W along U .

Note 3. If P is the projection onto U along W then u ∈ U if and only if P u = u. The fact that if u ∈ U then

P u = u is immediate from the definition of P , while if P u = u then obviously u ∈ Im P = U .

Our next aim is to characterise projection operators algebraically. It turns out that Observation (4) above is the key:

## THEOREM : A linear transformation is a projection operator if and only if it is idempotent.

Proof: We have seen already that every projection is idempotent, so the problem is to prove that an idempotent

linear transformation is a projection operator. Suppose that P : V → V is linear and idempotent. Define

To finish the proof we need to convince ourselves that P is the projection onto U along W .

For v ∈ V write v = u + w where u ∈ U and w ∈ W . Since u ∈ U there must exist x ∈ V such that
u = P x. Then

and therefore P is indeed the projection onto U along W , as we predicted.

The next result is a theorem which turns out to be very useful in many situations.
It is particularly important in applications of linear algebra in Quantum Mechanics.

## THEOREM :

Proof : Suppose first that P T = T P . If u ∈ U , so that P u = u, then T u = T P u = P T u ∈ U , so T U 6 U .

Now suppose conversely that U and W are T -invariant. Let v ∈ V and write

v = u + w with u ∈ U and w ∈ W . Then

P T v = P T (u + w) = P (T u + T w) = T u ,

since T u ∈ U and T w ∈ W . Also,

T P v = T P (u + w) = T u .

Thus P T v = T P v for all v ∈ V and therefore P T = T P , as asserted.

Note 1: If k = 2 then this reduces to exactly the same concept as we have just been
studying.

Note 2:

satisfy

U 1 ∩ U 2 = U 1 ∩ U 3 = U 2 ∩ U 3 = {0}

and yet it is clearly not true that

${F}^{2}$

is their direct sum.

EXAMPLE : If P is a projection then {P, I − P } is a partition of the identity.

PROOF:

## LINEAR FUNCTIONS AND DUAL SPACES

Note: It is a matter of important routine to check that the vector space axioms are

satisfied. It is also important that, when invited to define the dual space of a vector

space, you specify not only the set, but also the operations which make that set into a vector space.

## THEOREM :

To finish the proof we must show that they form a basis of V ′ .

To see that they are independent suppose that

Let V be a vector space over the field F . For a subset X of V the annihilator is defined by

## Let V be a finite-dimensional vector space over a field F and let U be a subspace. Then

Proof:

To finish this study of dual spaces we examine the second dual, that is, the dual of
the dual. It turns out that if V is a finite-dimensional vector space then the second dual
V ′′ can be naturally identified with V itself.

## QUANTUM COMPUTING BOOSTS DEEP LEARNING TO TOP LEVEL

“This is the beginning of the quantum age of computing and the latest advancement is towards building a universal quantum computer.

“A universal quantum computer, once built, will represent one of the greatest milestones in the history of information technology and has the potential to solve certain problems we couldn’t solve, and will never be able to solve, with today’s classical computers.”

#### quantum computing is likely to have the biggest impact on industries that are data-rich and time-sensitive. Machine Learning is the main application for quantum computing.

a quantum computer has quantum bits or qubits, which work in a particularly intriguing way. Where a bit can store either a zero or a 1, a qubit can store a zero, a one, both zero and one, or an infinite number of values in between—and be in multiple states (store multiple values) at the same time! If that sounds confusing, think back to light being a particle and a wave at the same time, Schrödinger’s cat being alive and dead, or a car being a bicycle and a bus.

A quantum computer exponentially expands the vocabulary of binary code words by using two spooky principles of quantum physics, namely ‘entanglement’ and ‘superposition’. Qubits can store a 0, a 1, or an arbitrary combination of 0 and 1 at the same time. Multiple qubits can be made in superposition states that have strictly no classical analogue, but constitute legitimate digital code in a quantum computer.

In a quantum computer, each quantum bit of information–or qubit–can therefore be unfixed, a mere probability; this in turn means that in some mysterious way, a qubit can have the value of one or zero simultaneously, a phenomenon called superposition.

And just as a quantum computer can store multiple values at once, so it can process them simultaneously. Instead of working in series, it can work in parallel, doing multiple operations at once.

In theory, a quantum computer could solve in less than a minute problems that it would take a classical computer millennia to solve.

To date, most quantum computers have been more or less successful science experiments. None have harnessed more than 12 qubits, and the problems the machines have solved have been trivial. Quantum computers have been complicated, finicky machines, employing delicate lasers, vacuum pumps, and other exotic machinery to shepherd their qubits.

The world’s fastest supercomputer, China’s Sunway TaihuLight, runs at 93 petaflops (93 quadrillion flops, or around 1017) – but it relies on 10 million processing cores and uses massive amounts of energy.

In comparison, even a small 30-qubit universal quantum computer could, theoretically, run at the equivalent of a classical computer operating at 10 teraflops (10 trillion flops, or 1012

##### A QUANTUM COMPUTER IS “NOT JUST A ‘FASTER’ COMPUTER, BUT THEY ARE THE EQUIVALENT OF A JET PLANE TO A BICYCLE.”

IBM says “With a quantum computer built of just 50 qubits, none of today’s TOP500 supercomputers could successfully emulate it, reflecting the tremendous potential of this technology,” .

A screenshot of IBM’s the author attempting to quantum compute. (courtesy : Silicon)

QUANTUM COMPUTING TIMELINE

The significant advance, by a team at the University of New South Wales (UNSW) in Sydney appears today in the international journal Nature.

“What we have is a game changer,” said team leader Andrew Dzurak, Scientia Professor and Director of the Australian National Fabrication Facility at UNSW.

“We’ve demonstrated a two-qubit logic gate – the central building block of a quantum computer – and, significantly, done it in silicon. Because we use essentially the same device technology as existing computer chips, we believe it will be much easier to manufacture a full-scale processor chip than for any of the leading designs, which rely on more exotic technologies.

“This makes the building of a quantum computer much more feasible, since it is based on the same manufacturing technology as today’s computer industry,” he added.

## The benefits of quantum computing

quantum computers could be used for discovering new drugs, securing the internet, modeling the economy, or potentially even building far more powerful artificial intelligence systems—all sorts of exceedingly complicated tasks.

A functional quantum computer will provide much faster computation in a number of key areas, including: searching large databases, solving complicated sets of equations, and modelling atomic systems such as biological molecules and drugs. This means they’ll be enormously useful for finance and healthcare industries, and for government, security and defence organisations.

For example, they could be used to identify and develop new medicines by greatly accelerating the computer-aided design of pharmaceutical compounds (and minimizing lengthy trial and error testing); develop new, lighter and stronger materials spanning consumer electronics to aircraft; and achieve much faster information searching through large databases.

Functional quantum computers will also open the door for new types of computational applications and solutions that are probably too premature to even conceive.

The fusion of quantum computing and machine learning has become a booming research area. Can it possibly live up to its high expectations?

“These advantages that you end up seeing, they’re modest; they’re not exponential, but they are quadratic,” said Nathan Wiebe, a quantum-computing researcher at Microsoft Research. “Given a big enough and fast enough quantum computer, we could revolutionize many areas of machine learning.” And in the course of using the systems, computer scientists might solve the theoretical puzzle of whether they are inherently faster, and for what.

A recent Fortune essay states, “Companies like Microsoft, Google and IBM are making rapid breakthroughs that could make quantum computing viable in less than 10 years.”

Very soon you will find hardware and software with fundamentally new integrated circuits that store and process quantum information. Many important computational problems will only be solved by building quantum computers. Quantum computing has been picking up the momentum, and there are many startups and scholars discussing quantum machine learning. A basic knowledge of quantum two-level computation ought to be acquired.

I am sure that Quantum Information and Quantum Computation will play a crucial role in defining where the course of our technology is headed.

“For quantum computing to take traction and blossom, we must enable the world to use and to learn it,” said Gambetta. “This period is for the world of scientists and industry to focus on getting quantum-ready.”