Basic numerics and plot
pylab notebook just like namespace
# For interactive plot, please use
# %pylab notebook
%pylab inline
%pylab is deprecated, use %matplotlib inline and import the required libraries.
Populating the interactive namespace from numpy and matplotlib
Commands beginning with % our IPython ‘magic’ commands. This one sets up a bunch of matplotlib back end and imports numpy into the global namespace. We will do this in all of our notebooks for simplicity. It’s actually considered bad form because it puts a lot of numpy functions into the global namespace, but for exploratory work it’s very convenient.
To avoid importing everything into the global namespace, one would use instead
#%matplotlib notebook
import numpy as np
import matplotlib as mpl
#mpl.rcParams['figure.dpi']= 150
%config InlineBackend.figure_format = 'svg'
import matplotlib.pyplot as plt
which will make all of the numpy functions, such as array() and sin(), available as np.array(), np.sin(), and the plotting functions such as plot() and xlabel() as plt.plot() and plt.xlabel().
Numpy Arrays
Numpy arrays store multidimensional arrays of objects of a fixed type. The type of an array is a dtype, which is a more refined typing system than Python provides. They are efficient maps from indices (i,j) to values. They have minimal memory overhead.
Arrays are mutable: their contents can be changed after they are created. However, their size and dtype, once created cannot be efficiently changed (requires a copy).
Arrays are good for:
Representing matrices and vectors (linear algebra)
Storing grids of numbers (plotting, numerical analysis)
Storing data series (data analysis)
Getting/changing slices (regular subarrays)
Arrays are not good for:
Applications that require growing/shrinking the size.
Heterogenous objects.
Non-rectangular data.
Arrays are 0-indexed.
# A vector is an array with 1 index
a = array([1/sqrt(2), 0, 1/sqrt(2)])
a
array([0.70710678, 0. , 0.70710678])
a.shape
(3,)
a.dtype
dtype('float64')
a.size
3
# We access the elements using [ ]
a[0]
0.7071067811865475
a[0] = a[0]*2
a
array([1.41421356, 0. , 0.70710678])
We create a 2D array (that is a matrix) by passing the array() function a list of lists of numbers in the right shape.
# A matrix is an array with 2 indices
B = array( [[ 1, 0, 0],
[ 0, 0, 1],
[ 0, 1, 0]] )
B
array([[1, 0, 0],
[0, 0, 1],
[0, 1, 0]])
B.shape
(3, 3)
B.dtype
dtype('int64')
B.size
9
B[0,0]
1
Warning! There is also a type called ‘matrix’ instead of ‘array’ in numpy. This is specially for 2-index arrays but is being removed from Numpy over the next two years because it leads to bugs. Never use matrix(), only array()
Basic Linear Algebra
There are two basic kinds of multiplication of arrays in Python:
Element-wise multiplication: a*b multiplies arrays of the same shape element by element.
Dot product: a@b forms a dot product of two vectors or a matrix product of two rectangular matrices.
Mathematically, for vectors,
while for 2D arrays (matrices),
a = array([1, 1]) / sqrt(2)
b = array([1, -1]) / sqrt(2)
a
array([0.70710678, 0.70710678])
b
array([ 0.70710678, -0.70710678])
a*a
array([0.5, 0.5])
a@a
0.9999999999999998
# Compute the length of a
norm(a)
0.9999999999999999
sqrt(a@a)
0.9999999999999999
a*b
array([ 0.5, -0.5])
a@b
-2.2371143170757382e-17
There are many, many more functions for doing linear algebra operations numerically provided by numpy and scipy. We will use some of them as we go.
Basic Plotting
The second primary use of numpy array’s is to hold grids of numbers for analyzing and plotting. In this case, we consider a long 1D array with length N as representing the values of the x and y axis of a plot, for example.
Let’s plot a sine wave:
# create an equally spaced array of 100 numbers
# from -2pi to 2pi
x = np.linspace(-2*np.pi, 2*np.pi, 100)
# evaluate a function at each point in x and create a
# corresponding array
y = 0.5*np.sin(x)
figure()
plot(x,y)
grid()
xlabel(r'$x$')
ylabel(r'$0.5 \sin(x)$')
show()
A few comments:
The call to sin(x) is a ‘ufunc’, which automatically acts element-by-element on whatever shape array it is passed.
Matplotlib supports \(\LaTeX\) style mathematical expressions in any text that it renders – just enclose in $ -signs. We use a raw (r”..”) string so that the backslashes are passed onto the \(\LaTeX\) interpreter intact rather than being interpreted as special characters.
The ‘notebook’ backend (which we turned on at the beginning of the notebook) provides basic interactive plotting inside the notebook. Try it out. There are other backends which provide somewhat better interaction if you setup jupyter on your local computer.
Speed
L = range(1000)
%timeit [i**2 for i in L]# % is a comment communicate to jupyter, not to python.
186 µs ± 787 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
a = arange(1000)
%timeit a**2
571 ns ± 2.12 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
mpl.rcParams['text.usetex']=True # This assigned the font style to be the MathJax style in the plot.
mpl.rcParams['font.family']='Times New Roman'
figure()
plot(a, a**2, '--')
title(r'This is a test.')
xlabel(r'$a$')
ylabel(r'$a^2$',rotation=0)# Do not rotate the y axis label.
show()
Some Common Arrays
arange(2,10,2)
array([2, 4, 6, 8])
# compact notation for the previous
# r_ creates a *row* vector with the contents of the slice
# notice the [ ] rather than ( )
r_[2:10:2]
array([2, 4, 6, 8])
linspace(2,10,5)
array([ 2., 4., 6., 8., 10.])
# compact notation for the previous
r_[2:10:5j]
array([ 2., 4., 6., 8., 10.])
ones((2,2))
array([[1., 1.],
[1., 1.]])
zeros((3,1))
array([[0.],
[0.],
[0.]])
eye(3)
array([[1., 0., 0.],
[0., 1., 0.],
[0., 0., 1.]])
diag([1,2,3])
array([[1, 0, 0],
[0, 2, 0],
[0, 0, 3]])
np.random.rand(2,2)
array([[0.78137026, 0.16848933],
[0.84995137, 0.75593457]])
np.random.exponential(scale=3,size=(3,3))
array([[ 9.03449174, 12.47286547, 1.66588726],
[ 1.74199391, 3.28102502, 0.21717107],
[ 5.2603578 , 1.60556311, 5.19330279]])
Array DTypes
Numpy has its own set of data types for the elements of arrays. These dtypes are more specific than ‘int’ or ‘str’ so that you can control the number of bytes used.
Integers: int16 (‘i2’), int32 (‘i4’), int64 (‘i8’), …
Unsigned: uint32 (‘u4’), uint64 (‘u8’), …
Float: float16 (‘f2’), float32 (‘f4’), float64 (‘f8’), …
Boolean: bool
Fixed Length Strings: ‘S1’, ‘S2’, …
array([1,0]).dtype
dtype('int64')
array([1.,0]).dtype
dtype('float64')
float32
numpy.float32
dtype('i4')
dtype('int32')
Basic Operations and Visualization
x = linspace(-pi, pi, 100)
y1 = sin(x)
y2 = exp(x/pi)
figure()
plot(x,y1, label=r'$\sin(x)$')
plot(x,y2, label=r'$e^{x/\pi}$')
legend(loc='upper left')
show()