Day 19 numpy module\pandas module\matplotlib module

numpy module\pandas module\matplotlib module

numpy module

The numpy library has two roles:

  1. Different from the list list, it provides array operations, array operations, as well as statistical distribution and simple mathematical models
  2. The calculation speed is fast, and even due to the simple operation built into python, it becomes a dependency package of pandas, sklearn and other modules. Advanced frameworks such as TensorFlow, PyTorch, etc., whose array operations are also very similar to numpy

Create a matrix

A matrix is ​​a numpy ndarray object. To create a matrix is ​​to pass a list to the np.array() method.

import numpy as np

# 创建一维的ndarray对象
arr = np.array([1, 2, 3])

# 创建二维的ndarray对象
print(np.array([[1, 2, 3], [4, 5, 6]]))

# 创建三维的ndarray对象
print(np.array([[[ 2.10025514,  0.12015895,  0.61720311],
        [ 0.30017032, -0.35224985, -1.1425182 ],
        [-0.34934272, -0.20889423,  0.58662319]],

       [[ 0.83898341,  0.93110208,  0.28558733],
        [ 0.88514116, -0.75439794,  1.25286816],
        [ 0.51292982, -0.29809284,  0.48851815]],

       [[-0.07557171,  1.13162939,  1.51981682],
        [ 2.18557541, -1.39649634, -1.44411381],
        [-0.50446586,  0.16003707,  0.87616892]]]))

Get the number of rows and columns of a matrix

arr = np.array([[1, 2, 3], [4, 5, 6]])

# 获取矩阵的行和列构成的数组
print(arr.shape)

# 获取矩阵的行
print(arr.shape[0])

# 获取矩阵的列
print(arr.shape[1])

cut matrix

The cutting of the matrix is ​​similar to the cutting of the list, but different from the cutting of the list, the cutting of the matrix involves the cutting of the row and the column, but the way of cutting both starts from the index 0, and takes the head and not the tail.

arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])

# 取所有元素
print(arr[:, :])

# 取第一行的所有元素
print(arr[:1, :])
print(arr[0,:])
print(arr[0, [0, 1, 2, 3]])

# 取第一列的所有元素
print(arr[:, :1])
print(arr[:,0])
print(arr[(0, 1, 2), 0])

# 取第一行第一列的元素
print(arr[0, 0])

# 取大于5的元素,返回一个数组
print(arr[arr > 5])

# 矩阵按运算符取元素的原理,即通过arr > 5生成一个布尔矩阵
print(arr > 5)

Matrix element replacement

The replacement of matrix elements is similar to the replacement of list elements, and the matrix is ​​also a variable type of data, that is, if the matrix is ​​replaced, the elements of the original matrix will be modified, so let's use the .copy() method as an example of matrix elements. replace.

arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])

# 取第一行的所有元素,并且让第一行的元素都为0
arr1 = arr.copy()
arr1[0, :] = 0

# 取所有大于5的元素,并且让大于5的元素为0
arr2 = arr.copy()
arr2[arr > 5] = 0

# 对矩阵清零
arr3 = arr.copy()
arr3[:, :] = 0

merging of matrices

arr1 = np.array([[1, 2], [3, 4], [5, 6]])
arr2 = np.array([[7, 8], [9, 10], [11, 12]])

# 合并两个矩阵的行,注意使用hstack()方法合并矩阵,矩阵应该有相同的行,其中hstack的h表示horizontal水平的
print(np.hstack((arr1, arr2)))
# 合并两个矩阵,其中axis=1表示合并两个矩阵的行
print(np.concatenate((arr1, arr2), axis=1))

# 合并两个矩阵的列,注意使用vstack()方法合并矩阵,矩阵应该有相同的列,其中vstack的v表示vertical垂直的
print(np.vstack((arr1, arr2)))

# 合并两个矩阵,其中axis=0表示合并两个矩阵的列
print(np.concatenate((arr1, arr2), axis=0))

Pass the piece matrix through the function

arange

# 构造0-9的ndarray数组
print(np.arange(10))

# 构造1-4的ndarray数组
print(np.arange(1, 5))

# 构造1-19且步长为2的ndarray数组
print(np.arange(1, 20, 2))

linspace/logspace

# 构造一个等差数列,取头也取尾,从0取到20,取5个数
print(np.linspace(0, 20, 5))

# 构造一个等比数列,从10**0取到10**20,取5个数
print(np.logspace(0, 20, 5))

zeros/ones/eye/empty

np.zeros((3,4))     #创造一个3*4的全0矩阵
np.ones((3,4))      #狗仔一个3*4的全1矩阵
np.eye(3)           #构造3个主元的单位矩阵
np.empty((4,4))     #构造一个4*4的随机矩阵,里面的元素是随机生成的

fromstring/fromfunction


# fromstring通过对字符串的字符编码所对应ASCII编码的位置,生成一个ndarray对象
s = 'abcdef'
# np.int8表示一个字符的字节数为8
print(np.fromstring(s, dtype=np.int8))

def func(i, j):
    """其中i为矩阵的行,j为矩阵的列"""
    return i*j

# 使用函数对矩阵元素的行和列的索引做处理,得到当前元素的值,索引从0开始,并构造一个3*4的矩阵
print(np.fromfunction(func, (3, 4)))

Matrix operations

# 运算符:+ - * / % **n

arr1 = np.array([[1, 2], [3, 4], [5, 6]])
arr2 = np.array([[7, 8], [9, 10], [11, 12]])

arr1+arr2       #两个元素对应位置元素相加
arr**2          #arr1中对应元素位置取平方

Dot Product of Matrix

The dot product of matrices must be such that the number of columns of the first matrix is ​​equal to the number of rows of the second matrix

arr1 = np.array([[1, 2, 3],  [4, 5, 6]])
arr2 = np.array([[7, 8], [9, 10], [11, 12]])

arr3=arr1.dot(arr2)

'''arr3是一个两行两列的矩阵,点乘就是把第一个矩阵的行乘以第二个元素的列,双方的元素一对一对应相乘并累加起来获得arr3对应的元素'''
#arr3[0,0]=arr1[0,0]*arr2[0,0]+arr1[1,0]*arr2[0,1]+arr1[2,0]*arr3[0,2]

transpose of a matrix

The transpose of a matrix is ​​equivalent to swapping the rows and columns of the matrix

arr = np.array([[1, 2, 3],  [4, 5, 6]])

print(arr.transpose())
print(arr.T)

matrix inverse

A matrix is ​​invertible only when its rows and columns are the same

arr = np.array([[1, 2, 3],  [4, 5, 6], [7, 8, 9]])

np.linalg.inv(arr)      #矩阵的逆

#单位矩阵的逆就是单位矩阵本身
arr1=np.eye(3)
np.linalg.inv(arr1)

Other operations on matrices

maximum value minimum value

arr = np.array([[1, 2, 3],  [4, 5, 6], [7, 8, 9]])

arr.max()           #获取所有元素最大值
arr.min()           #获取所有元素最小值
arr.max(axis=0)     #获取矩阵每一列的最大值 输出一个行
arr.max(axis=1)     #获取矩阵每一行的最大值 输出一个列
arr.argmax(axis=1)  #获取矩阵最大元素的索引位置

average value

arr = np.array([[1, 2, 3],  [4, 5, 6], [7, 8, 9]])

arr.mean()          #获取矩阵所有元素的平均值
arr,mean(axis=0)    #获取矩阵每一列的平均值
arr.mean(axis=1)    #获取矩阵每一行的平均值

pandas module

pandas is the core module of python data analysis, it mainly provides five functions

  1. Support file access operation, support database, html, json, pickle, csv.sas.stata.hdf, etc.
  2. Supports single-label operations such as addition, deletion, modification and search, slicing, higher-order functions, grouping and aggregation, and mutual conversion with dictionary lists
  3. Support multi-table splicing and merging operations
  4. Supports simple drawing operations
  5. Supports simple statistical analysis operations

Series

Can only handle one-dimensional arrays

import numpy as np
import pandas as pd

arr = np.array([1,2,3,4,np.nan,])
s=pd.Series(arr)

'''添加索引并且一维数组中的元素类型都变成float'''

DataFrame