Pandas study notes (2)

(1) Pandas processes the following three data structures

Series

DataFrame (DataFrame)

Panel

These data structures are built on top of Numpy arrays, which means they are fast. The best way to think about these data structures is that a higher-dimensional data structure is a container for its lower-dimensional data structure. For example, DataFrame is a container for Series and Panel is a container for DataFrame.

data structure

dimension

describe

series

1

1D labeled uniform array with constant size.

Data Frame

2

Generally 2D labelled, variable size table structures with potentially heterogeneous types of columns.

panel

3

Generic 3D marker, variable size array.

(2) series

series

A series is a one-dimensional array structure with uniform data. For example, the following series are set of integers: 10, 23, 56, ....

key point

  • Uniform data
  • The size does not change
  • The value of the data is variable

(2 ) Data frame

A data frame ( DataFrame ) is a two-dimensional array with heterogeneous data. E.g,

Name

age

gender

grade

Maxsu

25

male

4.45

Katie

34

Female

2.78

fault

46

Female

3.9

Lia

Female

x female

4.6

The table above represents data for sales teams with an overall performance rating organization. Data is represented in rows and columns. Each column represents an attribute, and each row represents a person.

the data type of the column

The data types of the four columns in the dataframe above are as follows:

List

Types of

Name

string

age

integer

gender

string

grade

floating point

key point

  • Heterogeneous data
  • variable size
  • Data is variable

(3 ) Panel

Panels are three-dimensional data structures with heterogeneous data. Panels are difficult to represent in a graphical representation. But a panel can be described as a container for a DataFrame.

key point

  • Heterogeneous data
  • variable size
  • Data is variable