Pandas study notes (2)
(1) Pandas processes the following three data structures
Series
DataFrame (DataFrame)
Panel
These data structures are built on top of Numpy arrays, which means they are fast. The best way to think about these data structures is that a higher-dimensional data structure is a container for its lower-dimensional data structure. For example, DataFrame is a container for Series and Panel is a container for DataFrame.
data structure |
dimension |
describe |
series |
1 |
1D labeled uniform array with constant size. |
Data Frame |
2 |
Generally 2D labelled, variable size table structures with potentially heterogeneous types of columns. |
panel |
3 |
Generic 3D marker, variable size array. |
(2) series
series
A series is a one-dimensional array structure with uniform data. For example, the following series are set of integers: 10, 23, 56, ....
key point
- Uniform data
- The size does not change
- The value of the data is variable
(2 ) Data frame
A data frame ( DataFrame ) is a two-dimensional array with heterogeneous data. E.g,
Name |
age |
gender |
grade |
Maxsu |
25 |
male |
4.45 |
Katie |
34 |
Female |
2.78 |
fault |
46 |
Female |
3.9 |
Lia |
Female |
x female |
4.6 |
The table above represents data for sales teams with an overall performance rating organization. Data is represented in rows and columns. Each column represents an attribute, and each row represents a person.
the data type of the column
The data types of the four columns in the dataframe above are as follows:
List |
Types of |
Name |
string |
age |
integer |
gender |
string |
grade |
floating point |
key point
- Heterogeneous data
- variable size
- Data is variable
(3 ) Panel
Panels are three-dimensional data structures with heterogeneous data. Panels are difficult to represent in a graphical representation. But a panel can be described as a container for a DataFrame.
key point
- Heterogeneous data
- variable size
- Data is variable