pandas: Create Pivot Table#
Package Import#
import pandas as pd
import numpy as np
Dataset Import#
The dataset used in this notebook is from Kaggle - Pokemon.
data = pd.read_csv('data/Pokemon.csv')
data
Show code cell output
Hide code cell output
# | Name | Type 1 | Type 2 | Total | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | Bulbasaur | Grass | Poison | 318 | 45 | 49 | 49 | 65 | 65 | 45 | 1 | False |
1 | 2 | Ivysaur | Grass | Poison | 405 | 60 | 62 | 63 | 80 | 80 | 60 | 1 | False |
2 | 3 | Venusaur | Grass | Poison | 525 | 80 | 82 | 83 | 100 | 100 | 80 | 1 | False |
3 | 3 | VenusaurMega Venusaur | Grass | Poison | 625 | 80 | 100 | 123 | 122 | 120 | 80 | 1 | False |
4 | 4 | Charmander | Fire | NaN | 309 | 39 | 52 | 43 | 60 | 50 | 65 | 1 | False |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
795 | 719 | Diancie | Rock | Fairy | 600 | 50 | 100 | 150 | 100 | 150 | 50 | 6 | True |
796 | 719 | DiancieMega Diancie | Rock | Fairy | 700 | 50 | 160 | 110 | 160 | 110 | 110 | 6 | True |
797 | 720 | HoopaHoopa Confined | Psychic | Ghost | 600 | 80 | 110 | 60 | 150 | 130 | 70 | 6 | True |
798 | 720 | HoopaHoopa Unbound | Psychic | Dark | 680 | 80 | 160 | 60 | 170 | 130 | 80 | 6 | True |
799 | 721 | Volcanion | Fire | Water | 600 | 80 | 110 | 120 | 130 | 90 | 70 | 6 | True |
800 rows × 13 columns
Create a pivot table#
data.pivot_table(index='Type 1', columns='Legendary', values='Attack', aggfunc='mean')
Show code cell output
Hide code cell output
Legendary | False | True |
---|---|---|
Type 1 | ||
Bug | 70.971014 | NaN |
Dark | 86.862069 | 110.500000 |
Dragon | 103.400000 | 126.666667 |
Electric | 66.125000 | 98.750000 |
Fairy | 57.187500 | 131.000000 |
Fighting | 96.777778 | NaN |
Fire | 82.191489 | 109.000000 |
Flying | 50.000000 | 107.500000 |
Ghost | 71.366667 | 110.000000 |
Grass | 72.119403 | 97.666667 |
Ground | 88.000000 | 150.000000 |
Ice | 73.227273 | 67.500000 |
Normal | 72.083333 | 140.000000 |
Poison | 74.678571 | NaN |
Psychic | 54.953488 | 122.142857 |
Rock | 89.925000 | 122.250000 |
Steel | 92.086957 | 96.250000 |
Water | 72.777778 | 111.250000 |
The advantage of pivot table compared to using groupby + unstack is the margins
parameter, which adds a row/column that gives the totals (or other aggregate functions) for each row/column:
data.pivot_table(index='Type 1', columns='Legendary', values='Attack', aggfunc='mean', margins=True)
Show code cell output
Hide code cell output
Legendary | False | True | All |
---|---|---|---|
Type 1 | |||
Bug | 70.971014 | NaN | 70.971014 |
Dark | 86.862069 | 110.500000 | 88.387097 |
Dragon | 103.400000 | 126.666667 | 112.125000 |
Electric | 66.125000 | 98.750000 | 69.090909 |
Fairy | 57.187500 | 131.000000 | 61.529412 |
Fighting | 96.777778 | NaN | 96.777778 |
Fire | 82.191489 | 109.000000 | 84.769231 |
Flying | 50.000000 | 107.500000 | 78.750000 |
Ghost | 71.366667 | 110.000000 | 73.781250 |
Grass | 72.119403 | 97.666667 | 73.214286 |
Ground | 88.000000 | 150.000000 | 95.750000 |
Ice | 73.227273 | 67.500000 | 72.750000 |
Normal | 72.083333 | 140.000000 | 73.469388 |
Poison | 74.678571 | NaN | 74.678571 |
Psychic | 54.953488 | 122.142857 | 71.456140 |
Rock | 89.925000 | 122.250000 | 92.863636 |
Steel | 92.086957 | 96.250000 | 92.703704 |
Water | 72.777778 | 111.250000 | 74.151786 |
All | 75.669388 | 116.676923 | 79.001250 |