pandas: Pandas Basics Exploration#
Loading data into Pandas DataFrame#
The dataset is from Kaggle - Pokemon.
import pandas as pd
data = pd.read_csv('data/Pokemon.csv')
data.shape, data.ndim
Show code cell output
Hide code cell output
((800, 13), 2)
Get headers#
data.columns
Show code cell output
Hide code cell output
Index(['#', 'Name', 'Type 1', 'Type 2', 'Total', 'HP', 'Attack', 'Defense',
'Sp. Atk', 'Sp. Def', 'Speed', 'Generation', 'Legendary'],
dtype='object')
First and last rows#
data.head(1), data.tail(1)
Show code cell output
Hide code cell output
( # Name Type 1 Type 2 Total HP Attack Defense Sp. Atk Sp. Def \
0 1 Bulbasaur Grass Poison 318 45 49 49 65 65
Speed Generation Legendary
0 45 1 False ,
# Name Type 1 Type 2 Total HP Attack Defense Sp. Atk \
799 721 Volcanion Fire Water 600 80 110 120 130
Sp. Def Speed Generation Legendary
799 90 70 6 True )
High level description of the data#
data.describe()
Show code cell output
Hide code cell output
| # | Total | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | |
|---|---|---|---|---|---|---|---|---|---|
| count | 800.000000 | 800.00000 | 800.000000 | 800.000000 | 800.000000 | 800.000000 | 800.000000 | 800.000000 | 800.00000 |
| mean | 362.813750 | 435.10250 | 69.258750 | 79.001250 | 73.842500 | 72.820000 | 71.902500 | 68.277500 | 3.32375 |
| std | 208.343798 | 119.96304 | 25.534669 | 32.457366 | 31.183501 | 32.722294 | 27.828916 | 29.060474 | 1.66129 |
| min | 1.000000 | 180.00000 | 1.000000 | 5.000000 | 5.000000 | 10.000000 | 20.000000 | 5.000000 | 1.00000 |
| 25% | 184.750000 | 330.00000 | 50.000000 | 55.000000 | 50.000000 | 49.750000 | 50.000000 | 45.000000 | 2.00000 |
| 50% | 364.500000 | 450.00000 | 65.000000 | 75.000000 | 70.000000 | 65.000000 | 70.000000 | 65.000000 | 3.00000 |
| 75% | 539.250000 | 515.00000 | 80.000000 | 100.000000 | 90.000000 | 95.000000 | 90.000000 | 90.000000 | 5.00000 |
| max | 721.000000 | 780.00000 | 255.000000 | 190.000000 | 230.000000 | 194.000000 | 230.000000 | 180.000000 | 6.00000 |
Get specific columns#
multiple_columns = data[['Name', 'HP', 'Legendary']]
print(multiple_columns.head())
Show code cell output
Hide code cell output
Name HP Legendary
0 Bulbasaur 45 False
1 Ivysaur 60 False
2 Venusaur 80 False
3 VenusaurMega Venusaur 80 False
4 Charmander 39 False
Get specific rows#
# Get the first row, which contains the feature names
middle_row = data.iloc[0]
print(middle_row)
Show code cell output
Hide code cell output
# 1
Name Bulbasaur
Type 1 Grass
Type 2 Poison
Total 318
HP 45
Attack 49
Defense 49
Sp. Atk 65
Sp. Def 65
Speed 45
Generation 1
Legendary False
Name: 0, dtype: object
Get specific coordinates item in the DataFrame#
# Get the name of the first 3 rows
item = data.iloc[[0, 1, 2], 1]
print(item)
Show code cell output
Hide code cell output
0 Bulbasaur
1 Ivysaur
2 Venusaur
Name: Name, dtype: object
Remove a column#
Here, if you do not use the
columnsparameter, you need to specify the axis (0 for rows and 1 for columns).
# Here, 'Total' column is the sum of HP,Attack,Defense, SP. Atk, SP. Def, Speed.
dropped_data = data.drop('Total', axis=1)
dropped_data.head(3)
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | Bulbasaur | Grass | Poison | 45 | 49 | 49 | 65 | 65 | 45 | 1 | False |
| 1 | 2 | Ivysaur | Grass | Poison | 60 | 62 | 63 | 80 | 80 | 60 | 1 | False |
| 2 | 3 | Venusaur | Grass | Poison | 80 | 82 | 83 | 100 | 100 | 80 | 1 | False |
Alternatively, you can use the
columnsparameter to specify the column(s) to drop without needing to specify the axis.
dropped_data = data.drop(columns='Total')
dropped_data.head(3)
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | Bulbasaur | Grass | Poison | 45 | 49 | 49 | 65 | 65 | 45 | 1 | False |
| 1 | 2 | Ivysaur | Grass | Poison | 60 | 62 | 63 | 80 | 80 | 60 | 1 | False |
| 2 | 3 | Venusaur | Grass | Poison | 80 | 82 | 83 | 100 | 100 | 80 | 1 | False |
inplace=Truemodifies the original DataFrame without needing to reassign it.
data.drop(columns='Total', inplace=True)
data.head(3)
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | Bulbasaur | Grass | Poison | 45 | 49 | 49 | 65 | 65 | 45 | 1 | False |
| 1 | 2 | Ivysaur | Grass | Poison | 60 | 62 | 63 | 80 | 80 | 60 | 1 | False |
| 2 | 3 | Venusaur | Grass | Poison | 80 | 82 | 83 | 100 | 100 | 80 | 1 | False |
Add a column#
# So, in order to add the 'Total' column again, we do the summation
data['Total'] = data.iloc[:, 4:10].sum(axis=1)
data.head(3)
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | Total | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | Bulbasaur | Grass | Poison | 45 | 49 | 49 | 65 | 65 | 45 | 1 | False | 318 |
| 1 | 2 | Ivysaur | Grass | Poison | 60 | 62 | 63 | 80 | 80 | 60 | 1 | False | 405 |
| 2 | 3 | Venusaur | Grass | Poison | 80 | 82 | 83 | 100 | 100 | 80 | 1 | False | 525 |
iloc and loc in Pandas#
iloc#
# Read single row
data.iloc[3]
Show code cell output
Hide code cell output
# 3
Name VenusaurMega Venusaur
Type 1 Grass
Type 2 Poison
HP 80
Attack 100
Defense 123
Sp. Atk 122
Sp. Def 120
Speed 80
Generation 1
Legendary False
Total 625
Name: 3, dtype: object
# Read multiple rows
data.iloc[[3,6]]
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | Total | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3 | 3 | VenusaurMega Venusaur | Grass | Poison | 80 | 100 | 123 | 122 | 120 | 80 | 1 | False | 625 |
| 6 | 6 | Charizard | Fire | Flying | 78 | 84 | 78 | 109 | 85 | 100 | 1 | False | 534 |
# Read a range of rows
data.iloc[3:7]
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | Total | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3 | 3 | VenusaurMega Venusaur | Grass | Poison | 80 | 100 | 123 | 122 | 120 | 80 | 1 | False | 625 |
| 4 | 4 | Charmander | Fire | NaN | 39 | 52 | 43 | 60 | 50 | 65 | 1 | False | 309 |
| 5 | 5 | Charmeleon | Fire | NaN | 58 | 64 | 58 | 80 | 65 | 80 | 1 | False | 405 |
| 6 | 6 | Charizard | Fire | Flying | 78 | 84 | 78 | 109 | 85 | 100 | 1 | False | 534 |
loc#
The loc property in Pandas is used to access a group of rows and columns by labels or a boolean array. Unlike iloc, which uses integer-based indexing, loc uses the actual labels of the index and columns. This makes it very useful for selecting data based on meaningful row or column names.
data.loc[data['Legendary'] == True].head()
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | Total | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 156 | 144 | Articuno | Ice | Flying | 90 | 85 | 100 | 95 | 125 | 85 | 1 | True | 580 |
| 157 | 145 | Zapdos | Electric | Flying | 90 | 90 | 85 | 125 | 90 | 100 | 1 | True | 580 |
| 158 | 146 | Moltres | Fire | Flying | 90 | 100 | 90 | 125 | 85 | 90 | 1 | True | 580 |
| 162 | 150 | Mewtwo | Psychic | NaN | 106 | 110 | 90 | 154 | 90 | 130 | 1 | True | 680 |
| 163 | 150 | MewtwoMega Mewtwo X | Psychic | Fighting | 106 | 190 | 100 | 154 | 100 | 130 | 1 | True | 780 |
If you have a condition to filter rows, then you must pass the columns that you want to select after the comma as a sequence. e.g., [condition, ['col1', 'col2']]
data.loc[data['Attack'] == 49, ['Name']]
Show code cell output
Hide code cell output
| Name | |
|---|---|
| 0 | Bulbasaur |
| 166 | Chikorita |
| 506 | Finneon |
Filtering data#
We can filter with multiple conditions at ease.
data.loc[(data['Type 1'] == 'Grass') & (data['Type 2'] == 'Poison')].head(3)
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | Total | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | Bulbasaur | Grass | Poison | 45 | 49 | 49 | 65 | 65 | 45 | 1 | False | 318 |
| 1 | 2 | Ivysaur | Grass | Poison | 60 | 62 | 63 | 80 | 80 | 60 | 1 | False | 405 |
| 2 | 3 | Venusaur | Grass | Poison | 80 | 82 | 83 | 100 | 100 | 80 | 1 | False | 525 |
data.loc[(data['Type 1'] == 'Grass') | (data['Type 2'] == 'Poison')][0::10].loc[data['HP'] > 50].head(3)
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | Total | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 50 | 45 | Vileplume | Grass | Poison | 75 | 80 | 85 | 110 | 90 | 50 | 1 | False | 490 |
| 101 | 94 | Gengar | Ghost | Poison | 60 | 65 | 60 | 130 | 75 | 110 | 1 | False | 500 |
| 197 | 182 | Bellossom | Grass | NaN | 75 | 80 | 95 | 90 | 100 | 50 | 2 | False | 490 |
Regex Filtering#
import re
Explicit filtering#
data.loc[data['Name'].str.contains('Mega')].head(3)
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | Total | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3 | 3 | VenusaurMega Venusaur | Grass | Poison | 80 | 100 | 123 | 122 | 120 | 80 | 1 | False | 625 |
| 7 | 6 | CharizardMega Charizard X | Fire | Dragon | 78 | 130 | 111 | 130 | 85 | 100 | 1 | False | 634 |
| 8 | 6 | CharizardMega Charizard Y | Fire | Flying | 78 | 104 | 78 | 159 | 115 | 100 | 1 | False | 634 |
Regex filtering, case sensitive by default#
data.loc[data['Type 1'].str.contains('Fire|grass', regex=True)].head(3)
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | Total | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 4 | 4 | Charmander | Fire | NaN | 39 | 52 | 43 | 60 | 50 | 65 | 1 | False | 309 |
| 5 | 5 | Charmeleon | Fire | NaN | 58 | 64 | 58 | 80 | 65 | 80 | 1 | False | 405 |
| 6 | 6 | Charizard | Fire | Flying | 78 | 84 | 78 | 109 | 85 | 100 | 1 | False | 534 |
Regex filtering, case insensitive#
data.loc[data['Type 1'].str.contains('Fire|grass', flags=re.IGNORECASE, regex=True)].head(10)
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | Total | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | Bulbasaur | Grass | Poison | 45 | 49 | 49 | 65 | 65 | 45 | 1 | False | 318 |
| 1 | 2 | Ivysaur | Grass | Poison | 60 | 62 | 63 | 80 | 80 | 60 | 1 | False | 405 |
| 2 | 3 | Venusaur | Grass | Poison | 80 | 82 | 83 | 100 | 100 | 80 | 1 | False | 525 |
| 3 | 3 | VenusaurMega Venusaur | Grass | Poison | 80 | 100 | 123 | 122 | 120 | 80 | 1 | False | 625 |
| 4 | 4 | Charmander | Fire | NaN | 39 | 52 | 43 | 60 | 50 | 65 | 1 | False | 309 |
| 5 | 5 | Charmeleon | Fire | NaN | 58 | 64 | 58 | 80 | 65 | 80 | 1 | False | 405 |
| 6 | 6 | Charizard | Fire | Flying | 78 | 84 | 78 | 109 | 85 | 100 | 1 | False | 534 |
| 7 | 6 | CharizardMega Charizard X | Fire | Dragon | 78 | 130 | 111 | 130 | 85 | 100 | 1 | False | 634 |
| 8 | 6 | CharizardMega Charizard Y | Fire | Flying | 78 | 104 | 78 | 159 | 115 | 100 | 1 | False | 634 |
| 42 | 37 | Vulpix | Fire | NaN | 38 | 41 | 40 | 50 | 65 | 65 | 1 | False | 299 |
Using regex#
data.loc[data['Name'].str.contains('pi[a-z]*', flags=re.IGNORECASE, regex=True)].head(3)
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | Total | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 13 | 10 | Caterpie | Bug | NaN | 45 | 30 | 35 | 20 | 20 | 45 | 1 | False | 195 |
| 20 | 16 | Pidgey | Normal | Flying | 40 | 45 | 40 | 35 | 35 | 56 | 1 | False | 251 |
| 21 | 17 | Pidgeotto | Normal | Flying | 63 | 60 | 55 | 50 | 50 | 71 | 1 | False | 349 |
data.loc[data['Name'].str.contains('^pi[a-z]*', flags=re.IGNORECASE, regex=True)].head(3)
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | Total | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 20 | 16 | Pidgey | Normal | Flying | 40 | 45 | 40 | 35 | 35 | 56 | 1 | False | 251 |
| 21 | 17 | Pidgeotto | Normal | Flying | 63 | 60 | 55 | 50 | 50 | 71 | 1 | False | 349 |
| 22 | 18 | Pidgeot | Normal | Flying | 83 | 80 | 75 | 70 | 70 | 101 | 1 | False | 479 |
data.loc[data['Name'].str.contains('pi.*', flags=re.IGNORECASE, regex=True)].head(10)
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | Total | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 13 | 10 | Caterpie | Bug | NaN | 45 | 30 | 35 | 20 | 20 | 45 | 1 | False | 195 |
| 20 | 16 | Pidgey | Normal | Flying | 40 | 45 | 40 | 35 | 35 | 56 | 1 | False | 251 |
| 21 | 17 | Pidgeotto | Normal | Flying | 63 | 60 | 55 | 50 | 50 | 71 | 1 | False | 349 |
| 22 | 18 | Pidgeot | Normal | Flying | 83 | 80 | 75 | 70 | 70 | 101 | 1 | False | 479 |
| 23 | 18 | PidgeotMega Pidgeot | Normal | Flying | 83 | 80 | 80 | 135 | 80 | 121 | 1 | False | 579 |
| 30 | 25 | Pikachu | Electric | NaN | 35 | 55 | 40 | 50 | 50 | 90 | 1 | False | 320 |
| 42 | 37 | Vulpix | Fire | NaN | 38 | 41 | 40 | 50 | 65 | 65 | 1 | False | 299 |
| 76 | 70 | Weepinbell | Grass | Poison | 65 | 90 | 50 | 85 | 45 | 55 | 1 | False | 390 |
| 84 | 78 | Rapidash | Fire | NaN | 65 | 100 | 70 | 80 | 80 | 105 | 1 | False | 500 |
| 136 | 127 | Pinsir | Bug | NaN | 65 | 125 | 100 | 55 | 70 | 85 | 1 | False | 500 |
Sorting#
Sort by 1 column, ascending#
data.sort_values('Type 1').head(3)
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | Total | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 600 | 540 | Sewaddle | Bug | Grass | 45 | 53 | 70 | 40 | 60 | 42 | 5 | False | 310 |
| 136 | 127 | Pinsir | Bug | NaN | 65 | 125 | 100 | 55 | 70 | 85 | 1 | False | 500 |
| 457 | 412 | Burmy | Bug | NaN | 40 | 29 | 45 | 29 | 45 | 36 | 4 | False | 224 |
Sort by 1 column, descending#
data.sort_values('Type 1', ascending=False).head(3)
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | Total | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 371 | 339 | Barboach | Water | Ground | 50 | 48 | 43 | 46 | 41 | 60 | 3 | False | 288 |
| 97 | 90 | Shellder | Water | NaN | 30 | 65 | 100 | 45 | 25 | 40 | 1 | False | 305 |
| 240 | 222 | Corsola | Water | Rock | 55 | 55 | 85 | 65 | 85 | 35 | 2 | False | 380 |
Sort by multiple columns, 1 = ascending, 0 = descending#
data.sort_values(['Type 1', 'HP'], ascending=[1, 0]).head(3)
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | Total | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 520 | 469 | Yanmega | Bug | Flying | 86 | 76 | 86 | 116 | 56 | 95 | 4 | False | 515 |
| 698 | 637 | Volcarona | Bug | Fire | 85 | 60 | 65 | 135 | 105 | 100 | 5 | False | 550 |
| 231 | 214 | Heracross | Bug | Fighting | 80 | 125 | 75 | 40 | 95 | 85 | 2 | False | 500 |
Reset index#
In the cell below, you can see the index is not consecutive (the leftmost column). We can reset it with the reset_index() method. By default, it adds the old index as a new column. If you do not want that, use the drop=True parameter.
new_data = data.loc[(data['Type 1'] == 'Grass') | (data['Type 2'] == 'Poison')][0::10].loc[data['HP'] > 50].head(3)
new_data
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | Total | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 50 | 45 | Vileplume | Grass | Poison | 75 | 80 | 85 | 110 | 90 | 50 | 1 | False | 490 |
| 101 | 94 | Gengar | Ghost | Poison | 60 | 65 | 60 | 130 | 75 | 110 | 1 | False | 500 |
| 197 | 182 | Bellossom | Grass | NaN | 75 | 80 | 95 | 90 | 100 | 50 | 2 | False | 490 |
new_data.reset_index(drop=True, inplace=True)
new_data
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | Total | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 45 | Vileplume | Grass | Poison | 75 | 80 | 85 | 110 | 90 | 50 | 1 | False | 490 |
| 1 | 94 | Gengar | Ghost | Poison | 60 | 65 | 60 | 130 | 75 | 110 | 1 | False | 500 |
| 2 | 182 | Bellossom | Grass | NaN | 75 | 80 | 95 | 90 | 100 | 50 | 2 | False | 490 |
Group by/Aggregate using aggregate function#
Aggregate functions like mean(), sum(), min(), and max() in pandas only work with numerical data. Non-numeric columns (such as strings or booleans) cannot be averaged or summed, so you must select only numeric columns when using these functions to avoid errors and get meaningful results.
Group by 1 column#
# Select only numeric columns for aggregation
numeric_cols = data.select_dtypes(include='number').columns
data.groupby(['Type 1'])[numeric_cols].mean().sort_values('Type 1').head(10)
Show code cell output
Hide code cell output
| # | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Total | |
|---|---|---|---|---|---|---|---|---|---|
| Type 1 | |||||||||
| Bug | 334.492754 | 56.884058 | 70.971014 | 70.724638 | 53.869565 | 64.797101 | 61.681159 | 3.217391 | 378.927536 |
| Dark | 461.354839 | 66.806452 | 88.387097 | 70.225806 | 74.645161 | 69.516129 | 76.161290 | 4.032258 | 445.741935 |
| Dragon | 474.375000 | 83.312500 | 112.125000 | 86.375000 | 96.843750 | 88.843750 | 83.031250 | 3.875000 | 550.531250 |
| Electric | 363.500000 | 59.795455 | 69.090909 | 66.295455 | 90.022727 | 73.704545 | 84.500000 | 3.272727 | 443.409091 |
| Fairy | 449.529412 | 74.117647 | 61.529412 | 65.705882 | 78.529412 | 84.705882 | 48.588235 | 4.117647 | 413.176471 |
| Fighting | 363.851852 | 69.851852 | 96.777778 | 65.925926 | 53.111111 | 64.703704 | 66.074074 | 3.370370 | 416.444444 |
| Fire | 327.403846 | 69.903846 | 84.769231 | 67.769231 | 88.980769 | 72.211538 | 74.442308 | 3.211538 | 458.076923 |
| Flying | 677.750000 | 70.750000 | 78.750000 | 66.250000 | 94.250000 | 72.500000 | 102.500000 | 5.500000 | 485.000000 |
| Ghost | 486.500000 | 64.437500 | 73.781250 | 81.187500 | 79.343750 | 76.468750 | 64.343750 | 4.187500 | 439.562500 |
| Grass | 344.871429 | 67.271429 | 73.214286 | 70.800000 | 77.500000 | 70.428571 | 61.928571 | 3.357143 | 421.142857 |
Group by hierarchical columns#
data.groupby(['Type 1', 'Type 2'])[numeric_cols].count().head()
Show code cell output
Hide code cell output
| # | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Total | ||
|---|---|---|---|---|---|---|---|---|---|---|
| Type 1 | Type 2 | |||||||||
| Bug | Electric | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
| Fighting | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | |
| Fire | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | |
| Flying | 14 | 14 | 14 | 14 | 14 | 14 | 14 | 14 | 14 | |
| Ghost | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
Quizzes#
1. Import the pandas library as pd.#
import pandas as pd
2. Read CSV into df#
df = pd.read_csv('data/Pokemon.csv')
3. Show first 5 rows#
df.head(5)
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | Total | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | Bulbasaur | Grass | Poison | 318 | 45 | 49 | 49 | 65 | 65 | 45 | 1 | False |
| 1 | 2 | Ivysaur | Grass | Poison | 405 | 60 | 62 | 63 | 80 | 80 | 60 | 1 | False |
| 2 | 3 | Venusaur | Grass | Poison | 525 | 80 | 82 | 83 | 100 | 100 | 80 | 1 | False |
| 3 | 3 | VenusaurMega Venusaur | Grass | Poison | 625 | 80 | 100 | 123 | 122 | 120 | 80 | 1 | False |
| 4 | 4 | Charmander | Fire | NaN | 309 | 39 | 52 | 43 | 60 | 50 | 65 | 1 | False |
4. Show last 3 rows#
df.tail(3)
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | Total | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 797 | 720 | HoopaHoopa Confined | Psychic | Ghost | 600 | 80 | 110 | 60 | 150 | 130 | 70 | 6 | True |
| 798 | 720 | HoopaHoopa Unbound | Psychic | Dark | 680 | 80 | 160 | 60 | 170 | 130 | 80 | 6 | True |
| 799 | 721 | Volcanion | Fire | Water | 600 | 80 | 110 | 120 | 130 | 90 | 70 | 6 | True |
5. Show all column names#
df.columns
Show code cell output
Hide code cell output
Index(['#', 'Name', 'Type 1', 'Type 2', 'Total', 'HP', 'Attack', 'Defense',
'Sp. Atk', 'Sp. Def', 'Speed', 'Generation', 'Legendary'],
dtype='object')
6. Select rows index 10 to 15#
df.iloc[10:16]
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | Total | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 10 | 8 | Wartortle | Water | NaN | 405 | 59 | 63 | 80 | 65 | 80 | 58 | 1 | False |
| 11 | 9 | Blastoise | Water | NaN | 530 | 79 | 83 | 100 | 85 | 105 | 78 | 1 | False |
| 12 | 9 | BlastoiseMega Blastoise | Water | NaN | 630 | 79 | 103 | 120 | 135 | 115 | 78 | 1 | False |
| 13 | 10 | Caterpie | Bug | NaN | 195 | 45 | 30 | 35 | 20 | 20 | 45 | 1 | False |
| 14 | 11 | Metapod | Bug | NaN | 205 | 50 | 20 | 55 | 25 | 25 | 30 | 1 | False |
| 15 | 12 | Butterfree | Bug | Flying | 395 | 60 | 45 | 50 | 90 | 80 | 70 | 1 | False |
7. Select row index 50#
df.iloc[50]
Show code cell output
Hide code cell output
# 45
Name Vileplume
Type 1 Grass
Type 2 Poison
Total 490
HP 75
Attack 80
Defense 85
Sp. Atk 110
Sp. Def 90
Speed 50
Generation 1
Legendary False
Name: 50, dtype: object
8. Filter rows Type 1 == Grass#
df.loc[df['Type 1'] == 'Grass'].head()
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | Total | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | Bulbasaur | Grass | Poison | 318 | 45 | 49 | 49 | 65 | 65 | 45 | 1 | False |
| 1 | 2 | Ivysaur | Grass | Poison | 405 | 60 | 62 | 63 | 80 | 80 | 60 | 1 | False |
| 2 | 3 | Venusaur | Grass | Poison | 525 | 80 | 82 | 83 | 100 | 100 | 80 | 1 | False |
| 3 | 3 | VenusaurMega Venusaur | Grass | Poison | 625 | 80 | 100 | 123 | 122 | 120 | 80 | 1 | False |
| 48 | 43 | Oddish | Grass | Poison | 320 | 45 | 50 | 55 | 75 | 65 | 30 | 1 | False |
9. Filter rows HP > 100#
df.loc[df['HP'] > 100].head()
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | Total | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 44 | 39 | Jigglypuff | Normal | Fairy | 270 | 115 | 45 | 20 | 45 | 25 | 20 | 1 | False |
| 45 | 40 | Wigglytuff | Normal | Fairy | 435 | 140 | 70 | 45 | 85 | 50 | 45 | 1 | False |
| 96 | 89 | Muk | Poison | NaN | 500 | 105 | 105 | 75 | 65 | 100 | 50 | 1 | False |
| 120 | 112 | Rhydon | Ground | Rock | 485 | 105 | 130 | 120 | 45 | 45 | 40 | 1 | False |
| 121 | 113 | Chansey | Normal | NaN | 450 | 250 | 5 | 5 | 35 | 105 | 50 | 1 | False |
10. Filter rows Name contains chu#
df.loc[df['Name'].str.contains('chu')].head()
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | Total | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 30 | 25 | Pikachu | Electric | NaN | 320 | 35 | 55 | 40 | 50 | 50 | 90 | 1 | False |
| 31 | 26 | Raichu | Electric | NaN | 485 | 60 | 90 | 55 | 90 | 80 | 110 | 1 | False |
| 186 | 172 | Pichu | Electric | NaN | 205 | 20 | 40 | 15 | 35 | 35 | 60 | 2 | False |
| 257 | 238 | Smoochum | Ice | Psychic | 305 | 45 | 30 | 15 | 85 | 65 | 65 | 2 | False |
11. Sort by Attack descending#
df.sort_values('Attack', ascending=False).head()
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | Total | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 163 | 150 | MewtwoMega Mewtwo X | Psychic | Fighting | 780 | 106 | 190 | 100 | 154 | 100 | 130 | 1 | True |
| 232 | 214 | HeracrossMega Heracross | Bug | Fighting | 600 | 80 | 185 | 115 | 40 | 105 | 75 | 2 | False |
| 429 | 386 | DeoxysAttack Forme | Psychic | NaN | 600 | 50 | 180 | 20 | 180 | 20 | 150 | 3 | True |
| 426 | 384 | RayquazaMega Rayquaza | Dragon | Flying | 780 | 105 | 180 | 100 | 180 | 100 | 115 | 3 | True |
| 424 | 383 | GroudonPrimal Groudon | Ground | Fire | 770 | 100 | 180 | 160 | 150 | 90 | 90 | 3 | True |
12. Reset index#
new_df = df.reset_index(drop=True)
new_df.head()
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | Total | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | Bulbasaur | Grass | Poison | 318 | 45 | 49 | 49 | 65 | 65 | 45 | 1 | False |
| 1 | 2 | Ivysaur | Grass | Poison | 405 | 60 | 62 | 63 | 80 | 80 | 60 | 1 | False |
| 2 | 3 | Venusaur | Grass | Poison | 525 | 80 | 82 | 83 | 100 | 100 | 80 | 1 | False |
| 3 | 3 | VenusaurMega Venusaur | Grass | Poison | 625 | 80 | 100 | 123 | 122 | 120 | 80 | 1 | False |
| 4 | 4 | Charmander | Fire | NaN | 309 | 39 | 52 | 43 | 60 | 50 | 65 | 1 | False |
13. Select columns Name, Type 1, HP#
df[['Name', 'Type 1', 'HP']].head()
Show code cell output
Hide code cell output
| Name | Type 1 | HP | |
|---|---|---|---|
| 0 | Bulbasaur | Grass | 45 |
| 1 | Ivysaur | Grass | 60 |
| 2 | Venusaur | Grass | 80 |
| 3 | VenusaurMega Venusaur | Grass | 80 |
| 4 | Charmander | Fire | 39 |
14. Filter Type 1 == Grass and Type 2 == Poison#
df.loc[(df['Type 1'] == 'Grass') & (df['Type 2'] == 'Poison')].head()
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | Total | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | Bulbasaur | Grass | Poison | 318 | 45 | 49 | 49 | 65 | 65 | 45 | 1 | False |
| 1 | 2 | Ivysaur | Grass | Poison | 405 | 60 | 62 | 63 | 80 | 80 | 60 | 1 | False |
| 2 | 3 | Venusaur | Grass | Poison | 525 | 80 | 82 | 83 | 100 | 100 | 80 | 1 | False |
| 3 | 3 | VenusaurMega Venusaur | Grass | Poison | 625 | 80 | 100 | 123 | 122 | 120 | 80 | 1 | False |
| 48 | 43 | Oddish | Grass | Poison | 320 | 45 | 50 | 55 | 75 | 65 | 30 | 1 | False |
15. Mean HP of Legendary#
df.loc[df['Legendary'] == True]['HP'].mean()
Show code cell output
Hide code cell output
np.float64(92.73846153846154)
16. Rows with even indices#
even_indices_df = df.iloc[::2]
even_indices_df.head()
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | Total | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | Bulbasaur | Grass | Poison | 318 | 45 | 49 | 49 | 65 | 65 | 45 | 1 | False |
| 2 | 3 | Venusaur | Grass | Poison | 525 | 80 | 82 | 83 | 100 | 100 | 80 | 1 | False |
| 4 | 4 | Charmander | Fire | NaN | 309 | 39 | 52 | 43 | 60 | 50 | 65 | 1 | False |
| 6 | 6 | Charizard | Fire | Flying | 534 | 78 | 84 | 78 | 109 | 85 | 100 | 1 | False |
| 8 | 6 | CharizardMega Charizard Y | Fire | Flying | 634 | 78 | 104 | 78 | 159 | 115 | 100 | 1 | False |
17. Save even indices to filtered_pokemon.csv#
even_indices_df = df.iloc[::2]
even_indices_df.to_csv('data/filtered_pokemon.csv', index=True)
even_indices_df = pd.read_csv('data/filtered_pokemon.csv', index_col=0)
even_indices_df.head()
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | Total | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | Bulbasaur | Grass | Poison | 318 | 45 | 49 | 49 | 65 | 65 | 45 | 1 | False |
| 2 | 3 | Venusaur | Grass | Poison | 525 | 80 | 82 | 83 | 100 | 100 | 80 | 1 | False |
| 4 | 4 | Charmander | Fire | NaN | 309 | 39 | 52 | 43 | 60 | 50 | 65 | 1 | False |
| 6 | 6 | Charizard | Fire | Flying | 534 | 78 | 84 | 78 | 109 | 85 | 100 | 1 | False |
| 8 | 6 | CharizardMega Charizard Y | Fire | Flying | 634 | 78 | 104 | 78 | 159 | 115 | 100 | 1 | False |
18. First 10 rows Name, Type 1, Attack#
df[['Name', 'Type 1', 'Attack']].head(10)
Show code cell output
Hide code cell output
| Name | Type 1 | Attack | |
|---|---|---|---|
| 0 | Bulbasaur | Grass | 49 |
| 1 | Ivysaur | Grass | 62 |
| 2 | Venusaur | Grass | 82 |
| 3 | VenusaurMega Venusaur | Grass | 100 |
| 4 | Charmander | Fire | 52 |
| 5 | Charmeleon | Fire | 64 |
| 6 | Charizard | Fire | 84 |
| 7 | CharizardMega Charizard X | Fire | 130 |
| 8 | CharizardMega Charizard Y | Fire | 104 |
| 9 | Squirtle | Water | 48 |
19. Max Defense#
df['Defense'].max()
Show code cell output
Hide code cell output
np.int64(230)
20. Rows Speed above mean#
mean_speed = df['Speed'].mean()
new_df = df.loc[df['Speed'] > mean_speed]
print('mean speed: ', mean_speed)
new_df.head()
Show code cell output
Hide code cell output
mean speed: 68.2775
| # | Name | Type 1 | Type 2 | Total | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 3 | Venusaur | Grass | Poison | 525 | 80 | 82 | 83 | 100 | 100 | 80 | 1 | False |
| 3 | 3 | VenusaurMega Venusaur | Grass | Poison | 625 | 80 | 100 | 123 | 122 | 120 | 80 | 1 | False |
| 5 | 5 | Charmeleon | Fire | NaN | 405 | 58 | 64 | 58 | 80 | 65 | 80 | 1 | False |
| 6 | 6 | Charizard | Fire | Flying | 534 | 78 | 84 | 78 | 109 | 85 | 100 | 1 | False |
| 7 | 6 | CharizardMega Charizard X | Fire | Dragon | 634 | 78 | 130 | 111 | 130 | 85 | 100 | 1 | False |
21. First 3 rows Name and Attack#
df[['Name', 'Attack']].iloc[:3]
Show code cell output
Hide code cell output
| Name | Attack | |
|---|---|---|
| 0 | Bulbasaur | 49 |
| 1 | Ivysaur | 62 |
| 2 | Venusaur | 82 |
22. Count per Type 1#
df.groupby(['Type 1'])['#'].count().sort_index()
Show code cell output
Hide code cell output
Type 1
Bug 69
Dark 31
Dragon 32
Electric 44
Fairy 17
Fighting 27
Fire 52
Flying 4
Ghost 32
Grass 70
Ground 32
Ice 24
Normal 98
Poison 28
Psychic 57
Rock 44
Steel 27
Water 112
Name: #, dtype: int64
23. Rows Generation 3 or 5#
df.loc[(df['Generation'] == 3) | (df['Generation'] == 5)].head()
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | Total | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 272 | 252 | Treecko | Grass | NaN | 310 | 40 | 45 | 35 | 65 | 55 | 70 | 3 | False |
| 273 | 253 | Grovyle | Grass | NaN | 405 | 50 | 65 | 45 | 85 | 65 | 95 | 3 | False |
| 274 | 254 | Sceptile | Grass | NaN | 530 | 70 | 85 | 65 | 105 | 85 | 120 | 3 | False |
| 275 | 254 | SceptileMega Sceptile | Grass | Dragon | 630 | 70 | 110 | 75 | 145 | 85 | 145 | 3 | False |
| 276 | 255 | Torchic | Fire | NaN | 310 | 45 | 60 | 40 | 70 | 50 | 45 | 3 | False |
24. Replace Fire with Flame in Type 1#
df_copy = df.copy()
df_copy['Type 1'] = df_copy['Type 1'].str.replace('Fire', 'Flame')
df_copy.head()
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | Total | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | Bulbasaur | Grass | Poison | 318 | 45 | 49 | 49 | 65 | 65 | 45 | 1 | False |
| 1 | 2 | Ivysaur | Grass | Poison | 405 | 60 | 62 | 63 | 80 | 80 | 60 | 1 | False |
| 2 | 3 | Venusaur | Grass | Poison | 525 | 80 | 82 | 83 | 100 | 100 | 80 | 1 | False |
| 3 | 3 | VenusaurMega Venusaur | Grass | Poison | 625 | 80 | 100 | 123 | 122 | 120 | 80 | 1 | False |
| 4 | 4 | Charmander | Flame | NaN | 309 | 39 | 52 | 43 | 60 | 50 | 65 | 1 | False |
25. Rows Name startswith P#
df.loc[df['Name'].str.startswith('P')].head()
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | Total | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 20 | 16 | Pidgey | Normal | Flying | 251 | 40 | 45 | 40 | 35 | 35 | 56 | 1 | False |
| 21 | 17 | Pidgeotto | Normal | Flying | 349 | 63 | 60 | 55 | 50 | 50 | 71 | 1 | False |
| 22 | 18 | Pidgeot | Normal | Flying | 479 | 83 | 80 | 75 | 70 | 70 | 101 | 1 | False |
| 23 | 18 | PidgeotMega Pidgeot | Normal | Flying | 579 | 83 | 80 | 80 | 135 | 80 | 121 | 1 | False |
| 30 | 25 | Pikachu | Electric | NaN | 320 | 35 | 55 | 40 | 50 | 50 | 90 | 1 | False |
26. Mean Attack per Type 1#
df.groupby('Type 1')['Attack'].mean().sort_index()
Show code cell output
Hide code cell output
Type 1
Bug 70.971014
Dark 88.387097
Dragon 112.125000
Electric 69.090909
Fairy 61.529412
Fighting 96.777778
Fire 84.769231
Flying 78.750000
Ghost 73.781250
Grass 73.214286
Ground 95.750000
Ice 72.750000
Normal 73.469388
Poison 74.678571
Psychic 71.456140
Rock 92.863636
Steel 92.703704
Water 74.151786
Name: Attack, dtype: float64
27. Sort by Type 1 asc, Attack desc#
df.sort_values(['Type 1', 'Attack'], ascending=[1, 0]).head()
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | Total | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 232 | 214 | HeracrossMega Heracross | Bug | Fighting | 600 | 80 | 185 | 115 | 40 | 105 | 75 | 2 | False |
| 137 | 127 | PinsirMega Pinsir | Bug | Flying | 600 | 65 | 155 | 120 | 65 | 90 | 105 | 1 | False |
| 19 | 15 | BeedrillMega Beedrill | Bug | Poison | 495 | 65 | 150 | 40 | 15 | 80 | 145 | 1 | False |
| 229 | 212 | ScizorMega Scizor | Bug | Steel | 600 | 70 | 150 | 140 | 65 | 100 | 75 | 2 | False |
| 650 | 589 | Escavalier | Bug | Steel | 495 | 70 | 135 | 105 | 60 | 105 | 20 | 5 | False |
28. Mean Defense per Type 1#
df.groupby('Type 1')['Defense'].mean().sort_index()
Show code cell output
Hide code cell output
Type 1
Bug 70.724638
Dark 70.225806
Dragon 86.375000
Electric 66.295455
Fairy 65.705882
Fighting 65.925926
Fire 67.769231
Flying 66.250000
Ghost 81.187500
Grass 70.800000
Ground 84.843750
Ice 71.416667
Normal 59.846939
Poison 68.821429
Psychic 67.684211
Rock 100.795455
Steel 126.370370
Water 72.946429
Name: Defense, dtype: float64
29. Legendary with Attack > 100#
df.loc[(df['Legendary'] == True) & (df['Attack'] > 100)].head()
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | Total | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 162 | 150 | Mewtwo | Psychic | NaN | 680 | 106 | 110 | 90 | 154 | 90 | 130 | 1 | True |
| 163 | 150 | MewtwoMega Mewtwo X | Psychic | Fighting | 780 | 106 | 190 | 100 | 154 | 100 | 130 | 1 | True |
| 164 | 150 | MewtwoMega Mewtwo Y | Psychic | NaN | 780 | 106 | 150 | 70 | 194 | 120 | 140 | 1 | True |
| 263 | 244 | Entei | Fire | NaN | 580 | 115 | 115 | 85 | 90 | 75 | 100 | 2 | True |
| 270 | 250 | Ho-oh | Fire | Flying | 680 | 106 | 130 | 90 | 110 | 154 | 90 | 2 | True |
30. New DataFrame Name and Total renamed Overall#
new_df = df[['Name', 'Total']].rename(columns={'Total': 'Overall'})
new_df.head()
Show code cell output
Hide code cell output
| Name | Overall | |
|---|---|---|
| 0 | Bulbasaur | 318 |
| 1 | Ivysaur | 405 |
| 2 | Venusaur | 525 |
| 3 | VenusaurMega Venusaur | 625 |
| 4 | Charmander | 309 |
31. Filter Type 1 == Water and Attack >= 80#
df.loc[(df['Type 1'] == 'Water') & (df['Attack'] >= 80)].head()
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | Total | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 11 | 9 | Blastoise | Water | NaN | 530 | 79 | 83 | 100 | 85 | 105 | 78 | 1 | False |
| 12 | 9 | BlastoiseMega Blastoise | Water | NaN | 630 | 79 | 103 | 120 | 135 | 115 | 78 | 1 | False |
| 60 | 55 | Golduck | Water | NaN | 500 | 80 | 82 | 78 | 95 | 80 | 85 | 1 | False |
| 67 | 62 | Poliwrath | Water | Fighting | 510 | 90 | 95 | 95 | 70 | 90 | 70 | 1 | False |
| 98 | 91 | Cloyster | Water | Ice | 525 | 50 | 95 | 180 | 85 | 45 | 70 | 1 | False |
32. Generation 4 Name and Defense#
df.loc[df['Generation'] == 4][['Name', 'Defense']].head()
Show code cell output
Hide code cell output
| Name | Defense | |
|---|---|---|
| 432 | Turtwig | 64 |
| 433 | Grotle | 85 |
| 434 | Torterra | 105 |
| 435 | Chimchar | 44 |
| 436 | Monferno | 52 |
33. Mean Total per (Type 1, Type 2)#
df.groupby(['Type 1', 'Type 2'])['Total'].mean().head(10)
Show code cell output
Hide code cell output
Type 1 Type 2
Bug Electric 395.500000
Fighting 550.000000
Fire 455.000000
Flying 419.500000
Ghost 236.000000
Grass 384.000000
Ground 345.000000
Poison 347.916667
Rock 435.000000
Steel 509.714286
Name: Total, dtype: float64
34. Filter Total between 400 and 500#
df.loc[(df['Total'] >= 400) & (df['Total'] <= 500)].head()
Show code cell output
Hide code cell output
| # | Name | Type 1 | Type 2 | Total | HP | Attack | Defense | Sp. Atk | Sp. Def | Speed | Generation | Legendary | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | Ivysaur | Grass | Poison | 405 | 60 | 62 | 63 | 80 | 80 | 60 | 1 | False |
| 5 | 5 | Charmeleon | Fire | NaN | 405 | 58 | 64 | 58 | 80 | 65 | 80 | 1 | False |
| 10 | 8 | Wartortle | Water | NaN | 405 | 59 | 63 | 80 | 65 | 80 | 58 | 1 | False |
| 19 | 15 | BeedrillMega Beedrill | Bug | Poison | 495 | 65 | 150 | 40 | 15 | 80 | 145 | 1 | False |
| 22 | 18 | Pidgeot | Normal | Flying | 479 | 83 | 80 | 75 | 70 | 70 | 101 | 1 | False |