Data Set Information

# Name Type 1 Type 2 HP Atk Def Sp.Atk Sp.Def Speed Total
1-721 - Bug, Dark, Dragon, Electric, Fairy, Fire, Fighting, Flying, Grass, Ghost, Ground, Ice, Normal, Water, Poison, Psychic, Rock, Steel, None (Only avavailable for Type 2) Range 1-255 5-190 5-230 10-194 20-230 5-180 180-780
Mean 69.26 79.00 73.84 72.82 71.90 68.28 435.10
Median 65 75 70 65 70 65 450

Preliminary analysis of the Pokemon Data Set revealed that Pokemon with only one type had Type 2 = None. To prevent errors due to null fields, they were changed to 'None'. Next, by using Pandas DataFrame methods, a rough idea of how stats were distributed was obtained. Unsurprisingly, 'Total' is several magnitudes larger than the other stats - for that reason, it will not be graphed with the other stats, in order to prevent 'squeezing'. Finally, for a more visual representation of stats, a boxplot was generated. Note that Type 1 and Type 2 are henceforth refered to as Primary Type and Secondary Type.


Type Distribution


Primary Type Frequency

Secondary Type Frequency

By using a barchart and the Pandas DataFrame.count() method, the type frequency distribution can be observed. Since types are categorical, the use of a Pareto Chart gives them a numerical value based on frequency.
The following trends can be observed from the charts:
 1. Most types with high frenquency in the Primary Type chart, have low frequency in the Secondary Type chart,
 2. Nearly half (48.25%) of Pokemon have a single type,
 3. Secondary Type distribution has much less variance than Primary Type distribution (if Secondary Type = None is disregarded).


Type Combinations

A Heatmap of the Primary Type/Secondary Type combinations allows us to see which combinations are most prevalent, and which do not exist. These combinations are order specific; therefore Bug/Poison and Poison/Bug are considered different. This specificity begs the question: is there a difference between Primary Type and Secondary Type?
Discussions on forums revealed that most people thought that Primary Type played a larger role solely in Pokemon appearance. However, further analysis is needed.

Stat Probability Density Functions


Stat Density Functions Based on Primary Type and Secondary Type for 

bug

 Pokemon

dark

dragon

electric

fairy

fighting

fire

flying

ghost

grass

ground

ice

normal

poison

psychic

rock

steel

water

By superimposing the probability density function for each stat, based on whether the selected type is a Primary Type or a Secondary Type, we can find out if Primary Type is more important than Secondary Type. For most types and most stats, both density functions have similar shapes. While there definitely are exceptions (e.g. Psychic Sp. Atk), the density functions of the same type resemble each other a lot more than density functions from different types. Therefore I hypothesize that Primary Type does not have a stronger influence on stats than Secondary Type. By referring to the barchart above, we see that there are many more Pokemon of with each type as a Primary Type. Therefore, the Primary Type density function is much less sensitive to outliers. Moreover, since the Primary Type density function takes into account Pokemon with a single type, it is more 'pure' - that is to say, it is less influenced by other types.
Pay Attention to the graph axis scale - they change depending on the type