1. Dataframe and Series Basics - Selecting Rows and Columns

The Dataframe are similar to a dictionary in the python. Lets us look at the below dictionary, In above dictionary keys are like column and values are like rows and we can access the rows by calling column.

IN[1]

person = {
    "first" : "Prathap",
    "last" : "Dominicsavio",
    "email" : "prathapdom@gmail.com"
}
print(person["email"])

OUT[1]

To understand them deeply, let us define and declare a dictionary of list. When we print out the value of the key then a list of data is received. Datafarmes are similarly to a dictionary of list when more flexibility and function to operate on them.

IN[2]

people = {
    "first" : ["Prathap", "Johny"],
    "last" : ["Dominicsavio", "Deep"],
    "email" : ["prathapdom@gmail.com", "johnydeep@gamil.com"]
}
print(people["email"])

OUT[2]

Now let us convert this 2D dictionary into a Dataframe and see how they both are very similar.

IN[3]

df = pd.DataFrame(people)
print(df)

OUT[3]

At the first column you will fine lines numbered from 0, which is called index and we will see them in detail on upcoming posts.

IN[4]

print(df['email'])
print(type(df['email']))

OUT[4]

If you access the a column of the Dataframe and check its type then you will find it a series so typically a Dataframe is a column with a list of series objects.

IN[5]

print(df[['first', 'last']])

OUT[5]

IN[6]

print(df.columns)

OUT[6]

IN[7]

print(df.iloc[[0, 1]])

OUT[7]

IN[8]

print(df.iloc[[0, 1], 2])

OUT[8]

IN[9]

print(df.loc[[0, 1]], 'email')
print(df.loc[[0, 1]], ['email', 'last'])

OUT[9]

IN[10]

df = pd.read_csv('C://Users//Prathap Dominicsavio//PycharmProjects//Python-Pandas//Asserts//Pokemon//pokemon_data.csv')
print(df['Legendary'].value_counts())

OUT[10]