Data Science with Python and Pandas

Why the Python Ecosystem and Pandas for Data Science?

One of the main goals of Python has always been to ease the learning curve while remaining intuitive and powerful. The language is open source which helps the thrive of the scientific packages like:

TensorFlow
Scikit-Learn
Numpy
Keras
SciPy
Pandas

The ecosystem has great support from big companies and individuals. The flat learning curve allows scientists from different areas to enter the Data Science world.

Pandas sits on top of Python and Numpy and simplifies data manipulation. Pandas offer great range of functions like:

import and export of various formats
data wrangling
data cleaning
text processing
time series and much more

All this makes Pandas/Python a natural choice for learning and mastering Data Science.

History of Pandas and Python

Python was created in the late 1980s by Guido van Rossum. The initial idea of Guido was to create language which is close to plain English, powerful, open for every one and suitable for everyday tasks.

You can see the Hello world! example in python:

print('Hello, world!')

Decades later the language is in the top of the most used, loved and wanted languages:

Pandas was started in 2008 and became open source in 2009. The main idea behind Pandas was to:

be the fundamental high-level building block for doing practical, real world data analysis in Python. Additionally, it has the broader goal of becoming the most powerful and flexible open source data analysis / manipulation tool available in any language.

Source: About Pandas

One of the most used code samples or Hello world! in Pandas is as simple as:

import pandas as pd

pd.read_csv("foo.csv")

This single line will give you shortcut to (plus few more lines):

reshaping and pivoting
slicing, fancy indexing, and subsetting
data alignment
data cleaning
data analysis
data mining

> Basic concepts

> Installations

> Series

> DataFrame

> Create

> Data Types

> Exercise

> Cheat Sheet

> Basic concepts

> Row

> Column

> Index

> MultiIndex

> Exercise

> Basic concepts

> read_csv()

> read_excel()

> Kaggle

> Exercise

> read_xml()

> read_json()

> to_csv()

> to_dict()

> to_json()

> Basic concepts

> groupby()

> Reshape

> melt()

> Exercise

> Pivot

> merge()

> Filter

> Basic concepts

> replace()

> split()

> Regex

> Search

> Exercise

> Find

> Basic concepts

> apply()

> aggfunc

> Convert

> count()

> Other

> Exercise

> map()

> Basic concepts

> Data Validation

> Data Cleaning

> Duplicate

> Time Series

> Pandas Error

> Get

> Basic concepts

> Styling

> Table

> Display

> DataIsBeautiful

> Beginners

> Data Science Projects

> Newsletter

Why the Python Ecosystem and Pandas for Data Science?

History of Pandas and Python