Data Science Challenge 1: Data Styling

Table and Data visualization is a very important part of Data Science. In this challenge we will focus on highlighting data in DataFrame.

Challenges have 3 sections depending on your level.

Suppose you have DataFrame like:

import pandas as pd
import numpy as np

np.random.seed(24)
df = pd.DataFrame({'A': np.linspace(1, 10, 10)})
df = pd.concat([df, pd.DataFrame(np.random.randn(10, 4), columns=list('BCDE'))],
               axis=1)
df.iloc[3, 3] = np.nan
df.iloc[0, 2] = np.nan

as:

A B C D E
1.0 1.329212 NaN -0.316280 -0.990810
2.0 -1.070816 -1.438713 0.564417 0.295722
3.0 -1.626404 0.219565 0.678805 1.889273
4.0 0.961538 0.104011 NaN 0.850229
5.0 1.453425 1.057737 0.165562 0.515018

There are 3 different challenges depending on difficulty level:

  • Beginner - B1
  • Aadvanced - A1
  • Master - M1

Highlight negative values

Can you highlight negative values in red? (as shown below)

A B C D E
0 1.000000 1.329212 nan -0.316280 -0.990810
1 2.000000 -1.070816 -1.438713 0.564417 0.295722
2 3.000000 -1.626404 0.219565 0.678805 1.889273
3 4.000000 0.961538 0.104011 nan 0.850229
4 5.000000 1.453425 1.057737 0.165562 0.515018
5 6.000000 -1.336936 0.562861 1.392855 -0.063328
6 7.000000 0.121668 1.207603 -0.002040 1.627796
7 8.000000 0.354493 1.037528 -0.385684 0.519818
8 9.000000 1.686583 -1.325963 1.428984 -2.089354
9 10.000000 -0.129820 0.631523 -0.586538 0.290720

Highlight negative values

Can you highlight columns based on the column name? Sample color map:

{'A':'cyan', 'B':'lightblue', 'C':'lightyellow', 'D':'salmon', 'E':'lightgreen'}

A B C D E
0 1.000000 1.329212 nan -0.316280 -0.990810
1 2.000000 -1.070816 -1.438713 0.564417 0.295722
2 3.000000 -1.626404 0.219565 0.678805 1.889273
3 4.000000 0.961538 0.104011 nan 0.850229
4 5.000000 1.453425 1.057737 0.165562 0.515018
5 6.000000 -1.336936 0.562861 1.392855 -0.063328
6 7.000000 0.121668 1.207603 -0.002040 1.627796
7 8.000000 0.354493 1.037528 -0.385684 0.519818
8 9.000000 1.686583 -1.325963 1.428984 -2.089354
9 10.000000 -0.129820 0.631523 -0.586538 0.290720

New column - sum of A & B. Showing split based on the percent

Can you add a new column which is sum of both A and B?

Then style it by showing split based on the percent of which value from column A or B as shown below:

A B B/A % total
0 12 3 25.000000 15
1 6 4 66.666667 10
2 8 1 12.500000 9
3 15 7 46.666667 22