Data Science Challenge 1: Data Styling
Table and Data visualization is a very important part of Data Science. In this challenge we will focus on highlighting data in DataFrame.
Challenges have 3 sections depending on your level.
Suppose you have DataFrame like:
import pandas as pd
import numpy as np
np.random.seed(24)
df = pd.DataFrame({'A': np.linspace(1, 10, 10)})
df = pd.concat([df, pd.DataFrame(np.random.randn(10, 4), columns=list('BCDE'))],
axis=1)
df.iloc[3, 3] = np.nan
df.iloc[0, 2] = np.nan
as:
A | B | C | D | E |
---|---|---|---|---|
1.0 | 1.329212 | NaN | -0.316280 | -0.990810 |
2.0 | -1.070816 | -1.438713 | 0.564417 | 0.295722 |
3.0 | -1.626404 | 0.219565 | 0.678805 | 1.889273 |
4.0 | 0.961538 | 0.104011 | NaN | 0.850229 |
5.0 | 1.453425 | 1.057737 | 0.165562 | 0.515018 |
There are 3 different challenges depending on difficulty level:
- Beginner - B1
- Aadvanced - A1
- Master - M1
Highlight negative values
Can you highlight negative values in red? (as shown below)
A | B | C | D | E | |
---|---|---|---|---|---|
0 | 1.000000 | 1.329212 | nan | -0.316280 | -0.990810 |
1 | 2.000000 | -1.070816 | -1.438713 | 0.564417 | 0.295722 |
2 | 3.000000 | -1.626404 | 0.219565 | 0.678805 | 1.889273 |
3 | 4.000000 | 0.961538 | 0.104011 | nan | 0.850229 |
4 | 5.000000 | 1.453425 | 1.057737 | 0.165562 | 0.515018 |
5 | 6.000000 | -1.336936 | 0.562861 | 1.392855 | -0.063328 |
6 | 7.000000 | 0.121668 | 1.207603 | -0.002040 | 1.627796 |
7 | 8.000000 | 0.354493 | 1.037528 | -0.385684 | 0.519818 |
8 | 9.000000 | 1.686583 | -1.325963 | 1.428984 | -2.089354 |
9 | 10.000000 | -0.129820 | 0.631523 | -0.586538 | 0.290720 |
Highlight negative values
Can you highlight columns based on the column name? Sample color map:
{'A':'cyan', 'B':'lightblue', 'C':'lightyellow', 'D':'salmon', 'E':'lightgreen'}
A | B | C | D | E | |
---|---|---|---|---|---|
0 | 1.000000 | 1.329212 | nan | -0.316280 | -0.990810 |
1 | 2.000000 | -1.070816 | -1.438713 | 0.564417 | 0.295722 |
2 | 3.000000 | -1.626404 | 0.219565 | 0.678805 | 1.889273 |
3 | 4.000000 | 0.961538 | 0.104011 | nan | 0.850229 |
4 | 5.000000 | 1.453425 | 1.057737 | 0.165562 | 0.515018 |
5 | 6.000000 | -1.336936 | 0.562861 | 1.392855 | -0.063328 |
6 | 7.000000 | 0.121668 | 1.207603 | -0.002040 | 1.627796 |
7 | 8.000000 | 0.354493 | 1.037528 | -0.385684 | 0.519818 |
8 | 9.000000 | 1.686583 | -1.325963 | 1.428984 | -2.089354 |
9 | 10.000000 | -0.129820 | 0.631523 | -0.586538 | 0.290720 |
New column - sum of A & B. Showing split based on the percent
Can you add a new column which is sum of both A and B?
Then style it by showing split based on the percent of which value from column A or B as shown below:
A | B | B/A % | total | |
---|---|---|---|---|
0 | 12 | 3 | 25.000000 | 15 |
1 | 6 | 4 | 66.666667 | 10 |
2 | 8 | 1 | 12.500000 | 9 |
3 | 15 | 7 | 46.666667 | 22 |