We can use the following syntax to margin on a single axis column or row in Pandas:
(1) Margin only on rows
df.pivot_table(index='foo', columns='bar', values='baz', margins=True).iloc[:, :-1]
(2) Margin only on columns
df.pivot_table(index='foo', columns='bar', values='baz', margins=True).iloc[:-1, :]
In the examples above we do margin on both axes but return only a given total by removing results with function .iloc[:-1, :]
.
Example
Suppose we have a DataFrame like:
import pandas as pd
df = pd.DataFrame({'foo': ['one', 'one', 'one', 'two', 'two', 'two'],
'bar': ['A', 'B', 'B', 'A', 'B', 'C'],
'baz': [1, 2, 3, 4, 5, 6],
'zoo': ['x', 'y', 'z', 'q', 'w', 't']})
with data:
foo | bar | baz | zoo | |
---|---|---|---|---|
0 | one | A | 1 | x |
1 | one | B | 2 | y |
2 | one | B | 3 | z |
3 | two | A | 4 | q |
4 | two | B | 5 | w |
To get totals or margins per column or rows we need to pass argument - margins=True
:
df.pivot_table(index='foo', columns='bar', values='baz', margins=True)
result:
bar | A | B | C | All |
---|---|---|---|---|
foo | ||||
one | 1.0 | 2.0 | 3.0 | 2.0 |
two | 4.0 | 5.0 | 6.0 | 5.0 |
All | 2.5 | 3.5 | 4.5 | 3.5 |
1. Margin only on rows
To margin only on rows we can do:
df.pivot_table(index='foo', columns='bar', values='baz', margins=True).iloc[:, :-1]
Which results into new row with totals:
bar | A | B | C |
---|---|---|---|
foo | |||
one | 1.0 | 2.0 | 3.0 |
two | 4.0 | 5.0 | 6.0 |
All | 2.5 | 3.5 | 4.5 |
2. Margin only on columns
If we prefer to get totals only on column level we can use:
df.pivot_table(index='foo', columns='bar', values='baz', margins=True).iloc[:, :-1]
Which results into total as a new column:
bar | A | B | C | All |
---|---|---|---|---|
foo | ||||
one | 1.0 | 2.0 | 3.0 | 2.0 |
two | 4.0 | 5.0 | 6.0 | 5.0 |
Conclusion
By default we can only control whether or not to have total columns by passing margins
:
If margins=True, special All columns and rows will be added with partial group aggregates across the categories on the rows and columns.
We can also change the name of the Total columns by: margins_name
:
Name of the row / column that will contain the totals when margins is True.