If you're working with textual data in a Pandas DataFrame and want to find all words written in uppercase, there are several simple ways to do it using Python.

A "capital word" here means a word where every letter is uppercase (like JAVA or PYTHON).

This is useful when cleaning data, detecting acronyms, or filtering for entries that stand out in text data.

Example DataFrame

Here’s a sample DataFrame we’ll work with:

import pandas as pd

data = {
    'country': ['JAPAN', 'INDIA', 'CHINA', 'FRANCE'],
    'users': [30, 15, 25, 3],
    'city': ['TOKYO', 'Delhi', 'Beijing', 'PARIS']
}

df = pd.DataFrame(data)
df

This results in:

country users city
0 JAPAN 30 TOKYO
1 INDIA 15 Delhi
2 CHINA 25 Beijing
3 FRANCE 3 PARIS

1. Use str.isupper() for Simple Matching

The easiest way is to convert each element to a string and check if it is uppercase:

import pandas as pd

caps = []

for col in df.columns:
    for val in df[col]:
        if str(val).isupper():
            caps.append(val)

print(caps)

Output:

['JAPAN', 'INDIA', 'CHINA', 'FRANCE', 'TOKYO', 'PARIS']

This method checks if the string version of each cell is all uppercase.

2. Use a Regex Pattern

If you prefer regular expressions, you can match uppercase words using a pattern like r'^[A-Z]+$':

import re

caps = []

for col in df.columns:
    for val in df[col]:
        if re.match(r'^[A-Z]+$', str(val)):
            caps.append(val)

print(caps)

This matches values made only of uppercase letters and excludes numbers or mixed-case strings.

['JAPAN', 'INDIA', 'CHINA', 'FRANCE', 'TOKYO', 'PARIS']

3. Apply Across Entire DataFrame

You can also use map() to test every cell at once and collect uppercase words:

caps = df.map(lambda x: str(x).isupper())

uppercase_values = df[caps].stack().tolist()
print(uppercase_values)

This returns the same list of uppercase words.

['JAPAN', 'TOKYO', 'INDIA', 'CHINA', 'FRANCE', 'PARIS']

4. Extract words starting with Capital Letters

We can also extract words staring with Capital letters but ending in normal case by regex:

import re

caps = []

for col in df.columns:
    for val in df[col]:
        if re.match(r'^[A-Z][a-z]+$', str(val)):
            caps.append(val)

print(caps)

result:

['Delhi', 'Beijing']

Resources