To validate IP addresses in a Pandas DataFrame, we can use
- the `pd.Series.apply() method
- custom function or regex
Here are the 2 ways to validate IP addresses in Pandas:
(1) validate with regex
(2) custom validation function
Suppose we work with custom DataFrame like:
validate with regex
To validate IP addresses with regex we have freedom of how strict the validation will be:
- strict regex for IP validation -
"^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])$"
- basic IP validation -
r"^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$"
So the Pandas validation will be applied by method str.contains
and passing the regex:
So it the basic generation we get:
0 True
1 True
2 True
Name: ip, dtype: bool
Using the strict validation we get the correct result:
result:
0 True
1 False
2 True
Name: ip, dtype: bool
validate with custom function
Alternatively we can use a custom function to validate IP addresses in Pandas.
This creates a new column 'valid_ip' in the DataFrame with a Boolean value. The column indicates whether each IP address is valid or not:
ip | valid_ip | |
---|---|---|
0 | 192.168.0.1 | True |
1 | 192.256.0.1 | False |
2 | 192.168.0.2 | True |