To validate IP addresses in a Pandas DataFrame, we can use
- the `pd.Series.apply() method
- custom function or regex
Here are the 2 ways to validate IP addresses in Pandas:
(1) validate with regex
df['ip'].str.contains(r"^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$")
(2) custom validation function
def validate_ip(ip):
try:
parts = ip.split('.')
return len(parts) == 4 and all(0 <= int(part) < 256 for part in parts)
except ValueError:
return False
except (AttributeError, TypeError):
return False
df['valid_ip'] = df['ip'].apply(validate_ip)
Suppose we work with custom DataFrame like:
import pandas as pd
data = {'ip': ['192.168.0.1', '192.256.0.1', '192.168.0.2']}
df = pd.DataFrame(data)
validate with regex
To validate IP addresses with regex we have freedom of how strict the validation will be:
- strict regex for IP validation -
"^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])$"
- basic IP validation -
r"^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$"
So the Pandas validation will be applied by method str.contains
and passing the regex:
df['ip'].str.contains(r"^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$")
So it the basic generation we get:
0 True
1 True
2 True
Name: ip, dtype: bool
Using the strict validation we get the correct result:
regex = "^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])$"
df['ip'].str.contains(regex)
result:
0 True
1 False
2 True
Name: ip, dtype: bool
validate with custom function
Alternatively we can use a custom function to validate IP addresses in Pandas.
This creates a new column 'valid_ip' in the DataFrame with a Boolean value. The column indicates whether each IP address is valid or not:
def validate_ip(ip):
try:
parts = ip.split('.')
return len(parts) == 4 and all(0 <= int(part) < 256 for part in parts)
except ValueError:
return False
except (AttributeError, TypeError):
return False
df['valid_ip'] = df['ip'].apply(validate_ip)
ip | valid_ip | |
---|---|---|
0 | 192.168.0.1 | True |
1 | 192.256.0.1 | False |
2 | 192.168.0.2 | True |