To plot two variables on two sides of Y-axes, we can plot in two steps:
- '(.*?)\n'
- '.+?(?=\n)'
Steps to extract everything until/after
Below are the steps which I usually follow for regex extraction in Pandas
- analyse the data from which I will extract
- clean the data
- choose pandas method - split,extractetc
- define regex pattern
- create new column(s)
Data
Let's create simple sample DataFrame to be used for regex extraction:
from faker import Faker
import pandas as pd
Faker.seed(0)
fake = Faker()
addr = []
for _ in range(5):
    addr.append(fake.address())
df = pd.DataFrame({'address':addr})
| address | |
|---|---|
| 0 | 48764 Howard Forge Apt. 421\nVanessaside, VT 79393 | 
| 1 | PSC 4115, Box 7815\nAPO AA 41945 | 
| 2 | 778 Brown Plaza\nNorth Jenniferfurt, VT 88077 | 
| 3 | 3513 John Divide Suite 115\nRodriguezside, LA 93111 | 
| 4 | 398 Wallace Ranch Suite 593\nIvanburgh, AZ 80818 | 
Example 1 - Captcharing group and characters
Extract everything in Pandas column up to new line
df['address'].str.extract('(.*?)\n')
result:
| 0 | |
|---|---|
| 0 | 48764 Howard Forge Apt. 421 | 
| 1 | PSC 4115, Box 7815 | 
| 2 | 778 Brown Plaza | 
| 3 | 3513 John Divide Suite 115 | 
| 4 | 398 Wallace Ranch Suite 593 | 
Example 2 - Non captcharing groups
Extract everything in Pandas column up to new line
df['address'].str.extract('(.+)?(?=\n)')
result:
| 0 | |
|---|---|
| 0 | 48764 Howard Forge Apt. 421 | 
| 1 | PSC 4115, Box 7815 | 
| 2 | 778 Brown Plaza | 
| 3 | 3513 John Divide Suite 115 | 
| 4 | 398 Wallace Ranch Suite 593 | 
Output

 
                     
                         
                         
                         
                         
                         
                         
                         
                        