If you're seeing the warning:

FutureWarning: Passing literal html to 'read_html' is deprecated and will be removed in a future version

this means you're using pandas' read_html() function in a way that will soon be unsupported.

Why the Change?

The pandas development team is deprecating the direct passing of HTML strings to read_html(). This change promotes better practices and prepares for future improvements to the function.

How to Fix It

Instead of:

pd.read_html("<table>...</table>")

which will raise warning and in future error:

FutureWarning: Passing literal html to 'read_html' is deprecated and will be removed in a future version

You should now use:

from io import StringIO
pd.read_html(StringIO("<table>...</table>"))

Or for HTML files:

pd.read_html("path/to/file.html")  # This is still valid

Use StringIO for HTML strings

import pandas as pd
from io import StringIO

html_data = """
<table>
    <tr><th>name</th><th>age</th></tr>
    <tr><td>Alice</td><td>25</td></tr>
    <tr><td>Bob</td><td>30</td></tr>
</table>
"""

df = pd.read_html(StringIO(html_data))[0]
print(df)

result:

    name  age
0  Alice   25
1    Bob   30

Use requests for web content

import requests
import pandas as pd

response = requests.get('https://example.com/data.html')
df = pd.read_html(StringIO(response.text))

Why This Matters

Making this change now will:

  1. Future-proof your code
  2. Remove annoying warning messages
  3. Ensure compatibility with upcoming pandas versions

Security concerns top the list, as accepting arbitrary HTML strings can potentially expose applications to security vulnerabilities. By requiring explicit file paths or URLs, pandas encourages safer data handling practices.

API clarity is another driving factor. Having a single function that accepts multiple input types can lead to confusion about expected behavior and error handling. Separating these concerns makes the API more predictable and easier to maintain.