[Bug]: The LLM strategy always sends all tokens from the URL to the LLM server even the URL input is HTML content

### crawl4ai version

0.6.3

### Expected Behavior

my example crawler:

```
llm_strategy = LLMExtractionStrategy(
    llm_config=self.llm_config,
    schema=PdfDoc.model_json_schema(),
    extraction_type="schema",
    instruction="""
    From the crawled content, extract data from html
    - data in html source include pdf file name and href url under data-pdf style attribute
    - One extracted post JSON format should look like this:

        {
            "name": "Volume 1 - 2024 CMD Rate Case - E-filing.pdf",
            "data_pdf": "/DMS/pdfview/``wportal`Documents`DMS`31`2494`/Volume 1 - 2024 CMD Rate Case - E-filing~pdf",
        }
    """,
    input_format="cleaned_html",
)

run_conf = CrawlerRunConfig(
    extraction_strategy=llm_strategy,
    cache_mode=CacheMode.DISABLED,
    target_elements=[
        ".btnOpenPdfFile",
        ".btn",
        ".btn-default",
        ".btn-sm",
    ],
    excluded_tags=[
        "header",
        "footer",
        "nav",
        "meta",
        "script",
        "style",
        "iframe",
        "li",
        "ul",
    ],
    prettiify=True,
)


resp = requests.get(url)
async with AsyncWebCrawler(config=self.browser_conf) as crawler:
    result = await crawler.arun(url=f"raw://{resp.text}", config=run_conf)
    assert isinstance(result, CrawlResultContainer)
```
Expected: only `cleaned_html` should be sent to LLM server

### Current Behavior

The behavior of llm extract strategy is send both url and html content to llm server. I think we doesn't handle the case when url input as raw html.  That leading to full raw html under `url` always sending to llm server. It's likely unexpected behavior, the token quota may leak due to it.

### Is this reproducible?

Yes

### Inputs Causing the Bug

```bash

```

### Steps to Reproduce

```bash

```

### Code snippets

```python

```

### OS

Linux

### Python version

3.12.3

### Browser

_No response_

### Browser version

_No response_

### Error logs & Screenshots (if applicable)

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: The LLM strategy always sends all tokens from the URL to the LLM server even the URL input is HTML content #1178

crawl4ai version

Expected Behavior

Current Behavior

Is this reproducible?

Inputs Causing the Bug

Steps to Reproduce

Code snippets

OS

Python version

Browser

Browser version

Error logs & Screenshots (if applicable)

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Bug]: The LLM strategy always sends all tokens from the URL to the LLM server even the URL input is HTML content #1178

Description

crawl4ai version

Expected Behavior

Current Behavior

Is this reproducible?

Inputs Causing the Bug

Steps to Reproduce

Code snippets

OS

Python version

Browser

Browser version

Error logs & Screenshots (if applicable)

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions