Skip to content

Warcserver - Performance issues / warmup effects / caching #942

@msvensson222

Description

@msvensson222

Hey!

Really appreciate this project & repo, been super fun to work with!

Now, I've got a question I can't really seem to figure out the answer to, so I come here in hopes for some kind soul to help me out :)

Background and problem

I have ~750k WARC records locally (25 files, ~1GB each), with corresponding .cdxj files (one per, so also 25 files).

I start the warcserver like; warcserver -t 10 locally on my macbook pro.
Now, if I sequentially perform a lot of requests like;

endpoint = "http://localhost:8070/"
full_request_url= f"{endpoint}my-coll/resource?url={url}"

with random urls, it takes around 700 requests before the average response time stabilizes. (See attached image below).
Not sure if relevant, but I can see a lot of Dir collections/my-col/indexes/ unchanged among the requests in the warcserver logs as well.

My questions

  1. Why is this? Is there some type of caching going on? I've searched the entire pywb docs but can't seem to find anything relating to caching.
  2. Can I somehow "avoid" this warmup period?

Thanks in advance!

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions