Real-world testing initialization using the EntityEvent approach #120
Replies: 2 comments
-
| Thanks for looking into this, Bryan. Some of the key-fetch results are similar to what we found when we were doing research into Web API Core and replication in 2018 when we first started discussing it and EntityEvent. Current Test Results In terms of average next link times, we haven't sampled them for DD 2.0 (and it's not guaranteed to work in all cases currently because we haven't certified people on it) so we don't have data for it yet. Most don't pass on the first try. Also, most people aren't respecting the header value that sets a larger page size at the moment so it may not be as efficient now as it will be in 6-12mo. It will be interesting to retest this against a larger sample of providers once they've all been certified for DD 2.0. However, we do have extensive stats on how fast it is to fetch using modification timestamp queries (which many still use because nextLink may not be generally available). It's roughly 12s per 1,000 records across all resources, standard or local, with a page size of 100 records per page and a sample size of 661 markets and up to 100,000 records per resource. This means it would take roughly 120s for 10,000 records, on average. Many support more than 100 records per page using this approach. YMMV. The Property Resource on its own is is roughly 22s / 1,000 records, meaning 220s for 10,000. This seems faster than the average of your results above? I'd be curious if you have a similar result locally. As a side note, we don't test concurrent requests since their behavior varies widely between providers and when we've tried in testing we've been rate limited a lot or they didn't support them. Everything is done sequentially, one-page-at-a-time per resource for a more "apples to apples" comparison. EntityEvent Goals While performance is one aspect of the EntityEvent proposal, it is not the only consideration. Some others are: 
 It's great if it can be as fast as possible, and we probably wouldn't want things to be much slower than they are now for the items above alone. However, there are some limitations: 
 | 
Beta Was this translation helpful? Give feedback.
-
| Thank you for doing this detailed analysis Bryan! This does help build the case for EntityEvent and it's effectiveness for replication. Nicely done with the headers by the way, it was really handy to tell which of those were our systems. I can tell you that the  | 
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
At the 2024 RESO Winter Dev conference (March 5–6, 2024) there was discussion on the best ways for consumers to "initialize" against a producer – the best way for consumers to get all of the data they need before they get to the point where they can "keep up" with new updates.
The general consensus at the meeting was that using the new EntityEvent standard, and then fetching using keys, directly, would be the preferred way to initialize and would offer a significant speedup over the current best practice of following
@odata.nextLink.Because we also talked about using evidence to make decisions several times, I thought it would be beneficial to try out these ideas and see what results I could come up with. SCIENCE!
tl;dr
Methodology
I tested five different MLSes across (I think) four different providers that Zenlist has access to.
I initialized between 10,000 and 40,000 properties for each MLS (depending on a variety of factors), first using the next link approach, and then using the entity event approach, and recorded information.
As far as I am aware, none of these MLSes support the very new EntityEvent standard yet. However, it's still possible to test the entity event approach because EntityEvent provides a list of keys to fetch, so it's easy to simulate as long as the test knows the list of keys in advance. Since I did the next link test first, I had the list of keys in advance.
In this test, I have not factored in the time it would take to traverse the entity event history; I assume that traversing that will not significantly add to the time.
When using the entity event approach, for four MLSes I generated a request that included a filter like
the fifth MLS did not support the
inoperator, so instead I generated a request that included a filter likeFor all of the MLSes, I initially tried the entity event approach using 100 concurrent requests. In two cases, I needed to scale that down to complete the tests.
Rate limits
Rate limits were not a major concern while performing these tests. Only one API rate limited me, and turning down the concurrency took care of that.
However, I opted not to test one particular provider because I know from past experience that they wouldn't rate limit Zenlist's access token while running the test, but would rate limit our access token for a full hour later in the day.
Results
dvbgsfewdkftzalsnozdzxyarMore details
dvbgsfewdkftzalsnozdzxyarChosing the number of properties per request for the entity event approach
The discussion at the dev conference suggested 1,000 properties per request. Across the 5 MLSes tested, I was never able to get more than 200 properties per request for a variety of reasons.
I typically started at 1,000, but would receive an HTTP 414: URI Too Long or an HTTP 413: Content Too Large error.
But even when I kept the URI within the HTTP spec limits, there was still typically an application limit that prevented going over 200.
For
dvbgschoosing anything over 150 resulted in an error "The query specified in the URI is not valid. Max Node Limit Exceeded 750".For
fewdkandftzal, anything over 200 would result in an error "Maximum value for $top is 200". Making a request without a$top=...would return 10 items.For
snozd, anything over 200 would result in an error "You have exceeded the maximum value count. Please limit to 200 values in a single expression."For
zxyar, anything over 200 would return successfully, but would only include 200 results.Per-MLS caveats
dvbgsinapproach; I used theorapproach instead.fewdk@odata.nextLinkto initialize worked for 10,000 listings. After that, it returned the errorMaximum value for $skip is 10000. Custom code would need to be written to initialize beyond 10,000 listings.ftzal@odata.nextLinkto initialize worked for 10,000 listings. After that, it returned the errorMaximum value for $skip is 10000. Custom code would need to be written to initialize beyond 10,000 listings.snozdWhich MLSes were tested?
The point of this was not to glorify or shame any particular MLS or provider. It was to get an early idea of how the entity event approach performs in the real world. So I'm not actively reporting which MLSes or providers were tested.
I used the user agent
Zenlist/1.0 EntityEventTest/{5-letter id} (+bryan@zenlist.com)when making requests, so if your MLS was tested and you want to correlate with the above table, you can look in your request logs. Or reach out to me privately.Beta Was this translation helpful? Give feedback.
All reactions