-
Notifications
You must be signed in to change notification settings - Fork 0
ERA-10519: [Events API Optimizations] Page Size change and Parallelization of calls #1202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The deployment failed. An Eslint issue it seems? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well done Gabo. This is an interesting pattern that we can then reuse in other places in the code to optimize our requests. I have many questions and suggestions since the code is kind of hard to understand in certain places. Maybe adding comments or refactorizing the code to making more readable would help.
src/ducks/events.js
Outdated
@@ -441,17 +442,15 @@ export const fetchMapEvents = (map, parameters) => async (dispatch, getState) => | |||
}); | |||
|
|||
let resultsToDate = []; | |||
const onEachRequest = onePageOfResults => { | |||
|
|||
const onEachRequest = ({ results: onePageOfResults }) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could follow a uniform language if we change the name of this method to onPageFetch
👍
@@ -16,7 +16,15 @@ describe('fetchMapEvents', () => { | |||
}); | |||
|
|||
test('appending a bbox parameter from the map object', async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another good place to add an improvement. A long time ago we decided to stop mocking or spying requests. Instead now we intercept them and mock a server response with MSW so the client code runs intact and our test quality improves. We could add MSW handlers here and stop spying axios 😄
(_, i) => i | ||
); | ||
|
||
axios.get = jest.fn(() => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same than before, but here I'd say is more important since these tests are good. Let's not mock axios, let's use MSW to intercept requests 👍
console.log('Wait for any fetch to complete.'); | ||
} | ||
|
||
const fetchPromise = processPage(page) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are mixing async / await and .then .catch patterns. Could we stay with async / await everywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got your point, but in this case I'm working with the reference of the promise not with the promise result.
Meaning that I want to set the finally
callback for later when the promise is resolved/rejected in order to remove it
from the Set, and also need to add that promise to the pool, both of those steps need to be accomplished without waiting the promise to be solved (not blocking the code execution), the blocking process is only required when Promise.race
enters in action.
|
||
for (const page of pagesToFetch) { | ||
while (activeFetches.size >= concurrencyLimit) { | ||
await Promise.race([...activeFetches]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I don't understand why we are using Promise.race here 🤔 The nature of this feature is not a race at all. We not only care about the response of the fastest request. We want to call 5 promises in parallel and everytime one responds just add another one, like a pool. It seems like everytime the race finishes, we will iterate in the while calling the Promise.race again with the 4 requests that didn't finish, causing doing the same requests several times (?) Could you explain a little the reasoning behind the race here please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a typo, I thought I was fixed it, it should be an if statement
Regarding the Promise.race, it finishes when one of the promises of the set is processed, meaning it succeed or failed, so that result is taken and Promise.race if finished also, as far as I understand it only checks promises statuses so there is no request duplication
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Promise.race is still confusing for me, I can't really link how a race is related to what we are doing here (I understand how it works though, just wonder if there is a more straightforward way to implement it). But now with the env I can see it working and it does what it is expected.
However, I get the impression that when using this new fetching system the environment gets kind of slow. But I'm not sure if I'm doing a false relationship here and maybe my computer is slow because of something else. I'd leave further testing on that to the QA team @fonzy911 @AlanCalvillo
return data; | ||
} | ||
|
||
async function parallelPaginatedRequest(apiUrl, requestConfig = {}, { itemsPerPage = 150, maxRetries = 3, concurrencyLimit = 5, onPageFetch = null } = {}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok having defaults I guess, but I see you added the defaults of the specific scenario you are working on (fetching map events). But this method is designed to be reused so maybe these are not the best generic defaults. I guess we shouldn't really have a default itemsPerPage
since that will vary every time. The rest seems ok for me
We can discuss the implementation in more detail, may be the pool can be handled only by a counter or something way more simpler ? not sure haha Also, I'm not sure if I'm getting wrongly the behavior/usage of Promise.race also it would be great to have more discussion on it Regarding the slowness of the site, its quite a point, lets see QA results 👀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change has been addressed, I sent the commit this morning but forgot to mentioned it here 🫡 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @gaboDev, this is looking great so far. I'm able to see the adjusted items number for both feed and map, however, just wanted to run something by you real quick @netomo.
When it comes to the map items, each batch of 150 within each page, is taking longer as they progress being fetched:
On root.dev, even though the map items count is lower (25) it is only taking this time to be fetched:
not sure if these are the numbers that we were expecting or if we should dive in a tad more.
@fonzy911 this isn't an apples-to-apples comparison. root.dev is fetching way fewer events than feature-env, can you ensure you are testing similar amounts of data? |
Hi @AlanCalvillo, yes, thank you very much for that input. You are right, I have made the comparison now using my local instance, pointing to this backend: https://era-10519.dev.pamdas.org, and I see the following: scoping this to map calls, I see that 6 calls of 25 map items take this amount of time to render: Which is around 3.30seconds per 150 items Now, with Gabo's fix, I see that 1 call of 150 map events, it takes around 2.24seconds to be completed And the backend params are showing appropriately: ![]() |
@@ -0,0 +1,75 @@ | |||
import axios from 'axios'; | |||
|
|||
async function fetchPage(apiUrl, requestConfig, page, pageSize, maxRetries) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we're pretty consistent throughout the repo with the const fetchPage = async (...whatever)
convention. Why go async function
instead?
} | ||
|
||
if (data === null) { // just logging in error, this place can be used to trigger a callback on maxRetries exhausted per page | ||
console.error(`Failed to fetch page ${page} after ${maxRetries} attempts.`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is logging on fetch cancellations, which are intentional and happen quite frequently. As a result, the console is getting spammed in Chrome. Can we tighten that up to only log errors if they're not cancellations?
@fonzy911 Did you gather metrics around how long the client takes to load X number of events, not only on a per-call basis but on the overall sum? |
Hi @AlanCalvillo, yes, I was able to draft some numbers at a larger scale, this is what I got: ![]() Without Parallelization, every call (for 25 items) takes in average 570ms to be completed With Parallelization, it takes an average of 3.55 seconds for every 150 items call, making 591ms per every 25 items called. |
So if time w parallelization > w/o parallelization, to me, kinda suggests that the objective is not being met. why is it approved? |
Previously, when running on an individual basis, the Parallelization process seemed to be faster than the original feature, that's where I originally approved it, but yes, it seems that performance diminishes as calls number rises. I have removed my approval due to new findings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requesting changes as per the recent QA dialogue
After some meetings and discussions We agreed on changing the original QA strategy since it was not covering the whole process, it was changed to something more suitable in regards of the technical approach of this ticket cc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @gaboDev Gabo,
Thank you very much for implementing the new feature and addressing my previous comments so thoughtfully. I really appreciate the effort you’ve put into this.
To provide additional context, I conducted some tests to evaluate the performance of the new parallelization code compared to the previous implementation. Here's a quick summary of my approach:
I loaded a local instance of the app without the parallelization code and measured the time it took to render 1000, 2000, and 3000 items.
I repeated the same process using an instance with the new parallelization code applied.
I then compared the rendering times for both setups.
Initially, the results showed that the instance with the parallelization code took slightly more time to load all the items.
After discussing this with the development team, they clarified that the primary objective of this story was to implement the parallelization feature. They emphasized that summing individual API response times (e.g., A=2s, B=3s, C=4s) is no longer an accurate measure due to concurrent execution. Instead, testing should focus on improvements in overall UX, error handling, and HTTP call pool behavior.
When I reviewed this live using a waterfall view in the network tab, I observed that the map items were indeed rendered while calls were being executed concurrently within the same period, which supports the devs' explanation.
Based on these findings, I’m approving this PR. Testing it in a higher environment will give us a clearer picture of how it behaves under more extensive conditions.
Great work, and thanks again for your effort on this!
cc: @amicavi , @JoshuaVulcan
What does this PR do?
Relevant link(s)