Skip to content

Conversation

lijunwangs
Copy link

@lijunwangs lijunwangs commented Oct 12, 2025

Problem

Vote using QUIC requires a simpler QOS instead of stake weighted as we want to allow all staked nodes to be able to vote and there is no need to differentiate the count of votes per vote interval based on stakes.

Summary of Changes

  1. Use a trait QosController to encapsulate different Qos Implementations -- declared in qos.rs
  2. Logic specific to SWQOS is refactored to swqos.rs from quic.rs
  3. Logic specific to simple QOS is implemented in simple_qos.rs
  4. Added unit tests for testing simple QOS
  5. Change votes QUIC serverice to use simple QOS.

There should be 0 impacts on existing SwQOS implementations as it is pure refactoring it.

I have annotated the code on major functions being refactored.

Fixes #

const WAIT_FOR_CONNECTION_TIMEOUT: Duration = Duration::from_secs(1);
debug!("spawn quic server");
let mut last_datapoint = Instant::now();
let unstaked_connection_table: Arc<Mutex<ConnectionTable>> = Arc::new(Mutex::new(
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic moved to different QOS implementations as it is QOS specific.

tasks
}

fn prune_unstaked_connection_table(
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic moved to swqos.rs

))
}

fn compute_max_allowed_uni_streams(peer_type: ConnectionPeerType, total_stake: u64) -> usize {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic moved to swqos.rs

MaxStreamError,
}

#[derive(Clone)]
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer needed as replaced by the Qos trait and ConnectionContext.

}
}

fn handle_and_cache_new_connection(
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic of handling adding to connection cache is moved to QosController::try_cache_connection

}
}

async fn prune_unstaked_connections_and_add_new_connection(
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

connection caching part is moved to swqos.rs

}

/// Calculate the ratio for per connection receive window from a staked peer
fn compute_receive_window_ratio_for_staked_node(max_stake: u64, min_stake: u64, stake: u64) -> u64 {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to swqos.rs

return;
}

let params = get_connection_stake(&new_connection, &staked_nodes).map_or(
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic is refactored into the QOS implementations. Essentially it does the following 3 things:

build the connection context,
try cache the connection cache.
handle the connection.

qos.on_stream_closed(&context);
}

let stable_id = connection.stable_id();
Copy link
Author

@lijunwangs lijunwangs Oct 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic encapuslated in Qos's remove_connection handler.

}

enum ConnectionTableType {
pub(crate) enum ConnectionTableType {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made pub(crate) as needed by both type of QOS

assert_eq!(stats.open_connections.load(Ordering::Relaxed), 0);
}

#[test]
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests moved to swqos.

@codecov-commenter
Copy link

codecov-commenter commented Oct 12, 2025

Codecov Report

❌ Patch coverage is 91.78378% with 76 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.2%. Comparing base (0199a3a) to head (1be7ec2).

Additional details and impacted files
@@           Coverage Diff            @@
##           master    #8437    +/-   ##
========================================
  Coverage    83.1%    83.2%            
========================================
  Files         840      842     +2     
  Lines      367669   368120   +451     
========================================
+ Hits       305719   306342   +623     
+ Misses      61950    61778   -172     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

(t, receiver, server_address, cancel)
}

fn setup_simple_qos_quic_server_with_params(
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for testing simple Qos server

}

impl SwQos {
fn cache_new_connection(
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic moved from previous handle_and_cache_new_connection with the caching part only without handling the connection.

}
}

fn prune_unstaked_connection_table(
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved from previous quic.rs's prune_unstaked_connection_table

}
}

async fn prune_unstaked_connections_and_add_new_connection(
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved from previous quic.rs's prune_unstaked_connections_and_add_new_connection

}

impl QosController<SwQosConnectionContext> for SwQos {
fn derive_connection_context(&self, connection: &Connection) -> SwQosConnectionContext {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic moved from previous quic.rs's setup_connection's code on determining the connection stake and peer type

}

#[allow(clippy::manual_async_fn)]
fn try_cache_connection(
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From previous quic.rs's setup_connection on handling caching the connection.


#[cfg(test)]
pub mod test {
use super::*;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test cases moved from previous quic.rs specific to SWQOS

@lijunwangs lijunwangs marked this pull request as ready for review October 12, 2025 23:03
};

// Convservatively allow 50 TPS per validator.
pub const MAX_VOTES_PER_SECOND: u64 = 50;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this defined here? I'd define this in the streamer itself. Having cross-crate deps makes reasoning harder.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it really belongs to here as it is this service controlling the streamer service. The streamer is generic and has no idea about vote or not.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By this logic maybe this belongs in TPU? Voting service is the client, not the server in this relationship.
Also I'd probably cut this down to 20 or so - there is no reasonable way we'd have to cast over 10 votes per second.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right! I moved it to tpu and updated the value to 20/s.

@lijunwangs lijunwangs requested a review from wen-coding October 13, 2025 17:13
minimum one stream per throttle window for simple QOS

WIP streamer_support_qos_as_trait

WIP refactoring QUIC streamer

cleaning up code

WIP refactoring

continued refactoring

continued refactoring

clean up code

fmt code

Fixed some comp issues

Fixed some comp issues

Ignore manual async warning as it is intentional

Moved more swqos functions out
@lijunwangs lijunwangs force-pushed the streamer_support_qos_as_trait.rb branch from 744e899 to 1be7ec2 Compare October 15, 2025 02:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants