Add Count Segments String Algorithm in R #214
Open
+284
−106
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR introduces a comprehensive and well-documented implementation of the Count Segments problem in R for efficient word boundary detection and segment counting.
Overview
The algorithm counts the number of segments (words) in a given string by detecting word boundaries.
A segment is defined as a sequence of non-space characters, and segments are separated by one or more spaces.
It efficiently identifies transitions from space to non-space characters to determine segment counts.
Features
current != ' '
AND(i == 1 OR previous == ' ')
count_segments()
– Primary boundary detection methodcount_segments_regex()
– Regex-based alternative for validationcount_segments_vectorized()
– Vectorized version for multiple string inputsstring_manipulation
scriptsComplexity
O(n)
— Each character is processed exactly onceO(1)
— Constant space (orO(n)
due to R’s internal character vector representation)Directory
DIRECTORY.md
Added:
"Count Segments"
entry under String ManipulationDemonstration
Run the following script to execute built-in examples and test cases:
PowerShell
Rscript "string_manipulation/count_segments.r"