- Implement regex functionality to remove noisy tokens (e.g., new lines, special characters). - Leverage LLMs to assist users in cleaning up their unstructured data. - Provide options for data transformation and normalization.