Skip to content

Macros to encode/decode URL #1013

@etx121

Description

@etx121

Describe the feature

With dbt-utils, we can use this to extract params for url:

{{ dbt_utils.get_url_parameter(field='page_url', url_parameter='utm_source') }}

However, sometimes, in the params, it is another URL which is encoded.
It would have been great to have macros directly included inside the package to encode/decode the URLs.

Describe alternatives you've considered

I am currently creating my own macros, it does the job (not 100% sure), but it would be better inside a maintained package, especially the one I am using to extract URL params. Here are the macros I am using:

{% macro url_decode(encoded_string) %}
    REGEXP_REPLACE(
        REGEXP_REPLACE(
            REGEXP_REPLACE(
                REGEXP_REPLACE(
                    REGEXP_REPLACE(
                        REGEXP_REPLACE(
                            REGEXP_REPLACE(
                                REGEXP_REPLACE(
                                    REGEXP_REPLACE(
                                        REGEXP_REPLACE(
                                            REGEXP_REPLACE(
                                                REGEXP_REPLACE(
                                                    REGEXP_REPLACE(
                                                        REGEXP_REPLACE(
                                                            REGEXP_REPLACE(
                                                                REGEXP_REPLACE(
                                                                    {{ encoded_string }},
                                                                    E'%3F', E'?', 'g'  -- Question mark - escaped in pattern, not in replacement
                                                                ),
                                                                E'%20', E' ', 'g'
                                                            ),
                                                            E'%21', E'!', 'g'
                                                        ),
                                                        E'%22', E'"', 'g'
                                                    ),
                                                    E'%23', E'#', 'g'
                                                ),
                                                E'%24', E'$', 'g'  -- Dollar sign - no escape in replacement
                                            ),
                                            E'%25', E'%', 'g'
                                        ),
                                        E'%26', E'&', 'g'
                                    ),
                                    E'%27', E'''', 'g'
                                ),
                                E'%28', E'(', 'g'  -- Parenthesis - no escape in replacement
                            ),
                            E'%29', E')', 'g'  -- Parenthesis - no escape in replacement
                        ),
                        E'%2B', E'+', 'g'  -- Plus sign - no escape in replacement
                    ),
                    E'%2C', E',', 'g'
                ),
                E'%2F', E'/', 'g'
            ),
            E'%3A', E':', 'g'
        ),
        E'%3D', E'=', 'g'
    )
{% endmacro %}
{% macro url_encode(string) %}
    REGEXP_REPLACE(
        REGEXP_REPLACE(
            REGEXP_REPLACE(
                REGEXP_REPLACE(
                    REGEXP_REPLACE(
                        REGEXP_REPLACE(
                            REGEXP_REPLACE(
                                REGEXP_REPLACE(
                                    REGEXP_REPLACE(
                                        REGEXP_REPLACE(
                                            REGEXP_REPLACE(
                                                REGEXP_REPLACE(
                                                    REGEXP_REPLACE(
                                                        REGEXP_REPLACE(
                                                            REGEXP_REPLACE(
                                                                REGEXP_REPLACE(
                                                                    {{ string }},
                                                                    E'%', E'%25', 'g'  -- Must encode % first to avoid double encoding
                                                                ),
                                                                E' ', E'%20', 'g'
                                                            ),
                                                            E'!', E'%21', 'g'
                                                        ),
                                                        E'"', E'%22', 'g'
                                                    ),
                                                    E'#', E'%23', 'g'
                                                ),
                                                E'\\$', E'%24', 'g'
                                            ),
                                            E'&', E'%26', 'g'
                                        ),
                                        E'''', E'%27', 'g'
                                    ),
                                    E'\\(', E'%28', 'g'
                                ),
                                E'\\)', E'%29', 'g'
                            ),
                            E'\\+', E'%2B', 'g'
                        ),
                        E',', E'%2C', 'g'
                    ),
                    E'/', E'%2F', 'g'
                ),
                E':', E'%3A', 'g'
            ),
            E'=', E'%3D', 'g'
        ),
        E'\\?', E'%3F', 'g'
    )
{% endmacro %}

Additional context

None

Who will this benefit?

Example:
https://myurl.com?u=https%3A%2F%anotherurl.com%2Fsection%2F
=> extracted params from u: https%3A%2F%anotherurl.com%2Fsection%2F
=> decoded params: https://anotherurl.com/section/

Are you interested in contributing this feature?

See my macros above, but I am it can be better than that

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions