Skip to content

Add user-defined headers before forward response to client #872

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Feb 26, 2018

Conversation

vankoven
Copy link
Contributor

fix #511

Copy link
Contributor

@krizhanovsky krizhanovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple extensions are required.

#
# Default:
# None.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget to add the description to the Wiki

@@ -74,6 +75,8 @@ static size_t tfw_capolicy_sz = 0; /* Current size. */
*/
#define TFW_NIPDEF_ARRAY_SZ (64)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary empty line

@@ -1762,6 +1789,10 @@ tfw_http_adjust_resp(TfwHttpResp *resp, TfwHttpReq *req)
if (r < 0)
return r;

r = tfw_http_add_loc_hdrs(hm, req);
if (r < 0)
return r;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not only requests are must be adjustable - an example from the issue hdr_add Expires "Thu, 15 Apr 2020 20:00:00 GMT" is about responses.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually only responses are adjusted in the patch: tfw_http_add_loc_hdrs() function is called in the tfw_http_adjust_resp() as a part of response processing. Is it required to adjust requests before forwarding them to backend servers?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, my bad - of course I was commenting tfw_http_adjust_resp() :)

We absolutely have to be able to adjust requests before forwarding them to a backend. X-Forwarded-For is the most trivial example which we hard code for now. Other examples could be security tokens passed to most powerful WAFs or removing some danger headers from suspicious requests.

BUG_ON(!loc);

for (i = 0; i < loc->usr_hdrs_sz; ++i) {
r = tfw_http_msg_hdr_add(hm, loc->usr_hdrs[i]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the issue isn't clear: it's titled "transform custom HTTP headers", but specifies hdr_add only. There is reference to Nginx's proxy_set_header which adds a header in all cases except Host and Connection headers when it actually replaces them. That's absolutely non-flexible. It's better to allow a user to delete and replace any headers as well as add it, so please add header deletion and substitution as well:

    hdr_del <header name>
    hdr_sub <header name> <header value>

# TAG: req_hdr_set
#
# Replace or remove a user-defined header from HTTP request message before
# forwarding it to backend server.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, make sense. But isn't it better to leave only one configuration option like req_hdr_xfrm? E.g. to add or replace content of X-Forwarded-For (as oposite to current hardcoded version it replaces the header if it's present in the request:

req_hdr_xfrm X-Forwarded-For 127.0.0.1

or to remove we just omit the value:

req_hdr_xfrm X-Forwarded-For

Just like tfw_http_msg_hdr_xfrm(). The same for responses.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only difference between req_hdr_add and req_hdr_set is ability of _add to append user defined string to the existing header. In this case original value of header will be saved, but a user defined string will be added to the header. If the feature is not required, req_hdr_add directive can be easily removed.

E.g. for the configuration resp_hdr_add Set-Cookie "ck=djnfsdnfkdn" the header
Set-Cookie: c=fvfkm will be transformed into Set-Cookie: c=fvfkm, ck=djnfsdnfkdn

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So resp_hdr_set deletes a header, adds it to a HTTP message or replaces a header if it's already present, is it correct? If so, then please describe the difference in https://github.com/tempesta-tech/tempesta/wiki/Modify-HTTP-Messages , preferably with some examples.

TFW_HTTP_MSG_RESP,

TFW_HTTP_MSG_NUM
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are clearly HTTP constants, why aren't they in http_msg.h?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is still not addressed.

@@ -79,7 +140,6 @@ __http_msg_hdr_val(TfwStr *hdr, unsigned id, TfwStr *val, bool client)
[TFW_HTTP_HDR_X_FORWARDED_FOR] = SLEN("X-Forwarded-For:"),
[TFW_HTTP_HDR_KEEP_ALIVE] = SLEN("Keep-Alive:"),
[TFW_HTTP_HDR_TRANSFER_ENCODING] = SLEN("Transfer-Encoding:"),
[TFW_HTTP_HDR_USER_AGENT] = SLEN("User-Agent:"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why User-Agent is removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks confusing to init the same array member twice, since

TFW_HTTP_HDR_SERVER = TFW_HTTP_HDR_USER_AGENT,

The User-agent case is checked in a few lines below:

nlen = SLEN("User-Agent:");

@@ -26,6 +26,67 @@
#include "http_msg.h"
#include "ss_skb.h"

/**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the copyright year in all the affected files.

#undef TfwStr_string
};
DEFINE_TFW_STR(user_agent, "User-Agent:");
DEFINE_TFW_STR(if_none_match, "If-None-Match:");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the two headers aren't in the array? It's better to use bit different code for requests and responses. The function is used in tfw_cfgop_mod_hdr() only and there you have msg_type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to tfw_http_hdr_t:

TFW_HTTP_HDR_SERVER = TFW_HTTP_HDR_USER_AGENT,

TFW_HTTP_HDR_ETAG = TFW_HTTP_HDR_IF_NONE_MATCH,

So the headers can't be listed in the array due to duplicated indexes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this why I wrote "It's better to use bit different code for requests and responses. The function is used in tfw_cfgop_mod_hdr() only and there you have msg_type." It's can be figured out why the code looks exactly this, but from first glance it looks strange and it's better to rework it in more straightforward way.

if (!tfw_stricmpspn(hdr, &user_agent, ':'))
return TFW_HTTP_HDR_USER_AGENT;
if (!tfw_stricmpspn(hdr, &if_none_match, ':'))
return TFW_HTTP_HDR_IF_NONE_MATCH;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, the function is used on configuration phase only. Please write this in a coment or use binary search in it since tfw_stricmpspn() returns the strings relation - it's just confusing to see the linear scan.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually there is __tfw_http_msg_find_hdr() function for binary search. But it requires alphabetically sorted array. That's not true for the array since original tfw_http_hdr_t is not sorted. The function is called only at configuration processing, so i've decided to keep it as clean as possible

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, then just point out this in a comment.

My concern is that we have several library-like files (str.[ch], ss_skb.[ch], http_msg.[ch]) which contain logic useful in various situations. The function also looks like very useful in future, so it's necessary either to limit it's usage or write in in very solid way.

* Get header id in header table for header @hdr.
*/
unsigned int
tfw_http_msg_is_spec_hdr(const TfwStr *hdr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to rename the function to tfw_http_msg_spec_hid() or tfw_http_msg_spec_hdr_id() - it returns the ID and doesn't answer IS question.

}

return dst;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a test for the function to t/unit/test_tfw_str.c

Copy link
Contributor

@krizhanovsky krizhanovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to merge with the couple cleanups.

TFW_HTTP_MSG_RESP,

TFW_HTTP_MSG_NUM
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is still not addressed.

unsigned int id; /* id in TfwHttpHdrTbl */
} TfwHdrDef;
#define TfwStrHdr(v, id) {{ (v), NULL, sizeof(v) - 1, 0 }, (id) }
static TfwHdrDef resp_hdrs[] = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add const type specification to allow the compiler do better optimization job.

/**
* Get header id in header table for header @hdr.
*
* TODO: return error if header is not allowed in @msg_type.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good topic to discuss and think about. Really, having Server header in a request doesn't look normal. However, there are Web applications setting custom headers. Moreover, there are proprietary soft (e.g. AntiBViruses) actively working with their clouds using HTTP and the soft sometimes sends quite strange traffic.

Maybe this is a subject for machine learning (to learn the normal set of HTTP headers). Maybe we should provide a configuration option to allow a user to specify which headers are allowed, like we're going to do this for HTTP strings alphabets in #628.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function checks only headers added by end user. The (resp|req)_hdr_set gives user opportunity to make very strange requests and responses and even broke network interactions between agents and server. But user must know what he does.

Maybe we should provide a configuration option to allow a user to specify which headers are allowed

It's already here: user can block specific headers in network by with (resp|req)_hdr_set <SomeHeader>; all the headers will be removed from forwarded traffic.


return tfw_strdup(pool, &tmp_hdr);
}
EXPORT_SYMBOL(tfw_http_msg_make_hdr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need to export the symbol. Please never export symbols in vain and for unit tests (I fixed several such exports)!

@@ -26,14 +26,39 @@
#include "http_msg.h"
#include "ss_skb.h"

/**
* Build TfwStr repesented HTTP header.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

repesented -> represented

}

/**
* Get header id in requestf header table for header @hdr.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requestf -> request

@@ -532,49 +613,37 @@ __hdr_sub(TfwHttpMsg *hm, char *name, size_t n_len, char *val, size_t v_len,

/**
* Transform HTTP message @hm header with identifier @hid.
* Raw header transforers must provide the header name by @name and @n_len.
* If @val is NULL, then the header will be deleted from @hm.
* @hdr must be compaund string and containt two or three parts:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

containt -> contain

@@ -45,7 +70,7 @@ __tfw_http_msg_find_hdr(const TfwStr *hdr, const TfwStr array[], size_t size)

while (start < end) {
size_t mid = start + (end - start) / 2;
const TfwStr *sh = &array[mid];
const TfwStr *sh = (array + mid * member_sz);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that brackets are excess here.


return tfw_strdup(pool, &tmp_hdr);
}

/**
* Find @hdr in @array. @array must be in lowercase and sorted in alphabetical
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the lowercase requirement still relevant for this function?

* and current location.
*
* @req - http request to deturmine location;
* @msg_type - Target message type (TFW_HTTP_MSG_REQ or TFW_HTTP_MSG_RESP).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguments description doesn't seem to reflect current function signature.

};

/* Pool for 'hdr_add' directive (TfwLocation->usr_hdrs) */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like obsolete comment: TfwLocation->usr_hdrs.

Copy link
Contributor

@aleksostapenko aleksostapenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to merge with some minor corrections.

@vankoven vankoven merged commit ec4bab8 into master Feb 26, 2018
@vankoven vankoven deleted the ik-add-user-hdr branch February 26, 2018 00:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HTTP: transform custom HTTP headers
3 participants