http - How to split header values? -
i'm parsing http headers. want split header values arrays makes sense.
for example, cache-control: no-cache, no-store
should return ['no-cache','no-store']
.
http rfc2616 says:
multiple message-header fields same field-name may present in message if , if entire field-value header field defined comma-separated list [i.e., #(values)]. it must possible combine multiple header fields 1 "field-name: field-value" pair, without changing semantics of message, appending each subsequent field-value first, each separated comma. order in header fields same field-name received therefore significant interpretation of combined field value, , proxy must not change order of these field values when message forwarded
but i'm not sure if reverse true -- safe split on comma?
i've found 1 example causes problems. user-agent string, example, is
mozilla/5.0 (x11; linux x86_64) applewebkit/537.36 (khtml, gecko) chrome/41.0.2272.101 safari/537.36
i.e., contains comma after "khtml". don't have more 1 user agent, doesn't make sense split header.
is user-agent string exception, or there more?
no, not safe split headers based on commas. example, accept: foo/bar;p="a,b,c", bob/dole;x="apples,oranges"
valid header if try split on comma intention of getting list of mime-types, you'd invalid results.
the correct answer each header specified using abnf, of them in various rfcs, e.g. accept:
defined in rfc7231 section 5.3.2.
i had specific problem , wrote parser , tested on edge cases. not parsing header non-trivial, interpreting , giving correct result non-trivial.
some headers more complex others, each header has it's own grammar should respected correct (and secure) processing.