Caching Modified URLs by Varnish instead of the original requested URL

Guillaume Quintard guillaume.quintard at gmail.com
Thu Aug 31 20:06:03 UTC 2023


I'm pretty sure it's correctly lowercasing "\2" correctly. The problem is
that you want to lowercase the *value* referenced by "\2" instead.

On this, I don't think you have a choice, you need to make that captured
group its own string, lowercase it, and only then concatenate it. Something
like:

set req.http.hash-url = regsuball(req.http.hash-url, ".*(q=)(.*?)(\&|$).*",
"\1") + *std.tolower("regsuball(req.http.hash-url, ".*(q=)(.*?)(\&|$).*",
"\2")") + *regsuball(req.http.hash-url, ".*(q=)(.*?)(\&|$).*", "\3"));

It's disgusting, but eh, we started with regex, so...

Other options include vmod_querystring
<https://github.com/Dridi/libvmod-querystring/blob/master/src/vmod_querystring.vcc.in>
(Dridi might possibly be of assistance on this topic) and vmod_urlplus
<https://docs.varnish-software.com/varnish-enterprise/vmods/urlplus/#query_get>
(Varnish
Enterprise), and the last, and possibly most promising one, vmod_re2
<https://gitlab.com/uplex/varnish/libvmod-re2/-/blob/master/README.md> which
would allow you to do something like

if (myset.match(".*(q=)(.*?)(\&|$).*", "\1")) {
   set req.http.hash-url = myset.matched(1) + std.lower(myset.matched(2)) +
myset.matched(3)
}

-- 
Guillaume Quintard


On Thu, Aug 31, 2023 at 1:03 AM Uday Kumar <uday.polu at indiamart.com> wrote:

> Hi Guillaume,
>
> In the process of modifying the query string in VCL code, we have a
> requirement of *lowercasing value of specific parameter*, instead of the *whole
> query string*
>
> *Example Request URL:*
> /search/ims?q=*CRICKET bat*&country_code=IN
>
> *Requirement:*
> We have to modify the request URL by lowercasing the value of only the *q
> *parameter
> i.e ./search/ims?q=*cricket bat*&country_code=IN
>
> *For that, we have found below regex:*
> set req.http.hash-url = regsuball(req.http.hash-url, "(q=)(.*?)(\&|$)",
> "\1"+*std.tolower("\2")*+"\3");
>
> *ISSUE:*
> *std.tolower("\2")* in the above statement is *not lowercasing* the
> string that's captured, but if I test it using *std.tolower("SAMPLE"),* its
> lowercasing as expected.
>
> 1. May I know why it's not lowercasing if *std.tolower("\2") is used*?
> 2. Also, please provide possible optimal solutions for the same. (using
> regex)
>
> Thanks & Regards
> Uday Kumar
>
>
> On Wed, Aug 23, 2023 at 12:01 PM Uday Kumar <uday.polu at indiamart.com>
> wrote:
>
>> Hi Guillaume,
>>
>> *use includes and function calls*
>> This is great, thank you so much for your help!
>>
>> Thanks & Regards
>> Uday Kumar
>>
>>
>> On Wed, Aug 23, 2023 at 1:32 AM Guillaume Quintard <
>> guillaume.quintard at gmail.com> wrote:
>>
>>> Hi Uday,
>>>
>>> I'm not exactly sure how to read those diagrams, so I apologize if I'm
>>> missing the mark or if I'm too broad here.
>>>
>>> There are a few points I'd like to attract your attention to. The first
>>> one is that varnish doesn't cache the request or the URL. The cache is
>>> essentially a big hashmap/dictionary/database, in which you store the
>>> response. The request/url is the key for it, so you need to have it in its
>>> "final" form before you do anything.
>>>
>>> From what I read, you are not against it, and you just want to sanitize
>>> the URL in vcl_recv, but you don't like the idea of making the main file
>>> too unwieldy. If I got that right, then I have a nice answer for you: use
>>> includes and function calls.
>>>
>>> As an example:
>>>
>>> # cat /etc/varnish/url.vcl
>>> sub sanitize_url {
>>>   # do whatever modifications you need here
>>> }
>>>
>>> # cat /etc/varnish/default.vcl
>>> include "./url.vcl";
>>>
>>> sub vcl_recvl {
>>>   call sanitize_url;
>>> }
>>>
>>>
>>> That should get you going.
>>>
>>> Hopefully I didn't miss the mark too much here, let me know if I did.
>>>
>>> --
>>> Guillaume Quintard
>>>
>>>
>>> On Tue, Aug 22, 2023 at 3:45 AM Uday Kumar <uday.polu at indiamart.com>
>>> wrote:
>>>
>>>> Hello All,
>>>>
>>>>
>>>> For our spring boot application, we are using Varnish Caching in a
>>>> production environment.
>>>>
>>>>
>>>>
>>>>
>>>> Requirement: [To utilize cache effectively]
>>>>
>>>> Modify the URL (Removal of unnecessary parameters) while caching the
>>>> user request, so that the modified URL can be cached by varnish which
>>>> helps improve cache HITS for similar URLs.
>>>>
>>>>
>>>> For Example:
>>>>
>>>> Let's consider the below Request URL
>>>>
>>>> Url at time t, 1. samplehost.com/search/ims?q=bags&source=android
>>>> &options.start=0
>>>>
>>>>
>>>> Our Requirement:
>>>>
>>>> To make varnish consider URLs with options.start=0 and without
>>>> options.start parameter as EQUIVALENT, such that a single cached
>>>> response(Single Key) can be utilized in both cases.
>>>>
>>>>
>>>> *1st URL after modification:*
>>>>
>>>> samplehost.com/search/ims?q=bags&source=android
>>>>
>>>>
>>>> *Cached URL at Varnish:*
>>>>
>>>> samplehost.com/search/ims?q=bags&source=android
>>>>
>>>>
>>>>
>>>> Now, Url at time t+1, 2.
>>>> samplehost.com/search/ims?q=bags&source=android
>>>>
>>>>
>>>> At present, varnish considers the above URL as different from 1st URL
>>>> and uses a different key while caching the 2nd URL[So, it will be a
>>>> miss]
>>>>
>>>>
>>>> *So, URL after Modification:*
>>>>
>>>> samplehost.com/search/ims?q=bags&source=android
>>>>
>>>>
>>>> Now, 2nd URL will be a HIT at varnish, effectively utilizing the cache.
>>>>
>>>>
>>>>
>>>> NOTE:
>>>>
>>>> We aim to execute this URL Modification without implementing the logic directly
>>>> within the default.VCL file. Our intention is to maintain a clean and
>>>> manageable codebase in the VCL.
>>>>
>>>>
>>>>
>>>> To address this requirement effectively, we have explored two potential
>>>> Approaches:
>>>>
>>>>
>>>> Approach-1:
>>>>
>>>>
>>>>
>>>> Approach-2:
>>>>
>>>>
>>>>
>>>>
>>>> 1. Please go through the approaches mentioned above and let me know the
>>>> effective solution.
>>>>
>>>> 2. Regarding Approach-2
>>>>
>>>> At Step 2:
>>>>
>>>> May I know if there is any way to access and execute a custom
>>>> subroutine from another VCL, for modifying the Request URL? if yes,
>>>> pls help with details.
>>>>
>>>> At Step 3:
>>>>
>>>> Tomcat Backend should receive the Original Request URL instead of the
>>>> Modified URL.
>>>>
>>>> 3. Please let us know if there is any better approach that can be
>>>> implemented.
>>>>
>>>>
>>>>
>>>> Thanks & Regards
>>>> Uday Kumar
>>>> _______________________________________________
>>>> varnish-misc mailing list
>>>> varnish-misc at varnish-cache.org
>>>> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
>>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/varnish-misc/attachments/20230831/86b16587/attachment-0001.html>


More information about the varnish-misc mailing list