Friday, April 30, 2021

Re: What is fuzzy matching?

Hi,

On Fri, Apr 30, 2021 at 2:15 AM BPJ <bpj@melroch.se> wrote:
>
>
> Den fre 30 apr. 2021 07:56Tony Mechelynck <antoine.mechelynck@gmail.com> skrev:
>>
>> On Fri, Apr 30, 2021 at 5:10 AM Yegappan Lakshmanan <yegappanl@gmail.com> wrote:
>> >
>> > Hi Tony,
>> >
>> > On Thu, Apr 29, 2021 at 3:07 AM Tony Mechelynck
>> > <antoine.mechelynck@gmail.com> wrote:
>> > >
>> > > The help for :vimgrep (in quickfix.txt with "Last change: 2021 Feb
>> > > 05", maybe that date is in error) now mentions an [f] flag without
>> > > saying what it does. One recent vim_dev thread makes me think that
>> > > with the 'f' flag "fuzzy matching" is used. So I used :helpgrep
>> > > \<fuzzy\> and found several mentions of fuzzy matching, but AFAICT
>> > > they all assume that the reader knows what fuzzy matching is. Nowhere
>> > > did I see the expression defined. So what is fuzzy matching?
>> > >
>> >
>> > We should add a description for "fuzzy matching" to the Vim help.
>> > I will send out a PR.
>> >
>> > Fuzzy matching refers to matching strings using non-exact matches.
>> > For example, when you search for the 'get pat' string using fuzzy
>> > matching, it will match the strings 'GetPattern', 'PatternGet',
>> > 'getPattern', 'patGetter', 'getSomePattern', 'MatchpatternGet' etc.
>> >
>> > :echo matchfuzzy(['GetPattern', 'PatternGet', 'getPattern',
>> > 'patGetter', 'getSomePattern', 'MatchpatternGet'], 'get pat')
>> > ['patGetter', 'GetPattern', 'PatternGet', 'getPattern',
>> > 'getSomePattern', 'MatchpatternGet']
>> >
>> > Fuzzy matching will match a string, if all the characters in the search
>> > string are present in the string in the same order. Case is ignored during
>> > the search. Other characters can be present between two characters
>> > in the search string. If the search string has multiple words, then each word
>> > is matched separately. So the words in the search string can be present in
>> > any order in a string.
>> >
>> > Fuzzy matching assigns a score for each match based on some criteria.
>> > The match with the highest score is returned first.
>> >
>> > Regards,
>> > Yegappan
>>
>> Ah I see. So IIUC Vim's fuzzy matching will match (caselessly) if
>> there is an extra letter but not if there is a missing letter, and it
>> won't match swapped letters: if the search string is 'word', then Vim
>> will find 'WoRd' or 'worrd' but not 'wrd' or 'wrod'. Thanks for
>> explaining.
>
>
>
> IOW fuzzy 'word' is equivalent to regex 'w.*o.*r.*d.*' or '.*w.*o.*r.*d.*'?
>

When fuzzy matching a string, the case is ignored. So a fuzzy 'word' is
sort of equivalent to '\c.*w.*o.*r.*d.*', But when multiple matches are
sorted by the fuzzy match score, the match is much more than a regular
expression match. For example, the score is higher for a match with
less distance between the characters. Similarly matches at the beginning
of a word or after a camel case or underscore are given a higher
score.

- Yegappan

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/vim_use/CAAW7x7m58RZj52LmqGxH7nXKFrf1aF2ZCUMxYpOQ%3Dwq0Ob0y%2Bg%40mail.gmail.com.

No comments: