Wednesday, March 2, 2011

Re: Non sequential diff

On Wed, 2 Mar 2011, Schorschie wrote:

> Hello Forum,
>
> I'm quite new here, so I'm not familiar with your etiquette yet. I can
> indtroduce myself later if you want.
>
> I'm currently comparing a lot of text*) so I was wondering weather
> there is something wich finds me blocks of similar text in different
> context of a file. Google couldn't help me so far and I think this
> should be quite tricky, too.

So are you looking for similar strings, or, as the subject says, differences?
If you are looking for matching sequences, then you may want a suffix array,
I think:

https://secure.wikimedia.org/wikipedia/en/wiki/Suffix_array

That page doesn't cite "Programming Pearls" by Jon Bentley,
but it could cite section 15.2

If you don't want to write the code to do this,
http://sary.sourceforge.net/#usage-tools
might be useful.

Other things to look at could be suffix trees,
https://secure.wikimedia.org/wikipedia/en/wiki/Suffix_tree

and the longest common substring problem
https://secure.wikimedia.org/wikipedia/en/wiki/Longest_common_substring_problem
>
> Greetings,
> Grzegorz
>
HTH
Hugh
>
> *) Not the dissertation of MdB Karl Theodor (...) von und zu
> Guttenberg...
>
> --
> You received this message from the "vim_use" maillist.
> Do not top-post! Type your reply below the text you are replying to.
> For more information, visit http://www.vim.org/maillist.php
>

--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

No comments: