Thursday, May 26, 2011

Re: modifying hex codes with a regex replace

Hmm. I think we might be almost stuck. Vim has an oddity where you can't
distinguish between null and newline in the output of submatch().

I think you'll have to process the string to ordinarily treat nulls
properly, but then treat the newline version as a special case.

Like this:

function! AddToByte(b,n)
if b == "\n"
" Actually represents a null
let r = nr2char(n)
else
let r = nr2char(char2nr(b)+n)
endif
if r == "\n" | return "\r" | endif
if r == "\r" | return "\\\r" | endif
return r
endfunction

:s!\%x00Heading level 1\%x00\+.\{-}\%x00\+\(\d\+\)\%x00\+
\Body text\%x00vel 1\%x00\+\(.\)[\x04\x01\x00]\+!
\\=escape("\x00Body text\x00vel 2\x00\x00","\n").
\AddToByte(submatch(2),4).
\escape("\x00\x00\x19#".submatch(1)."\x1a","\n")

:s!\%x00Heading level 1\%x00\+.\{-}\%x00\+\(\d\+\)\%x00\+
\Body text\%x00vel 1\%x00\+\(\n\)[\x04\x01\x00]\+!
\\=escape("\x00Body text\x00vel 2\x00\x00","\n").
\nr2char(14).
\escape("\x00\x00\x19#".submatch(1)."\x1a","\n")

Ben.

On 27/05/11 1:32 AM, Dylan Evans wrote:
> I'm assuming the extra \ beginning each line refer to the sed "continue line"
> command. I'm omitting them when putting it into gvim in a continuous line.
> This worked almost perfectly, but it replace all my instances of ^@ with the
> carriage return (\x0a).
> When using hex escapes, sometimes the escaped characters are not inserted,
> yielding the replacement string:
> =2^Z^Y
> Any ideas?
> Dylan
> On Thu, May 26, 2011 at 9:26 AM, Ben Schmidt <mail_ben_schmidt@yahoo.com.au
> <mailto:mail_ben_schmidt@yahoo.com.au>> wrote:
>
> O, right!
>
> You just have to use an expression in the subsitute part (using \=), and
> nr2char(), char2nr() and submatch(). Whether you use literal characters
> or hex escapes (\x, \%x etc.) doesn't matter. But something like this
> should work:
>
>
> :s!\%x00Heading level 1\%x00\+.\{-}\%x00\+\(\d\+\)\%x00\+
> \Body text\%x00vel 1\%x00\+\(\_.\)[\x04\x01\x00]\+!
> \\="\x00Body text\x00vel 2\x00\x00".nr2char(char2nr(submatch(2))+4).
>
> \"\x00\x00\x19#".submatch(1)."\x1a"
>
> HTH,
>
> Ben.
>
>
>
>
> On 27/05/11 12:02 AM, Dylan Evans wrote:
>
> I don't believe I expressed the problem very well. My regular expressions
> works
> perfectly except for string 2. String 2 reads in a single character. I
> then need
> to write out a different character, which is related to string 2 by having
> a hex
> value 4 more than what was read. For instance, if string two was an "a", (hex
> code 61), I would need to print out an "e" (hex code 65), because 61 + 4 =
> 65. I
> would like to do this in gvim because I don't have access to a Perl
> compiler on
> the windows machine that will be performing this task.
> Thanks again
>
> On Wed, May 25, 2011 at 8:27 PM, Ben Schmidt
> <mail_ben_schmidt@yahoo.com.au <mailto:mail_ben_schmidt@yahoo.com.au>
> <mailto:mail_ben_schmidt@yahoo.com.au
> <mailto:mail_ben_schmidt@yahoo.com.au>>> wrote:
>
> On 26/05/11 1:34 AM, Floobit wrote:
>
> I'm trying to modify a series of binary files made with a legacy
> program, and need to change a certain character in my search string to
> the character with hex code +4. For context, here is my sed regex:
>
> :s!^@Heading level 1^@\+.\{-}^@\+\(\d\+\)^@\+Body text^@vel 1^@\+\(\_.
> \{1}\)[^D^A^@]\+!^@Body text^@vel 2^@^@\2^@^@^Y#\1^Z
>
> with HEX(^@) =00, etc. String 2 is only 1 character long, but is
> occasionally rendered as a carriage return, thus the need for the \_.
> \{1} pattern. Instead of writing the exact character of string 2, I
> need to write the character +4 to its hex code. For instance, if
> HEX(string2)=97, I would need to print ASCII(9b).
>
>
> To match it, either put it in directly (type Ctrl-V then Ctrl-D to get
> ^D, for example) or match with \%x outside a collection ( :help /\%x )
> or just \x inside a collection ( :help /\] ).
>
> To include it in the substitution, either put it in directly (except
> there are a few oddities, e.g. with \n or ^J representing null--:help
> sub-replace-special) or use an expression (:help sub-replace-expression)
> which can use nr2char() (:help nr2char()) or \x in a string (:help
> expr-string).
>
> Here's one option, avoiding using control characters, which means it's
> robust in something like .vimrc as encoding changes won't come into
> play, it turns out something like this:
>
> :s!\%x00Heading level 1\%x00\+.\{-}\%x00\+\(\d\+\)\%x00\+
> \Body text\%x00vel 1\%x00\+\(\_. \)[\x04\x01\x00]\+!
> \\="\x00Body text\x00vel 2\x00\x00".submatch(2).
> \"\x00\x00\x19#".submatch(1)."\x1a"
>
> (The backslashes at the beginnings of lines are just for line
> continuation if including in a .vimrc or script; omit them if you're
> joining the lines together, e.g. on the commandline.)
>
> I have no idea why you would use \{1}, so I omitted it, too. I may have
> made a bunch more booboos if I didn't understand the original regex
> (e.g. because sed has differences to Vim, which I know it does, but am
> not sure on any specifics).
>
> There are many, many other possible ways of achieving the same, too, so
> the above is just one opinion....
>
> Ben.
>
>
>
>
>
> --
> You received this message from the "vim_use" maillist.
> Do not top-post! Type your reply below the text you are replying to.
> For more information, visit http://www.vim.org/maillist.php

--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

No comments:

Post a Comment