Thursday, April 21, 2016

Capture columns nummers of matches ending with double byte chars

Since I use Vim I have troubles with double byte characters.

I want to capture all strings of matches together with startcolumn and endcolumn of a match (line by line). I don't need only the strings but also the columnnumbers for other functions.

The last few years I used match/matchend then I noted that it did not capture correctly double byte characters within the string.
Then I adapted everything to use searchpos() but today I found out that it gives troubles with string with a double byte character at the end.


"mylist = list with all linenrs having matches
for n in range(0, len(mylist)-1)
let idx = []
let edx = []
let matches_between_cols = []

"FIND ALL IDX MATCHES
"idx --> forward search
call cursor(mylist[n],1)
while line(".") == mylist[n]
let S= searchpos(@/, '')
if S[0] == mylist[n]
call add(idx, S[1]-1)
endif
endwhile
"idx --> backward search (to include matches on first column)
call cursor(mylist[n],len(getline(mylist[n])))
while line(".") == mylist[n]
let S= searchpos(@/, 'b')
if S[0] == mylist[n]
call add(idx, S[1]-1)
endif
endwhile

"FIND ALL EDX MATCHES
"edx --> forward search
call cursor(mylist[n],1)
while line(".") == mylist[n]
let E= searchpos(@/, 'e')
if E[0] == mylist[n]
call add(edx, E[1])
endif
endwhile
"edx --> backward search (to include matches on first column)
call cursor(mylist[n],len(getline(mylist[n])))
while line(".") == mylist[n]
let E= searchpos(@/, 'eb')
if E[0] == mylist[n]
call add(edx, E[1])
endif
endwhile

if len(idx) > 0
for i in range(0,len(idx)-1)
let r = strpart(getline(mylist[n]),idx[i], edx[i]-idx[i])
call add(matches_between_cols, r)
endfor
endif
endfor

-----------------------------------
Buffer:
city | Felicità
whatever | Peach
pmg00000001 | Perché
text| Céline
bMgbXuEWo | Université


@/ = "| \zs\S\+"
it captures:
Felicit<c3>
Peach
Perch<c3>
Céline
Universit<c3>

Expected:
Felicità
Peach
Perché
Céline
Université

Can you please tell me what I did wrong?
(Is it not possible to let every character be a single byte char as in languages as Python?)

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

No comments:

Post a Comment