Wednesday, May 11, 2016

Re: SIGBUS in Vim 7.4

Dominique wrote:

> Frew Schmidt <frioux@gmail.com> wrote:
>
> > On Wednesday, May 11, 2016 at 2:18:31 AM UTC-7, Dominique Pelle wrote:
> >> [ ... ]
> >> Thanks. I should add: asan writes errors (if any) on stderr by default,
> >> and since vim puts the terminal in cooked mode, asan error can be
> >> garbled. So it's best to redirect stderr. The ASAN_OPTIONS
> >> environment variable can be used for this. I put this in my ~/.bashrc:
> >>
> >> export ASAN_OPTIONS="check_initialization_order=1:strict_init_order=1:detect_stack_use_after_return=1:detect_leaks=0:log_path=/home/pel/asan.log:abort_on_error=1"
> >>
> >> The relevant part to redirect errors is "log_path=/home/pel/asan.log"
> >> (obviously you can put a different file name)
> >>
> >> See https://github.com/google/sanitizers/wiki/AddressSanitizerFlags
> >> for more details.
> >>
> >> > FWIW though, I suspect that a skilled practitioner might be able
> >> > to suss out the actual cause of the bug with the coredump I already have.
> >>
> >> It may not be so easy with a gdb stack alone: the memory error may
> >> have happened much earlier than where the crash is detected in gdb.
> >> That's why asan helps: if freed memory is accessed for example, the
> >> error will be reported immediately when accessed, and asan will give
> >> more information:
> >> - the stack where freed memory is accessed
> >> - the stack where memory was previously freed
> >> - the stack where memory was previously allocated
> >>
> >> Regards
> >> Dominique
> >
> > Ok, sounds good. I've set the logging to $HOME/asan-$$.pid. Will report back when I have results.
>
>
> Hi Frew
>
> I downloaded your dotfiles from github and used them.
> I noticed dotfiles for awesome so I suspected that you used
> awesome and so I used it too. Awesome window manager
> also happened to be used in the other crash reported a
> few days ago. Awesome causes more resizes than non
> tiling window manager, which I think is the key to reproduce
> the bug.
>
> I managed to reproduce a crash with a asan build of
> vim-7.4.1829 using your dotfile. I'm not sure that my
> crash is related to your crash, but it's crash anyway.
> Asan reports:
>
> =================================================================
> ==3602==ERROR: AddressSanitizer: stack-use-after-return on address
> 0x7f31e405da48 at pc 0x000000f9676f bp 0x7ffca9f03c70 sp
> 0x7ffca9f03c68
> READ of size 8 at 0x7f31e405da48 thread T0
> #0 0xf9676e in nfa_regtry /home/pel/sb/vim/src/./regexp_nfa.c:6982
> #1 0xf93115 in nfa_regexec_both /home/pel/sb/vim/src/./regexp_nfa.c:7156
> #2 0xf91b4b in nfa_regexec_multi /home/pel/sb/vim/src/./regexp_nfa.c:7366
> #3 0xf8f195 in vim_regexec_multi /home/pel/sb/vim/src/regexp.c:8177
> #4 0x111f33c in next_search_hl /home/pel/sb/vim/src/screen.c:7700
> #5 0x10b105d in win_line /home/pel/sb/vim/src/screen.c:3527
> #6 0x10a3d0c in update_single_line /home/pel/sb/vim/src/screen.c:824
> #7 0x1771f96 in main_loop /home/pel/sb/vim/src/main.c:1306
> #8 0x175fc65 in main /home/pel/sb/vim/src/main.c:1051
> #9 0x7f31eba0bf44 in __libc_start_main
> /build/eglibc-IRFzFh/eglibc-2.19/csu/libc-start.c:287
> #10 0x4ecf1c in _start ??:?
>
> Address 0x7f31e405da48 is located in stack of thread T0 at offset 584 in frame
> #0 0x12eb47f in syn_current_attr /home/pel/sb/vim/src/syntax.c:1885
>
> This frame has 36 object(s):
> [32, 36) ''
> [48, 52) ''
> [64, 68) ''
> [80, 88) ''
> [112, 116) ''
> [128, 132) 'syn_id'
> [144, 160) 'endpos'
> [176, 192) 'hl_startpos'
> [208, 224) 'hl_endpos'
> [240, 256) 'eos_pos'
> [272, 288) 'eoe_pos'
> [304, 308) 'end_idx'
> [320, 324) 'idx'
> [336, 344) 'spp'
> [368, 376) 'cur_si'
> [400, 408) 'sip'
> [432, 436) 'startcol'
> [448, 452) 'endcol'
> [464, 472) 'flags'
> [496, 500) 'cchar'
> [512, 520) 'next_list'
> [544, 548) 'found_match'
> [560, 564) 'do_keywords'
> [576, 912) 'regmatch' <== Memory access at offset 584 is inside
> this variable
> [976, 992) 'pos'
> [1008, 1012) 'lc_col'
> [1024, 1032) 'cur_extmatch'
> [1056, 1088) 'buf_chartab'
> [1120, 1128) 'line'
> [1152, 1156) 'keep_next_list'
> [1168, 1172) 'zero_width_next_list'
> [1184, 1208) 'zero_width_next_ga'
> [1248, 1252) 'r'
> [1264, 1280) 'startpos'
> [1296, 1304) 'lspp'
> [1328, 1344) 'sps'
> HINT: this may be a false positive if your program uses some custom
> stack unwind mechanism or swapcontext
> (longjmp and C++ exceptions *are* supported)
> SUMMARY: AddressSanitizer: stack-use-after-return ??:0 ??
> Shadow bytes around the buggy address:
> 0x0fe6bc803af0: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5
> 0x0fe6bc803b00: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5
> 0x0fe6bc803b10: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5
> 0x0fe6bc803b20: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5
> 0x0fe6bc803b30: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5
> =>0x0fe6bc803b40: f5 f5 f5 f5 f5 f5 f5 f5 f5[f5]f5 f5 f5 f5 f5 f5
> 0x0fe6bc803b50: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5
> 0x0fe6bc803b60: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5
> 0x0fe6bc803b70: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5
> 0x0fe6bc803b80: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5
> 0x0fe6bc803b90: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5
> Shadow byte legend (one shadow byte represents 8 application bytes):
> Addressable: 00
> Partially addressable: 01 02 03 04 05 06 07
> Heap left redzone: fa
> Heap right redzone: fb
> Freed heap region: fd
> Stack left redzone: f1
> Stack mid redzone: f2
> Stack right redzone: f3
> Stack partial redzone: f4
> Stack after return: f5
> Stack use after scope: f8
> Global redzone: f9
> Global init order: f6
> Poisoned by user: f7
> Container overflow: fc
> ASan internal: fe
> ==3602==ABORTING
>
> Code at regexp_nfa.c:6982 were error is detected:
>
> 6982 if (reg_startpos[0].lnum < 0)
>
> reg_startpos is a global variable (pointer).
> This pointer was set to a local variable somehow
> in a function which already returned, so it's pointing
> to invalid memory as a result of stack unwinding.

This usually indicates recursive use of the pattern matching functions.
Because of the global variables this doesn't work. There already is a
note about this in the todo list. It's a bit of work though. A first
(simpler) step could be to detect the recursiveness and produce an error
when it happens, not when it later uses invalid memory.
The problem might be caused by the new asynchronous functionality,
autocommands or something else.

> That local variable according to asan is regmatch
> in stack frame of syn_current_attr(...) at syntax.c:1885.
> Asan says:
>
> Address 0x7f31e405da48 is located in stack of thread T0 at offset 584 in frame
> #0 0x12eb47f in syn_current_attr /home/pel/sb/vim/src/syntax.c:1885
> ...
> [576, 912) 'regmatch' <== Memory access at offset 584 is inside this variable
>
> So the invalid memory access is at offset 584 - 576 = 8 in
> regmatch. regmatch is declared in syntax.c:1904 as:
>
> 1904 regmmatch_T regmatch;
>
> With regmatch_T type being:
>
> 144 typedef struct
> 145 {
> 146 regprog_T *regprog;
> 147 lpos_T startpos[NSUBEXP];
> 148 lpos_T endpos[NSUBEXP];
> 149 int rmm_ic;
> 150 colnr_T rmm_maxcol; /* when not zero: maximum column */
> 151 } regmmatch_T;
>
> Offset 8 in that struct is the beginning of field startpos
> (since I used x86_64, so sizeof(regprog) == 8)
> which is consistent with were error is reported in
> regexp_nfa.c:6982:
>
> 6982 if (reg_startpos[0].lnum < 0)
>
> I think I see where the bug comes from: local variable
> regmatch at syntax.c:1904 is passed by address
> as 1st argument of syn_regexp(...) in syntax.c:
>
> 2116 r = syn_regexec(&regmatch,
> 2117 current_lnum,
> 2118 (colnr_T)lc_col,
> 2119 IF_SYN_TIME(&spp->sp_time));
>
> syn_regexec(...) is:
>
> 3291 static int
> 3292 syn_regexec(
> 3293 regmmatch_T *rmp,
> 3294 linenr_T lnum,
> 3295 colnr_T col,
> 3296 syn_time_T *st UNUSED)
> 3297 {
> 3298 int r;
> 3299 #ifdef FEAT_PROFILE
> 3300 proftime_T pt;
> 3301
> 3302 if (syn_time_on)
> 3303 profile_start(&pt);
> 3304

No comments: