Tuesday, November 14, 2017

Re: Vim surprisingly slow?

Erik Falor <ewfalor@gmail.com> wrote:

> On Mon, Nov 13, 2017 at 09:33:49PM +0100, Christian Brabandt wrote:
>>
>> On So, 12 Nov 2017, Tim Chase wrote:
>>
>> > Taking on the challenge listed here
>> >
>> > https://dev.to/jorinvo/csv-challenge-1al
>> >
>> > it provides a .json file link to mung into date-named files (never
>> > mind that these happen to all be the same date), I came up with this
>> > vistrace -f  -e open,close,read,write -otrace.log ./vim  -u NONE data.json  -S old_engine.vimm solution:


This is how long it takes on my x86_64 xubuntu-16.04 machine
with SSD, using vim-8.0.1298 (huge)

$ cat old_engine.vim
set regexpengine=1
silent g/.*name":"\([^"]*\)".*card":"\([^"]*\)".*/let
s=substitute(getline('.'), '.*stamp":"\(\d\+\)-\(\d\+\)-\(\d\+\).*',
'\1\2\3','').'.txt'|s//\1,\2/|exec ".w!>>".s
q!

$ cat new_engine.vim
set regexpengine=2
silent g/.*name":"\([^"]*\)".*card":"\([^"]*\)".*/let
s=substitute(getline('.'), '.*stamp":"\(\d\+\)-\(\d\+\)-\(\d\+\).*',
'\1\2\3','').'.txt'|s//\1,\2/|exec ".w!>>".s
q!

$ time ./vim -u NONE data.json  -S new_engine.vim
real    0m15.182s
user    0m1.652s
sys    0m0.712s

$ time ./vim -u NONE data.json  -S old_engine.vim
real    0m14.363s
user    0m0.972s
sys    0m0.600s


Where data.json is found at https://dev.to/jorinvo/csv-challenge-1al

While Vim is running, it's only uses ~ 10% of the CPU.
So it's not CPU bound.

Using...

$ strace -f  -e open,close,read,write,fsync -otrace.log \
    ./vim  -u NONE data.json  -S old_engine.vim

I see that Vim keeps opening, writing, closing, opening, writing, closing (etc)
the file 20150425.txt. It only writes a few bytes at at time (between
~30 and ~40 bytes).  We also keep opening/closing directory ".":

...snip...
4753  open(".", O_RDONLY)               = 8
4753  close(8)                          = 0
4753  open("20150425.txt", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
4753  write(8, "Lyla Schultz IV,1212-1221-1121-1"..., 36) = 36
4753  fsync(8)                          = 0
4753  close(8)                          = 0
4753  open(".", O_RDONLY)               = 8
4753  close(8)                          = 0
4753  open("20150425.txt", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
4753  write(8, "Ricardo Durgan,1234-2121-1221-12"..., 35) = 35
4753  fsync(8)                          = 0
4753  close(8)                          = 0
4753  open(".", O_RDONLY)               = 8
4753  close(8)                          = 0
4753  open("20150425.txt", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
4753  write(8, "Barton Graham,1212-1221-1121-123"..., 34) = 34
4753  fsync(8)                          = 0
4753  close(8)                          = 0

etc. (many other open/write/close like this)

Running this command without filtering syscalls, I get more details
of the repeating open/write/close/fsync calls:

$ strace -f -otrace.log ./vim -u NONE data.json-S old_engine.vim

...snip...
4560  getcwd("/home/pel/sb/vim/src", 4096) = 21
4560  stat("/home/pel/sb/vim/src/20150425.txt", {st_mode=S_IFREG|0664, st_size=3029496, ...}) = 0
4560  open(".", O_RDONLY)               = 8
4560  fchdir(8)                         = 0
4560  chdir("/home/pel/sb/vim/src")     = 0
4560  getcwd("/home/pel/sb/vim/src", 4096) = 21
4560  fchdir(8)                         = 0
4560  close(8)                          = 0
4560  stat("20150425.txt", {st_mode=S_IFREG|0664, st_size=3029496, ...}) = 0
4560  stat("/home/pel/sb/vim/src/20150425.txt", {st_mode=S_IFREG|0664, st_size=3029496, ...}) = 0
4560  readlink("20150425.txt", 0x7ffef493bd60, 4095) = -1 EINVAL (Invalid argument)
4560  stat(".20150425.txt.swp", 0x7ffef493ccf0) = -1 ENOENT (No such file or directory)
4560  stat("20150425.txt", {st_mode=S_IFREG|0664, st_size=3029496, ...}) = 0
4560  access("20150425.txt", W_OK)      = 0
4560  getxattr("20150425.txt", "system.posix_acl_access", 0x7ffef493cb90, 132) = -1 ENODATA (No data available)
4560  stat("20150425.txt", {st_mode=S_IFREG|0664, st_size=3029496, ...}) = 0
4560  open("20150425.txt", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
4560  write(8, "Keeley Bosco,1228-1221-1221-1431"..., 33) = 33
4560  fsync(8)                          = 0
4560  close(8)                          = 0
4560  chmod("20150425.txt", 0100664)    = 0
4560  setxattr("20150425.txt", "system.posix_acl_access", "\2\0\0\0\1\0\6\0\377\377\377\377\4\0\6\0\377\377\377\377 \0\4\0\377\377\377\377", 28, 0) = 0
4560  recvmsg(5, 0x7ffef493dda0, 0)     = -1 EAGAIN (Resource temporarily unavailable)
4560  recvmsg(5, 0x7ffef493dda0, 0)     = -1 EAGAIN (Resource temporarily unavailable)
4560  poll([{fd=5, events=POLLIN}], 1, 0) = 0 (Timeout)
4560  select(6, [0 3 5], [], [0 3], {0, 0}) = 0 (Timeout)
4560  getcwd("/home/pel/sb/vim/src", 4096) = 21
4560  stat("/home/pel/sb/vim/src/20150425.txt", {st_mode=S_IFREG|0664, st_size=3029529, ...}) = 0
4560  open(".", O_RDONLY)               = 8
4560  fchdir(8)                         = 0
4560  chdir("/home/pel/sb/vim/src")     = 0
4560  getcwd("/home/pel/sb/vim/src", 4096) = 21
4560  fchdir(8)                         = 0
4560  close(8)                          = 0
4560  stat("20150425.txt", {st_mode=S_IFREG|0664, st_size=3029529, ...}) = 0
4560  stat("/home/pel/sb/vim/src/20150425.txt", {st_mode=S_IFREG|0664, st_size=3029529, ...}) = 0
4560  readlink("20150425.txt", 0x7ffef493bd60, 4095) = -1 EINVAL (Invalid argument)
4560  stat(".20150425.txt.swp", 0x7ffef493ccf0) = -1 ENOENT (No such file or directory)
4560  stat("20150425.txt", {st_mode=S_IFREG|0664, st_size=3029529, ...}) = 0
4560  access("20150425.txt", W_OK)      = 0
4560  getxattr("20150425.txt", "system.posix_acl_access", 0x7ffef493cb90, 132) = -1 ENODATA (No data available)
4560  stat("20150425.txt", {st_mode=S_IFREG|0664, st_size=3029529, ...}) = 0
4560  open("20150425.txt", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
4560  write(8, "Rubye Jerde,1228-1221-1221-1431\n", 32) = 32
4560  fsync(8)                          = 0
4560  close(8)                          = 0
4560  chmod("20150425.txt", 0100664)    = 0
4560  setxattr("20150425.txt", "system.posix_acl_access", "\2\0\0\0\1\0\6\0\377\377\377\377\4\0\6\0\377\377\377\377 \0\4\0\377\377\377\377", 28, 0) = 0
4560  recvmsg(5, 0x7ffef493dda0, 0)     = -1 EAGAIN (Resource temporarily unavailable)
4560  recvmsg(5, 0x7ffef493dda0, 0)     = -1 EAGAIN (Resource temporarily unavailable)
4560  poll([{fd=5, events=POLLIN}], 1, 0) = 0 (Timeout)
4560  select(6, [0 3 5], [], [0 3], {0, 0}) = 0 (Timeout)
4560  getcwd("/home/pel/sb/vim/src", 4096) = 21
4560  stat("/home/pel/sb/vim/src/20150425.txt", {st_mode=S_IFREG|0664, st_size=3029561, ...}) = 0
...snip...

Notice that it's use fsync() after each small write to file "20150425.txt".

Using my attached io.pl script which summarizes the trace.log file
and shows where I/Os come from, I get this:

$ strace -f -e open,close,read,write,fsync -otrace.log \
    ./vim -u NONE data.json -S old_engine.vim

$ io.pl < trace.log
read_bytes   #read  write_bytes  #write  #open  #fsync  filename
----------   -----  -----------  ------  -----  ------  --------
   1612897      27          0       0       2       0  data.json
    135992      35          0       0       2       0  /usr/share/X11/locale/locale.alias
     36864       9          0       0       1       0  /usr/share/X11/locale/locale.dir
     16720       8          0       0       4       0  /home/pel/.ICEauthority
      4250       3          0       0       1       0  /usr/share/X11/locale/en_US.UTF-8/XLC_LOCALE
      3361       2          0       0       1       0  /lib/terminfo/x/xterm
      3171      16        216       7       1       0  /usr/share/locale/locale.alias
       832       1          0       0       1       0  /lib/x86_64-linux-gnu/libnss_nis.so.2
...snip many files with little amount of I/Os...
         0       0          0       0       1       0  .data.json.swx
         0       0     179281    5004    5004    5004  20150425.txt
         0       0          0       0       5       0  /etc/ld.so.cache
         0       0          0       0    5004       0  .
         0       0      16384       4       2       0  .data.json.swp
         0       0          0       0       1       0  /usr/lib/locale/locale-archive
----------   -----  -----------  ------  ------  -----  TOTAL
   1883068     191     195881    5015   15121    5004

As highlighted in red, notice that we only write 179,281 bytes to file
"20150425.txt" but we open/write/close/fsync 5004 times.
That repeated I/O patterns kills performance, especially because
of the frequent fsync().

We also read 1,612,897 bytes from data.json, but that should be
fast since we do only 27 read and opened that file only twice
according to output of io.pl. Vim reads file deta.json by chunks
of 64 KB. 1,612,897 bytes is exactly the file size of data.json.

Maybe we can change vim to avoid the constant open/write/close/fsync
on file "20150425.txt" which should speed it up greatly.

Regards
Dominique

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

No comments: