Erik Falor <ewfalor@gmail.com> wrote:
> On Mon, Nov 13, 2017 at 09:33:49PM +0100, Christian Brabandt wrote:
>>
>> On So, 12 Nov 2017, Tim Chase wrote:
>>
>> > Taking on the challenge listed here
>> >
>> > https://dev.to/jorinvo/csv-challenge-1al
>> >
>> > it provides a .json file link to mung into date-named files (never
>> > mind that these happen to all be the same date), I came up with this
>> > vistrace -f -e open,close,read,write -otrace.log ./vim -u NONE data.json -S old_engine.vimm solution:
This is how long it takes on my x86_64 xubuntu-16.04 machine
with SSD, using vim-8.0.1298 (huge)
$ cat old_engine.vim
set regexpengine=1
silent g/.*name":"\([^"]*\)".*card":"\([^"]*\)".*/let
s=substitute(getline('.'), '.*stamp":"\(\d\+\)-\(\d\+\)-\(\d\+\).*',
'\1\2\3','').'.txt'|s//\1,\2/|exec ".w!>>".s
q!
$ cat new_engine.vim
set regexpengine=2
silent g/.*name":"\([^"]*\)".*card":"\([^"]*\)".*/let
s=substitute(getline('.'), '.*stamp":"\(\d\+\)-\(\d\+\)-\(\d\+\).*',
'\1\2\3','').'.txt'|s//\1,\2/|exec ".w!>>".s
q!
$ time ./vim -u NONE data.json -S new_engine.vim
real 0m15.182s
user 0m1.652s
sys 0m0.712s
$ time ./vim -u NONE data.json -S old_engine.vim
real 0m14.363s
user 0m0.972s
sys 0m0.600s
Where data.json is found at https://dev.to/jorinvo/csv-challenge-1al
While Vim is running, it's only uses ~ 10% of the CPU.
So it's not CPU bound.
Using...
$ strace -f -e open,close,read,write,fsync -otrace.log \
./vim -u NONE data.json -S old_engine.vim
I see that Vim keeps opening, writing, closing, opening, writing, closing (etc)
the file 20150425.txt. It only writes a few bytes at at time (between
~30 and ~40 bytes). We also keep opening/closing directory ".":
> On Mon, Nov 13, 2017 at 09:33:49PM +0100, Christian Brabandt wrote:
>>
>> On So, 12 Nov 2017, Tim Chase wrote:
>>
>> > Taking on the challenge listed here
>> >
>> > https://dev.to/jorinvo/csv-challenge-1al
>> >
>> > it provides a .json file link to mung into date-named files (never
>> > mind that these happen to all be the same date), I came up with this
>> > vistrace -f -e open,close,read,write -otrace.log ./vim -u NONE data.json -S old_engine.vimm solution:
This is how long it takes on my x86_64 xubuntu-16.04 machine
with SSD, using vim-8.0.1298 (huge)
$ cat old_engine.vim
set regexpengine=1
silent g/.*name":"\([^"]*\)".*card":"\([^"]*\)".*/let
s=substitute(getline('.'), '.*stamp":"\(\d\+\)-\(\d\+\)-\(\d\+\).*',
'\1\2\3','').'.txt'|s//\1,\2/|exec ".w!>>".s
q!
$ cat new_engine.vim
set regexpengine=2
silent g/.*name":"\([^"]*\)".*card":"\([^"]*\)".*/let
s=substitute(getline('.'), '.*stamp":"\(\d\+\)-\(\d\+\)-\(\d\+\).*',
'\1\2\3','').'.txt'|s//\1,\2/|exec ".w!>>".s
q!
$ time ./vim -u NONE data.json -S new_engine.vim
real 0m15.182s
user 0m1.652s
sys 0m0.712s
$ time ./vim -u NONE data.json -S old_engine.vim
real 0m14.363s
user 0m0.972s
sys 0m0.600s
Where data.json is found at https://dev.to/jorinvo/csv-challenge-1al
While Vim is running, it's only uses ~ 10% of the CPU.
So it's not CPU bound.
Using...
$ strace -f -e open,close,read,write,fsync -otrace.log \
./vim -u NONE data.json -S old_engine.vim
I see that Vim keeps opening, writing, closing, opening, writing, closing (etc)
the file 20150425.txt. It only writes a few bytes at at time (between
~30 and ~40 bytes). We also keep opening/closing directory ".":
...snip...
4753 open(".", O_RDONLY) = 8
4753 close(8) = 0
4753 open("20150425.txt", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
4753 write(8, "Lyla Schultz IV,1212-1221-1121-1"..., 36) = 36
4753 fsync(8) = 0
4753 close(8) = 0
4753 open(".", O_RDONLY) = 8
4753 close(8) = 0
4753 open("20150425.txt", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
4753 write(8, "Ricardo Durgan,1234-2121-1221-12"..., 35) = 35
4753 fsync(8) = 0
4753 close(8) = 0
4753 open(".", O_RDONLY) = 8
4753 close(8) = 0
4753 open("20150425.txt", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
4753 write(8, "Barton Graham,1212-1221-1121-123"..., 34) = 34
4753 fsync(8) = 0
4753 close(8) = 0
4753 write(8, "Lyla Schultz IV,1212-1221-1121-1"..., 36) = 36
4753 fsync(8) = 0
4753 close(8) = 0
4753 open(".", O_RDONLY) = 8
4753 close(8) = 0
4753 open("20150425.txt", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
4753 write(8, "Ricardo Durgan,1234-2121-1221-12"..., 35) = 35
4753 fsync(8) = 0
4753 close(8) = 0
4753 open(".", O_RDONLY) = 8
4753 close(8) = 0
4753 open("20150425.txt", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
4753 write(8, "Barton Graham,1212-1221-1121-123"..., 34) = 34
4753 fsync(8) = 0
4753 close(8) = 0
etc. (many other open/write/close like this)
Running this command without filtering syscalls, I get more details
of the repeating open/write/close/fsync calls:
$ strace -f -otrace.log ./vim -u NONE data.json-S old_engine.vim
...snip...
4560 getcwd("/home/pel/sb/vim/src", 4096) = 21
4560 stat("/home/pel/sb/vim/src/20150425.txt", {st_mode=S_IFREG|0664, st_size=3029496, ...}) = 0
4560 open(".", O_RDONLY) = 8
4560 fchdir(8) = 0
4560 chdir("/home/pel/sb/vim/src") = 0
4560 getcwd("/home/pel/sb/vim/src", 4096) = 21
4560 fchdir(8) = 0
4560 close(8) = 0
4560 stat("20150425.txt", {st_mode=S_IFREG|0664, st_size=3029496, ...}) = 0
4560 stat("/home/pel/sb/vim/src/20150425.txt", {st_mode=S_IFREG|0664, st_size=3029496, ...}) = 0
4560 readlink("20150425.txt", 0x7ffef493bd60, 4095) = -1 EINVAL (Invalid argument)
4560 stat(".20150425.txt.swp", 0x7ffef493ccf0) = -1 ENOENT (No such file or directory)
4560 stat("20150425.txt", {st_mode=S_IFREG|0664, st_size=3029496, ...}) = 0
4560 access("20150425.txt", W_OK) = 0
4560 getxattr("20150425.txt", "system.posix_acl_access", 0x7ffef493cb90, 132) = -1 ENODATA (No data available)
4560 stat("20150425.txt", {st_mode=S_IFREG|0664, st_size=3029496, ...}) = 0
4560 open("20150425.txt", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
4560 write(8, "Keeley Bosco,1228-1221-1221-1431"..., 33) = 33
4560 fsync(8) = 0
4560 close(8) = 0
4560 chmod("20150425.txt", 0100664) = 0
4560 setxattr("20150425.txt", "system.posix_acl_access", "\2\0\0\0\1\0\6\0\377\377\377\377\4\0\6\0\377\377\377\377 \0\4\0\377\377\377\377", 28, 0) = 0
4560 recvmsg(5, 0x7ffef493dda0, 0) = -1 EAGAIN (Resource temporarily unavailable)
4560 recvmsg(5, 0x7ffef493dda0, 0) = -1 EAGAIN (Resource temporarily unavailable)
4560 poll([{fd=5, events=POLLIN}], 1, 0) = 0 (Timeout)
4560 select(6, [0 3 5], [], [0 3], {0, 0}) = 0 (Timeout)
4560 getcwd("/home/pel/sb/vim/src", 4096) = 21
4560 stat("/home/pel/sb/vim/src/20150425.txt", {st_mode=S_IFREG|0664, st_size=3029529, ...}) = 0
4560 open(".", O_RDONLY) = 8
4560 fchdir(8) = 0
4560 chdir("/home/pel/sb/vim/src") = 0
4560 getcwd("/home/pel/sb/vim/src", 4096) = 21
4560 fchdir(8) = 0
4560 close(8) = 0
4560 stat("20150425.txt", {st_mode=S_IFREG|0664, st_size=3029529, ...}) = 0
4560 stat("/home/pel/sb/vim/src/20150425.txt", {st_mode=S_IFREG|0664, st_size=3029529, ...}) = 0
4560 readlink("20150425.txt", 0x7ffef493bd60, 4095) = -1 EINVAL (Invalid argument)
4560 stat(".20150425.txt.swp", 0x7ffef493ccf0) = -1 ENOENT (No such file or directory)
4560 stat("20150425.txt", {st_mode=S_IFREG|0664, st_size=3029529, ...}) = 0
4560 access("20150425.txt", W_OK) = 0
4560 getxattr("20150425.txt", "system.posix_acl_access", 0x7ffef493cb90, 132) = -1 ENODATA (No data available)
4560 stat("20150425.txt", {st_mode=S_IFREG|0664, st_size=3029529, ...}) = 0
4560 open("20150425.txt", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
4560 write(8, "Rubye Jerde,1228-1221-1221-1431\n", 32) = 32
4560 fsync(8) = 0
4560 close(8) = 0
4560 chmod("20150425.txt", 0100664) = 0
4560 setxattr("20150425.txt", "system.posix_acl_access", "\2\0\0\0\1\0\6\0\377\377\377\377\4\0\6\0\377\377\377\377 \0\4\0\377\377\377\377", 28, 0) = 0
4560 recvmsg(5, 0x7ffef493dda0, 0) = -1 EAGAIN (Resource temporarily unavailable)
4560 recvmsg(5, 0x7ffef493dda0, 0) = -1 EAGAIN (Resource temporarily unavailable)
4560 poll([{fd=5, events=POLLIN}], 1, 0) = 0 (Timeout)
4560 select(6, [0 3 5], [], [0 3], {0, 0}) = 0 (Timeout)
4560 getcwd("/home/pel/sb/vim/src", 4096) = 21
4560 stat("/home/pel/sb/vim/src/20150425.txt", {st_mode=S_IFREG|0664, st_size=3029561, ...}) = 0
4560 stat("/home/pel/sb/vim/src/20150425.txt", {st_mode=S_IFREG|0664, st_size=3029496, ...}) = 0
4560 open(".", O_RDONLY) = 8
4560 fchdir(8) = 0
4560 chdir("/home/pel/sb/vim/src") = 0
4560 getcwd("/home/pel/sb/vim/src", 4096) = 21
4560 fchdir(8) = 0
4560 close(8) = 0
4560 stat("20150425.txt", {st_mode=S_IFREG|0664, st_size=3029496, ...}) = 0
4560 stat("/home/pel/sb/vim/src/20150425.txt", {st_mode=S_IFREG|0664, st_size=3029496, ...}) = 0
4560 readlink("20150425.txt", 0x7ffef493bd60, 4095) = -1 EINVAL (Invalid argument)
4560 stat(".20150425.txt.swp", 0x7ffef493ccf0) = -1 ENOENT (No such file or directory)
4560 stat("20150425.txt", {st_mode=S_IFREG|0664, st_size=3029496, ...}) = 0
4560 access("20150425.txt", W_OK) = 0
4560 getxattr("20150425.txt", "system.posix_acl_access", 0x7ffef493cb90, 132) = -1 ENODATA (No data available)
4560 stat("20150425.txt", {st_mode=S_IFREG|0664, st_size=3029496, ...}) = 0
4560 open("20150425.txt", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
4560 write(8, "Keeley Bosco,1228-1221-1221-1431"..., 33) = 33
4560 fsync(8) = 0
4560 close(8) = 0
4560 chmod("20150425.txt", 0100664) = 0
4560 setxattr("20150425.txt", "system.posix_acl_access", "\2\0\0\0\1\0\6\0\377\377\377\377\4\0\6\0\377\377\377\377 \0\4\0\377\377\377\377", 28, 0) = 0
4560 recvmsg(5, 0x7ffef493dda0, 0) = -1 EAGAIN (Resource temporarily unavailable)
4560 recvmsg(5, 0x7ffef493dda0, 0) = -1 EAGAIN (Resource temporarily unavailable)
4560 poll([{fd=5, events=POLLIN}], 1, 0) = 0 (Timeout)
4560 select(6, [0 3 5], [], [0 3], {0, 0}) = 0 (Timeout)
4560 getcwd("/home/pel/sb/vim/src", 4096) = 21
4560 stat("/home/pel/sb/vim/src/20150425.txt", {st_mode=S_IFREG|0664, st_size=3029529, ...}) = 0
4560 open(".", O_RDONLY) = 8
4560 fchdir(8) = 0
4560 chdir("/home/pel/sb/vim/src") = 0
4560 getcwd("/home/pel/sb/vim/src", 4096) = 21
4560 fchdir(8) = 0
4560 close(8) = 0
4560 stat("20150425.txt", {st_mode=S_IFREG|0664, st_size=3029529, ...}) = 0
4560 stat("/home/pel/sb/vim/src/20150425.txt", {st_mode=S_IFREG|0664, st_size=3029529, ...}) = 0
4560 readlink("20150425.txt", 0x7ffef493bd60, 4095) = -1 EINVAL (Invalid argument)
4560 stat(".20150425.txt.swp", 0x7ffef493ccf0) = -1 ENOENT (No such file or directory)
4560 stat("20150425.txt", {st_mode=S_IFREG|0664, st_size=3029529, ...}) = 0
4560 access("20150425.txt", W_OK) = 0
4560 getxattr("20150425.txt", "system.posix_acl_access", 0x7ffef493cb90, 132) = -1 ENODATA (No data available)
4560 stat("20150425.txt", {st_mode=S_IFREG|0664, st_size=3029529, ...}) = 0
4560 open("20150425.txt", O_WRONLY|O_CREAT|O_APPEND, 0664) = 8
4560 write(8, "Rubye Jerde,1228-1221-1221-1431\n", 32) = 32
4560 fsync(8) = 0
4560 close(8) = 0
4560 chmod("20150425.txt", 0100664) = 0
4560 setxattr("20150425.txt", "system.posix_acl_access", "\2\0\0\0\1\0\6\0\377\377\377\377\4\0\6\0\377\377\377\377 \0\4\0\377\377\377\377", 28, 0) = 0
4560 recvmsg(5, 0x7ffef493dda0, 0) = -1 EAGAIN (Resource temporarily unavailable)
4560 recvmsg(5, 0x7ffef493dda0, 0) = -1 EAGAIN (Resource temporarily unavailable)
4560 poll([{fd=5, events=POLLIN}], 1, 0) = 0 (Timeout)
4560 select(6, [0 3 5], [], [0 3], {0, 0}) = 0 (Timeout)
4560 getcwd("/home/pel/sb/vim/src", 4096) = 21
4560 stat("/home/pel/sb/vim/src/20150425.txt", {st_mode=S_IFREG|0664, st_size=3029561, ...}) = 0
...snip...
Notice that it's use fsync() after each small write to file "20150425.txt".
Using my attached io.pl script which summarizes the trace.log file
and shows where I/Os come from, I get this:
$ strace -f -e open,close,read,write,fsync -otrace.log \
./vim -u NONE data.json -S old_engine.vim
$ io.pl < trace.log$ strace -f -e open,close,read,write,fsync -otrace.log \
./vim -u NONE data.json -S old_engine.vim
read_bytes #read write_bytes #write #open #fsync filename
---------- ----- ----------- ------ ----- ------ --------
1612897 27 0 0 2 0 data.json
135992 35 0 0 2 0 /usr/share/X11/locale/locale.alias
36864 9 0 0 1 0 /usr/share/X11/locale/locale.dir
16720 8 0 0 4 0 /home/pel/.ICEauthority
4250 3 0 0 1 0 /usr/share/X11/locale/en_US.UTF-8/XLC_LOCALE
3361 2 0 0 1 0 /lib/terminfo/x/xterm
3171 16 216 7 1 0 /usr/share/locale/locale.alias
832 1 0 0 1 0 /lib/x86_64-linux-gnu/libnss_nis.so.2
---------- ----- ----------- ------ ----- ------ --------
1612897 27 0 0 2 0 data.json
135992 35 0 0 2 0 /usr/share/X11/locale/locale.alias
36864 9 0 0 1 0 /usr/share/X11/locale/locale.dir
16720 8 0 0 4 0 /home/pel/.ICEauthority
4250 3 0 0 1 0 /usr/share/X11/locale/en_US.UTF-8/XLC_LOCALE
3361 2 0 0 1 0 /lib/terminfo/x/xterm
3171 16 216 7 1 0 /usr/share/locale/locale.alias
832 1 0 0 1 0 /lib/x86_64-linux-gnu/libnss_nis.so.2
...snip many files with little amount of I/Os...
0 0 0 0 1 0 .data.json.swx
0 0 179281 5004 5004 5004 20150425.txt
0 0 0 0 5 0 /etc/ld.so.cache
0 0 0 0 5004 0 .
0 0 16384 4 2 0 .data.json.swp
0 0 0 0 1 0 /usr/lib/locale/locale-archive
---------- ----- ----------- ------ ------ ----- TOTAL
1883068 191 195881 5015 15121 5004
0 0 179281 5004 5004 5004 20150425.txt
0 0 0 0 5 0 /etc/ld.so.cache
0 0 0 0 5004 0 .
0 0 16384 4 2 0 .data.json.swp
0 0 0 0 1 0 /usr/lib/locale/locale-archive
---------- ----- ----------- ------ ------ ----- TOTAL
1883068 191 195881 5015 15121 5004
As highlighted in red, notice that we only write 179,281 bytes to file
"20150425.txt" but we open/write/close/fsync 5004 times.
That repeated I/O patterns kills performance, especially because
of the frequent fsync().
We also read 1,612,897 bytes from data.json, but that should be
fast since we do only 27 read and opened that file only twice
according to output of io.pl. Vim reads file deta.json by chunks
of 64 KB. 1,612,897 bytes is exactly the file size of data.json.
Maybe we can change vim to avoid the constant open/write/close/fsync
on file "20150425.txt" which should speed it up greatly.
Regards
Dominique
fast since we do only 27 read and opened that file only twice
according to output of io.pl. Vim reads file deta.json by chunks
of 64 KB. 1,612,897 bytes is exactly the file size of data.json.
Maybe we can change vim to avoid the constant open/write/close/fsync
on file "20150425.txt" which should speed it up greatly.
Regards
Dominique
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php
---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
No comments:
Post a Comment