Hi pra007!
On Do, 22 Okt 2015, pra007 wrote:
> I am using vim for windows
> 
> I have large (800 mb plus) file containing following formate
> 
> the file is space separated
> 
> 8232394 06774483 N 19850910 19870818 19910818 EXP. 
> 8309716 06774483 N 19850910 19870818 19910319 REM. 
> 4687262 06908244 N 19860917 19870818 19990815 EXP. 
> 4687262 06908244 N 19860917 19870818 19990309 REM. 
> 4687262 06908244 N 19860917 19870818 19950221 M184 
> 4687262 06908244 N 19860917 19870818 19910108 M173 
> 4687262 06908244 N 19860917 19870818 19880802 ASPN 
> 4687263 06868897 N 19860527 19870818 19990128 M185 
> 4687263 06868897 N 19860527 19870818 19950509 RMPN 
> 4687263 06868897 N 19860527 19870818 19950509 ASPN 
> 4687263 06868897 N 19860527 19870818 19950119 M184 
> 4687263 06868897 N 19860527 19870818 19910311 ASPN 
> 4687263 06868897 N 19860527 19870818 19910124 M173 
> 4687264 06882047 N 19860703 19870818 19990815 EXP. 
> 4687264 06882047 N 19860703 19870818 19990309 REM. 
> 4687264 06882047 N 19860703 19870818 19950503 RMPN 
> 4687264 06882047 N 19860703 19870818 19950503 ASPN 
> 4687264 06882047 N 19860703 19870818 19950119 M184 
> 4687264 06882047 N 19860703 19870818 19910311 ASPN
> RE45781 14176526 N 20140210 20151027 20150929 ASPN 
> RE45786 14260890 N 20140424 20151027 20150929 ASPN 
> RE45790 14454285 Y 20140807 20151103 20151008 ASPN 
> RE45793 13445791 N 20120412 20151103 20151006 ASPN
> 
> I have another .txt file (small) containing following formate
> 4687264 
> 4687264 
> 4687264 
> RE45781
> RE45786
> RE45790
> RE45793
> 
> Now I want to extract lines from big file having match from the small file
> with respect to column 1 which will only contain lines which are presnet in
> small txt file
> 
> The result file should look like this
> 
> 4687264 06882047 N 19860703 19870818 19990815 EXP. 
> 4687264 06882047 N 19860703 19870818 19990309 REM. 
> 4687264 06882047 N 19860703 19870818 19950503 RMPN 
> 4687264 06882047 N 19860703 19870818 19950503 ASPN 
> 4687264 06882047 N 19860703 19870818 19950119 M184 
> 4687264 06882047 N 19860703 19870818 19910311 ASPN
> RE45781 14176526 N 20140210 20151027 20150929 ASPN 
> RE45786 14260890 N 20140424 20151027 20150929 ASPN 
> RE45790 14454285 Y 20140807 20151103 20151008 ASPN 
> RE45793 13445791 N 20120412 20151103 20151006 ASPN
> 
> Is there any way?
If you have awk available it should be trivial and fast:
#v+
0 14908 chrisbra@debian /tmp % awk 'NR==FNR {a[$1]}
 NR!=FNR && $1 in a' ids.txt large_file.txt
 4687264 06882047 N 19860703 19870818 19990815 EXP.
 4687264 06882047 N 19860703 19870818 19990309 REM.
 4687264 06882047 N 19860703 19870818 19950503 RMPN
 4687264 06882047 N 19860703 19870818 19950503 ASPN
 4687264 06882047 N 19860703 19870818 19950119 M184
 4687264 06882047 N 19860703 19870818 19910311 ASPN
 RE45781 14176526 N 20140210 20151027 20150929 ASPN
 RE45786 14260890 N 20140424 20151027 20150929 ASPN
 RE45790 14454285 Y 20140807 20151103 20151008 ASPN
 RE45793 13445791 N 20120412 20151103 20151006 ASPN
#v-
It can also be done with VimL, but this will most like be slower.
Something like this should do it:
1) Open your file with the ids:
:let ids=getline(1,'$')
:let @/='^'.join(ids, '\V\|')
:e logfile
:v//d
(Not this changes your logfile. So use 'u' to undo the modification')
Best,
Christian
-- 
Chef: "Wir brauchen auch eine SQL-Datenbank!"
Angestellter denkt: "Weiß er wovon er spricht oder hat er das nur
wieder irgendwo aufgeschnappt?" 
sagt: "OK, welche Farbe soll sie denn haben?"
Chef: "Nun, ich denke Flieder hat das meiste RAM!"
-- 
-- 
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php
--- 
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Subscribe to:
Post Comments (Atom)
 
No comments:
Post a Comment