Saturday, January 6, 2024

Re: DirDiffVim for folders, but files are missed?

Thanks.
Yes, the Vim question has morphed into a programming question, which is not only 'OT' for this forum, but definitely 'OT' for my skills.
I simply need Google Photos to download my photos, imagine computer illiterate people trying this?

I'm stuck on:
step: generating a list of md5 values, with their counts of how often they occur (counts >1 indicate duplicates);
step: select md5 values with counts >1;
step: use grep to find that md5 with its (fileName.jpg?) in the original (TakeoutAlbumYears?) file.

Here's where I'm at:
ubuntu@ubuntu:~/Documents$ sort |uniq -c|sort -nr fileNamesCutOutOfSum.md5
126a8a51b9d1bbd07fddc65819a542c3
126a8a51b9d1bbd07fddc65819a542c3
3e7705498e8be60520841409ebc69bc1
3bc3be114fb6323adc5b0ad7422d193a
d8e8fca2dc0f896fd7cb4cb0031ba249
d8e8fca2dc0f896fd7cb4cb0031ba249
^C
ubuntu@ubuntu:~/Documents$ cat fileNamesCutOutOfSum.md5
3bc3be114fb6323adc5b0ad7422d193a
126a8a51b9d1bbd07fddc65819a542c3
3e7705498e8be60520841409ebc69bc1
d8e8fca2dc0f896fd7cb4cb0031ba249
126a8a51b9d1bbd07fddc65819a542c3
d8e8fca2dc0f896fd7cb4cb0031ba249
ubuntu@ubuntu:~/Documents$ sort |uniq -c|sort -nr fileNamesCutOutOfSum.md5 > fNCOOSSorted.md5
^C
ubuntu@ubuntu:~/Documents$ ls
fileNamesCutOutOfSum.md5  fNCOOSSorted.md5  NoMachine  sum.md5  test1  test2
ubuntu@ubuntu:~/Documents$ cat fNCOOSSorted.md5
126a8a51b9d1bbd07fddc65819a542c3
126a8a51b9d1bbd07fddc65819a542c3
3e7705498e8be60520841409ebc69bc1
3bc3be114fb6323adc5b0ad7422d193a
d8e8fca2dc0f896fd7cb4cb0031ba249
d8e8fca2dc0f896fd7cb4cb0031ba249
On Sunday 7 January 2024 at 03:25:16 UTC+10 jr wrote:
hi,

(you do realise we're somewhat OT for this forum ? :-))

On Sat, 6 Jan 2024 at 15:00, K otgc <kontheg...@gmail.com> wrote:
> Thanks.
> I ran these commands, and I'm up to the final step of using those hash values to look up all the matching filenames in the original md5 file.
> I'm researching a command for that, as command fdupes seems to be for files, but I need to match up the md5 hash values?
> ubuntu@ubuntu:~/Documents$ find -H test1/ ! -type d -exec md5sum {} + > sum.md5
> ubuntu@ubuntu:~/Documents$ ls
> NoMachine sum.md5 test1 test2
> ubuntu@ubuntu:~/Documents$ cat sum.md5
> 3bc3be114fb6323adc5b0ad7422d193a test1/test1.1/test1.1.1/test1.1.1file2.JPG
> 126a8a51b9d1bbd07fddc65819a542c3 test1/test1.1/test1.1.1/test1.1.1file1.JPG.json
> 3e7705498e8be60520841409ebc69bc1 test1/test1.1/test1.1.1/test1.1.1file1.JPG
> d8e8fca2dc0f896fd7cb4cb0031ba249 test1/test2/test2.2/test2.2.2/test2.2.2file1.JPG
> 126a8a51b9d1bbd07fddc65819a542c3 test1/test2/test2.2/test2.2.2/test1.1.1file1.JPG.json
> d8e8fca2dc0f896fd7cb4cb0031ba249 test1/test2/test2.2/test2.2.2/test2.2.2file1.JPG.json
> ubuntu@ubuntu:~/Documents$ sort|uniq -c|sort -nr sum.md5 |cut -d ' ' -f1
> 126a8a51b9d1bbd07fddc65819a542c3
> 126a8a51b9d1bbd07fddc65819a542c3
> 3e7705498e8be60520841409ebc69bc1
> 3bc3be114fb6323adc5b0ad7422d193a
> d8e8fca2dc0f896fd7cb4cb0031ba249
> d8e8fca2dc0f896fd7cb4cb0031ba249


ubuntu@ubuntu:~/Documents$ find -H test1/ ! -type d -exec md5sum {} + > sum.md5

why not use '-type f' ? anyway, the following should do what you look for:

$ find -H test1/ ! -type d -exec md5sum {} + | awk -f kotgc.awk

the awk code is:
-----<snip>-----
{
if ($1 in arr)
arr[$1] = arr[$1] ", " $2
else
arr[$1] = $2
}

END {
for (m in arr)
if (arr[m] ~ ".*,.*")
print m " " arr[m]
}
-----<snip>-----

--
regards, jr.

You have the right to free speech, as long as you're not dumb enough
to actually try it.
(The Clash 'Know Your Rights')

this email is intended only for the addressee(s) and may contain
confidential information. if you are not the intended recipient, you
are hereby notified that any use of this email, its dissemination,
distribution, and/or copying without prior written consent is
prohibited.

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/vim_use/ac56ff91-886d-493a-8572-be0c614768ccn%40googlegroups.com.

No comments: