shell diff文件,但在比较中忽略了一些文本

vbkedwbf  于 7个月前  发布在  Shell
关注(0)|答案(1)|浏览(56)

我有两个文件:
file1.txt:

boring stuff
interesting value = 123  !ts:yesterday
interesting value = 456  !ts:yesterday
boring stuff

字符串
file2.txt:

boring stuff
interesting value = 123  !ts:today
interesting value = 789  !ts:today
boring stuff


现在,diff -u file1.txt file2.txt将突出显示两行interesting value,因为!ts:后面的文本已经改变。不是我想要的。我只想看到第二行interesting value已经改变,而不被告知第一行保持不变,但后面有不同的注解。
所以我可以这样做:

diff -u <(sed -e 's/\!ts:.*$//' file1.txt) <(sed -e 's/\!ts:.*$//' file2.txt)
@@ -1,4 +1,4 @@
     boring stuff
     interesting value = 123  
    -interesting value = 456  
    +interesting value = 789  
     boring stuff

的字符串
太好了!
但是现在我在输出中看不到过滤掉的文本,我需要在执行过滤比较之后将过滤掉的上下文放回去,这样我就可以理解diff中更改的上下文,但是上下文本身不被认为是不匹配的。
有什么方法可以实现这一点吗?我能找到的唯一过滤器开关-I完全删除了匹配的行。

bvhaajcl

bvhaajcl1#

filterdiff.awk

# print lines saved up to this point
function emit() {
    for (i=1; i<=n; i++)
        if (i in lines)
            print lines[i]
}

/^@@/ {
    emit()

    # reset state
    split(_,lookup)
    n = split(_,lines)

    # no longer in diff header section
    filtering = 1
}

filtering {
    if (i = index($0,"!ts:")) {
        want = substr($0, 2, i-2)

        if (/^[-]/) {
            # save state
            lookup[want] = n+1
        }
        else if (/^[+]/ && i = lookup[want]) {
            # line exists in old and new - edit it

            # keep old or new version?
            if (keep_new) lines[i] = $0

            sub(/^./, " ", lines[i])
            next
        }
    }
}

# save lines to print later
{ lines[++n] = $0 }

END { emit() }
$ diff -u old new | awk -f filterdiff.awk
--- file1.txt   2023-11-04 02:01:45.062527075 +0000
+++ file2.txt   2023-11-04 02:01:48.306560296 +0000
@@ -1,4 +1,4 @@
 boring stuff
 interesting value = 123  !ts:yesterday
-interesting value = 456  !ts:yesterday
+interesting value = 789  !ts:today
 boring stuff
$
$ diff -u old new | awk -f filterdiff.awk keep_new=1
--- file1.txt   2023-11-04 02:01:45.062527075 +0000
+++ file2.txt   2023-11-04 02:01:48.306560296 +0000
@@ -1,4 +1,4 @@
 boring stuff
 interesting value = 123  !ts:today
-interesting value = 456  !ts:yesterday
+interesting value = 789  !ts:today
 boring stuff

上面的代码要求-+行都出现在同一个@@块中。
这是一个可以跨块替换的变体。它可能要求“感兴趣”的数据是唯一的。

filterdiff2.awk

{ lines[NR] = $0 }

i = index($0, "!ts:") {
    q = substr($0, 2, i-2)
    if (/^[-]/) old[q] = NR
    if (/^[+]/) new[q] = NR
}

END {
    for (q in old)
        if (q in new) {
            if (keep_new)
                lines[ old[q] ] = lines[ new[q] ]
            sub(/^[-+]/, " ", lines[ old[q] ])
            delete lines[ new[q] ]
        }
    for (i=1; i<=NR; i++)
        if (i in lines)
            print lines[i]
}

相关问题