Tuesday, October 2, 2012

Counting log messages

Kees Cook made an interesting observation on his Google+ post.
$ git log --no-merges v3.5..v3.6 |egrep -i '(integer|counter|buffer|stack|fix) (over|under)flow' | wc -l
31
It finds phrases like "integer overflow", "fix underflow", etc. in the log messages of commits since v3.5 release up to v3.6 release.  There are 31 such phrases among 10k non-merge commits.

This however does not necessarily mean that there are 31 commits.  There only are 23 such commits (the "--grep=" option takes BRE, and syntactic metacharacters need to be quoted with backslashes), and you can count them like this:
$ git log --oneline --no-merges --regexp-ignore-case --grep='\(integer\|counter\|buffer\|stack\|fix\) \(over\|under\)flow' v3.5..v3.6 | wc -l 
23
This is because some commits have these phrases multiple times.  For example, commit dd03e734 reads like this:

mlx4_core: Fix integer overflows so 8TBs of memory registration works
This patch adds on the fixes done in commits 89dd86db78e0 ("mlx4_core:Allow large mlx4_buddy bitmaps") and 3de819e6b642 ("mlx4_core: Fix integer overflow issues around MTT table") so that memory registrationof up to 8TB (log_num_mtt=31) finally works.
It fixes integer overflows in a few mlx4_table_yyy routines in icm.cby using a u64 intermediate variable, and int/uint issues that causedtable indexes to become nagive by setting some variables to be u32instead of int.  These problems cause crashes when a user attempted toregister > 512GB of RAM.

Note that the regular expression used here does not catch when the phrase does not appear exactly as spelled in the log message. For example, it misses this:

mtdchar: fix offset overflow detection
Sasha Levin has been running trinity in a KVM tools guest, and was ableto trigger the BUG_ON() at arch/x86/mm/pat.c:279 (verifying the range of the memory type).  The call trace showed that it was mtdchar_mmap() that created an invalid remap_pfn_range().
To catch a commit with a log message like this, we can loosen the condition to say "the commit log must have one of these words (integer|counter|buffer|stack|fix), and also one of these words (overflow|underflow)".

The way to spell that is like this:
$ git log --all-match --no-merges --regexp-ignore-case \ 
  --grep='\(integer\|counter\|buffer\|stack\|fix\)' \
  --grep='\(over\|under\)flow' v3.5..v3.6
Each --grep= pattern match individually, and --all-match tells git log to show only commits for which all of the patterns trigger, and that is why this will find "fix" and "overflow" not next to each other.  With --oneline , this will count 53 commits.

By the way, I haven't written "Git guide" material on my blog for quite some time, and it shows. I need to practice writing a bit more.

No comments: