终于把个人覆盖率统计搞清楚了，还一鱼两吃

终于把个人代码覆盖率搞清楚了

为啥这是个事情？

在实施了质量门禁的团队中，通常都会对MR/PR设置(增量)代码覆盖率门禁。

如果MR/PR中的代码均来自某位开发人员，那么如果质量门禁未通过，这个发起MR/PR的人就是事主，找到他解决即可。这也是通常质量门禁红绿灯背后的逻辑。

最近，则遇到了一个幺蛾子的事情，在分析某个开发团队的MR时，发现居然来自一个群租的特性分支。这个特性分支上的代码提交人不是1个，2个，3个，而是一整个开发小组！

当代码合并请求被质量门禁拒绝之后，收到提醒的同学两手一摊也表示很无奈，这些不都是我的代码啊，我只能为我自己的代码负责。部门负责人和组长也两手一摊，表示我也有心无力啊，这些事情也和大家强调过了。也不能一人作弊，全组罚站吧。平台能不能帮我们把坏掉找出来？

所以，兜兜转转问题又回来了。

思路

在劝说需求要拆分要MECE无果的情况下，就转而考虑还是把不达标的人抓出来算了。有了这个数据，说话也更有分量不是。

大致的方案是这样的，

1）通过Git Blame可以拿到每个代码文件的每一行的行号、内容、最后修改者、commit等数据

2）通过Jacoco获取到(增量）代码覆盖率报告

3）缝合两者的数据，通过行号关联人覆盖的数据

4）根据人聚合出每个开发人员应该负责代码行数和被覆盖的代码行数

5）计算出谁的行覆盖率没达标

6）分支覆盖也类似套路

实现

以git blame为例，使用jgit这个库，

下载代码repo，checkout到指定分支
过滤代码库目录，得到需要blame的文件清单，例如指定 src/main/java下的以.java后缀的文件
对每个文件执行 git blame,得到每个文件的blame结果。如果知道起点COMMIT,也可以在这里设置，以获得增量结果。

假设repo已经下载、分支什么的已经就绪，以下是一个简要的实现

代码语言：javascript复制

packagecom.github;
importcom.github.domain.AuthorStats;
importcom.github.domain.BlamedFile;
importcom.github.domain.BlamedJacocoLine;
importcom.github.domain.Line;
importorg.eclipse.jgit.api.BlameCommand;
importorg.eclipse.jgit.api.Git;
importorg.eclipse.jgit.api.errors.GitAPIException;
importorg.eclipse.jgit.blame.BlameResult;
importorg.eclipse.jgit.lib.PersonIdent;
importorg.eclipse.jgit.lib.Repository;
importjava.io.File;
importjava.io.IOException;
importjava.nio.file.Files;
importjava.nio.file.Path;
importjava.nio.file.Paths;
importjava.util.ArrayList;
importjava.util.List;
importjava.util.Map;
importjava.util.concurrent.ConcurrentHashMap;
importjava.util.concurrent.ExecutorService;
importjava.util.concurrent.Executors;
importjava.util.concurrent.TimeUnit;
importjava.util.stream.Collectors;
importjava.util.stream.Stream;
publicclassGitBlameConcurrentProcessor{
publicstaticList<BlamedJacocoLine>blamedJacocoLines=newArrayList<>();
/*
1）通过GitBlame可以拿到每个代码文件的每一行的行号、内容、最后修改者、commit等数据
2）通过Jacoco获取到(增量）代码覆盖率报告
3）缝合两者的数据，通过行号关联人 覆盖的数据
4）根据人聚合出每个开发人员应该负责代码行数和被覆盖的代码行数
5）计算出谁的行覆盖率没达标
6）分支覆盖也类似套路
*/
publicstaticvoidmain(String[]args)throwsIOException{
longstart=System.currentTimeMillis();
StringjacocoReportPath="D:\antony\repo\redis-embedded-server\target\site\jacoco\jacoco.xml";
StringgitRepoPath="D:\antony\repo\redis-embedded-server";
ConcurrentHashMap<String,BlamedFile>resultMap=newConcurrentHashMap<>();
//1-通过GitBlame可以拿到每个代码文件的每一行的行号、内容、最后修改者、commit等数据
gitBlamedFiles(gitRepoPath,resultMap);
//2-通过Jacoco获取到(增量）代码覆盖率报告
List<XmlReportParser.SourceFile>sourceFiles=
(newXmlReportParser(Paths.get(jacocoReportPath))).parse();
//上述parser是从sonar-jacoco-plugin项目里扒拉过来的一个工具类
//3-缝合两者的数据，通过行号关联人 覆盖的数据
bindJacocoWithBlameResult(sourceFiles,resultMap);
//结果保存到了List<BlamedJacocoLine>blamedJacocoLines中
//如果2中提供的是生成的增量覆盖率报告，则此处得到的也就是增量的个人报告了
//4-找出每个开发人员的代码覆盖率
//每个author的汇总
Map<String,AuthorStats>authorStatsMap=getAuthorTotalCoverage(blamedJacocoLines);
System.out.println("totalfilesblamed:" sourceFiles.size());
System.out.println(authorStatsMap);
System.out.println("totaltimeused(second):" (System.currentTimeMillis()-start)/1000);
}
staticMap<String,AuthorStats>getAuthorTotalCoverage(List<BlamedJacocoLine>blamedJacocoLines){
Map<String,AuthorStats>authorStatsMap=blamedJacocoLines.stream()
.collect(Collectors.groupingBy(BlamedJacocoLine::getAuthor,
Collectors.collectingAndThen(
Collectors.partitioningBy(line->line.getCoveredInstrs()>0,Collectors.counting()),
partitioningByResult->
AuthorStats.builder()
.coveredLines(partitioningByResult.get(true))
.missedLines(partitioningByResult.get(false))
.build())));
returnauthorStatsMap;
}
publicstaticvoidgitBlamedFiles(StringgitRepoPath,ConcurrentHashMap<String,BlamedFile>resultMap)throwsIOException{
longstart=System.currentTimeMillis();
//InitializeJGitRepository
Repositoryrepository=initRepository(gitRepoPath);
ExecutorServiceexecutorService=Executors.newFixedThreadPool(20);
List<String>fileNames=walkFiles(gitRepoPath);
List<String>converted=relativePath(fileNames,gitRepoPath);
for(StringfileName:converted){
executorService.execute(()->{
BlameResultblameResult=runBlameCommand(repository,fileName);
if(blameResult!=null){
BlamedFileblamedFile=BlamedFile.builder().name(fileName).lines(newArrayList<>()).build();
processBlameResult(blameResult,blamedFile);
resultMap.put(fileName,blamedFile);
}
});
}
executorService.shutdown();
try{
executorService.awaitTermination(Long.MAX_VALUE,TimeUnit.NANOSECONDS);
}catch(InterruptedExceptione){
e.printStackTrace();
}
//Don'tforgettoclosetherepositorywhendone
repository.close();
System.out.println("totalfilesblamed:" converted.size());
System.out.println("totaltimeused(second):" (System.currentTimeMillis()-start)/1000);
}
publicstaticvoidbindJacocoWithBlameResult(List<XmlReportParser.SourceFile>sourceFiles,ConcurrentHashMap<String,BlamedFile>resultMap){
for(XmlReportParser.SourceFilesourceFile:sourceFiles){//package filename
StringfullName=("src/main/java/" sourceFile.packageName() "/" sourceFile.name()).replace('\','/');
if(resultMap.containsKey(fullName)){
System.out.println("matched::" fullName);
BlamedFilefile=resultMap.get(fullName);
List<Line>blamedLines=file.getLines();
for(XmlReportParser.Lineline:sourceFile.lines()){
BlamedJacocoLineblamedJacocoLine=BlamedJacocoLine.builder()
.coveredBranches(line.coveredBranches())
.missedBranches(line.missedBranches())
.coveredInstrs(line.coveredInstrs())
.missedInstrs(line.missedInstrs())
.fileName(fullName).build();
for(LineblamedLine:blamedLines){
if(line.number()==blamedLine.getNumber()){
blamedJacocoLine.setAuthor(blamedLine.getAuthor());
blamedJacocoLine.setCommit(blamedLine.getCommit());
blamedJacocoLine.setWhen(blamedLine.getWhen());
blamedJacocoLine.setNumber(blamedLine.getNumber());
}
}
blamedJacocoLines.add(blamedJacocoLine);
}
}
}
}
publicstaticList<String>walkFiles(StringrepoPath)throwsIOException{
Pathbase=Paths.get(repoPath);
Stream<Path>walk=Files.walk(base);
List<String>fileList=walk.map(x->x.toString())
.filter(f->f.endsWith(".java"))
.collect(Collectors.toList());
returnfileList;
}
publicstaticList<String>relativePath(List<String>files,StringrepoPath){
List<String>relativePathFiles=newArrayList<>();
Pathbase=Paths.get(repoPath);
for(Stringfile:files){
PathrelativePath=base.relativize(Paths.get(file));
StringfileReplaced=relativePath.toString().replace('\','/');
relativePathFiles.add(fileReplaced);
}
returnrelativePathFiles;
}
privatestaticRepositoryinitRepository(StringgitRepoPath){
try{
returnGit.open(newFile(gitRepoPath)).getRepository();
}catch(IOExceptione){
e.printStackTrace();
returnnull;
}
}
privatestaticBlameResultrunBlameCommand(Repositoryrepository,StringfileName){
try{
BlameCommandblameCommand=newBlameCommand(repository);
blameCommand.setFilePath(fileName);
returnblameCommand.call();
}catch(GitAPIExceptione){
e.printStackTrace();
returnnull;
}
}
privatestaticvoidprocessBlameResult(BlameResultblameResult,BlamedFileblamedFile){
//ProcesstheblameResultandreturnthedesiredinformation
for(inti=0;i<blameResult.getResultContents().size();i  ){
PersonIdentpersonIdent=blameResult.getSourceAuthor(i);
intnumber=i 1;//linenumstartwith1
Stringauthor=personIdent.getEmailAddress();
Stringcommit=blameResult.getSourceCommit(i).getName();
Stringdate=personIdent.getWhen().toString();
//TODO:fixtimeZone
Lineline=Line.builder().author(author).commit(commit).when(date).number(number).build();
blamedFile.getLines().add(line);
}}}

一鱼两吃

前面是把git blame数据和jacoco数据进行了整合，可以知道每个开发人员的代码覆盖率数据。对于度量平台来说，通常也有人希望我们能回答公司目前这么多的代码库repo,一共有多少个库，多少个文件，多少行代码。希望我们能盘点清楚目前公司的家底，以及编程语言的变化趋势。类似的某个团队，某个人的技术栈也可以通过类似的方式从代码行数据中洞察出来。

而这些数据其实都在 ConcurrentHashMap<String,BlamedFile>resultMap 这个数据结构中。

例如，我们根据文件的后缀名（如.java)分类统计一下,就能知道某个repo总共有多少个此类的文件，以及总计有多少行了。当然如果想要知道地更细，可以再区分一下开发、测试代码的比例，或者是根据每一行的内容再剔除一下空行等等。

性能方面，内部测试了一下，以一个1万个文件的代码库为例，git blame了1500个文件，并分析了jacoco.xml中涉及到的500个java文件，总耗时在30秒以内（10个并发）。

list string 开发数据统计

0 人点赞