CodeZhi

This is the online support of the project CodeZhi. This page includes the jar package of CodeZhi and a sampled dataset used by the tool. This page also introduces features which CodeZhi has picked up to profile methods and how to use the tool to profile methods in a Java project, a package, or a class.
CodeZhi is a tool of feature extraction for Java methods.

We try to find representative features, which can distinguish methods well. Here we picked up 103 features as the metircs to profile Java methods.
The USAGE of the tool can be found as follows,
java -jar CodeZhi.jar -source ‹parameter1›
java -jar CodeZhi.jar -source ‹parameter1› -package ‹parameter2› ...
java -jar CodeZhi.jar -source ‹parameter1› -class ‹parameter2› ...
java -jar CodeZhi.jar -source ‹parameter1› -method ‹parameter2› ...
For example,
java -jar CodeZhi.jar -source project_path
java -jar CodeZhi.jar -source project_path -package package_name
java -jar CodeZhi.jar -source project_path -class class_name
java -jar CodeZhi.jar -source project_path -method class_name#method_signature
means the input info is as follows,
  • the path of the project which contians java source file (c:/pro/src): project_path
  • Path of training set of faults: inPathFault_busybox
  • the name of target package: package_name
  • the name of target class: package_class
  • the signature of target method need to specify the class (like A#a()): classname#methodsignature
As this page mentioned before, we picked up some outstanding features to profile methods. Here are those features in detail.

typedesciprtionnotesfeature-name
about statementsthe number of declarationscount all the declarations in the method. Like"int a;"num_declaration
the number of initializations and definitionscount all the initializations and definitions in the method. Like "int x=1;"num_initialization
the number of assignmentscount all the assignments in the method. Like"x=1;"num_assignment
the number of lines count all the lines in the methodloc_executable
the max length of the code blockcount the max number of code block's lines in the methodmax_loc_inblock
the min length of the code blockcount the min number of code block's lines in the methodmin_loc_inblock
about variablesthe number of local variablescount all the local variables defined in the methodnum_local_var
the number of local variables never used count the number of local variables which is never used in the methodnum_unsed_var
the number of local variables used oncecount the number of local variables which is used only once in the methodnum_local_var_once
the number of local variables used more than oncecount number of local variabls used more than once in the method.num_local_var_overonce
the total times of local variablescount the total times of local variabls used in the method.sum_times_used_var
the max time of local variablescount the max times of local variables used in the method.max_times_used_var
the number of fieldscount the number of fileds used in the method.num_filed
the number of fields used oncenum_filed_once
the number of fields used more than oncenum_field_overonce
the max time of fieldscount the max time of field used in the method.max_times_used_filed
the total times of fieldscount the total tims of field used in the method.sum_times_used_field
the number of static variables count the number of static variables in the method.num_static_var
the number of static variables used oncenum_static_var_once
the number of static variables used more than oncenum_static_var_overonce
the max time of static variablescount the max time of static variables in the method.max_times_used_static_var
the total times of static variablessum_times_used_static_var
the number of parametersnum_param
the number of parameters used oncenum_param_used_once
the number of parameters used more than oncenum_param_used_overonce
the max time of parameters usedcount the max time of parameters used in the method.max_times_used_param
the total times of parameters usedcount the total times of parameters in the method.sum_times_used_param
the number of arrayscount the number of arrays in the method. num_array
the max time of array usedcount the max time of array used in the method. max_times_used_array
the total times of arrays usedcount the total times of arrays in the method.sum_times_used_array
about objectsthe number of objectscount the total number of objects in the method. num_object
the number of object by constructorcount the number of objects generated by constructor. num_object_constructor
the number of object not by constructorcount the number of object generated not by constructor. num_object_non_constructor
about methodsthe number of methods calledcount the number of methods called in the method.num_called_method
the max time of method being calledcount the max time of method being called in the method. max_times_called_method
the total times of methods being calledcount the total times of methods being called in the method. sum_times_called_method
the number of static methods being calledcount the number of static methods being called in the method.num_called_static_method
the max time of static method being calledcount the max time of static method being called in the method. max_times_called_static_method
the total times of static methods being calledcount total times of static methods being called in the method.sum_times_called_static_method
about operators and operands(called num_operator)the number of operators( ) [ ] -> . ! ~ ++ -- + - * & / % << >> < <= > >= == != ^ | && || ?: = += -= /= %= ^= new instance ofsum_times_used_operator
the max time of a operator usedcount max time of a operator used in the method.max_times_used_operator
the type of operator used(called unique_operator)num_operator
the number of operands(called num_operand)count number of all the variables which present datum.sum_times_used_operand
the max time of a operand usedcount max time of a operand used in the method.max_times_used_operand
the type of operands(called unique_operand)num_operand
about complexityhalstead vocabulary complexity µµ = unique_operator + unique_operandhalstead_vocabulary_complexity
healstead length complexity NN = num_operator + num_operandhealstead_length_complexity
healstead level complexity LL=(2*unique_operand)/(unique_operator*num_operand)healstead_level_complexity
healstead difficulty complexity DD = 1 / Lhealtead_difficulty_complexity
healstead capacity complexity VV = N * log2(unique_operator + unique_operand)healstead_capacity_complexity
healstead efficiency complexity EE = V / Lhealstead_efficientcy_complexity
healstead time complexity TT = E / 18healtead_time_complexity
healstead faulty complexity BB = E2/3 / 1000healtead_bug
Mccabe complexity Mccabe = edge - node + 2McCabe
about code blocksthe number of if blocksnum_if
the number of if conditions which contain methodnum_if_with_method_call
the number of else-if blocksnum_elseif
the max number of else-if block which is after a if statementcount how many else-if blocks which are after a if statement and return the maximum.max_else_if_per_if
the number of else blocknum_else
the number of try blocknum_try
the number of catch blocknum_catch
the max number of catch block which is after a try blockcount how many catch blocks which are after a try block and retrun the maximum. max_catch_per_try
the number of finally blocknum_finally
the number of switch blocknum_swtich
the number of case blocknum_case
the number of case blocks which are in a switch blockmax_cases_per_swtich
the number of for blocknum_for
the number of while blocknum_while
the number of do-while blocknum_do_while
the number of for-each blocknum_for_each
the number of loop blockwhile + do-while + for + for-eachnum_loop
the number of loop conditions which contain method callingnum_loop_with_method_call
the number of synchronized blocknum_synchronized
about nested blocksthe deepest layer of nested blocksfind all the layers of nested blocks, and return the maximum.max_depth_nested_block
the deepest layer of nested if blockthere is another if block in a if block. Find the deepest layers.max_depth_nested_if
the number of if blocks which contain loops.num_if_contained_loop
the deepest layer of for blockthere is another for block in a for block. Find the deepest layers.max_depth_nested_for
the deepest layer of while blockthere is another while block in a while block. Find the deepest layers.max_depth_nested_while
the deepest layer of do-while blockthere is another do-while block in a do-while block. Find the deepest layers.max_depth_nested_do_while
the deepest layer of for-each blockthere is another for-each block in a for-each block. Find the deepest layers.max_depth_nested_for_each
the number of loops which contain if block.num_loop_contain_if
about jumping statementthe number of break statementsnum_break
the number of continue statementsnum_continue
the number of return statementsnum_return
about final and staticthe number of final local variablesnum_final_local_var
the number of final parametersnum_final_param
about assignmentsthe number of compand assignments+= -= *= /= &= &= |= ^= <<= >>= >>>=num_compand_assignment
the number of decrement/increment statements++ --num_increment_decrement
the number of prefix decrement/increment statements++ --num_increment_decrement_prefix
the number of post decrement/increment statements++ --num_increment_decrement_post
about fundamental datathe number of literalsinteger data, like 231. Float data, like 3.1415926. String data, like "hahahha". Character data, like 'a'. true/false.num_literal
the number of fundamental type byte, short, int, long, float, double, char, booleannum_primitive_var
the number of assert statement num_assert
about othersthe number of class in the methodnum_used_class
the number of anonymous classes in the methodnum_anonymous_class_definition
the number of inner classes in the methodnum_inner_class_definition
the attribute of methodpublic private protected default is prenstened by 0 1 2 3 respectively.level_method_access
the number of annotations in the methodnum_annotation
the number of exceptions which are thrown by methodnum_throw_exception
about distance the min distance from definition to usingthe distance from defining a variable to using a variable. Find the minimum. min_line_var_use
the max distance from definition to usingthe distance from defining a variable to using a variable. Find the maxmum. max_line_var_use
the average distance from definition to usingthe distance from defining a variable to using a variable. Find the average. avg_line_var_use