A duplicate block is any section of code over 100 tokens which appears more than once in a repository. A token is counted based on the way code is parsed.


In the following example we break down a statement into 11 tokens, the same way a parser would see it.

example statement

a = 3 + b

statement(
  assignment_expression(
    left_side(
      name(a), 
      operand(+), 
      right_side(
        infix_expression(
          left_side(int),
          operand(+),
          right_side(
            name(b)
          )
        )
      )
    )
  )
)