Semantically-based Patent Thicket Identification

“Semantically-based patent thicket identification”
Mateusz Gątkowski, Marek Dietl, Łukasz Skrok, Ryan Whalen, Katharine Rockett
Research Policy
Published online on 3 Feb 2020
Abstract: Patent thickets have been identified as a major stumbling block in the development of new technologies, creating the need to accurately identify thicket membership. Various citations-based methodologies (Graevenitz et al., 2011; Clarkson, 2005) have been proposed, which have relied on broad survey results (Cohen et al., 2000) for validation. Expert evaluation is an alternative direct method of judging thicket membership at the individual patent level. While this method potentially is robust to drafting and jurisdictional differences in patent design, it is also costly to use on a large scale. We employ a natural language processing technique, which does not carry these large costs, to proxy expert views closely. Furthermore, we investigate the relation between our semantic measure and citation based measures, finding them quite distinct. We then combine a variety of thicket indicators into a statistical model to assess the probability that a newly added patent belongs to a thicket. We also study the role each measure plays, as part of creating a prospective screening model that could improve efficiency of the patent system, in response to Lemley (2001).