An Optimal Decomposition Algorithm for Tree Edit Distance
This page contains links to the conference and journal versions of this paper, as well as the conference presentation. The implementation includes our O(n3)-time O(n2)-space algorithm (DMRW) as well as an O(n2)-space implementation of Shasha & Zhang's algorithm and of Klein's. We provide a framework to compare the implementations and to change the distance metric. The code may be download and used for scholarly and academic purposes, in which case we request that you include a reference to our paper.
AbstractThe edit distance between two ordered rooted trees with vertex labels is the minimum cost of transforming one tree into the other by a sequence of elementary operations consisting of deleting and relabeling existing nodes, as well as inserting new nodes. We present a worst-case O(n3)-time algorithm for the problem when the two trees have size n, improving the previous best O(n3log n)-time algorithm. We prove the optimality of our algorithm among the family of "decomposition strategy" algorithms - which also includes the previous fastest algorithms - by tightening the known lower bound of Omega(n2log2n) to Omega(n3), matching our algorithm's running time.
Tree Format(root-label (child1)(child2)(child3)...)
root-label is a string labeling this node
child1 - is the leftmost child of root-label
child2 - is the second from the left child of root-label
C++ Implementations of different Tree Edit Distance AlgorithmsBoris Pismenny coded this as a final undergraduate project. The implementation includes DMRW, Klein, and Shasha-Zhang all in O(n^2) space. You can compare the algorithms using a test procedure which generates random test cases. Available are Windows (including GUI) and non-Windows (no GUI provided) versions.
DocumentationMake sure you read this before using the software.