An Optimal Decomposition Algorithm for Tree Edit Distance


This page contains links to the conference and journal versions of this paper, as well as the conference presentation. The implementation includes our O(n3)-time O(n2)-space algorithm (DMRW) as well as an O(n2)-space implementation of Shasha & Zhang's algorithm and of Klein's. We provide a framework to compare the implementations and to change the distance metric. The code may be download and used for scholarly and academic purposes, in which case we request that you include a reference to our paper.

  • "An Optimal Decomposition Algorithm for Tree Edit Distance". Erik Demaine, Shay Mozes, Benjamin Rossman, Oren Weimann. In Proceedings of the 34th International Colloquium on Automata, Languages and Programming (ICALP 2007). slides
  • "An Optimal Decomposition Algorithm for Tree Edit Distance". Erik Demaine, Shay Mozes, Benjamin Rossman , Oren Weimann. ACM Transactions on Algorithms (TALG).
  • Abstract

    The edit distance between two ordered rooted trees with vertex labels is the minimum cost of transforming one tree into the other by a sequence of elementary operations consisting of deleting and relabeling existing nodes, as well as inserting new nodes. We present a worst-case O(n3)-time algorithm for the problem when the two trees have size n, improving the previous best O(n3log n)-time algorithm. We prove the optimality of our algorithm among the family of "decomposition strategy" algorithms - which also includes the previous fastest algorithms - by tightening the known lower bound of Omega(n2log2n) to Omega(n3), matching our algorithm's running time.

    Tree Format

    (root-label (child1)(child2)(child3)...)
    Where:
    root-label is a string labeling this node
    child1 - is the leftmost child of root-label
    child2 - is the second from the left child of root-label
    etc.

    Implementations

    C++ Implementations of different Tree Edit Distance Algorithms

    Boris Pismenny coded this as a final undergraduate project. The implementation includes DMRW, Klein, and Shasha-Zhang all in O(n^2) space. You can compare the algorithms using a test procedure which generates random test cases. Available are Windows (including GUI) and non-Windows (no GUI provided) versions.

    TED Implementations

  • TED Windows.zip - The Tree Edit Distance for Windows users only, with GUI. Includes an executable and source code
  • TED non-Windows.zip - The Tree Edit Distance for non-Windows users only, without GUI.  Includes an executable and source code
  • Documentation

    Make sure you read this before using the software.
  • TED User Guide for Windows.pdf - User manual for the windows Tree Edit Distance implementation
  • TED User Guide for non-Windows.pdf - User manual for the non-windows Tree Edit Distance implementation
  • Programmers Manual.pdf - Programmers manual for the Tree Edit Distance project