Documentation
¶
Overview ¶
Package lcs provides implementations of algorithms to find the longest common subsequence/shortest edit script (LCS/SES) between two slices suitable for use with unicode/utf8 and other alphabets.
Index ¶
Examples ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type DP ¶
type DP[T comparable] struct { // contains filtered or unexported fields }
DP represents a dynamic programming based implementation for finding the longest common subsequence and shortest edit script (LCS/SES) for transforming A to B. See https://en.wikipedia.org/wiki/Longest_common_subsequence_problem. This implementation can return all LCS and SES rather than just the first one found. If a single LCS or SES is sufficient then the Myer's algorithm implementation is lilkey a better choice.
Example ¶
package main import ( "fmt" "unicode/utf8" "cloudeng.io/algo/codec" "cloudeng.io/algo/lcs" ) func main() { runeDecoder := codec.NewDecoder(utf8.DecodeRune) a, b := runeDecoder.Decode([]byte("AGCAT")), runeDecoder.Decode([]byte("GAC")) all := lcs.NewDP(a, b).AllLCS() for _, lcs := range all { fmt.Printf("%s\n", string(lcs)) } }
Output: GA GA GC AC
func NewDP ¶
func NewDP[T comparable](a, b []T) *DP[T]
NewDP creates a new instance of DP. The implementation supports slices of bytes/uint8, rune/int32 and int64s.
type Edit ¶
type Edit[T comparable] struct { Op EditOp A, B int Val T }
Edit represents a single edit. For deletions, an edit specifies the index in the original (A) slice to be deleted. For insertions, an edit specifies the new value and the index in the original (A) slice that the new value is to be inserted at, but immediately after the existing value if that value was not deleted. Insertions also provide the index of the new value in the new (B) slice. A third operation is provided, that identifies identical values, ie. the members of the LCS and their position in the original and new slices. This allows for the LCS to retrieved from the SES.
An EditScript that can be trivially 'replayed' to create the new slice from the original one.
var b []uint8 for _, action := range actions { switch action.Op { case Insert: b = append(b, action.Val.(int64)) case Identical: b = append(b, a[action.A]) } }
type EditScript ¶
type EditScript[T comparable] []Edit[T]
EditScript represents a series of Edits.
func (*EditScript[T]) Apply ¶
func (es *EditScript[T]) Apply(a []T) []T
Apply transforms the original slice to the new slice by applying the SES.
func (*EditScript[T]) FormatHorizontal ¶
func (es *EditScript[T]) FormatHorizontal(out io.Writer, a []T)
FormatVertical prints a representation of the edit script across three lines, with the top line showing the result of applying the edit, the middle line the operations applied and the bottom line any items deleted, eg:
CB AB AC -+|-||-|+ A C B
func (*EditScript[T]) FormatVertical ¶
func (es *EditScript[T]) FormatVertical(out io.Writer, a []T)
FormatVertical prints a representation of the edit script with one item per line, eg:
- 6864772235558415538 -8997218578518345818
- -6615550055289275125
- -7192184552745107772 5717881983045765875
func (*EditScript[T]) Reverse ¶
func (es *EditScript[T]) Reverse() *EditScript[T]
Reverse returns a new edit script that is the inverse of the one supplied. That is, of the original script would transform A to B, then the results of this function will transform B to A.
type Myers ¶
type Myers[T comparable] struct { // contains filtered or unexported fields }
Myers represents an implementation of Myer's longest common subsequence and shortest edit script algorithm as as documented in: An O(ND) Difference Algorithm and Its Variations, 1986.
Example ¶
package main import ( "fmt" "unicode/utf8" "cloudeng.io/algo/codec" "cloudeng.io/algo/lcs" ) func main() { runeDecoder := codec.NewDecoder(utf8.DecodeRune) a, b := runeDecoder.Decode([]byte("ABCABBA")), runeDecoder.Decode([]byte("CBABAC")) fmt.Printf("%v\n", string(lcs.NewMyers(a, b).LCS())) }
Output: BABA
func NewMyers ¶
func NewMyers[T comparable](a, b []T) *Myers[T]
NewMyers returns a new instance of Myers. The implementation supports slices of bytes/uint8, rune/int32 and int64s.
func (*Myers[T]) SES ¶
func (m *Myers[T]) SES() *EditScript[T]
SES returns the shortest edit script.