diff(3) - Linux man page
Name
diff - compute a shortest edit script (SES) given two sequencesSynopsis
#include <mba/diff.h>
typedef const void *(*idx_fn)(const void *s, int idx, void *context);typedef
int (*cmp_fn)(const void *e1, const void *e2, void *context);
typedef enum {DIFF_MATCH = 1,DIFF_DELETE,DIFF_INSERT} diff_op;
struct diff_edit {short op; /* DIFF_MATCH, DIFF_DELETE or DIFF_INSERT
*/int off; /* off into a if MATCH or DELETE, b if INSERT */int len;};
int diff(const void *a, int aoff, int n,
const void *b, int boff, int m, idx_fn
idx, cmp_fn cmp, void *context, int dmax,
struct varray *ses, int *sn, struct varray
*buf);
Description
[1] E. Myers, ''An O(ND) Difference Algorithm and Its Variations,'' Algorithmica 1, 2 (1986), 251-266. http://www.cs.arizona.edu/people/gene/PAPERS/diff.ps
- diff
- The diff(3m) function computes the minimumedit distancebetween the sequences a (from aoff for length n) and b
(from boff for length m) and optionally collects theedit scriptnecessary to transform a into b storing the result in the
varray identified by the ses parameter. The idx paremeter is a pointer to an idx_fn that returns the element at the numeric index
in a sequence. The cmp parameter is a pointer to a cmp_fn function that returns zero if the two elements e1 and e2 are equal and
non-zero otherwise. The supplied context parameter will be passed directly to both functions with each invokation. If idx and cmp are
NULL a and b will be compared as continuous sequences of unsigned char (i.e. raw memory diff).
If the ses parameter is not NULL it must be a varray(3m) with a membsize of sizeof(struct diff_edit). Each struct diff_edit element in the varray(3m) starting from 0 will be populated with the op, off, and len that together constitute the edit script. The number of struct diff_edit elements in the edit script is written to the integer pointed to by the sn parameter. If the ses or sn parameter is NULL, the edit script will not be collected.
If the dmax parameter is not 0, the calculation will stop as soon as it is determined that the edit distance of the two sequences equals or exceeds the specified value. A value of 0 indicates that there is no limit.
If the buf parameter is not NULL it must be a varray(3m) with membsize of sizeof(int) and will be used as temporary storage for the dynamic programming tables. If buf is NULL storage will be temporarily allocated and freed with malloc(3) and free(3).
Returns
- diff
- The diff(3m) function returns the edit distance between the two sequences a and b or dmax if the edit distance has been determined to meet or exceed the specified dmax value. If an error occurs -1 is returned and errno is set to indicate the error.