Abstract
In many applications, it is necessary to determine the field similarity. Our paper introduces a package of substring-based new algorithms to determine Field Similarity. Combined together, our new algorithms not only achieves higher accuracy, but also gains the time complexity O(knm) (k<0.75) for the worst case, O(β*n) where β<6 for the average case and O(1) for the best case. Throughout the paper, we use the approach of comparative examples to show the higher accuracy of our algorithms compared to that proposed in Lee et al. [1]. Theoretical analysis, concrete examples and experimental results show that our algorithms can significantly improve the accuracy and time complexity of the calculation of field similarity.
| Original language | English |
|---|---|
| Pages (from-to) | 122-133 |
| Number of pages | 12 |
| Journal | Pattern Analysis and Applications |
| Volume | 6 |
| Issue number | 2 |
| DOIs | |
| State | Published - 2003 |
| Externally published | Yes |
Keywords
- Data cleaning
- Data mining
- Field similarity
- Pattern recognition
- Record similarity
- String similarity
Fingerprint
Dive into the research topics of 'Faster algorithm of string comparison'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver