A picture of common subsequence length for two random strings over an alphabet of 4 symbols

Бесплатный доступ

The maximal length of longest common subsequence (LCS) for a couple of random finite sequences over an alphabet of 4 characters was considered as a random function of the sequences lengths and 𝑛; Exact probability distributions tables are presented for all couples of length in a range 2

Lcs, levenshtein metric, edit distance, sequence alignment, similarity of strings

Короткий адрес: https://sciup.org/14336063

IDR: 14336063

Список литературы A picture of common subsequence length for two random strings over an alphabet of 4 symbols

  • Durbin R., Eddy S., Krogh A., Mitchison G. Biological sequence analysis: probabilistic models of proteins and nucleic acids, Cambridge University Press, 1998, 356 с.
  • J. G. Reich, H. Drabsch, A. D¨ aumler, On the statistical assessment of similarities in DNA sequences. Nucleic acids research, 12(13), 5529-5543.1984..
  • K. Ning, K. P. Choi. Systematic assessment of the expected length, variance and distribution of Longest Common Subsequences arXiv preprint arXiv:1306.4253//2013.
  • V. Chvatal, D. Sankoff. Longest Common subsequences of two random sequences//J. Appl. Probability, 1975, №12. С. 306-315.
  • R. Bundschuh. High precision simulations of the longest common subsequence problem//The European Physical Journal B-Condensed Matter and Complex Systems, V. 22. No. 4. 2001. С. 533-541.
Статья научная