Use native implementation of difflib

Using pyinstrument and scanning the the test file [meta-quest-oss-notice.md](https://github.com/user-attachments/files/21373464/meta-quest-oss-notice.md) for licenses, we have found that a lot of time is spent in the sequence matching portion:
```
11.902 get_licenses  scancode/api.py:150
├─ 11.858 detect_licenses  licensedcode/detection.py:2180
│  ├─ 9.619 LicenseIndex.match  licensedcode/index.py:892
│  │  ├─ 9.469 LicenseIndex.match_query  licensedcode/index.py:960
│  │  │  ├─ 8.915 LicenseIndex.get_approximate_matches  licensedcode/index.py:718
│  │  │  │  ├─ 6.753 LicenseIndex.get_query_run_approximate_matches  licensedcode/index.py:808
│  │  │  │  │  ├─ 6.360 match_sequence  licensedcode/match_seq.py:48
│  │  │  │  │  │  ├─ 5.469 match_blocks  licensedcode/seq.py:107
│  │  │  │  │  │  │  ├─ 5.446 find_longest_match  licensedcode/seq.py:19
│  │  │  │  │  │  │  │  ├─ 5.289 [self]  licensedcode/seq.py
│  │  │  │  │  │  │  │  ├─ 0.112 dict.get  <built-in>
│  │  │  │  │  │  │  │  └─ 0.045 extend_match  licensedcode/seq.py:84
│  │  │  │  │  │  │  │     ├─ 0.037 [self]  licensedcode/seq.py
│  │  │  │  │  │  │  │     └─ 0.008 <lambda>  <string>:1
│  │  │  │  │  │  │  │        ├─ 0.007 [self]  <string>
│  │  │  │  │  │  │  │        └─ 0.001 tuple.__new__  <built-in>
```

Currently, we use a pure python implementation of difflib to perform license detection (https://github.com/aboutcode-org/scancode-toolkit/blob/develop/src/licensedcode/seq.py) A way to improve the performance of this part of license detection would be to use a native implementation of difflib like https://pypi.org/project/cdifflib/ or https://github.com/rapidfuzz/CyDifflib

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Use native implementation of difflib #4484

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Use native implementation of difflib #4484

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions