Page MenuHomeDevCentral

Revisit Sørensen–Dice coefficient
Open, Needs TriagePublic

Description

The English Wikipedia article Sørensen–Dice coefficient gives an EXAMPLE of application for strings using bigrams.

This is what's currently implemented in D2052.

https://pganalyze.com/blog/similarity-in-postgres-and-ruby-on-rails-using-trigrams uses trigrams instead of bigrams and give more weight to word start by padding spaces to the strings. Such approach is implemented in PostGreSQL and Rails.

We should determine if we can improve our Sørensen–Dice code switching to such trigrams.

Event Timeline

dereckson created this task.Feb 16 2020, 04:21
dereckson moved this task from Backlog to Dev on the easy board.
dereckson moved this task from Backlog to Feature requests on the Keruald board.