Deykun to DACH - jetzt auf [email protected] • 1 year agoGerman heatmap based on Wikipedia articles 🇩🇪media.kbin.socialimagemessage-square27fedilinkarrow-up194arrow-down13
arrow-up191arrow-down1imageGerman heatmap based on Wikipedia articles 🇩🇪media.kbin.socialDeykun to DACH - jetzt auf [email protected] • 1 year agomessage-square27fedilink
minus-squareDeykunOPlinkfedilink24•edit-21 year agoTo clarify, it is not the total number of words but rather the number of unique words considered. Imho a million of unique words is okay. A bigger concern for me would be that words on Wikipedia can be overly specific.
minus-square@[email protected]linkfedilink1•1 year agoHave you considered a similarity search approach? They would handle your oddly specific synonym issue
minus-squareDeykunOPlinkfedilink1•1 year agoI only have a prespellechecked list of words from here: http://www.aaabbb.de/WordList/WordList_en.php
To clarify, it is not the total number of words but rather the number of unique words considered. Imho a million of unique words is okay. A bigger concern for me would be that words on Wikipedia can be overly specific.
That million words sounds like a lot.
Have you considered a similarity search approach? They would handle your oddly specific synonym issue
I only have a prespellechecked list of words from here: http://www.aaabbb.de/WordList/WordList_en.php