What languages does Cludo support?
The natural language processing at Cludo consists of multiple steps:
- Tokenization – Splitting a sentence into individual words
- Elision – Removing elisions; For example, in French: l’amour → amour, m’appelle → appelle
- Stop words – Remove fill words such as a, an, it, is, that, this, me, you, your, etc., as they don’t provide any context to the content
- Stemming – Convert words into their root form, e.g. pilots→pilot, grew→grow, living→live (supporting derivations)
Supported Languages
Language | ISO code | Tokenization | Elision | Stop words | Stemming |
---|---|---|---|---|---|
Arabic | ar | ||||
Armenian | hy | ||||
Basque | eu | ||||
Brazilian | pt-br | ||||
Bulgarian | bg | ||||
Catalan | ca | ||||
Chinese (Simplified) | zh | ||||
Czech | cs | ||||
Danish | da | ||||
Dutch | nl | ||||
English | en | ||||
Estonian | et | ||||
Finnish | fi | ||||
French | fr | ||||
Galician | gl | ||||
German | de | ||||
Greek | el | ||||
Hindi | hi | ||||
Hungarian | hu | ||||
Icelandic | is | ||||
Indonesian | id | ||||
Irish | ga | ||||
Italian | it | ||||
Japanese | jp | ||||
Korean | ko | ||||
Latvian | lv | ||||
Lithuanian | lt | ||||
Norwegian (bokmål) | no | ||||
Norwegian (nynorsk) | nn | ||||
Persian | fa | ||||
Polish | pl | ||||
Portuguese | pt | ||||
Romanian | ro | ||||
Russian | ru | ||||
Serbian | sr | ||||
Sorani (Kurdish) | ku | ||||
Spanish | es | ||||
Swahili | sw | ||||
Swedish | sv | ||||
Thai | th | ||||
Turkish | tr | ||||
Ukrianian | uk | ||||
Vietnamese | vi |