What languages does Cludo support?
The natural language processing at Cludo consists of multiple steps:
- Tokenization – Splitting a sentence into individual words
- Elision – Removing elisions; For example, in French: l’amour → amour, m’appelle → appelle
- Stop words – Remove fill words such as a, an, it, is, that, this, me, you, your, etc., as they don’t provide any context to the content
- Stemming – Convert words into their root form, e.g. pilots→pilot, grew→grow, living→live (supporting derivations)
Supported Languages
| Language | ISO code | Tokenization | Elision | Stop words | Stemming |
|---|---|---|---|---|---|
| Arabic | ar | ✅ | ✅ | ✅ | ✅ |
| Armenian | hy | ✅ | ✅ | ✅ | ✅ |
| Basque | eu | ✅ | ✅ | ✅ | ✅ |
| Brazilian | pt-br | ✅ | ✅ | ✅ | ✅ |
| Bulgarian | bg | ✅ | ✅ | ✅ | ✅ |
| Catalan | ca | ✅ | ✅ | ✅ | ✅ |
| Chinese (Simplified) | zh | ✅ | ✅ | ✅ | ✅ |
| Czech | cs | ✅ | ✅ | ✅ | ✅ |
| Danish | da | ✅ | ✅ | ✅ | ✅ |
| Dutch | nl | ✅ | ✅ | ✅ | ✅ |
| English | en | ✅ | ✅ | ✅ | ✅ |
| Estonian | et | ✅ | ❌ | ❌ | ❌ |
| Finnish | fi | ✅ | ✅ | ✅ | ✅ |
| French | fr | ✅ | ✅ | ✅ | ✅ |
| Galician | gl | ✅ | ✅ | ✅ | ✅ |
| German | de | ✅ | ✅ | ✅ | ✅ |
| Greek | el | ✅ | ✅ | ✅ | ✅ |
| Hindi | hi | ✅ | ✅ | ✅ | ✅ |
| Hungarian | hu | ✅ | ✅ | ✅ | ✅ |
| Icelandic | is | ✅ | ❌ | ❌ | ❌ |
| Indonesian | id | ✅ | ✅ | ✅ | ✅ |
| Irish | ga | ✅ | ✅ | ✅ | ✅ |
| Italian | it | ✅ | ✅ | ✅ | ✅ |
| Japanese | jp | ✅ | ✅ | ✅ | ✅ |
| Korean | ko | ✅ | ❌ | ❌ | ❌ |
| Latvian | lv | ✅ | ✅ | ✅ | ✅ |
| Lithuanian | lt | ✅ | ✅ | ✅ | ✅ |
| Norwegian (bokmål) | no | ✅ | ✅ | ✅ | ✅ |
| Norwegian (nynorsk) | nn | ✅ | ✅ | ✅ | ✅ |
| Persian | fa | ✅ | ✅ | ✅ | ❌ |
| Polish | pl | ✅ | ✅ | ✅ | ✅ |
| Portuguese | pt | ✅ | ✅ | ✅ | ✅ |
| Romanian | ro | ✅ | ✅ | ✅ | ✅ |
| Russian | ru | ✅ | ✅ | ✅ | ✅ |
| Serbian | sr | ✅ | ❌ | ❌ | ❌ |
| Sorani (Kurdish) | ku | ✅ | ✅ | ✅ | ✅ |
| Spanish | es | ✅ | ✅ | ✅ | ✅ |
| Swahili | sw | ✅ | ❌ | ❌ | ❌ |
| Swedish | sv | ✅ | ✅ | ✅ | ✅ |
| Thai | th | ✅ | ✅ | ✅ | ❌ |
| Turkish | tr | ✅ | ✅ | ✅ | ✅ |
| Ukrianian | uk | ✅ | ✅ | ✅ | ✅ |
| Vietnamese | vi | ✅ | ❌ | ❌ | ❌ |