Now you can use language recognition with Flow Builder. Doing so allows you to personalise and improve the experience of your customers with this feature. To do so, just follow the next steps:
For our example, we want to reply with different SMS depending on the language of the incoming message.
1. Add a 'Branch' Step and select the option 'is in language' from the second drop-down menu.
2. Select the desired language from the third drop-down menu.
3. Add as many branches as you need, don't forget to set-up the variables for each. For this example, we will set-up three branches for Polish, Dutch and Spanish.
4. Add the desired response for each branch. For this example we want to add a 'Reply using SMS' Step:
- For Polish the SMS should say Witamy!
- For Dutch Welkom!
- For Spanish ¡Bienvenido!
- For any other SMS which language cannot be identified, the reply would be in English since it is the most likely language the customer will know.
You're all set! The final flow using the language recognition feature looks like this:
Language detection accuracy
The language detection service is a machine learning model. This means that the service uses a big set of examples for some task to learn how to make predictions for that task. The dataset used to train the model can never contain every combination of characters used in every language, such dataset does not exist, and if it did it would be too big to handle. Since the model learns from this incomplete dataset, it cannot be assured that the model is 100% perfect.
The model that is used in the language service is tested on a dataset containing ~64000 tweets. For this dataset, the percentage of correctly classified languages for the tweets is 89%. However, this does not mean that this is the actual accuracy of the model since we do not know what data the customers will send.
The model runs on the MessageBird servers alone. Your data is safe; it will never be sent to any third parties.
Some scenarios where the model is known to have difficulty predicting the correct language with high certainty:
- With very short text, mainly single words. Because single words are likely to occur in multiple languages, the model has difficulty correctly predicting these.
- Languages that are very similar. For instance, Spanish and Portuguese are fairly similar. For some sentences which are very alike in the two languages, the model might have difficulty predicting the correct one.
- Text containing irregular characters (or irregular combinations). This is mainly when a user tries to trick the system. This could be sending a single emoji, on which language detection is fairly arbitrary because it does not contain any language clues. Also using characters that look alike but are different. An example of this is the “regular space” and “Japanese space”. Both these spaces look the same, but the model will have difficulty predicting the language when the Japanese space is used consistently in an English text.