
How to Extend SentenceTransformers to Swedish Language - Ebbot Blog
Extending SentenceTransformers to Swedish language
The story of the NLP team from Hello Ebbot extending SentenceTransformers to Swedish started a month ago, when we unexpectedly received a call from Santa Claus...
🎅🏼 Santa: Hello, is this Hello Ebbot's NLP team? It's Santa Claus speaking! Hello Ebbot is on the Nice List this year and I have a gift for you.
👾 Hello Ebbot team: Oh Santa!! Really, you have a gift for us?
🎅🏼 Santa: Yes of course, you have all been working very hard in the year 2020. How may I help reducing your workload?
👾 Hello Ebbot team: Hmm, there is actually one thing that we want to improve right now! So in order for our digital co-worker to respond to human-language, he has to be trained to detect intent, which basically is the purpose of a message. Then he learns how to accurately predict it through 10-20 example sentences for every intent. It would be nice if we can have an application that takes one sentence as an input and outputs many sentences with the same meaning, so we don't have to come up with these examples ourselves.
🎅🏼 Santa: Aaah, then I know exactly what you need, how about my intelligent SentenceTransformers model? He can help you translate the sentences into numbers and you can use cosine similarity to find similar sentences in a big corpus.
👾 Hello Ebbot team: That's great! We will prepare and clean our list of example sentences in our database and wait for your gift!
🎅🏼 Santa: One little problem, you have to teach SentenceTransformers Swedish! He only speaks English.
👾 Hello Ebbot team: That's okay Santa, we know you have to talk to other companies on the nice list. Let us take care of this from here.
That's when we decided to train the SentenceTransformers so that the model can embed Swedish text. And finally, after hours of training and many cups of coffee later...SentenceTransformers now speaks Swedish fluently! 🥳 🎉
How we extended SentenceTransformers to Swedish
Based on the publication Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation, we extended the teacher English SentenceTransformers to a student Swedish model using English - Swedish parallel sentences dataset, which was TED2020 corpus containing 119,602 sentences. We trained our Transformer based on UKPLab's example training script using Colab Pro notebook. Utilizing Colab Pro's Graphics Processing Unit (GPU), it took us only two hours to train and we achieved the accuracy of 95.6% evaluated on test set.
Hello Ebbot's application built using SentenceTransformers
After finishing extending SentenceTransformers To Swedish, we used the model to embed our corpus, which is a cleaned list of 56,538 example phrases that we came up with to teach Ebbot in the past. Then, cosine similarity was applied to compare the semantic similarity between the given text and sentences in the corpus. The application then prints out the most similar sentences along with similarity scores. Using Streamlit , our NLP team built a simple web app, allowing users to choose how many similar phrases they want to generate. There is also an option to print out top similar or all sentences within a chosen range of percentage.
Let's take a look at more examples!
När skickar ni grejerna som jag beställt?
- undrar när ni skickar iväg det jag beställt av er (Score: 0.93)
- jag undrar om när jag får grejerna som jag beställt (Score: 0.91)
- hej jag har beställt varor utav er fått undrar vart resterande tagit vägen (Score: 0.90)
- när får jag mina beställda varor (Score: 0.89)
- när får jag mitt paket som jag beställt (Score: 0.89)
- när måste jag hämta beställning (Score: 0.89)
- när kommer saker jag beställer fram (Score: 0.88)
- när skickas min beställning (Score: 0.88)
- och du undrar jag hur jag ska gå till väga skickar ni hit någon som hämtar den då jag hade hemleverans (Score: 0.88)
- vart är mina saker som jag har beställt (Score: 0.88)
Tack för all hjälp, ni är bäst!
- kanon tack för all hjälp ha det gott (Score: 0.98)
- superbra tack så mycket för hjälpen (Score: 0.98)
- toppen tack så mycket för hjälpen (Score: 0.98)
- toppen tack för din hjälp 👍🏾 (Score: 0.98)
- tack för hjälpen ha det så bra (Score: 0.98)
- stort tack du har varit till stor hjälp (Score: 0.98)
- perfekt tack så mycket för hjälpen (Score: 0.98)
- toppen tack tack för bra service (Score: 0.98)
- oh toppen tack för din hjälp (Score: 0.98)
- excellent thanks for your help (Score: 0.98)
You can see that the application is not only finding other sentences with similar words, but is actually able to return sentences with the same meaning. This is what makes the SentenceTransformers a powerful and helpful tool for us, because the more creative we are with the example phrases, the better Ebbot become at detecting intents!
Being extremely excited about our result, Santa 🎅🏼 called to congratulate us and ask when we will have the application ready to be used in production. Even though we are proud of ourselves for successfully extending SentenceTransformers to Swedish, we told him that we still want to test it internally and make improvements before the official release. We thanked Santa 🎅🏼 again and promised him we would be even more hard-working in the year 2021 to continue being on the nice list 🎄 And so Hello Ebbot's journey for the year 2021 begins....
More stories

How the EU AI Act will shape the future of service automation
The clock is ticking. The EU AI Act is set to become law, reshaping how artificial intelligence is developed, deployed, and regulated in Europe. For organizations looking to integrate AI solutions, this legislation raises important questions about compliance, accountability, and the choice of AI providers.

Ebbot Achieves ISO 27001 Certification
In 2024, we took on a bold challenge: to earn the internationally recognized ISO 27001 certification. In December, we achieved that goal, marking an important milestone in Ebbot’s commitment to delivering AI-powered service automation with the highest standards of security.

Press release: Gofido first to launch EbbotGPT to customers - Ebbot Blog
Swedish insurance provider Gofido is taking a significant step in its commitment to delivering exceptional customer service by officially launching EbbotGPT. This marks a historic milestone as Gofido becomes the first insurance provider in Sweden to integrate generative AI into its customer support chatbot.

We’re opening our API for EbbotGPT
In celebration of the one-year anniversary of EbbotGPT, we are happy to announce that we are now opening our API for our EU-hosted LLMs, EbbotGPT. This marks a significant milestone in our journey to offer robust AI-driven customer service solutions that are fully compliant with EU data regulations.

From overwhelmed to empowered: GenAI’s role in succeeding with self-service in ITSM
In today’s fast-paced business world, having an efficient internal service management (ITSM) system is more important than ever. But let’s be honest—many ITSM systems are neither user-friendly nor scalable, which ends up making them inefficient. Enter Generative AI (GenAI), a technology that could solve this. But how can we take advantage of this technology in an effective use case without risking security? Let’s break it down.

Ebbot becomes the preferred GenAI partner to renowned chatbot expert Campfire AI
Stockholm, Sweden – July 8, 2024 Campfire AI, a Brussels-based conversational AI consultancy firm, has handpicked Ebbot as its new GenAI partner. From now on, Campfire AI will offer Ebbot’s services to all clients seeking to leverage GenAI in service automation. Ebbot,…

Enento Group chooses Ebbot as strategic AI partner for service automation
Stockholm, Sweden – June 19, 2024 **With a focus on providing a secure GenAI platform for automating service processes at scale, Ebbot has become an attractive partner for enterprises looking to deliver a world class AI service experience. Now signing the Nordic knowledge company [Enento…

Small vs. Large GenAI models – pros & cons
When it comes to generative AI (GenAI) models, size does matter—just maybe not how you'd expect. Both small and large GenAI models have their strengths and weaknesses. Understanding these can help you choose the best model for your needs. Let's break down the pros and cons.🌟 ## The buzz…

Coeo leverages Generative AI to enhance customer experience
coeo Inkassos is rapidly growing and aims to be one of Sweden's largest debt collection agencies in the next five years. Focusing on customer experience as a central strategy, coeo has now set itself apart by becoming the first in the industry to offer 24/7 support with generative AI.

How to make your data sources AI-ready: Step-by-step
Generative AI has revolutionized chatbot training. What once took hours is now completed in minutes. BUT, (there's always a but), the effectiveness of a Generative AI-trained chatbot heavily depends on the quality of its data sources. So, what constitutes a "good" data source for a GenAI chatbot, and what measures can be taken to prepare? Let's find out.

Cross-border service: coeo's live chat breaks down language barriers with a click
The debt collection company coeo Sweden takes its customer service to the next level by introducing an automatic translation feature in its live chat. With the new feature, users can now get real-time support in any language they prefer.

Ebbot Acknowledged by Deloitte as One of the Top 50 Fastest-Growing Technology Companies in Sweden
Stockholm, Sweden, November 2, 2023. Ebbot, providing a conversational AI platform for managing service processes at scale, has been acknowledged by Deloitte as one of the top 50 fastest-growing technology companies in Sweden. ### Background Ebbot,…