Artificial Intelligence

When we bought our very first PlayStation console, it came with Tekken 3. We played almost every single day. My favorite character was Eddie Gordo, and unlike the newbies who just mashed random buttons, I actually knew how to play him. But one thing that always fascinated me in Tekken. In practice mode, you could configure your opponent to block every single attack. No matter what move I tried, the opponent would effortlessly anticipate and block it, almost as if it were reading my mind. Switch the mode to attacking, and suddenly the it would give me a beating like a pro player.

When I was a kid, we called it the CPU. Today, most people would call it the "AI". However, looking back, I realize that the humble PS One didn’t have the resources to run anything close to what we now call artificial intelligence. It wasn’t "thinking" or predicting my moves like modern AI might; it was just a well-written program executing specific rules set by the developers. In fact, even in many modern games, CPU opponents still don’t use true AI. They rely on complex patterns and scripted responses to simulate intelligence.

This brings us to an interesting observation: the term "AI" is often used as a catch-all for any complex, intelligent-seeming software. But in reality, much of what we call AI today—especially in video games or even some chatbots—isn’t intelligence in the human sense. It's simply advanced programming designed to give the illusion of intelligence.

Any sufficiently advanced software is indistinguishable from AI. Whether it’s a character blocking every move in Tekken or a chatbot answering questions, the line between advanced software and what we perceive as AI often gets blurred. In this chapter, we'll explore what AI really is, how it's different from other forms of software, and how it fits into the bigger picture.

What is AI?

Artificial Intelligence refers to the simulation of human intelligence by machines that are designed to perform tasks that typically require human cognitive functions, such as problem-solving, learning, and understanding language. AI has many applications, but when it comes to chatbots, we're working with a specific subset of AI.

In the context of chatbots, AI is not a monolithic entity. It relies on several key components working together:

Classifier: This helps the chatbot understand the intent behind a user's message. For example, when a customer asks, "Where is my order?" the classifier identifies that the intent is to request order tracking information.
Sentiment Analyzer: This AI tool gauges the emotional tone of a customer’s message. It can detect whether a customer is frustrated, satisfied, or neutral. A sentiment analyzer helps the chatbot adjust its responses, offering more empathetic replies when dealing with an upset customer.
Entity Recognition: This component identifies specific pieces of information from a user's message, such as dates, locations, product names, or order numbers. If a customer asks, "Where is my blue jacket?" the entity recognizer will extract "blue jacket" as the relevant item.

Each of these AI tools works in tandem to make chatbots more responsive and capable of handling complex customer inquiries. Instead of a simple script that looks for keywords, AI in chatbots ensures better understanding and more personalized interactions, improving the overall customer experience.

When we think of AI in the context of chatbots, it’s easy to imagine a single, all-encompassing model that handles everything from understanding the user’s intent to generating a response. But in reality, we can leverage multiple AI models, each with their own unique specialty.

By using multiple AI passes, we can extract more nuanced information from user messages, making the chatbot not only more responsive but also more empathetic and contextually aware. This multi-layered approach ensures that the chatbot isn’t just reacting to what the customer is saying but is also considering how they are saying it and why they might be saying it. Each additional AI model we add becomes a new feature that can enhance the chatbot’s effectiveness in resolving customer issues.

The Classifier

A classifier's main job is to categorize or classify input data into predefined categories. In the context of chatbots, a classifier helps to determine the type of question or request a user is making by analyzing their input and mapping it to a specific class, or intent.

At a basic level, the classifier works by learning from a large dataset of labeled examples during its training process. For instance, if the goal is to identify whether a user is asking about an order status, making a return, or requesting help with a product, the classifier is trained on thousands of real-world examples of user queries related to these categories.

When a user sends a message to the chatbot, the classifier takes this message and analyzes it to predict the most likely category or intent. The accuracy of this prediction depends on the quality of the training data and how well the classifier is tuned to handle the various nuances of language.

In eCommerce, the classifier is particularly useful because customer queries often fall into specific, repeated categories like order tracking, product details, or shipping concerns. By efficiently classifying these inquiries, the chatbot can provide faster, more accurate responses, ultimately improving the user experience and reducing the need for human intervention.

In summary, a classifier allows a chatbot to understand the type of message it’s receiving and respond in a relevant way, forming the backbone of a chatbot's ability to interpret and act on user inputs.

Sentiment Analyzer

A sentiment analyzer is a crucial tool in chatbots that allows them to detect and interpret the emotional tone or attitude behind a user's message. Rather than just focusing on what the user says, a sentiment analyzer helps the chatbot understand how the user is feeling, whether they are happy, frustrated, neutral, or upset.

A sentiment analyzer uses natural language processing (NLP) techniques to assess the emotional content of a message. It does this by examining the words, phrases, and sentence structure to determine the overall tone. The process often involves:

Lexical Analysis: The sentiment analyzer looks at specific words (e.g., "happy," "angry," "terrible") and assigns them positive, negative, or neutral sentiment scores.
Contextual Understanding: Beyond just words, it also considers the context and combinations of words. For example, "I didn't like it" is negative, but "I thought I wouldn't like it, but I did!" is positive. The sentiment analyzer uses context to avoid misclassifying such cases.
Polarity Score: Many sentiment analyzers will produce a score, called a "polarity score," which ranges from very negative to very positive. This numerical value helps the chatbot measure how strong the emotion is and adjust its responses accordingly.

Role in Chatbots

In chatbot systems, sentiment analysis plays an essential role in managing user interactions effectively. It helps the chatbot gauge the user's emotional state and respond in a more empathetic and relevant manner. Here’s how it benefits the interaction:

Tone-Adjusted Responses: If the sentiment analyzer detects that a customer is frustrated or angry (e.g., "This is taking forever!"), the chatbot can offer a more empathetic response, such as "I'm really sorry for the delay, let me help you right away!"—rather than a neutral or scripted reply.
Escalation Triggers: In many cases, if the sentiment analysis reveals that a customer is extremely dissatisfied or upset, the chatbot can automatically escalate the conversation to a human agent who can handle the situation more delicately.
Improving User Experience: By recognizing emotions, the chatbot can not only provide the right answers but also foster a better relationship with the customer. A message like "I'm having a hard time with your product" might prompt the chatbot to offer more assistance, while a positive message like "This is great, thanks!" may receive a cheerful confirmation in return.
Personalizing Responses: Chatbots that are aware of the customer’s emotional state can adapt their responses to better match the situation. If a user is happy, the chatbot might keep the conversation light and friendly. If a user is upset, the chatbot might prioritize offering solutions and minimize pleasantries.

Applications in eCommerce

In the eCommerce world, sentiment analysis can be especially useful because customer interactions are often charged with emotion—especially when things go wrong. Sentiment analysis helps in:

Handling Complaints: When a customer expresses frustration about a delayed package or a defective product, the sentiment analyzer identifies the negative emotion and prompts the chatbot to be more apologetic and helpful.
Upselling or Cross-Selling: If the sentiment analyzer detects that a customer is pleased with their purchase or experience, the chatbot might see this as an opportunity to suggest related products or offer discounts.
Proactive Support: For neutral or mildly negative sentiment, the chatbot can offer proactive help. For example, if a user seems unsure (e.g., "I'm not sure this is the right product"), the chatbot might suggest more information or recommend alternatives.

Sentiment analysis gives chatbots the ability to "read between the lines" and respond to not just the words but also the emotions of users. By doing so, it can create more satisfying and human-like interactions, even in automated customer service settings.

Entity Recognition

Entity recognition is a key AI feature in chatbots that enables the system to identify and extract specific pieces of information from a user's message. This could include names, dates, locations, product names, or order numbers. Rather than interpreting the whole message as a block of text, entity recognition breaks it down and highlights the most important elements, allowing the chatbot to process the message with a more precise understanding of the user's needs.

How Does Entity Recognition Work?

Entity recognition, often called named entity recognition (NER), is a natural language processing (NLP) technique that categorizes different parts of the text into predefined categories, such as:

Person Names: Recognizing individuals, e.g., "John Doe."
Locations: Identifying places, e.g., "New York" or "the office."
Dates and Times: Understanding references to specific dates or time periods, e.g., "tomorrow," "August 15," or "next week."
Product Names: Recognizing specific product names in an eCommerce context, e.g., "iPhone 14" or "Sony headphones."
Numbers: Identifying numeric values, such as "2 items," or order numbers, e.g., "Order #123456."
Monetary Values: Identifying currency or prices, e.g., "$150" or "€200."
Email Addresses: Recognizing and extracting email addresses.
Other Specific Entities: Depending on the chatbot's domain, entity recognition can be customized to pick out important categories specific to that context, such as order status, shipping companies, or special product features.

How Entity Recognition Works in Chatbots

Entity recognition allows a chatbot to extract meaningful information from customer messages, helping it better understand and respond to the user's request. Here's how it works:

Breaking Down the Message: When a customer types a message like "I need help with my order #123456," entity recognition identifies "order #123456" as a key element (in this case, the order number). It classifies "order #123456" as an entity related to order management, which the chatbot can use to look up relevant information in the eCommerce system.
Understanding User Intent: Recognizing specific entities in the message gives the chatbot context about what the customer is referring to. For example, if a user says, "I'm going to be out of town from July 5th to July 10th, can you delay my shipment?" the chatbot identifies "July 5th" and "July 10th" as dates and "shipment" as a relevant entity. The bot can then provide a more accurate response, such as offering to adjust the shipping date.
Providing Precise Responses: Entity recognition enables the chatbot to offer specific answers by connecting these recognized entities to backend systems or databases. If a user asks, "Where's my Sony headphones order?" the chatbot identifies "Sony headphones" as the product entity and can use this to track the order status more efficiently.
Data Extraction for Further Use: Even if the chatbot isn't able to fully resolve a query, it can pass along extracted entities (like order numbers, product names, or dates) to a human agent, saving time and reducing back-and-forth questioning.

Applications in eCommerce

Entity recognition is especially valuable in eCommerce chatbots because customers often provide critical details in their queries that the chatbot must capture to function effectively. Here are some common use cases:

Order Status Inquiries: When a user asks, "Where is my order #123456?" entity recognition pulls out the order number, allowing the chatbot to instantly look up the order and provide the customer with real-time shipping updates.
Product Availability: If a customer inquires, "Do you have iPhone 14 in stock?" entity recognition extracts "iPhone 14" as the product and can cross-reference it with the inventory to check availability.
Returns and Refunds: When a customer says, "I want to return my laptop I bought last week," the entity recognition system identifies "return" as the action and "laptop" as the product, making it easier for the chatbot to guide the customer through the return process.
Appointment Scheduling: If a business chatbot is involved in booking or scheduling, a customer might say, "I need an appointment on Friday at 3 PM." The bot recognizes "Friday" and "3 PM" as entities related to time, allowing it to check availability and confirm or reschedule appointments.
Billing Issues: When a user mentions, "I was charged $100 on my last order," entity recognition highlights "$100" as a monetary value and "last order" as an important reference, allowing the chatbot to quickly find and resolve the billing issue.

Why It's Important

Without entity recognition, chatbots would struggle to handle complex or detailed queries. The ability to extract and categorize important pieces of information allows a chatbot to go beyond generic conversations and provide truly valuable assistance to users. It turns a chatbot into a functional tool that understands specific details and can act on them accordingly, making it indispensable for eCommerce platforms and other industries where precision is critical.

The Role of AI in the Chatbot

You can think of AI as one of the many integrations. Its initial goal is to extract information from the customer’s message, pass that information to the core application, and use a company's rules to craft a response that resolves the customer's issue.

For example, when a customer types "Where's my order #123456?" into the chatbot, AI components like Entity Recognition extract "order #123456" and pass it along to the backend system to retrieve the order status. A Classifier can determine that this message falls under the category of "Order Status Inquiry" and guide the chatbot to provide the correct response based on company policy.

Sentiment Analysis can also come into play, detecting if the customer is frustrated based on the tone of their message. For instance, if a user types, "I’m so tired of waiting for my order!" the AI can recognize frustration, adjust the tone of the reply, and perhaps even escalate the issue to a human agent for quicker resolution.

Through this process, AI not only acts as a tool for understanding user inputs but also helps to generate thoughtful, tailored responses that align with company goals—whether it’s providing quick updates, resolving product issues, or handling more complex queries.

The Hand off

In an ideal world, a chatbot or AI system would be able to handle every customer inquiry seamlessly. However, the reality is that AI has limitations and can't always provide the necessary solution. When a chatbot encounters an issue it cannot resolve, the next step is crucial: the hand-off to a human agent.

The hand-off process is not just about transferring the conversation; it's about ensuring a smooth transition from automated to human assistance. When chatbot application can't answer a question or resolve an issue using AI, it should be able to recognize its limitations and immediately initiate a hand-off. This is vital to maintaining a positive customer experience, as a quick and efficient hand-off shows that the company values their time and concerns.

During this transition, the chatbot should collect as much relevant information as possible about the customer's query. This might include the nature of the issue, any attempted solutions, the customer's mood or sentiment, and any contextual data such as order history or previous interactions. By packaging this information and presenting it to the human agent, the AI can help streamline the support process. This way, the agent has all the context they need to effectively resolve the issue without making the customer repeat themselves.

A well-executed hand-off not only saves time but also enhances the customer experience by ensuring that the human agent is fully informed and ready to assist. The goal is to make the transition feel seamless, minimizing any frustration the customer might feel and maximizing their satisfaction with the service they receive. In the end, while AI can handle many tasks, the ability to recognize when it needs to step back and pass the baton to a human is just as critical to the success of a chatbot.

Keeping the AI Models up to date

AI models are only as good as the data they are trained on, and in the diverse world of eCommerce, customer communication varies significantly across different platforms and product types. For example, customers buying digital goods often use jargon that is unique to the tech-savvy community, while those purchasing physical arts and craft may use language that is more descriptive and emotive.

This variation in language can become a challenge for AI models, especially when they are trained primarily on data from one type of customer. An AI model optimized for handling inquiries about software downloads might struggle to accurately classify and respond to questions about handcrafted jewelry.

This is why it is important to have a robust training pipeline in place to allow regular updates of the AI models. This pipeline should focus on three areas in order of priority:

Handling False Positives:
False positives occur when the AI misclassifies a customer inquiry, leading to an incorrect or irrelevant response. By having a path in place to identify and report these errors, the model can be improved immediately.
Addressing Low Confidence Scores:
When the AI encounters a message it is unsure about, it gives it a low score. These low score interactions are a goldmine for improving the model.
Random Sample of new messages:
When starting integration with a new company, it is important to grab a random sample of the messages they receive and train the models with it. This ensures that false positives are less likely to occur.

Regularly updating the AI models through this pipeline not only improves their ability to classify and respond to a wider range of customer inquiries but also ensures that the chatbot remains relevant as customer communication styles evolve. Continuous training allows the chatbot to adapt to new trends, slang, and industry-specific language, ultimately expanding the number of customers it can assist effectively.

Training the model means having humans look at customer messages and decide what they actually mean. This means exposing customer personal information. To properly perform training, we need to take security and data privacy into account.

Training

github Github Source