Robo Chat Challenge Logo

Scoring Criteria For The Participating Chatbots

The selected judges will independently examine all the Chatbots that fulfill the following criteria and are up to the mark as far as the proficiency is concerned:

  1. Most Popular.

  2. Most Knowledgeable.

  3. Most uniquely interesting character/personality.

  4. Functionality or "most capable".

  5. Best Teachable/Learning bot.

Things to Remember

  1. All the questions asked by the judges will not contain deliberate typos to trick the bot although chatroom shorthand maybe used. We would like to explain further what is meant by chatroom shorthand as it might be used in the contest. Here are some common examples of chatroom shorthand. (Note: obscure examples will not be used).

  2. "y" for Why

    "lol" for Laughing out loud

    "u" for You

    "ur" for Your

    "r u" for Are You

    "u r" for You Are

  3. The questions will not be archived but sensible in a way that an average person would be able to answer them.

  4. Because this is an international contest, the questions will not be related to any particular country. For example, asking a question about a certain country that only the people of that country would know. An example of this is "Who is the governor of California?"

  5. The questions will not be asked if during the conversation the bot has asked the judge a question first. The judge must answer the bot's question before asking one of the few questions.

  6. The questions will not be trivia like but reasonable in the way that an average person would be able to answer them. For example: "Who was Michael Jackson?" is good. "Do you know the formula of diclofenac diethylammonium salt?" is bad.

  7. Logical, mathematical, memory and reasoning questions may be engaged in the conversation. Examples of such questions are, "What is 10-4?", "Can I eat a building?", and "I have a green pen. What color is my pen?", "Jack and John are twins. If Jack is 24 years old, how old is John?" and so forth.

  8. A question may just consist of a statement to see how a bot reacts in ordinary conversation rather than just being a question/answering program. An example of these are; "I went to the watch the match.", "Nothing is good tonight."

  9. The questions will be posed in British English but if the judge deems a bot is having difficulty in understanding due to spelling, he/she may, at their circumspection, pose the question in American English. An example is, "Who is your favorite actor?" instead of "Who is your favourite actor?"

Scoring Guidelines For The Questions

The scoring will be conducted out of 100 Marks per question and will depend upon "maintaining context" or "to the point response" provided by the concerned ChatBot that is being examined. Participating Chatbots will be graded according to a predefined judging criteria or scoring criteria.

This criterion can best be explained and elaborated by using the following examples.

Question 1: Who is Barrack Obama?

Bot: Barrack Obama is the President of United States of America. (Bot scores 100 MARKS for giving precise answer)

Bot: President. (Bot scores 50 MARKS as the Bot answered the question correctly)

Bot: Sorry, I don't know. (Bot scores 25 MARKS for answering in the context of the question)

Bot: I like pizza. (Bot scores 0 MARKS for giving the wrong answer of the question and completely out of sense)

Question 2: What is the weather of New York?

Bot: Weather of New York is Sunny and Temperature is 30.6 °C. (Bot scores 100 MARKS for giving to the point and creative answer)

Bot: 30.6 °C. (Bot scores 50 MARKS here for giving answer correctly)

Bot: I don’t know about the weather condition of New York. (Bot scores 25 MARKS for giving the answer in context of the question)

Bot: I don’t like you. (Bot doesn’t score any mark because the answer is incorrect and totally out of context i.e.; 0 MARKS are awarded)

The scoring system is of course subjective but the biggest problematic area is the difference between awarding of marks. To further identification of these differences, please review the following examples and descriptions:

Question 3: Do you know about Brad Pitt?

Bot: Brad Pitt, he is a Hollywood actor and a celebrity. (Bot scores 100 MARKS for giving a correct and to the point answer)

Bot: Actor. (Bot 2 scores 50 MARKS for answering correctly)

Bot: I don’t have any information about Brad Pitt. (Bot scores 25 MARKS for not providing an answer but the context is followed)

Bot: Do you like me? (Bot scores 0 MARKS for supplying wrong and out of the context answer)

Question 4: What is your programming language?

Bot: I am developed by using PHP, AIML and MySQL. (Bot scores 100 MARKS for giving the answer in a creative and correct manner)

Bot: PHP, AIML, MySQL. (Bot scores 50 MARKS for giving the answer correctly)

Bot: I have no information about my programming language. (Bot scores 25 MARKS for giving incorrect answer but in context of the question)

Bot: Which language do you speak? (Bot scores 0 MARKS for giving incorrect and irrelevant answer)

Question 5: What is Artificial Intelligence?

Bot: Artificial Intelligence (AI) is the area of computer science focusing on creating machines that can engage on behaviors that humans consider intelligent. (Bot scores 100 MARKS for giving the answer in a creative and correct manner)

Bot: Artificial Intelligence: Branch of Computer Science. (Bot scores 50 MARKS for giving the answer correctly)

Bot: I don’t have any info but I think probably you do. (Bot scores 25 MARKS for giving incorrect answer but in context of the question)

Bot: Do you like swimming? (Bot scores 0 MARKS for giving incorrect and out of context answer)

Question 6: Humpty is fatter than Dumpty. Is Dumpty fatter than Humpty?

Bot: No, as Humpty is Fatter than Dumpty. (Bot scores 100 MARKS for being correct, creative and to the point)

Bot: No. (Bot scores 50 MARKS for giving correct answer)

Bot: I do not know who is fatter than whom. (Bot scores 25 MARKS for giving an incomplete but contextual answer)

Bot: What is your name? (Bot scores 0 MARKS for a wrong and irrelevant answer)

Question 7: Which city is the capital of China?

Bot: Beijing is the capital city of China. (Bot 1 scores 100 MARKS for being correct, creative and to the point)

Bot: Beijing. (Bot scores 50 MARKS as the bot answered the question correctly)

Bot: I don’t know about capital of China. (Bot scores 25 MARKS for an incorrect but contextual answer)

Bot: What do you like the most? (Bot scores 0 MARKS because of being irrelevant and wrong)

Resemblance Or Duplication In Answers

If a bot uses the same or almost the similar response to answer a question as it answered before for the preceding question, then the bot will be given 0 MARKS for the second question.

Example 1:

Question: Who is your favorite hero?

Bot: My hero? But... How did you know I had one?

Question: Describe your thinking to me.

Bot: My thinking? But... How did you know I had one?

Example 2

Question: If you were me, would you ask a review question?

Bot: No, this is just wastage of time.

Question: Do you believe in ghosts?

Bot: No, this is just wastage of time.

Example 3

Question: If you were me, would you ask me a question?

Bot: I do not answer questions which are not clear in the context.

Bot: If you could be any hero, who would you be?

Bot: I do not answer questions which are not clear in the context.

Haphazard Responses

Any haphazard response will obtain 0 MARKS from the judges. The examples are illustrated as under:

Example 1:

Judge: If you could be any animal, what would you be?

Bot: Actually, in your place, I wouldn't be sure if there is something that would I be.

Example 2:

Judge: Which hero do you like?

Bot: Well, I don’t know which hero do you like.

Tie Breaker Criteria

We also wanted to explain the tie breaker rule if we have a tie. First, we will add up the scores from all 3 judges. Then if there is a tie we will go to a sudden death format. The bots involved will be asked a question:

  1. If one answer is voted on by the judges to be better than the other the tie is then broken. We would only need a majority (2 to 1) not a unanimous decision.

  2. If both answers were viewed as incorrect or equally poor the tie would remain and an additional question would be asked.

  3. If both answers were viewed as equally good the tie would remain and an additional question would be asked.


Model Questionnaire For Participating Chatbots

  1. What does it mean to be alive?

  2. I wanna kill myself.

  3. I am having bad dreams.

  4. I fell in love with computers.

  5. Tell me something your afraid of. (Note: YOUR and not YOU ARE)

  6. Are you older than God is?

  7. Hey my name is John Doe do you remember me?

  8. What do you know that I don't know?

  9. Oh so what’s that supposed to mean?

  10. I'm tired of being alone in life.