SEMANTIC SIMILARITY

Verbal communication with machines has been one of the main goals of computer science since its early days. In 1968, Arthur C. Clarke caught the imagination of half the planet with his novel 2001: A Space Odyssey, which was made into an enormously successful film by Stanley Kubrick in the same year as the book was launched. Computer science was still in its infancy; the first microprocessor had not even been developed. Still, the idea of artificial intelligence such as that of HAL 9000 had already seduced a generation, even though personal computers were still a thing of the future.

blogSTS

From left to right: Aitor González Aguirre, Dr. LLuís Padró and Horacio Rodríguez Hontoria.

As often happens in these situations, in mid-2016 we are still far from replicating the imaginary communication capabilities of HAL 9000 in 2001. It may appear to be extremely easy to understand human language – people do it every day – but it is a very difficult task for machines. Language is full of factors that make comprehension complicated: polysemy, irony, sarcasm and double meanings. There are many ways of saying the same thing. As if this was not enough, comprehension depends on our knowledge of the world, which we use to reason and understand each other, in a process that we carry out almost without realising. The following two statements are a clear example:

  • Yesterday I saw a plane flying over New York.
  • Yesterday I saw a train flying over New York.

Although the statements are exactly the same apart from one word, they have very different meanings. It is easy for us to imagine a plane flying over New York, but when we read the second statement we immediately realise that a train cannot fly. Consequently, we know that the meaning of this sentence is different. For example, it may be that a plane is flying over New York and the person inside the plane (because people cannot fly either) sees a train from the window. It is very difficult for computers to make these kinds of inferences, but relatively easy for people.

Semantic textual similarity (STS) is a task that was introduced originally in SemEval 2012 to tackle one of the aspects of artificial intelligence that will enable machines to communicate naturally with people: evaluation of meaning. Knowledge of whether two original sentences have the same meaning or not is vital to good communication. However, meaning is not black and white: there is an entire spectrum of greys. The aim of STS is to automatically assess the similarity of two sentences on a scale from 0 to 5. Each band represents, in an easily understandable way, the differences that make two sentences equivalent or not.

There are many techniques and resources for tackling this task, such as WordNet, Wikipedia and ontologies such as SUMO, which address many of the aforementioned difficulties, including polysemy, the identification of entities (including people and companies) and reasoning. However, these resources are created manually, which is a very expensive task in terms of the time required and the money. Recently, advances made via Deep Learning have led to new useful resources that have been generated automatically to improve STS systems. One of these is Word Embedding: a system that can map the semantic characteristics of words in a vector. Words with similar meanings are mapped closer to each other, and those that are different are further away. These new systems can analyse the context in which words appear in a very wide range of texts. Assuming that similar words appear in similar contexts, the system puts each of the words into an N-dimensional space, and maintains the premise that similar words should remain close to each other. When the vectors have been generated, we can calculate the similarity of the words by calculating the cosine of the vectors.

The same vector that converts man into King, converts woman into Queen (figure 1).

The same vector that converts man into King, converts woman into Queen (figure 1).

However, the most interesting characteristic of these vectors is that the meaning is preserved, so we can operate on them using additions and subtractions. For example, the result of the operation King-Man+Woman would be the same as Queen, or put another way, if the vector that codes the meaning of Man is taken away from the vector that means King and the vector for Woman is added, the result is the vector that means Queen (see Figure 1).

These measures of equivalence of meaning are useful for many tools, for example, in the well-known Siri, the personal assistant for iOS, or to assess voice commands for automated systems for dwellings, so that the house can deduce that the phrase “I need more light” could mean “Put up the blinds” or “Turn on the lights”, depending on how much light there is outside at the time. Other potential applications are assistance for the elderly, as in general people in this age range find it harder to learn and use voice commands, and assistance for teaching, where an STS system can assess whether a student’s response means the same as the correct response assigned by the teacher, which makes the task of teaching easier.

Another very popular task is Question Answering (QA), an area in which our group is working actively. QA can be defined as a task in which the user asks a question in natural language and the system must give an answer (instead of different documents, which is what Google would do). The first QA systems tackled questions of a factual nature: Where was Obama born?, When did the Second World War end? Who discovered penicillin? Then, more complex questions were tackled, in which the response was not found in just one document, but needed to be built from partial components, for example: Who were the first three chancellors of the Federal Republic of Germany?; questions related to definitions: Who was Picasso?, What is paracetamol used for?; and questions on opinions, such as arguments for and against arms control in the USA. In parallel, restricted domain systems have been developed, as well as systems that seek a response not only in textual documents but also in structured sources, such as Linked Open Data repositories (FreeBase, DBPedia, BioPortal and others). In all of these systems, distance measures between the question and potential responses are used widely.

One type of QA that has become fairly well-known is Community QA (CQA). In CQA, an initial question is asked by a member of the community. This activates a complex structure of interventions of members of the community who reply, give opinions, reconsider and refine, among other actions. This leads to the use of different measures of similarity between questions and other questions, between questions and answers, and between different answers, etc. Obviously, the measures are different for different sub-tasks. Recently, our group participated in a CQA competition as part of SemEval 2016 in which there was the added difficulty that texts in their original language, Arabic, were compared with their translation in English.

Both STS and QA are partial approximations to Natural Language Understanding (NLU), which consists of a programme that reads a text and, on this basis, constructs a conceptual representation of its meaning. If two texts have the same meaning, or similar meanings, they will have similar conceptual representations. However, this representation could be used in other applications (such as translation and creating summaries). In the future, research will be carried out in this direction, which will combine Deep Learning and major semantic online resources for the automatic comprehension of language. Perhaps HAL is not so far away after all.

Dr Lluís Padró, Dr Horacio Rodríguez Hontoria, researchers of the Center for Language and Speech Technologies and Applications (TALP UPC), and Aitor González from IXA research group

 

The next hurdle

How can you be a successful entrepreneur? It is the question of the decade. We live with a wave of proposals designed to take advantage of the constant need for rapid innovation in different industries, and discovering factors of success or failure is a kind of new Holy Grail. Indeed, the question itself is a business opportunity, as we can see by looking at related content on Amazon or Linkedin, and as shown by the apogee of coaches and experts flourishing in the field of entrepreneurship.

DSC_4503

Dr. Santiago Royo. Director of the center for sensors, instruments and systems development (CD6 UPC)

When I was persuaded to write this article, I thought that I could identify some ideas that could be used as guidance, drawn from the experiences of the eleven technology-based companies (TBC) that we have created at the Center for Sensors, Instruments and Systems Development (CD6), a centre for innovation in optical engineering and photonics at the UPC. The article also serves to highlight the role of universities in Spain as key stakeholders in technology transfer, a function that is not always easy to assimilate within and outside the institutions.

Key factors

It is not easy to extract common ideas and summarise them in an article, because the common part gets diluted in the execution in each case. However, I think I can identify at least four key factors. Due to the nature of the CD6, the following points refer to technology-based companies that develop physical equipment; those that develop software differ considerably in many aspects.

The first point is that, fortunately, the knowledge that there is a certain theoretical basis for tackling this kind of project has become widespread among entrepreneurs. Many basic concepts are well-known and enable a much less suicidal approach than that taken around 15 years ago. The legal framework is clearer, the stakeholders know what they are talking about and the concepts are fairly common to all projects, although they should be adapted to each one. There are some wonderful books on this, notably, in my opinion, ‘Technology Ventures’, by Thomas H. Byers and other authors. So if entrepreneurs want to put their savings on the line, it is a good idea to read a bit first.

Secondly, the entrepreneurial team is key. A team with clear personal objectives regarding the company; complementary technical, commercial and management skills; and full-time dedication are all bonuses. Here I would add that there is a clear advantage in all members of the team having already worked in a technology-based company or experienced one close hand. If you are planning to create a TBC, working in one first and learning from others seems like an intelligent option. Then, you should build a balanced team of professional, reliable people who are committed to the company.

Assesssment of the business opportunity and how to protect it is another key. It is essential to evaluate whether you can do something that is both profitable and difficult to copy, due to control over some vulnerable point in the industry’s value chain, the existence of intellectual property rights for the technology, or specific knowhow that is difficult to replicate. The business opportunity should resolve a problem of a real user who is willing to pay for the solution. There are several ways to assess the opportunity, many have nice names in English, but in many cases they boil down to “the earlier you ask the potential client, the better”.

Breaking even

Another important point to consider is that you have to pay for the party. The aim is to have enough resources to get by until enough repeat sales are obtained to break even. The balance between financial resources and commitments is critical and one of the main problems that awaits entrepreneurs; it often takes them by surprise. It is important to identify all the available channels of support (shareholder’s loans, own capital and presales) to adapt the project to funding needs. The idea is to obtain enough money for the general project to be profitable, no more and, of course, no less. This is where most technology-based companies fail. It almost always takes longer to sell than expected during the planning phase.

Despite so much advice, the main certainty is that there is no certainty in the world of technology-based companies. Physics cannot predict, more than statistically, where a leaf will fall from a tree, even though all the laws that are involved are known individually. Likewise, when we talk about TBCs, we may know the general laws, but each case is different (in terms of the market, internationalisation, the sales channel, the team, and other factors). We can only obey the general laws, quantify the likelihood of being successful, and let the leaf fall within the expected circle. We should enjoy the leaf dropping and all the adrenaline this involves, and learn as much as possible to better overcome the next hurdle.

Dr. SANTIAGO ROYO, DIRECTOR OF THE CENTER FOR SENSORS, INSTRUMENTS AND SYSTEMS DEVELOPMENT (CD6 UPC)