– Pochi giorni fa una serie di attacchi cibernetici alla pressoché sconosciuta Dyn Corporation americana – che gestisce uno dei principali “centralini” che fa funzionare l’Internet in Nord America – ha messo in ginocchio il web negli Usa per diverse ore, bloccando l’accesso a molti siti di grande importanza: Twitter, Spotify, Pinterest, parti della galassia Amazon, il New York Times, la BBC, la CNN e parecchi altri.
Gli attacchi in sé, per quanto ripetuti e massicci, non erano particolarmente sofisticati. Il Governo americano ha fatto sapere infatti che l’azione poteva essere stata condotta anche da dei teenager, che non era necessario attribuire l’azione ad “altri governi”. La volontà di minimizzare forse c’era, ma il suggerimento Usa è perfettamente plausibile – una chiara indicazione di come sono vulnerabili i sistemi informatici che sempre di più governano le nostre vite.
Leggi l’articolo di Eugenio Santagata, CEO di CY4Gate SpA, Vice Presidente Esecutivo, Direttore.
The opportunities offered by today’s big data and unstructured information are a great impulse for companies to choose solutions based on text mining approaches and applications. Internal and external text content essential for business—customer service records, insurance claims, clinical trial data, as well as emails, news and social media content—is everywhere today. Text mining approaches and applications are essential for helping companies take advantage of this information for improving strategic business activities and boosting decision making.
Unlike data mining, which is designed to work with structured data, text mining (or text analytics) focuses on text-heavy business data and is intended to handle full-text documents, emails and web content. In other words, text mining handles the most relevant part of today’s enterprise business data that is stored as unstructured content in the form of text (over 80% of business data is unstructured).
The most common text mining approach involves a representation of text that is based on keywords. A keyword based methodology can be combined with other statistical elements (machine learning and pattern recognition techniques, for example) to discover relationships between different elements in text by recognizing repetitive patterns in present in the content. These approaches do not attempt to understand language, and may only retrieve relationships at a superficial level.
Text mining based on intelligent technologies such as artificial intelligence and semantic technology can leverage an understanding of language to more deeply understand a text. This enables extraction of the most useful information and knowledge hidden in text content and improves the overall analysis and management of information.
As a powerful approach that improves a number of activities, from information management, data analysis and business intelligence to social media monitoring, text mining is successfully applied in a variety of industries. Here are some examples of the most used text mining applications:
Text mining supports organizations in managing unstructured information, identifying connections and relationships in information, and in extracting relevant entities to improve knowledge management activities.
Text mining helps companies to get the most out of customer data by capturing new needs and opinions from text to improve customer support through the ability to understand what clients are saying (for example via social media.)
Text mining extracts the entities that matter the most (people, places, brands, etc. and also relevant relationships, concepts, facts and emotions) from text. A better analysis of text means better results for entity extraction that can be integrated into other platforms and applications to improve business intelligence activities, for example.
– Qualche giorno fa Microsoft e le principali dotcom hanno annunciato uno sforzo congiunto per sviluppare e promuovere l’Intelligenza Artificiale (AI): ne abbiamo parlato con Luigi Conti, VP Corporate Strategy & Development di Expert System.
– Ascolta l’intervento di Maurizio Mencarini, EMEA Head of Sales – Intelligence di Expert System, all’evento dedicato alla seconda edizione de “Il mese dell’investimento in Francia”, organizzato dall’Ambasciata di Francia in Italia e Business France Italia.
With many years of experience in developing and implementing different kinds of information management or process automation process projects, we are in a unique position to share what we believe are the most effective text analytics best practices and the most innovative and value rich text analytics applications.
Let’s start with Text Analytics best practices by focusing on a couple of areas that can have a significant impact on a project’s success, but that are often ignored by customers and/or consultants.
Together with the explosion of available information we are all experiencing, the number of text analytics applications that provide business value to organizations are growing by the day. The three applications below, while perhaps less known, offer organizations the highest ROI potential, based on our experience in the field:
– Prediction is tough and rarely a guarantee. If we were able to predict the future, there would be no Vegas, or at minimum a much poorer version. Weather would cause fewer surprises and the World Series, the Davis Cup, the Stanley Cup, the World Cup, the Super Bowl, despite the not so predictable commercials, and the Daytona 500 would not be nearly as interesting to watch. In the world we live in, there are so many variables, nuisances and unknowns, it is difficult at best, to predict a specific outcome or what the future may hold. If we were able to consistently and accurately predict the future, it would be similar to having tomorrow’s Wall Street Journal, and those with access, would retire rich the day after tomorrow.
A technology that strives to understand human communication must be able to understand meaning in language. In this post, we take a deeper look at a core component of our Cogito technology, the semantic disambiguator, and how it determines word meaning and sentence meaning.
To start, let’s clarify our definitions of words and sentences from a linguistic point of view.
A “word” is a string of characters that can have different meanings (jaguar: car or animal?; driver: one who drives a vehicle or the part of a computer?; rows, the plural noun or the third singular person of the verb to row?). A “sentence” is a group of words that express a specific thought: to capture it, we need to understand how words relate to other words (“Paul, Jack’s brother, is married to Linda“. Linda is married to Paul, not Jack.).
To understand word meaning and sentence meaning, our semantic disambiguator engine must be able to automatically resolve ambiguities to understand the meaning of each word in a text.
Let’s consider this sentence:
John Smith is accused of the murders of two police officers.
To understand the word meaning and sentence meaning in any phrase, the disambiguator performs four consecutive phases of analysis:
This phase breaks up the stream of text into meaningful elements called tokens. The sequence of “atomic” elements resulting from this process will be further elaborated in the next phase of analysis.
John > human proper noun
Smith > human proper noun
is > verb
of > preposition
During this phase, each token in the text is assigned a part of speech. The semantic disambiguator is able to recognize inflected forms, conjugations and identify nouns, proper nouns and so on. Starting from a mere sequence of tokens, what results from this elaboration is a sequence of elements. Some of them have been grouped to form collocations (police officer) and every token or group of tokens is represented by a block that identifies its part of speech.
John Smith > human proper noun
is accused > predicate nominal
During this phase, the disambiguator operates several word grouping operations on different levels in order to reproduce the way that words are linked to one another to form sentences. Sentences are further analyzed and elaborated to attribute a logical role to each phrase (subject, object, verb, complement, etc.) and to identify relationships between verbs, subjects and objects and between these and other complements whenever possible. In our example, the sentence is made of a single independent clause, where John Smith is recognized as subject of the sentence
John Smith > subject
is accused > nominal predicate
During the last and most complex phase, the tokens recognized during grammatical analysis are associated with a specific meaning. Each token can be associated to several concepts; the choice is made by considering the base form of each token with respect to its part of speech, the grammatical and syntactical characteristics of the token, the position of the token in the sentence and the relation of the token to the syntactical elements surrounding it.
Like the human brain, the disambiguator eliminates all candidate terms for each token except one, which will be definitively associated to the token. When the disambiguator comes across an unknown element in a text (for example, human proper names), it tries to infer word meaning and sentence meaning by considering the context in which each token appears to determine its meaning.
Is accused > to accuse > to blame
police officer > policeman, police woman, law enforcement officer
Want to learn more about the disambiguation process? Try our demo
– Mentre dovremo aspettare qualche decennio per vedere eserciti composti da robot, nel contesto della cyber warfare sono già disponibili sistemi in grado di operare a una velocità impossibile all’essere umano. Nelle intenzioni del Pentagono si tratta sempre di sistemi che non puntano a sostituire l’uomo, ma a supportarlo in modo efficace.
«L’intelligenza artificiale è una grandissima occasione per l’industria hi-tech italiana, un treno che il nostro paese non può perdere». A dirlo è Stefano Spaggiari, fondatore e amministratore delegato di Expert System [EXSY.MI], azienda modenese leader nel settore delle intelligenze artificiali semantiche per l’analisi dei Big Data, che spiega: «a livello globale non c’è ancora un player consolidato come leader assoluto, è una corsa dove nessuno ha ancora un vantaggio commerciale o tecnologico decisivo».
Il nostro paese ha perso la corsa in molti, forse troppi campi dell’information technology, ma la partita delle intelligenze artificiali è ancora aperta, anche a società ancora piccole ma tecnologicamente all’avanguardia, come Expert System, considerata nella top 10 mondiale delle tecnologie di analisi semantica dei big data, al pari di giganti come IBM o HP.
Expert System is the Platinum Sponsor at the 2016 KMWorld Taxonomy Boot Camp this November 14-17, 2016 in Washington DC. We are thrilled to have 2 sessions where you can see Expert System speak – Daniel Mayer, CEO speaking on “Cognitive Meets Taxonomy” and Bryan Bell, EVP speaking on “Context Navigation & Semantics”.
This year’s theme, Hacking KM: People, Processes, & Technologies, will look at novel ways to support knowledge sharing and organizational culture. It will examine processes and technology to foster collaboration between self-organizing and cross-functional teams to promote early delivery and stimulate innovation, continuous improvement, and encourage rapid and flexible response to change. We look forward to seeing you at this year’s event!