The implicit social network according to Sinequa

implict-SN-SInequaOne of the primary goals of businesses that implement an enterprise social network is expert location. Expert has often been a wasted word these last years, most of all from an HR standpoint but the problem is actual and well known ; who has skills, abilities and knowledge on a given matter. As business issues are becoming more and more complex, finding the right person in an organization is often like looking for a needle in a haystack, all the more since the expertise in question may have nothing to do with someone’s current position or job title.

An enterprise social network helps to meet part of this need. It captures informal conversations and is able to link a person to a matter, to capitalize on conversations and know who knows what and whom. The links between people, matters and conversations help to locate experts and networks of experts on a matter.

An enterprise social networks only identifies expertises that are declared on the social network

But it only covers a part of the need because the system has limits.

First, it will work only if people participate and we all know how hard it is to have a critical mass of employees actively using enterprise social networks, all the more since the strong focus on communities of practices leads to overlooking all “operational” interactions that happen in day-to-day flows of work. The experts that emerge on the social network are nothing more…the experts that are on the social network. What does not ensure that they’re the best in the organization to help on a given matter.

Then because enterprise social networks are mainly declaratory. One says he has such or such skills, tries to manage his employee brand. Of course there are mechanisms to prevent such issues : a governance that promotes sincerity, the ability for other users to endorse and promote one’s skills, and the conversations that show that one is actually a reliable expert or an usurper.

Why starting from scratch while it’s possible to use existing data ?

Last because the day one of the enterprise social network is often the day one of information gathering for expert location. Things start with a blank page. Nothing that happened or was said before the social network was lauch exists. And most of what will happen outside of the social network will never be taken into account.

Enough limits that make enterprise social networks necessary but not sufficient for expert location.

So this post is about the conversation I had with Xavier Pornain and Hans-Josef Jeanrond from Sinequa on what they call implicit social networks.

Data are the exhaust of work. Let’s recycle them usefully

First acknowledgement by Xavier Pornain : “For us big data is more than a marketing concept. 80% of the information available in the organization is unstructured and has been there long before social networks, contained in documents or business applications.” An obviousness no one can discuss and that leads to the conclusion I stated above : starting a social network is, for expertises, trying to make sense of data we have but don’t know how to use. “The unstructured nature of data is a technology problem, a stack of documents is valueless if you can’t categorize it and link it to the client’s business“.

The “Implicit Social Network” approach consists of indexing all documents and data that have been produced by employees at work to make experts networks surface through content analytics. The positive side of the approach is its exhaustiveness as many enterprise social networks have an adoption rate that is around 5/10% Xaver Pornain adds “today, no business offers employees a real unified search engine“. A real waste if we consider the potential, the knowledge/human/relational capital that is left unused. Morevover, it isn’t about making employees share their knowledge and experience but only bending to collect what already exists. It’s not about potential value but actual one.

The Astrazeneca case

How does it work ? Let’s explore the Astrazeneca case.

Astrazeneca is a pharmaceutical company. 70 000 employees, 500 millions documents, 40% of which being internal and 60% hosted and external medical databases.

The purpose was to built a system that detects and links business entities. The system has access to databases that references diseases, drugs etc. as well as structured frames of reference (products, clients, employees etc.). Structured data guides the analysis of unstructured one : it’s important to link the analysis of unstructured data to the structure of the organization, of line of businesses to put it into perspective and make sense of it.

Then all these documents have been indexed and analytics allows to understand, in a text, what refers to a person, a place, a drug etc.. and create links between these entities. One can think that if a person is mentioned in many work documents on a gene (without being the author of the documents), she may have a certain expertise. From link to link, the implicit social network surfaces.

Then one can as the search engine a question in natural language. One enters the name of the disease and the systems issues names of people that have been working on it. An inestimable value for a pharma company : no one covers alone the full scope of a disease so it’s important to bring together practitioners from many different disciplines. But the engine can also propose names of drugs, genes…with the related experts. So it’s possible to identify who should be mobilized for a given matter and even discover narrower sub-matters with, here again, the related network of experts.

The engine also offers dynamic rich profiles. In the case of Astrazeneka it displays each of the drugs, diseases, genes… for which one has expertise based on what he created or have worked on. Any user can decide to remove (not add) information from their profile if they think it’s not relevant.

That’s a generic approach that can apply to many kinds of needs. In the services industry, for example, find people based on the missions they achieved using data from the CRM, sales proposals, CVs and project management tools. It’s possible to know who has actually done what, worked on what, with what results and so to find the right person / team for a given work. When a need is formalized in document / job description, the document itself can become a query without having to enter all the data in the engine that will crawl all the indexed sources. Use cases are easy to imagine in fields like sales activities (find “pieces” of proposals, resources), legal (cases, legal precedents) etc.

Of course the system gets richer and smarter as the company produces documents, information, data, without needing the participation of users to “let others know”. What solves part of the adoption issue that’s peculiar to social networks : things start from actual work without adding new tasks, without having to share here what’s been done there.

What’s key in such approaches is, of course, data. One can ask the best question, what will lead to answers and relevant ones is the volume and variety of data. Keep in mind that, at Astrazeneka, more than 500 millions documents are indexed in real time.

That’s all for the approach. Now let’s focus on a couple of questions it raises.

First, data privacy, what is a more and more sensitive matter. According to Xavier Pornain there are noticeable differences depending on countries and cultures. “In the anglo-saxon culture such approaches do not raise many problems provided they deliver actual business value. Germans are very cautious but things are discussed with unions, there’s no restricting law that prevents anything. They are quite pragmatic but need to be sure the system is used to find, not to rate. As for France…the fact it’s impossible to index emails is a real loss of value”. That’s something I’ve often heard : the biggest Big Data issue in France is not technology but the CNIL (National Comission for Computing and Liberties). And competitiveness is at stake !

Human-Machine co-operation works better than brute power

Another question that went through my mind : why providing a link to the expert, to the right person, instead of the answer or a document. “Because the document may be confidential and not accessible. So we give the name of the expert. Moreover the expert brings context. That’s the difference with systems like Watson that give the answer. The best paradigm is to use humans for what they’re best and favor human-machine iterations”.

Hans-Josef Jeanrond adds : “I’ve been struck by this video from the TED Conference. The speaker says that the beginning Deep Blue won againts a chess master, then 3 students defeated Deep Blue with 3 average PCs and…their cumulative intelligence. That proves the value of human-machine interactions. The machine is here to provide assistance, not brute power”. As a matter of fact, sometimes the answer does not exist so human interactions will ne needed.

We can also wonder if such systems will deliver the promise social networks are failing at. According to Xavier Pornain who confirms my feeling, that’s not exactly what will happen. Social Networks allow interactions that were not possible before, generate data that wouldn’t have been generated otherwise and, consequently, must be one the sources Sinequa integrates with. Sinequa’s user dynamic profile is not supposed to replace the social network’s one but to feed him with data from all the information system to increase its value. For Pornain, social networks will become the new enterprise portals. What does not mean they won’t capture data elsewhere.

Impression

On this matter, I’ve noticed a very interested quote by Hans-Josef Jeanrond : “we have a cultural and sociological issue with social networks. Why should one spend time writing when search brings the answer when I need it ? That’s why few people use them”.  Hence the need to make both work together.

Towards Unified Information Access

So does the future of social lie in search and big data. That what often comes to my mind, thinking that if it’s so difficult to make people participate it’s better to make the most of what they produce as a matter of course. Data are the “exhaust” of our activities. We should try to recycle them as well as possible.

Sinequa sees the dawn of a new field called “Unified Information Access” and pretends having an original approach compared to other Big Data players :

• by offering real time processing. “Hadoop is nice for data scientists but end users want instant results”.

• by not focusing on NOSQL but allowing to process structured and unstructured data. What NOSQL should mean is “not only SQL”.

• by integrating semantic and natural language processing capabilities into analytics.

Bottom line : many businesses think they have a sharing problem while they have a search one

What lessons to learn from this long discussions ?

1°) Many IT departments are looking for smart things to do with Big Data but wonder what business cases are worth. Such cases should give them ideas.

2°) One of the first question raised when an enterprise social network is launched is how to feed profiles with external data (mainly HR ones) and an impressive value lies in being able to use everything that already exists without having to start from scratch and “re-create” everything.

3°) If if it’s hard to make people participate, it’s possible to the make the most of what they do in their day-to-day work with their usual business tools.

4°) If unstructured information accounts for 80% of the available information, it need to be correlated with structured information and the structure of business to make sense of data and have line of business driven approaches.

5°) The value of data is low without interpretation and context. The value of big data does not lie in processing but in human made insights.

To end, I’d like to thank  Xavier Pornain and Hans-Josef Jeanrond for their time and Claire de Larrinaga that made this meeting possible.