Gilbane Group Report Intelligenx

  • Uploaded by: Intelligenx
  • 0
  • 0
  • December 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Gilbane Group Report Intelligenx as PDF for free.

More details

  • Words: 2,567
  • Pages: 9
Research Report

Beyond Search: What to do when Your Enterprise Search System Doesn't Work

April 2, 2008 by Stephen E. Arnold

Beyond Search: Intelligenx

12. Intelligenx www.intelligenx.com Intelligenx is one of those companies with solid technology which is off the radar. But it was Intelligenx’s Discovery Engine that was the secret ingredient for the Carlyle Group when it sold Dex Media to R H. Donnelley Corporation for $9.4 billion. Search technology from Intelligenx also substantively changed how the Office of AIDS Research manages and administers research grants at U.S. NIH. And it was their Discovery Engine that helped to transform the way in which D&B licenses data to libraries around the country. Iqbal Talib, his son, and a cadre of skilled engineers have built technology that permits users to search and interact with incredibly complex datasets. The core product offering, Discovery Engine is unique in that it was built ground-up to enable full-text search with categorizations. The display of intuitive refinements (with counts) that are derived from the structure in data helps users to find and ‘discover’ information. Item

Quick Facts

Product

Intelligenx Discovery Engine

Price

Starts at $50,000. Custom price quote required.

Key Feature

Full-text search with categorizations.

Purpose

Provide access to structured information, so that users can interact and discover

Clients

Publicar, Axesa, MediaTel, OAR at NIH, TDS, ilocal, D&B, WebVisible

Company

Privately-held

Contact

+1-703-793-3270

Table 26: Quick Look at Intelligenx

Mr. Talib told Beyond Search, “Our Company was one of the first to introduce a combined full-text search coupled with navigation. What we discovered was that there are far more effective ways to let users interact with information. We also found that we could engineer systems to deliver unprecedented search features and functionalities at far lower costs and without many of the challenges and bottlenecks associated with other conventional search methods.” The son, Zubair Talib, is the CTO. He attended MIT and, with some friends from school, developed the first algorithms that are still the foundation of Discovery Engine. The Carlyle Group purchased Denver, CO-based Dex Media for $7.05B. Over the next 26 months, Dex launched a new Internet strategy that harnessed the power of Discovery Engine. On DexOnline, users could conduct a Google-like full-text search and for the first time anywhere, they could search all the text from all of Dex Media’s print directories. Users could refine the search results in order to find (or discover) what they were looking for. The site was responsive and users took to the interactive search ©2008 Gilbane Group, Inc.

180

http://gilbane.com

Beyond Search: Intelligenx functionality. During the time Carlyle owned Dex Media, usage of DexOnline skyrocketed (10-fold increase in traffic) propelling Dex from Internet obscurity to the number 1 traffic position within its 13 state region, ahead of Google Local, Yahoo Local, and Switchboard. Since Dex, Intelligenx has won a number of highly competitive contracts with large directory publishers around the world who use Discovery Engine to provide interactive access to yellow page information over the Internet. Mr. Talib said, We had success with directory publishers because our technology can easily handle very large traffic volumes, large data sets, and complex business logic. Directory publishers also face challenges with how to monetize their traffic and how to scale their business models – a problem that Discovery Engine solves quite naturally.

The company’s system allows you to search content from a print yellow page ad (including brands, locations and hours of operation, for instance), including the standard name, address, and category fields. A user does not have to specify which fields to query. Each result set is then presented in “buckets,” or collections of on-target results, not a list of results. You can then refine or “drill down” into these buckets to find particular listings quickly and intuitively. The suggestion of results that may be related to the initial query allows you to discover information that they may not have known even existed.

Figure 42: Intelligenx Discovery Engine The Discovery Engine includes separate APIs: one for indexing, one for acquiring content, and one for data transformation. The system can be integrated into almost any enterprise environment.

©2008 Gilbane Group, Inc.

181

http://gilbane.com

Beyond Search: Intelligenx

The Technology Discovery Engine is proprietary technology. The approach combines full-text search with fielded search. The result is that the system that provides all the benefits of and capabilities of conventional full-text search technology and all the search capabilities that exist in relational database management systems (RDBMs), combined with navigation and counts. Discovery Engine helps to exploit the underlying structure of the data for refinements and many other assisted search techniques; it also resolves failed queries. With more than a decade of computer science and development, the Discovery Engine incorporates innovative algorithms for compressing, optimizing and searching processed content. The approach required a “ground up” rethinking of content processing, according to the company. Innovations include algorithms for data compression and storage, content processing, and distributed parallel processing. A high-level schematic of the Discovery Engine illustrates a number of incorporated components. The system does not require a third-party database. A licensee can use commodity servers to scale the system. Like Google, the Intelligenx approach allows additional storage and servers to be added without complicated configuration and certification processes. Intelligenx’s founder told Beyond Search: Typical implementations achieve an 80 percent reduction in hardware, hosting and enterprise database costs. Our software simply bolts on to an existing enterprise infrastructure, eliminating expensive integration work. In fact, many of our customers retrofit our system into their existing data and maintenance infrastructure.

©2008 Gilbane Group, Inc.

182

http://gilbane.com

Beyond Search: Intelligenx

Figure 43: Paginas Amarillas' use of the Intelligenx Interface The Intelligenx system makes it possible to display a result set with hot links to other Web pages and related categories. The two-panel display used in Paginas Amarillas displays related content in the left-hand panel of the display.

Linguistics The system includes support for linguistic techniques to improve query understanding. The standard Discovery Engine linguistics toolkit includes spelling checkers, stemmers, stop word removers, and synonym updating functions. These tools support multiple languages including multi-byte languages like Japanese, Chinese and Arabic. The linguistics tools are used within the query transformation infrastructure that can be used to extend the capabilities of Discovery Engine. This infrastructure can also be used to perform complex query transformation tasks such as parsing complex Boolean queries, including Boolean NOTs, translating query operators from different languages, performing category matches preferentially, and constraining or loosening a query.

APIs The architecture of the Discovery Engine includes a number of components. The application programming interfaces make it possible to integrate the Intelligenx system into other enterprise applications, Web pages, or a portal. The APIs and extensions are fully documented. The product is typically shipped with a Software Development Kit (SDK) that contains sample configuration files as well as the entire toolset required to manage a real application on a real deployment. The SDK contains a sample application

©2008 Gilbane Group, Inc.

183

http://gilbane.com

Beyond Search: Intelligenx along with data, source code and display files that can be used as a starter kit for developing a customer-specific application. The Index API provides all of the functions required to construct an Intelligenx index from a copy of the customer's data feed. The Search API provides all of the functions required to search an Intelligenx index. Particular strengths of the Search API are the very flexible and customizable ranking and sorting methods, query expansion and linguistic modifiers, inclusion of complex search logic and search trees, and failed search handling methods. The index and search plug-ins are typically applicationspecific code written to process the customer's raw data feed, as well as satisfy the business requirements specified by the customer. While accessible through an API or XML web service, Discovery Engine is also packaged with a presentation layer that consists of visualization pages, e.g., JSP or ASP, to accept a user's query and present the relevant results. Other APIs available include a Crawler API for crawling the web and accumulating a web index to augment the customer's data, as well as a Reporting API for generating statistical information about the queries processed by the Search API and a Management API for administering a deployment. In addition to the public APIs, Intelligenx provides a number of documented extension sub-systems that can be used to enhance the capabilities of the basic search engine. These extensions can be used, among other tasks, to augment the indexing process, configure the query transformation process and control the results ranking process. Intelligenx also provides a suite of pre-written implementations of these extensions that suffice to satisfy the business rules of most customers. However, customer-specific requirements can be incorporated quickly by writing fresh implementations within this infrastructure.

Intelligenx Features The system includes a number of interesting features. For example, content processed is automatically categorized and appropriate metadata generated and linked to the content. The system can process XML, structured data, or unstructured text. More recently, Intelligenx has packaged its internal data mining tools into rich business intelligence log analysis tools. These add-on products, Ad Optimizer and Site Optimizer, build on the Discovery Engine architecture to provide deep, interactive information about usage. AdOptimizer, tracks user behavior and generates real-time reports about those actions. One application of AdOptimizer is to permit real-time inspection of users’ interaction with suggested content. These reports can be syndicated to allow advertisers, users, or licensee staff to make adjustments to certain system components; for example, content boosting or advertising fees. SiteOptimizer helps determine relationships and correlations between user behavior and how those relationships can be used to drive improvements to the search application. Another recent add-on, Content Enhancer, crawls web pages and extracts relevant and meaningful content and entities from web pages in order to enhance the original content repository. ©2008 Gilbane Group, Inc.

184

http://gilbane.com

Beyond Search: Intelligenx Feature

Beyond Search Comment

Knowledgebase Support

None needed. The system “discovers” entities and categories

Query Types

Boolean, free text, and assisted navigation

Visualization

Outputs can be displayed as tables or other representations

Entity Extraction

Not applicable

Platforms Supported

Linux, Windows

Export

Content can be generated in XML or user-defined formats

Third-Party Support

The Discovery Engine can be integrated with any third-party application

Vertical Support

Publishing

Analytic Functions

The system includes strong analytic support including various numeric functions. Additional mathematical processes may be integrated via the APIs

Table 27: Technical Highlights for Intelligenx

Other Intelligenx features include: Geospatial data support so results can be searched, mapped or manipulated by geo parameters Configurable categorization and relevance ranking thresholds Key word highlighting in results Near real-time index updating Multi-threaded architecture to take advantage of multicore processors Built in content transformation tools Federated search capability to search across disparate repositories The system is language-independent and provides a configurable security model based on the operating system in use. For public access, the system supports hypertext transport protocol (HTTP) authentication. The system has no limit on the number of documents or the amount of content it can process and index.

Discovery Engine in Action You can explore the functionality of the Intelligenx system at Publicar’s Spanish language directory portal at http://www.paginasamarillas.com/. Publicar is the largest directory publisher in South America. Traffic has almost doubled for Publicar since deploying Discovery Engine and the site processes millions of queries per day with high performance. Publicar will add on extensions for wireless search and SMS that will utilize the core search infrastructure built on Intelligenx technology. Other Intelligenx current customers include: Axesa (Puerto Rico, formerly Verizon Information Systems Puerto Rico) Conselho Federal da Justiça (Justice Department Brazil) ©2008 Gilbane Group, Inc.

185

http://gilbane.com

Beyond Search: Intelligenx DeTelefoongids (Netherlands) Dun & Bradstreet (USA) iLocal (Netherlands, Belgium, Luxemburg) Localeze (USA) National Institutes of Health (US Federal Government) MediaTel (Czech Republic) WebVisible (USA) 411.ca (Canada)

Upside The upsides of the Discovery Engine pivot on the system’s ability to handle very large volumes of content even at extremely high loads. Beyond Search’s tests revealed response times in the 100 millisecond range for our test queries. Other upsides include: Support for structured and unstructured information regardless of the source document’s language or the physical location of the data. A scalable architecture that allows licensees to expand the system’s infrastructure with commodity hardware. Note that Intelligenx also offers hosted solutions and a suite of web services for merchant-level reporting and search analytics. Discovery Engine has excellent failed-search handling A well-documented and comprehensive suite of APIs with sample code. Intelligenx makes integration and extension of its system less painful than some of the other companies profiled in this study.

Downside The downside of Intelligenx is the low profile the company has adopted in its 10 year history. Even though the firm is projected to generate $4 to $6 million in profitable revenue in 2008, most information professionals are not aware of the company’s highperformance, feature-rich system. And because the company has captured a number of international customers (mostly directory publishers) Discovery Engine is perceived as only a local search technology. That’s not true. In reality, Discovery Engine can bolt on to any database or content repository, including native XML files and deliver blinding performance, equal to or better than many of the features associated with Endeca’s or Fast Search & Transfer’s systems. If your applications require scalable full-text search with categorizations, then you ought to know about Intelligenx. Other drawbacks include: The system performs best when the source content is structured; for example, content from a database or well-formed XML

©2008 Gilbane Group, Inc.

186

http://gilbane.com

Beyond Search: Intelligenx The basic system can be used in its default mode. However, tuning the system or integrating it with third party applications requires study of the API documentation and may involve writing scripts The company offers a range of professional services. Some of the work is performed by senior developers. If you want a large, custom project in a very short time, you may have to wait until the firm’s technical highly trained staff becomes available.

Net-Net The truth is that processing so much information so quickly is not so easy using conventional search technology. Using the wrong technology to achieve this sort of functionality has its limitations including challenges with performance and scalability. Today, Intelligenx’s performance over the Internet and its high-speed indexing is closer to that delivered by Google than most other Web search systems. The software has also been battle tested under heavy loads where it has delivered the goods. The system is adept in its manipulation of structured data. It is even possible to use the Discovery Engine as a database engine, eliminating most of the hassles and processing bottlenecks associated with traditional relational database architectures. Like Google, Intelligenx technology works on commodity class clustered computing environments so that scaling is easy and cost effective. The product is flexible enough to support custom query transformations to enhance the user experience. As well. it can provide totally customized ranking/sorting/filtering schemes in order to accommodate the relevance and ordering of search results. A full set of APIs, interfaces and complete documentation enables rapid application development and easy, rapid deployment. If you want to make use of assisted navigation and offer key word searching, you will want to take a long, hard look at the Intelligenx system. Using it as the data management foundation, Intelligenx makes it relatively easy to hook in specialized visualized, statistical, even additional content processing functionality.

©2008 Gilbane Group, Inc.

187

http://gilbane.com

Related Documents

Group 2 Report
October 2019 38
Group 1 Report
October 2019 38
Report Group One
June 2020 9
Aviation Report Group 9
April 2020 36

More Documents from "Nitesh Shetty"

Vedp_fy08
December 2019 13
Is Monthly Article 9-9-08
December 2019 14
Ilocal Article
December 2019 9