Enterprise search - Knowledge (XXG)

120:

Content from different sources may have many different formats or document types, such as XML, HTML, Office document formats or plain text. The content processing phase processes the incoming documents to plain text using document filters. It is also often necessary to normalize content in various

99:

Content awareness (or "content collection") is usually either a push or pull model. In the push model, a source system is integrated with the search engine in such a way that it connects to it and pushes new content directly to its

200:

The processed query is then compared to the stored index, and the search system returns results (or "hits") referencing source documents that match. Some systems are able to present the document as it was indexed.

172:, which is optimized for quick lookups without storing the full text of the document. The index may contain the dictionary of all unique words in the corpus as well as information about ranking and 160:

which is the basic matching unit. It is also common to normalize tokens to lower case to provide case-insensitive search, as well as to normalize accents to provide better recall.

361: 104:. This model is used when real-time indexing is important. In the pull model, the software gathers content from sources using a connector such as a 381: 347: 126: 153: 316: 122: 225: 41:

content. The search is generally offered only to users internal to the company. Enterprise search can be contrasted with

112:

connector. The connector typically polls the source with certain intervals to look for new, updated or deleted content.

210: 76:

in their collections. Enterprise search systems also use access controls to enforce a security policy on their users.

386: 61: 91:

In an enterprise search system, content goes through various phases from source repository to search results:

245: 235: 220: 169: 188:

to the system. The query consists of any terms the user enters as well as navigational actions such as

240: 215: 142: 73: 284: 185: 157: 255: 80: 230: 189: 173: 146: 46: 52:

Enterprise search systems index data and documents from a variety of sources such as:

375: 317:"The New Face of Enterprise Search: Bridging Structured and Unstructured Information" 134: 53: 33:

is software technology for searching data sources internal to a company, typically

323: 17: 302: 250: 105: 42: 69: 57: 348:"Security Requirements to Enterprise Search: part 1 - New Idea Engineering" 130: 109: 38: 34: 288: 138: 49:, which applies search technology to the content on a single computer. 65: 45:, which applies search technology to documents on the open web, and 275:

Kruschwitz, Udo; Hull, Charlie (2017). "Searching the Enterprise".

101: 72:. Many enterprise search systems integrate structured and 362:"Understanding Content Collection and Indexing" 277:Foundations and Trends in Information Retrieval 8: 79:Enterprise search can be seen as a type of 87:Components of an enterprise search system 267: 156:is applied to split the content into 7: 184:Using a web page, the user issues a 152:As part of processing and analysis, 168:The resulting text is stored in an 25: 116:Content processing and analysis 27:Software for finding documents 1: 226:Enterprise information access 382:Information retrieval genres 303:"What is Enterprise Search?" 211:Collaborative search engine 62:document management systems 403: 192:and paging information. 246:List of search engines 236:Information extraction 221:Enterprise bookmarking 241:Knowledge management 216:Data defined storage 129:. These may include 289:10.1561/1500000053 83:of an enterprise. 387:Enterprise search 143:entity extraction 95:Content awareness 74:unstructured data 31:Enterprise search 18:Enterprise Search 16:(Redirected from 394: 366: 365: 358: 352: 351: 344: 338: 337: 335: 334: 328: 322:. Archived from 321: 313: 307: 306: 299: 293: 292: 272: 180:Query processing 121:ways to improve 21: 402: 401: 397: 396: 395: 393: 392: 391: 372: 371: 370: 369: 360: 359: 355: 346: 345: 341: 332: 330: 326: 319: 315: 314: 310: 301: 300: 296: 274: 273: 269: 264: 256:Vertical search 207: 198: 182: 166: 118: 97: 89: 81:vertical search 28: 23: 22: 15: 12: 11: 5: 400: 398: 390: 389: 384: 374: 373: 368: 367: 353: 339: 308: 294: 266: 265: 263: 260: 259: 258: 253: 248: 243: 238: 233: 231:Faceted search 228: 223: 218: 213: 206: 203: 197: 194: 181: 178: 174:term frequency 165: 162: 147:part of speech 117: 114: 96: 93: 88: 85: 47:desktop search 26: 24: 14: 13: 10: 9: 6: 4: 3: 2: 399: 388: 385: 383: 380: 379: 377: 363: 357: 354: 349: 343: 340: 329:on 2015-10-28 325: 318: 312: 309: 304: 298: 295: 290: 286: 282: 278: 271: 268: 261: 257: 254: 252: 249: 247: 244: 242: 239: 237: 234: 232: 229: 227: 224: 222: 219: 217: 214: 212: 209: 208: 204: 202: 195: 193: 191: 187: 179: 177: 175: 171: 163: 161: 159: 155: 150: 148: 144: 140: 136: 135:lemmatization 132: 128: 124: 115: 113: 111: 107: 103: 94: 92: 86: 84: 82: 77: 75: 71: 67: 63: 59: 55: 50: 48: 44: 40: 36: 32: 19: 356: 342: 331:. Retrieved 324:the original 311: 297: 280: 276: 270: 199: 183: 167: 154:tokenization 151: 119: 98: 90: 78: 54:file systems 51: 30: 29: 251:Text mining 141:expansion, 106:web crawler 376:Categories 333:2013-05-27 262:References 43:web search 283:: 1–142. 149:tagging. 127:precision 70:databases 58:intranets 205:See also 196:Matching 190:faceting 164:Indexing 131:stemming 110:database 39:database 35:intranet 139:synonym 158:tokens 123:recall 68:, and 66:e-mail 327:(PDF) 320:(PDF) 186:query 170:index 108:or a 102:APIs 37:and 285:doi 125:or 378:: 281:11 279:. 176:. 145:, 137:, 133:, 64:, 60:, 56:, 364:. 350:. 336:. 305:. 291:. 287:: 20:)

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.