22:
101:(ORNL). The software processes free-text documents and shows relationships amongst them, a technique valuable across numerous data domains, from health care fraud to national security. The results are presented in clusters of prioritized relevance. Piranha uses the term frequency/inverse corpus frequency term weighting method which provides strong parallel processing of textual information, thus the ability to analyze large document sets.
187:
J. W. Reed, T. E. Potok, and R. M. Patton, "A multi-agent system for distributed cluster analysis," in
Proceedings of Third International Workshop on Software Engineering for Large-Scale Multi- Agent Systems (SELMAS'04)" W16L Workshop - 26th International Conference on Software Engineering Edinburgh,
128:
This work has resulted in eight patents (9,256,649, 8,825,710, 8,473,314, 7,937,389, 7,805,446, 7,693,9037, 7,315,858, 7,072,883), and commercial licenses (including TextOre and Pro2Serve), a spin-off company with the inventors, Covenant Health, and Pro2Serve called VortexT Analytics, two R&D 100
191:
J. Reed, Y. Jiao, T. E. Potok, B. Klump, M. Elmore, and A. R. Hurson, "TF-ICF: A New Term
Weighting Scheme for Clustering Dynamic Data Streams," in Proceedings of 5th International Conference on Machine Learning and Applications (ICMLA'06). vol. 0 ORLANDO, FL, 2006,
179:
R. M. Patton, B. G. Beckerman, T. E. Potok, G. Tourassi, "A Recommender System for Web-Based
Discovery and Refinement of Information Radiologists Seek", Radiological Society of North America (RSNA), 2012 Annual Meeting, Nov. 2012, Chicago, IL,
183:
R. M. Patton, T. E. Potok, B. A. Worley, "Discovery & Refinement of
Scientific Information via a Recommender System", The Second International Conference on Advanced Communications and Computation, Oct. 2012, Venice,
108:
Collecting and
Extracting: Millions of documents from sources such as databases and social media can be collected and text extracted from hundreds of file formats; This information can be translated to other
315:
51:
173:
138:
Cui, X., Beaver, J., St. Charles, J., Potok, T. (September 2008). Proceedings of the IEEE Swarm
Intelligence Symposium, St. Louis, Mo.
94:
73:
320:
98:
165:
310:
157:
118:
Categorizing: Grouping items via supervised and semi-supervised machine learning methods and targeted search lists.
34:
44:
38:
30:
149:
55:
112:
Storing and indexing: Documents in search servers, relational databases, etc. can be stored and indexed.
140:
275:
Method and system for determining precursors of health abnormalities from processing medical records
204:
124:
Visualizing: Showing relationships among documents so that users can quickly recognize connections.
264:
Dynamic reduction of dimensions of a document vector in a document search and retrieval system
288:
115:
Recommending: The system can highlight the most valuable information for specific users.
304:
289:
Agent-Based
Software for Gathering and Summarizing Textual and Internet Information
90:
269:
258:
247:
240:
229:
218:
141:
Dimensionality
Reduction for High Dimensional Particle Swarm Clustering
295:
253:
Agent-based method for distributed clustering of textual information
174:
Big Data Can Help the
Federal Government Move Mountains. Here's How.
121:
Clustering: Similarity is used to group documents hierarchically.
15:
166:
Swimming with
Piranha: Testing Oak Ridge's text analysis tool
155:
Franklin Jr., Curtis (Nov 30, 2012) Enterprise Efficiency.
129:
Awards, and scores of peer reviewed research publications.
235:
Method for gathering and summarizing internet information
224:
System for gathering and summarizing internet information
150:Energy lab's Piranha puts teeth into text analysis
158:Piranha Brings Affordable Big-Data to Government
43:but its sources remain unclear because it lacks
8:
316:Data mining and machine learning software
74:Learn how and when to remove this message
286:DOE Energy Innovlation Portal (2014)
7:
163:Breeden II, John (Dec 7, 2012) GCN.
188:Scotland, UK: IEE, 2004, pp. 152-5.
147:Yasin, Rutrell (Nov 29, 2012) GCN.
202:2007 R&D 100 Magazine's Award
171:Kirby, Bob (Summer 2013) FedTech.
95:United States Department of Energy
14:
93:system. It was developed for the
104:Piranha has six main elements:
20:
1:
99:Oak Ridge National Laboratory
337:
29:This article includes a
58:more precise citations.
270:U.S. patent 8,473,314
259:U.S. patent 7,937,389
248:U.S. patent 7,805,446
241:U.S. patent 7,693,903
230:U.S. patent 7,315,858
219:U.S. patent 7,072,883
321:Agent-based software
296:ORNL Piranha website
205:Piranha (software)
31:list of references
311:Cluster computing
192:pp. 258–263.
84:
83:
76:
328:
272:
261:
250:
243:
232:
221:
79:
72:
68:
65:
59:
54:this article by
45:inline citations
24:
23:
16:
336:
335:
331:
330:
329:
327:
326:
325:
301:
300:
283:
268:
257:
246:
239:
228:
217:
214:
199:
135:
80:
69:
63:
60:
49:
35:related reading
25:
21:
12:
11:
5:
334:
332:
324:
323:
318:
313:
303:
302:
299:
298:
293:
282:
281:External links
279:
278:
277:
266:
255:
244:
237:
226:
213:
210:
209:
208:
198:
195:
194:
193:
189:
185:
181:
177:
169:
161:
153:
145:
134:
131:
126:
125:
122:
119:
116:
113:
110:
82:
81:
39:external links
28:
26:
19:
13:
10:
9:
6:
4:
3:
2:
333:
322:
319:
317:
314:
312:
309:
308:
306:
297:
294:
291:
290:
285:
284:
280:
276:
271:
267:
265:
260:
256:
254:
249:
245:
242:
238:
236:
231:
227:
225:
220:
216:
215:
211:
207:
206:
201:
200:
196:
190:
186:
182:
178:
176:
175:
170:
168:
167:
162:
160:
159:
154:
152:
151:
146:
143:
142:
137:
136:
132:
130:
123:
120:
117:
114:
111:
107:
106:
105:
102:
100:
96:
92:
88:
78:
75:
67:
57:
53:
47:
46:
40:
36:
32:
27:
18:
17:
287:
274:
263:
252:
234:
223:
203:
172:
164:
156:
148:
139:
127:
103:
86:
85:
70:
64:October 2023
61:
50:Please help
42:
91:text mining
56:introducing
305:Categories
133:References
109:languages.
97:(DOE) by
212:Patents
87:Piranha
52:improve
197:Awards
184:Italy.
89:is a
37:, or
180:USA.
307::
273:–
262:–
251:–
233:–
222:–
41:,
33:,
292:.
144:.
77:)
71:(
66:)
62:(
48:.
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.