126:
that reflect a particular problem domain - in Word's case, formatting at the character and paragraph and document level, definitions of styles, inclusion of citations, etc. - which are nested within each other in complex ways. Understanding even a portion of such an XML document by reading it, let alone catching errors in its structure, is impossible without a very deep prior understanding of the specific XML implementation, along with assistance by software that understands the XML schema that has been employed. Such text is not "human-understandable" any more than a book written in
Swahili (which uses the Latin alphabet) would be to an American or Western European who does not know a word of that language: the tags are symbols that are meaningless to a person unfamiliar with the domain.
274:
166:
125:
The concept of XML as "human-readable", however, can only be taken so far. Some implementations/dialects of XML, such as the XML representation of the contents of a
Microsoft Word document, as implemented in Office 2007 and later versions, utilize dozens or even hundreds of different kinds of tags
110:
Some types of data described here as "semi-structured", especially XML, suffer from the impression that they are incapable of structural rigor at the same functional level as
Relational Tables and Rows. Indeed, the view of XML as inherently semi-structured (previously, it was referred to as
381:. Typically the records in a semi-structured database are stored with unique IDs that are referenced with pointers to their location on disk. This makes navigational or path-based queries quite efficient, but for doing searches over many records (as is typical in
111:"unstructured") has handicapped its use for a widening range of data-centric applications. Even documents, normally thought of as the epitome of semi-structure, can be designed with virtually the same rigor as
122:
In view of this fact, XML might be referred to as having "flexible structure" capable of human-centric flow and hierarchy as well as highly rigorous element structure and data typing.
137:
or JavaScript Object
Notation, is an open standard format that uses human-readable text to transmit data objects. JSON has been popularized by web services developed utilizing
103:(Object Exchange Model) was created prior to XML as a means of self-describing a data structure. XML has been popularized by web services that are developed utilizing
259:
Prone to "garbage in, garbage out"; by removing restraints from the data model, there is less forethought that is necessary to operate a data application.
295:
187:
38:
or other markers to separate semantic elements and enforce hierarchies of records and fields within the data. Therefore, it is also known as
233:
491:
321:
213:
299:
191:
242:
Support for lists of objects simplifies data models by avoiding messy translations of lists into a relational data model.
239:
Support for nested or hierarchical data often simplifies data models representing complex relationships between entities.
96:
284:
176:
303:
288:
195:
180:
119:
and processed by both commercial and custom software programs without reducing their usability by human readers.
507:
405:
389:
100:
68:
46:
27:
377:
is that queries cannot be made as efficiently as in a more constrained structure, such as in the
232:
Programmers persisting objects from their application to a database do not need to worry about
415:
152:
store data natively in JSON format, leveraging the pros of semi-structured data architecture.
378:
343:
31:
357:
It can represent the information of some data sources that cannot be constrained by schema.
440:
420:
347:
112:
72:
39:
23:
374:
339:
35:
501:
385:), it is not as efficient because it has to seek around the disk following pointers.
363:
It can be helpful to view structured data as semi-structured (for browsing purposes).
360:
It provides a flexible format for data exchange between different types of databases.
67:
are not the only forms of data anymore, and different applications need a medium for
45:
In semi-structured data, the entities belonging to the same class may have different
471:
252:
The traditional relational data model has a popular and ready-made query language,
49:
even though they are grouped together, and the attributes' order is not important.
273:
165:
485:
460:
116:
149:
64:
60:
57:
53:
52:
Semi-structured data are increasingly occurring since the advent of the
26:
that does not obey the tabular structure of data models associated with
145:
392:(OEM) is one standard to express semi-structured data, another way is
410:
92:
461:
The Penn database group has semi-structured and XML data project
138:
134:
104:
492:
Semi-Structured data analytics: Relational or Hadoop platform?
393:
382:
267:
253:
159:
88:
236:, but can often serialize objects via a light-weight library.
373:
The primary trade-off being made in using a semi-structured
350:, and the amount of structure used depends on the purpose.
353:The advantages of this model are the following:
488: – semi-structured data and XML
8:
448:Symposium on Principles of Database Systems
302:. Unsourced material may be challenged and
194:. Unsourced material may be challenged and
369:The data transfer format may be portable.
342:where there is no separation between the
322:Learn how and when to remove this message
214:Learn how and when to remove this message
75:, one often finds semi-structured data.
431:
99:are all forms of semi-structured data.
16:Data organized by tags but not tables
7:
300:adding citations to reliable sources
234:object-relational impedance mismatch
192:adding citations to reliable sources
14:
366:The schema can easily be changed.
272:
164:
472:Stanford Universities Lore DBMS
1:
34:, but nonetheless contains
524:
91:, other markup languages,
73:object-oriented databases
439:Peter Buneman (1997).
69:exchanging information
441:"Semistructured data"
406:Semi-structured model
390:Object Exchange Model
336:semi-structured model
264:Semi-structured model
486:UPenn Database Group
296:improve this section
188:improve this section
28:relational databases
20:Semi-structured data
144:Databases such as
115:, enforced by the
30:or other forms of
416:Unstructured data
332:
331:
324:
224:
223:
216:
515:
474:
469:
463:
458:
452:
451:
445:
436:
379:relational model
327:
320:
316:
313:
307:
276:
268:
219:
212:
208:
205:
199:
168:
160:
523:
522:
518:
517:
516:
514:
513:
512:
498:
497:
482:
477:
470:
466:
459:
455:
443:
438:
437:
433:
429:
421:Structured data
402:
328:
317:
311:
308:
293:
277:
266:
249:
229:
220:
209:
203:
200:
185:
169:
158:
132:
113:database schema
86:
81:
40:self-describing
24:structured data
17:
12:
11:
5:
521:
519:
511:
510:
500:
499:
496:
495:
489:
481:
480:External links
478:
476:
475:
464:
453:
430:
428:
425:
424:
423:
418:
413:
408:
401:
398:
375:database model
371:
370:
367:
364:
361:
358:
340:database model
330:
329:
280:
278:
271:
265:
262:
261:
260:
257:
248:
245:
244:
243:
240:
237:
228:
225:
222:
221:
172:
170:
163:
157:
154:
131:
128:
85:
82:
80:
77:
15:
13:
10:
9:
6:
4:
3:
2:
520:
509:
508:Data modeling
506:
505:
503:
493:
490:
487:
484:
483:
479:
473:
468:
465:
462:
457:
454:
449:
442:
435:
432:
426:
422:
419:
417:
414:
412:
409:
407:
404:
403:
399:
397:
395:
391:
386:
384:
380:
376:
368:
365:
362:
359:
356:
355:
354:
351:
349:
345:
341:
337:
326:
323:
315:
305:
301:
297:
291:
290:
286:
281:This section
279:
275:
270:
269:
263:
258:
255:
251:
250:
247:Disadvantages
246:
241:
238:
235:
231:
230:
226:
218:
215:
207:
197:
193:
189:
183:
182:
178:
173:This section
171:
167:
162:
161:
156:Pros and cons
155:
153:
151:
147:
142:
140:
136:
129:
127:
123:
120:
118:
114:
108:
106:
102:
98:
94:
90:
83:
78:
76:
74:
70:
66:
62:
59:
55:
50:
48:
43:
41:
37:
33:
29:
25:
22:is a form of
21:
467:
456:
447:
434:
387:
372:
352:
335:
333:
318:
309:
294:Please help
282:
210:
201:
186:Please help
174:
143:
141:principles.
133:
124:
121:
109:
107:principles.
87:
51:
44:
19:
18:
42:structure.
32:data tables
427:References
227:Advantages
117:XML schema
47:attributes
312:June 2024
283:does not
204:June 2024
175:does not
150:Couchbase
65:databases
61:documents
58:full-text
502:Category
400:See also
346:and the
54:Internet
304:removed
289:sources
196:removed
181:sources
146:MongoDB
494:by IBM
348:schema
95:, and
56:where
444:(PDF)
411:NoSQL
338:is a
93:email
79:Types
71:. In
388:The
344:data
334:The
287:any
285:cite
179:any
177:cite
148:and
139:REST
135:JSON
130:JSON
105:SOAP
63:and
36:tags
394:XML
383:SQL
298:by
254:SQL
190:by
101:OEM
97:EDI
89:XML
84:XML
504::
446:.
396:.
450:.
325:)
319:(
314:)
310:(
306:.
292:.
256:.
217:)
211:(
206:)
202:(
198:.
184:.
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.