Document Processing Workflow
Below we sketch the document processing workflow, for details see poster. See also janta-components.
Implementation Roadmap
see milestones & tickets for up-to-date information and details.
Rendering
Rendering: Notation Adaptation
- Purpose: Testing of the NTN Framework of the JOMDoc library
- Retrieve rendered Documents from file system (later TNTBase) and adapt notations, see ticket:533
Rendering: XML to XML
- Purpose: Allow integration with various interfaces (most frontend require a specific XML-input format, e.g. see wyzbook's format or WordML)
- Optionally, requested documents are converted to specific XML formats
- for demo purpose: XSLT-based conversion from OMDoc to XHTML
Initialization of the document commons: Segmentation and Integration
- Purpose: Identification, creation, and maintenance of an in-memory document commons
- On initial retrieval, documents are segmented and integrated into the document commons
- Segmentation preserves the original document
- Segmentation splits the document into knowledge items (suitable for reuse to create new documents) with global unique identifiers
- Integration: We do not identify or construct new interrelation between knowledge items but simply draw on explicit cross-reference in and between documents
Extraction of documents
- Purpose: Testing the document commons
- Extract a (previously integrated) document from the document commons
Contextualization
- Contextualization of the rendering of documents is completed
- TODO: Specify context annotations for items in the document commons
- user specific parameters: language
- system specific parameters: font, font-size, line-height, margins (define the amount of information on a document's page)
Contextualize Doc Extraction
- Purpose: Extending the NTN Framework towards Variants
- Adapt extraction of initial documents to the user according to context parameters (exchange text fragments of the original document with appropriate variants, cache the new narrative path in the narrative commons)
Document creation
- Selection and arrangement of knowledge items in a new document structure based on strategies and templates
Template-based Path Generation
- Purpose: Towards Generating new documents
- Generating a new path for an initial document, e.g. a slide presentation, based on templates/ strategies
Template-based Document Generation
- Purpose: Towards Generating new documents
- Creating template to generate new documents, selecting and arranging knowledge items
- Example: A guided tour for a text fragments includes definition and examples for each symbol in the text fragment
Semantic Document Generation
- Purpose: Towards Generating new documents
- Processing the content commons, semantic path generation along theoretical dependencies
How to assure that a user receives the same document?
- User modeling, is not part of the document processor but one level up, the user model generates requests to the document processor (maybe even an interface task)
Attachments
-
workflow.png
(77.3 KB) - added by cmueller
3 years ago.

