Procedural knowledge is everywhere — from technical manuals and safety guidelines to research protocols and industrial documentation. Yet most of this knowledge remains locked inside static PDF files, difficult to access, reuse, or integrate into intelligent systems. PERKS Solution 1 (S1) tackles this challenge head-on by enabling the automatic extraction of procedural knowledge from documents and transforming it into structured, machine-readable knowledge graphs.

From PDFs to Procedural Intelligence

At the core of PERKS S1 is an end-to-end pipeline designed to extract shallow procedural knowledge graphs directly from PDF documents. Instead of treating documents as unstructured text, the pipeline understands them as a sequence of executable procedural steps and formalizes these steps using a well-defined ontology.

The result: procedural instructions that can be queried, evaluated, reused, and integrated into downstream AI applications.

How the S1 Pipeline Works

PERKS S1 follows a modular workflow that covers the entire process — from raw documents to interoperable knowledge representations:

  • Document Parsing Module
    Procedural documents are transformed into structured text-layout representations, preserving both content and layout information.
  • Procedural Instruction Extraction
    Parsed text is analyzed to identify and separate individual procedural actions, enabling step-by-step interpretation of instructions.
  • Knowledge Graph Construction
    Extracted actions are semantically aligned with a user-defined ontology (the PERKS ontology) and formalized into knowledge graph triplets or other interoperable formats.
  • Hybrid AI Approach
    The pipeline combines rule-based heuristics with large language models (LLMs) and vision-language models (VLMs) to achieve robust performance across diverse document types.
  • Automatic Evaluation Module
    Each component of the pipeline can be evaluated using standardized, task-specific metrics. Evaluation strategies are customizable to support different use cases and domains.

Importantly, the entire pipeline can be operated off-line or on-premise, making it suitable for sensitive or proprietary data environments.

Who Is PERKS S1 For?

PERKS S1 is designed primarily for:

  • Researchers working on procedural knowledge, knowledge graphs, or document understanding
  • AI developers building systems that rely on structured domain knowledge
  • Academic and industrial users seeking to integrate procedural knowledge graphs into downstream applications, such as decision support systems, automation, or intelligent assistants

Why It Matters

PERKS S1 brings innovation and flexibility to a challenging research area:

  • It demonstrates the innovative combination of LLM/VLM technologies with graph-based representations in a specialized domain.
  • The pipeline is adaptable to other domains with similar constraints, such as low-resource settings or environments requiring strong data protection.
  • Its modular design supports integration across organizations and systems.
  • It directly addresses research gaps in complex procedural knowledge extraction and multi-component system evaluation.
  • The hybrid approach of LLM/VLM with user-defined knowledge bases has the potential for facilitating model’s transparency and explainability.

Current Status and Next Steps

PERKS S1 is currently at the prototype stage, laying a strong foundation for further development, experimentation, and cross-domain adaptation.

Get in Touch

PERKS S1 is developed by the MLT (Multilingual Language Technology) department of DFKI (German Research Center for Artificial Intelligence). For more information or collaboration opportunities, please contact:

  • Günter Neumann – guenter.neumann@dfki.de
  • Jörg Steffen – joerg.steffen@dfki.de
  • Cennet Oguz – cennet.oguz@dfki.de
  • Noon Pokaratsiri Goldstein – noon.pokaratsiri@dfki.de