RemNote Community
Community

Study Guide

📖 Core Concepts Metadata – “data about data”; it describes, defines, or gives information about other data. Core Functions – locate/retrieve data, support discovery, enable organization, preserve provenance, and allow interoperability across systems. Main Types Descriptive – title, author, abstract, keywords (used for discovery). Structural – how parts relate (pages → chapters, image layers). Administrative – rights, permissions, creation details, technical specs. Metadata Models – NISO (descriptive, structural, administrative) and Kimball (technical, business, process). Schemas & Standards – Formal structures (Dublin Core, ISO 11179, Data Catalog Vocabulary) that dictate allowed elements and syntax (XML, RDF, JSON‑LD, etc.). Granularity – Level of detail; high granularity = many fine‑grained elements, low granularity = fewer, cheaper elements. Controlled Vocabularies – Taxonomies, thesauri, data dictionaries that enforce consistent term usage. 📌 Must Remember Metadata ≠ Documentation – metadata is a concise, standardized subset; documentation is narrative detail. Embedded vs. External – Embedded travels with the file; external lives in a separate catalog/registry. Three NISO Types: Descriptive ↔ Structural ↔ Administrative. ISO 11179 – the “registry” standard for meaning & technical structure of data. Dublin Core – 15 core elements (e.g., title, creator, subject, format). FAIR Principles – metadata should make data Findable, Accessible, Interoperable, Reusable. Metadata Quality Dimensions – completeness, accuracy, consistency, and timeliness. Legal/Privacy – Metadata can be used in e‑discovery; removal tools help protect sensitive info. 🔄 Key Processes Create Metadata Automatic: extraction by software (e.g., camera EXIF, file system). Manual: user‑entered entry (cataloguer, author). Register & Store Submit to a metadata registry (ISO 11179‑compliant). Choose embedded (file header) or external (catalog, database). Validate Quality Run completeness/consistency checks against the chosen schema. Manage Change Version metadata records, log provenance, assess impact on downstream systems. Use in Data Warehouse Technical metadata defines tables/fields; business metadata explains meaning; process metadata logs ETL steps. 🔍 Key Comparisons Descriptive vs. Structural vs. Administrative Descriptive = “what is it?” (title, subject). Structural = “how is it built?” (chapter order). Administrative = “who can use it & how?” (rights, format). Embedded ↔ External Metadata Embedded: stays with file → easy transport, risk of redundancy. External: centralized → easier bulk editing, risk of misalignment. Hierarchical ↔ Linear Schemas Hierarchical: parent‑child (IEEE LOM). Linear: flat list along one axis (Dublin Core). Technical ↔ Business ↔ Process Metadata (Kimball) Technical: tables, data types. Business: business meaning, source. Process: ETL timestamps, CPU usage. ⚠️ Common Misunderstandings “All metadata is automatically generated.” – Many domains (museum, library) rely on manual entry for quality. “Higher granularity is always better.” – It increases cost and may overwhelm users; choose level that matches discovery needs. “Metadata guarantees privacy.” – Detailed metadata can expose sensitive context (e‑discovery, location tags). “Metadata and documentation are interchangeable.” – Documentation provides narrative context; metadata provides standardized, searchable descriptors. 🧠 Mental Models / Intuition Metadata = “Label on a Box.” The label tells you what’s inside, who made it, when, and how to handle it without opening the box. Data Warehouse DNA – Think of technical, business, and process metadata as the three strands of DNA that define the warehouse’s structure, meaning, and behavior. Schema as a Blueprint – A schema (hierarchical or linear) is the building plan that tells you where each component fits. 🚩 Exceptions & Edge Cases Low‑Granularity Use Cases – Quick bulk ingestion (e.g., sensor streams) may accept minimal metadata to reduce overhead. Embedded Metadata Redundancy – When the same metadata is stored both in file headers and external catalogs, inconsistencies can arise. Controlled Vocabularies May Differ by Community – Getty vocabularies suit art; Library of Congress vocabularies suit books – using the wrong set reduces retrieval precision. Legal Metadata in US Litigation – Removal tools are essential; simply stripping EXIF may not erase all discoverable metadata. 📍 When to Use Which Choose a Standard Dublin Core → simple web/resource discovery. ISO 11179 → enterprise‑wide data registries needing precise technical definitions. Data Catalog Vocabulary (DCAT) → publishing datasets on the web. Select Storage Mode Embedded for portable media (photos, audio). External for large‑scale catalogs (library collections, data warehouses). Pick Schema Type Hierarchical when resources have nested parts (e‑books, learning objects). Linear for flat resource listings (catalog of datasets). Apply Controlled Vocabulary when consistent term use is critical (museum collections, ecological data). 👀 Patterns to Recognize “Who, What, When, Where, Why, How” → appears in ecological metadata and many provenance records. Parent‑Child Elements → hierarchical schemas (e.g., chapter → page). “Title, Creator, Subject, Format” → the core Dublin Core pattern. Metadata Quality Flags – missing required fields, mismatched controlled‑vocab terms, or inconsistent date formats. 🗂️ Exam Traps Confusing Descriptive with Structural – A question may list “page order” and expect you to label it structural, not descriptive. Assuming All Web Metadata Is Dublin Core – Many sites use microformats, RDFa, or custom schemas; only some elements map to Dublin Core. Thinking Embedded Metadata Is Always Up‑to‑Date – Files can be copied without updating embedded tags, leading to stale information. Believing More Metadata = Better Privacy – Detailed geotags or timestamps can actually expose location and timing. Mixing Up Controlled Vocabularies – Selecting Getty terms for a scientific dataset is a common distractor. --- Use this guide to quickly recall the “what, why, and how” of metadata before the exam. Focus on the core concepts, compare the types, and watch out for the common traps!
or

Or, immediately create your own study flashcards:

Upload a PDF.
Master Study Materials.
Start learning in seconds
Drop your PDFs here or
or